My post assumes that you're interested in sequences.
Well, you can always have a mixture of two models. Like:
P(X_t|X_t-1, Theta) = theta_p P(X_t) + theta_q P(X_t | X_t-1)
here theta_p + theta_q = 1. Note that you need to model the two
submodels too, but I didn't list the parameters.
Anyway, this kind of stuff is in the domain of context modelling. With
additional data, the context "grows" in your model. For starters, you
could examine an application of context-modelling to the JPEG-LS image
Applications of universal context modeling to lossless compression of
Marcelo J. Weinberger, Jorma Rissanen, and Ron Arps.
IEEE Trans. Image Processing, 5(4):575--586, April 1996.
There are more papers referenced at
In addition to the universal modelling work started by Rissanen (and
of which JPEG-LS is a special case), Marcus Hutter told me that he
knows how to come up with these priors: http://www.yqcomputer.com/
Sorry, the references I'm giving you do not address your concerns
directly, but the questions about sequences you're asking have really
been studied far more in information theory than in statistics.
mag. Aleks Jakulin
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana, Slovenia.