## What's a random sequence for Bayesians?

### What's a random sequence for Bayesians?

Suppose I believe that every n-bit binary sequence is equally likely.
Now, suppose I observe a new sequence s. Can I quantify how consistent
that sequence is with my belief?

There are various tests I can do

1. Count number of 1's in the sequence. If number of 1's is
considerably higher than expected, then my belief is probably wrong
2. Count number of runs and if there are too many or too few runs of
1's, then this sequence is not very consistent with my belief.
3. Say that because any string is equally likely, then observing
111......1111 is fully consistent with my prior belief, and there's no
reason to update it.

So what would be the Bayesian way to choose between these approaches?

Yaroslav

### What's a random sequence for Bayesians?

This is a prior. One could call it the null.

Yes, you can use the likelihood of your model.

You're comparing your null hypothesis here with various alternative
hypotheses:

Alternative = Binomial

Alternative = HMM

Right. It depends, however, on whether you consider the string to be a
sequence or a sample. In 1) it is a sample (each 0 or 1 is an event),
in 2) it is a sequence (each 0 or 1 is an event in the context of the
previous event) and in 3) the whole sequence is a single event.

Aleks
--
mag. Aleks Jakulin
http://www.yqcomputer.com/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana, Slovenia.

### What's a random sequence for Bayesians?

>

OK, so the question here -- which representation should I choose?

The criteria here is to choose a prior consistent with my beliefs. So
I believe that true distribution is uniform, and want p(data).

A uniform distribution, in my belief:
1. Assigns low probability to strings with too many or too few ones
2. Assigns low probability to strings with too many or too few runs
3. Generally assigns low probability to strings with too much order --
the ones that can be compressed a lot

Ignoring belief 3 for now, I can come up with a good representation of
belief 1 (Binomial distribution), and good representation of belief 2
(runs test). But how would I combine these two distributions? The two
aren't really independent since lots of 1's mean few runs.

Belief 1 and belief 2 are really special cases of belief 3. Counting
number of 1's or number of runs in a string, is an approximation of
compressibility. General compressibility requires finding smallest
Turing Machine that generates string, and is uncomputable. The
practical question is, which other good approximation of
compressibility exist, and how do we combine them to form a consistent
prior?

### What's a random sequence for Bayesians?

My post assumes that you're interested in sequences.

Well, you can always have a mixture of two models. Like:

P(X_t|X_t-1, Theta) = theta_p P(X_t) + theta_q P(X_t | X_t-1)

here theta_p + theta_q = 1. Note that you need to model the two
submodels too, but I didn't list the parameters.

Anyway, this kind of stuff is in the domain of context modelling. With
could examine an application of context-modelling to the JPEG-LS image
compression standard:

http://www.yqcomputer.com/
http://www.yqcomputer.com/

and

Applications of universal context modeling to lossless compression of
gray-scale images.
Marcelo J. Weinberger, Jorma Rissanen, and Ron Arps.
IEEE Trans. Image Processing, 5(4):575--586, April 1996.

There are more papers referenced at
http://www.yqcomputer.com/

In addition to the universal modelling work started by Rissanen (and
of which JPEG-LS is a special case), Marcus Hutter told me that he
knows how to come up with these priors: http://www.yqcomputer.com/ ~marcus