The whole thing comes down to whether it is better to first transmit

"N", the number of symbols and then the data, or to include an EOF

signal in-line.

There are costs to doing both.

Imagine there is a probability model for the data itself

p(s|past) available at any point. The data

First case: transmitting N ahead of time.

The cost for the data is L_data = sum(i=1..N) -log p(s_i | past) plus

the cost to transmit N. The Elias delta code can provide a code for

natural integers in less than L(N) <= log N + 2 log log (2 N ).

L = L_data + log N + 2 log log (2N)

Second case: transmititng an EOF signal in-band.

Here, we will have to model the probability of an EOF signal.

The common choice is p(EOF) = 1/(i+1) with i the number of symbols

processed. Thus at any step the codelength for the actual signal

is -log [ (1-p(EOF))*p(s_i | past) ] (since an EOF didn't occur), and then

at the end, one EOF, -log [ P(EOF) ]

this is L = L_data + \sum_(i=1..N) (- log( i/i+1) ) + log (N+1)

= L_data + 2*log(N+1)

Conclusion: the length information costs

Lelias = log N + 2 log log (2N) for pre-transmitting the length.

Leof = 2 log(N+1) for an inband EOF.

(all logarithms are base 2)

For N >= 37, the Elias code wins, providing a shorter codelength.

So, at least with this EOF model, there is an advantage to

pre-transmitting the length for files longer than 37 bits.

Note that you can make an EOF marker cost of

1 + min( Lelias, Leof )

by pre-transmitting one bit saying which scheme you will be using.

The difference between Lelias and Leof is small, being less than 6

bits for N=10000.

PS: That Elias codelength is an upper bound, more specifically

L(elias) = 1 + floor(log(N)) + 2*floor(log(1+floor(log(N))))

counts the integral concrete bits.

Dr Chaos" < XXXX@XXXXX.COM > wrote in message

news: XXXX@XXXXX.COM ...

I'm not sure where you got this limit as I can't find any references with

this limit and it doesn't quite seem to agree with reality. The actual

length of an Elias delta code (which you noted later) is:

floor[lg x] + 2 floor[lg (1 + floor[lg x] ) ] + 1

And your formula is sometimes less than this. I think you need to add some

floor or ceilings to make it correct.

then

I certainly agree that for larger N the Elias code takes less bits than

encoding the EOF as a symbol. But one cannot say that this means it is

"better". You cannot say one is better than another without making some

assumption of what the probability distribution is for the inputs. If the

probability distribution function is heavily weighted to small values of N

then the EOF symbol wins.

The key thing is that you can only say that one code is better for some set

of probability distribution functions. You cannot say one code is

universally better than the other.

For example, let's compare the Elias delta code with a unary coding of an

integer that encodes an integer N as N-1 zeroes followed by a 1, thus

L(unary) = N. L(unary) < L(elias) only for N <= 4. Is Elias better than

Unary? It depends on what the probability distribution is for the value of

N. If all values of N are equally likely, then yes it is.

But let's consider if the value of N we were encoding was the number of coin

flips we made until it come up heads. There p( N = x ) = 2^-x. The p( N <=

x ) = 1 - 2^-x. Here the unary code is the optimal code.

news: XXXX@XXXXX.COM ...

I'm not sure where you got this limit as I can't find any references with

this limit and it doesn't quite seem to agree with reality. The actual

length of an Elias delta code (which you noted later) is:

floor[lg x] + 2 floor[lg (1 + floor[lg x] ) ] + 1

And your formula is sometimes less than this. I think you need to add some

floor or ceilings to make it correct.

then

I certainly agree that for larger N the Elias code takes less bits than

encoding the EOF as a symbol. But one cannot say that this means it is

"better". You cannot say one is better than another without making some

assumption of what the probability distribution is for the inputs. If the

probability distribution function is heavily weighted to small values of N

then the EOF symbol wins.

The key thing is that you can only say that one code is better for some set

of probability distribution functions. You cannot say one code is

universally better than the other.

For example, let's compare the Elias delta code with a unary coding of an

integer that encodes an integer N as N-1 zeroes followed by a 1, thus

L(unary) = N. L(unary) < L(elias) only for N <= 4. Is Elias better than

Unary? It depends on what the probability distribution is for the value of

N. If all values of N are equally likely, then yes it is.

But let's consider if the value of N we were encoding was the number of coin

flips we made until it come up heads. There p( N = x ) = 2^-x. The p( N <=

x ) = 1 - 2^-x. Here the unary code is the optimal code.

Add one bit to the formula.

L(N) <= 1+ log N + 2 log log (2 N ).

for certain needs I wanted a formula continuous in N.

Right. Less than 37.

Of course.

I would guess that in practice, the sizes of streams which need to be

compressed follow a mysterious power law distribution, like so many in

Nature.

In fact if it were P(N) ~ 1/N then you can't even normalize that to

make it a probability density function and thus the optimal prior for

coding of integers.

1. How to Recover the the Carrier Signal from SSB Transmitted Signal?

2. <pre> versus <code> white-space:pre; when pasting

3. pre-determine the length of buffer for `sprinf()' (pre-c99)...

4. eof on channel, not eof on transform: [eof] returns true

5. How do I optimize filter coefficient bit length and signal bit length?

6. Resampling signals of different lengths to a relative length of 1000 samples

7. POSIX signal handling versus traditional signal handling

8. real baseband signal versus complex baseband signal

9. POSIX signal handling versus traditional signal handling

10. wxAda (pre-pre-pre-pre-release)

11. [ANNOUNCE] wxAda (pre-pre-pre-pre-release)

12. Transmit 3 -- what features are limited pre-purchase?

13. TCPSocket, send always transmit an even length message

14. Does usbser.sys transmit DTR and RTS signals to hardware RS232 p

15. Unlocking on desktop doesn't transmit unlocked signal to PocketPC ???

3 post • Page:**1** of **1**