here was an small update to the source. The main addition (requested

in several emails) was an option:

#define _ASM 1 // Set to 0 to disable inline asm

in the Qitypes.h which allows disabling of the VC6 inline asm (used in

some macros). The speed of operations (mostly radix codes decoder)

drops by a few percent without the inline asm. There were also some

minor compiler warnings that some people emailed about which were

cleaned up in the latest code (I left the file name unchanged for the

old links to work):

http://www.1stworks.com/ref/C/QIC100.zip

There is also a thread about the algorithm discovery in the Computer

Chess Club ( http://www.talkchess.com/ <-- signup screen), where the

seed for it was planted way back in the last century. Here is the

excerpt for those interested in that aspect:

------ Computer Chess Club thread excerpt (few typos fixed) ----------

"Number of positions in chess -- The rest of the story"

http://www.talkchess.com/forums/1/message.html?476509

Posted by Ratko V. Tomic on January 03, 2006 at 08:12:12:

Hi Uri, that little excercise in enumeration you brought up

(and mentioned in the other thread) set me off back then to

try make it work as a general purpose compression algorithm.

While a neat idea on paper, the problem was that the arithmetic

precision had to be of the size of output.

After struggling for a while, I searched the literature and

it turned out such compression algorithm already existed,

called "Enumerative Coding", since 1960s (first in Russian

literature, from Kolmogorov and his disciples, then shortly

thereafter here, in USA, from Lynch and Davisson). And, as

in my version, the precision problem was still unsolved after

over four decades of various attempts to make the algorithm

practical.

Since I arrived at it on my own, my conventions for

enumeration happened to be backwards from those that

existed in the literature (mine sorted right to left,

the so-called colex sorting of combinations, and built

up the enumerative addends bottom up, while the standard

scheme sorted lexicographically & worked recursively top

down, plus all my pictures were rotated 45 degrees from

theirs). Further, due to my playing with lattice methods

in QCD (in my physics graduate school days), I also had

my own visual representation of combinatorics as lattice

walks, which is a very intuitive, heuristically rewarding way

of looking at it, allowing one to see all of the combinatorial

identities at a glance (especially useful for tossing and

checking out algorithms in the head when going to sleep

or waking up, without a pencil and paper). The lattice

formulation turns out to have existed in the EC literature

as well (as the Schalkwijk's Pascal triangle walks), although

not in as general or elegant formalism as mine, lacking even

a notation for the lattice walks, key sets of paths, enumerative

classes, constraints... (stuff I worked out while doing physics).

Since that thread, I kept returning to the problem, on and

off, trying various ideas. Nothing worked. Then, in summer

2004, when my wife and kids went to a summer camp for

a week, I stayed home to work on a programming project

(a video codec). The first night home alone, at about 2AM,

while debugging the latest batch of code, out of nowhere

an idea popped into my head on that pesky enumeration

problem, something I didn't yet try. I quickly coded just a

toy version,

"nightlight" < XXXX@XXXXX.COM > wrote in

I know you claim to have this great code. But others such as Matt

the inventor of PAQ codes at least wrote a simple arithmetic coder

FPAQ0 to show how his methods would compete on files where zero order

arithmetic coding would come into play. Since your model is always

"tigther than the present best enropy coding algorithm" do you have

actually test code to compare with real files and test sets. Or is

the method not yet advanced enough to do real compression on files

yet.

Don't get me wrong Matt is not saying FAPQ0 is the best entropy

coder he knows it isn't that good. He is just showing how it works

for a certain model. The same sort of thing could be dome with your

method. It least that is if your method is comparable at all with

real world entropy coders. And since using the same model as most

one could calculate the true entropy of versus standard test models.

If one does this one can see Matts code does not produce on average

the shortest file. Maybe yours could do better if you ever actually

get real working code. But I find it hard to belive it could compress

as well as some current entropy encoders.

Where Matts code shines is his use of various models and a slick

way to combine them which makes his family of PAQ coders among the

best on the net. The point is if your code is half as good as you

claim then his simple 2 state enropy coder could be replaces by

you faster and tighter 2 state coders wich would bring you name fame.

But I won't hold my breath.

David A. Scott

--

My Crypto code

http://www.yqcomputer.com/

http://www.yqcomputer.com/

My Compression code http://www.yqcomputer.com/

**TO EMAIL ME drop the roman "five" **

Disclaimer:I am in no way responsible for any of the statements

made in the above text. For all I know I might be drugged.

As a famous person once said "any cryptograhic

system is only as strong as its weakest link"

I know you claim to have this great code. But others such as Matt

the inventor of PAQ codes at least wrote a simple arithmetic coder

FPAQ0 to show how his methods would compete on files where zero order

arithmetic coding would come into play. Since your model is always

"tigther than the present best enropy coding algorithm" do you have

actually test code to compare with real files and test sets. Or is

the method not yet advanced enough to do real compression on files

yet.

Don't get me wrong Matt is not saying FAPQ0 is the best entropy

coder he knows it isn't that good. He is just showing how it works

for a certain model. The same sort of thing could be dome with your

method. It least that is if your method is comparable at all with

real world entropy coders. And since using the same model as most

one could calculate the true entropy of versus standard test models.

If one does this one can see Matts code does not produce on average

the shortest file. Maybe yours could do better if you ever actually

get real working code. But I find it hard to belive it could compress

as well as some current entropy encoders.

Where Matts code shines is his use of various models and a slick

way to combine them which makes his family of PAQ coders among the

best on the net. The point is if your code is half as good as you

claim then his simple 2 state enropy coder could be replaces by

you faster and tighter 2 state coders wich would bring you name fame.

But I won't hold my breath.

David A. Scott

--

My Crypto code

http://www.yqcomputer.com/

http://www.yqcomputer.com/

My Compression code http://www.yqcomputer.com/

**TO EMAIL ME drop the roman "five" **

Disclaimer:I am in no way responsible for any of the statements

made in the above text. For all I know I might be drugged.

As a famous person once said "any cryptograhic

system is only as strong as its weakest link"

gt; "tigther than the present best entropy coding algorithm" do you

You can download the source code which shows the coding aspects where

QI is essentially different from the existent entropy coding

algorithms. All that the source will show you is that QI is more

accurate and much faster entropy coder than any AC you may have, tested

under the same conditions (e.g. same model, same coding of model

information, of data lengths, counts, output serialization or

packaging... etc). There are some quasi-arithmetic coders, which run

faster than the full AC (paying for speed in extra output redundancy),

but even these won't run 20 or 30 times faster, let alone 200 times

faster than the full AC (they're usually 2-3 times faster), as QI does.

But if you have one such, you're welcome to test it. I would be curious

to know.

The source code itself is not a file archiver or video codec or any

such higher level application. Any differences for these higher level

applications, however interesting they may be otherwise, are a simple

consequence of the fundamental differences:

a) AC always codes with greater redundancy than QI (under the same

coding conditions, obviously; this is the result of AC being a direct

approximation of QI, see [T2] pp. 19-25, the chapter on that exact

difference, how much and how much for which data, with additional

details on AC in [40],[41],[41a],[41b]) and

b) AC codes much more slowly than QI due to:

.. b1) AC has to perform coding operations for all input symbols, while

QI can just skip over the most probable symbols at memory speed (you

can see the top of coding loop in EncDec.c where it merely scans the

memory for the the less probable symbol, 32 symbols for each loop step

at basically memory scan speed), and

.. b2) AC performs more complex operations for the least probable

symbols (which QI also needs to encode explicitly) i.e. mul/div vs

simple array lookup and add. This difference, which remains even for

the uncompressable data (where the least & most probable symbols are

approximately equally likely), allows QI to still code at least 6 times

faster than the full AC even in high entropy limit.

All that is, of course, measurable using the source code provided

(which also includes very accurate timing functions). The above are not

religious claims or invitation to believe or debate belief systems, but

a simple statement of easily verifiable empirical facts. If you need to

know also how it would do on "real" file, and can't extrapolate from

how it does on memory buffers filled with arbitrary content, well, you

are welcome to add such file related code and try it. Now, if you do

add a file i/o which takes hundreds times longer than the coding, I can

predict you won't see almost any speed difference.

-- References ( http://www.1stworks.com/ref/RefLib.htm )

T2. R. V. Tomic "Quantized indexing: Background information" 1stWorks

TR05-0625, 39p, Jun 2005

http://www.1stworks.com/ref/TR/tr05-0625a.pdf

40. J.C. Kieffer "Second Order Analysis of Data Compression

Algorithms" (Preprint from J.C.K. lectures)

http://citeseer.ist.psu.edu/370131.html

41. M.Drmota, H-K. Hwang, W. Szpankowski "Precise Average Redundancy

of an Idealized Arithmetic Coding" DCC 2002, 222-231.

http://citeseer.ist.psu.edu/drmota02precise.html

41a. P.A.J. Volf "Weighting Techniques In Data Compression: Theory

and Algorithms" Ph.D. thesis, Eindhoven University of Technology, De

You can download the source code which shows the coding aspects where

QI is essentially different from the existent entropy coding

algorithms. All that the source will show you is that QI is more

accurate and much faster entropy coder than any AC you may have, tested

under the same conditions (e.g. same model, same coding of model

information, of data lengths, counts, output serialization or

packaging... etc). There are some quasi-arithmetic coders, which run

faster than the full AC (paying for speed in extra output redundancy),

but even these won't run 20 or 30 times faster, let alone 200 times

faster than the full AC (they're usually 2-3 times faster), as QI does.

But if you have one such, you're welcome to test it. I would be curious

to know.

The source code itself is not a file archiver or video codec or any

such higher level application. Any differences for these higher level

applications, however interesting they may be otherwise, are a simple

consequence of the fundamental differences:

a) AC always codes with greater redundancy than QI (under the same

coding conditions, obviously; this is the result of AC being a direct

approximation of QI, see [T2] pp. 19-25, the chapter on that exact

difference, how much and how much for which data, with additional

details on AC in [40],[41],[41a],[41b]) and

b) AC codes much more slowly than QI due to:

.. b1) AC has to perform coding operations for all input symbols, while

QI can just skip over the most probable symbols at memory speed (you

can see the top of coding loop in EncDec.c where it merely scans the

memory for the the less probable symbol, 32 symbols for each loop step

at basically memory scan speed), and

.. b2) AC performs more complex operations for the least probable

symbols (which QI also needs to encode explicitly) i.e. mul/div vs

simple array lookup and add. This difference, which remains even for

the uncompressable data (where the least & most probable symbols are

approximately equally likely), allows QI to still code at least 6 times

faster than the full AC even in high entropy limit.

All that is, of course, measurable using the source code provided

(which also includes very accurate timing functions). The above are not

religious claims or invitation to believe or debate belief systems, but

a simple statement of easily verifiable empirical facts. If you need to

know also how it would do on "real" file, and can't extrapolate from

how it does on memory buffers filled with arbitrary content, well, you

are welcome to add such file related code and try it. Now, if you do

add a file i/o which takes hundreds times longer than the coding, I can

predict you won't see almost any speed difference.

-- References ( http://www.1stworks.com/ref/RefLib.htm )

T2. R. V. Tomic "Quantized indexing: Background information" 1stWorks

TR05-0625, 39p, Jun 2005

http://www.1stworks.com/ref/TR/tr05-0625a.pdf

40. J.C. Kieffer "Second Order Analysis of Data Compression

Algorithms" (Preprint from J.C.K. lectures)

http://citeseer.ist.psu.edu/370131.html

41. M.Drmota, H-K. Hwang, W. Szpankowski "Precise Average Redundancy

of an Idealized Arithmetic Coding" DCC 2002, 222-231.

http://citeseer.ist.psu.edu/drmota02precise.html

41a. P.A.J. Volf "Weighting Techniques In Data Compression: Theory

and Algorithms" Ph.D. thesis, Eindhoven University of Technology, De

nightlight" < XXXX@XXXXX.COM > wrote in

news: XXXX@XXXXX.COM :

I guess that means you don't yet have code where you can compare

it to even simple airhmtic file coders. I don't have faith in your

work from your earlier posts.

A simple No you can't actully test it in any real applications yet

would have been enough. Again from the earlier thread its not obvious

to me you have a full understanding of arithmetic coding methods.

Very funny.

Its very strange you make a big deal of claiming you compare it

against what you claim is a good entropy coder Moffat. Yet you don't

even test against it on a level playing ground. You think by modifying

Moffat that you are giving an honest test. However if you really wanted

an honest test since you are the new kid on the block. You would think

you could easily convert your code to work on files like Moffat's or are

you afraid to test it on the same playing field Moffat and others have

picked so various methods can be tested against yours.

I for one belive you shifted ground because you fear real aithemetic

coders and people could take existing software without modification and

show you directly that you coder does not lead to shorter compressed output

than already existing coders. I suspect this since most would have tested

the Moffat code on the same playground instead of moding it to one of your

choice where its not easier to compare against any other standard codings.

Look maybe you have something. If your method is any good at all

surely you could easily add the stuff you stripped out of Moffat's

to your code so that you can compare the compression results or is

there some reason you can't. If you can't than it would seem to be of

little use.

See:

http://groups.google.com/group/comp.compression/browse_frm/thread/ffc7f7ca8

4c76378/792884848866dc4c?q=moffat&rnum=4#792884848866dc4c

From below if its true at all and if your code works at all you should

have the courage of your convictions to test it against Moffat and others

where they were designed to run. Doesn't yours work there whats the

problem?

YOUR QUOTE

"Note also that for the test we had stripped the Moffat et al. V3 binary

coder to its bare engine, replaced stream i/o with memory to memory

coding, no allocations were done within the timed loop, model decisions

were taken out (since only order 0 was tested) and all dynamical coding

options (alternatives that didn't matter) were hardwired as static to

avoid time waste in the test. So the arithmetic coder tested was

already quite a bit faster than the out-of-a-box Moffat et al code. We

wanted it to take its best shot, the best it possibly could (since QI

didn't need any such implementation/code level edge, being so much more

efficient at the fundamental, algorithmic level)."

If you wanted to take the best shot Again let me stated that for the

dense. If you wanted to take the best shot. Then test it like others on

files where Moffats was made to work. If you don't one can only wonder

what you are hiding. Especailly since you clain this is so much better

than arithmetic.

I thought about downloading as you suggested and converting it to files

so it really can honestly be compared to Moffat. But I realize from your

post that you would claim I did a poor job of converting so I will not do

it. After all your the one making the cliam its better than Moffat. Your

the one saying you compared it

news: XXXX@XXXXX.COM :

I guess that means you don't yet have code where you can compare

it to even simple airhmtic file coders. I don't have faith in your

work from your earlier posts.

A simple No you can't actully test it in any real applications yet

would have been enough. Again from the earlier thread its not obvious

to me you have a full understanding of arithmetic coding methods.

Very funny.

Its very strange you make a big deal of claiming you compare it

against what you claim is a good entropy coder Moffat. Yet you don't

even test against it on a level playing ground. You think by modifying

Moffat that you are giving an honest test. However if you really wanted

an honest test since you are the new kid on the block. You would think

you could easily convert your code to work on files like Moffat's or are

you afraid to test it on the same playing field Moffat and others have

picked so various methods can be tested against yours.

I for one belive you shifted ground because you fear real aithemetic

coders and people could take existing software without modification and

show you directly that you coder does not lead to shorter compressed output

than already existing coders. I suspect this since most would have tested

the Moffat code on the same playground instead of moding it to one of your

choice where its not easier to compare against any other standard codings.

Look maybe you have something. If your method is any good at all

surely you could easily add the stuff you stripped out of Moffat's

to your code so that you can compare the compression results or is

there some reason you can't. If you can't than it would seem to be of

little use.

See:

http://groups.google.com/group/comp.compression/browse_frm/thread/ffc7f7ca8

4c76378/792884848866dc4c?q=moffat&rnum=4#792884848866dc4c

From below if its true at all and if your code works at all you should

have the courage of your convictions to test it against Moffat and others

where they were designed to run. Doesn't yours work there whats the

problem?

YOUR QUOTE

"Note also that for the test we had stripped the Moffat et al. V3 binary

coder to its bare engine, replaced stream i/o with memory to memory

coding, no allocations were done within the timed loop, model decisions

were taken out (since only order 0 was tested) and all dynamical coding

options (alternatives that didn't matter) were hardwired as static to

avoid time waste in the test. So the arithmetic coder tested was

already quite a bit faster than the out-of-a-box Moffat et al code. We

wanted it to take its best shot, the best it possibly could (since QI

didn't need any such implementation/code level edge, being so much more

efficient at the fundamental, algorithmic level)."

If you wanted to take the best shot Again let me stated that for the

dense. If you wanted to take the best shot. Then test it like others on

files where Moffats was made to work. If you don't one can only wonder

what you are hiding. Especailly since you clain this is so much better

than arithmetic.

I thought about downloading as you suggested and converting it to files

so it really can honestly be compared to Moffat. But I realize from your

post that you would claim I did a poor job of converting so I will not do

it. After all your the one making the cliam its better than Moffat. Your

the one saying you compared it

NIGHTLIGHT what is wrong with you?

You claim a coder better than the best arithmetic, Yet

your only proof is your word by using Moffat code you

modifed yourself.

Look people here are willing to help those trying to

learn. Yet you never seem to anwser real questions.

Such as why you went to the trouble to modify Moffats

code yourself and then proclaim your method is better

than the best present day arithmetic encoders.

Please stop with the ridiculus post of refereces that

have nothing to do with this unless your trying to pull

the wool over peoples eyes.

Since we don't work for you and don't have to kiss

your ass for a job. If you really want to show its better

and can compress to smaller sizes than Moffats. Then do

an honest test on unmodifed code.

Why is it you can change his code and put him down

with code that is not even completely his?

Yet you seem unable to change your code to work with

the data his is designed for? Surely you have the

ability with your team of people to do that. Is it

that your afraid other current arithmetic coders

can already compress most files better?

My tests on my time, your tests on your time.

You must be damn rich to afford time ownership.

Can I share your secret to owning time?

No entropy ..please.

gt; You claim a coder better than the best arithmetic,

That's not the "only" proof. Analysis of the algorithms,

as sketched below (and covered in more detail in the the

preprint & the tech reports) will give you the same answer,

showing it even more clearly. The source code which is

available allows anyone to test as they wish. The source

also allows examination so that one can explain the results

from the code itself, rather than form the mathematical

description of the algorithms.

The main mod on their code was to make it code from memory

to memory instead of using much slower C streams. The

alternative of adding stream in/out to QI code would simply

add a large and variable term and uncertainty to both coding

times making tests measure mostly the irrelevant aspects (such

as how much time the C stream library takes under all OS &

buffering fluctuations).

Hence, regarding the speed tests, the removal of their

stream i/o has allowed much more accurate and reliable

measurements of the coding speeds differences. If you wish

to measure your C stream i/o functions speed, which combines

some addons of coding times, you go ahead, test that. That

doesn't interest me.

The other mods were selection of the coding mode and precision

via their own options (in the headers). Binary coder, order 0

mode was used. Their code for the higher order contexts was

commented out, so the coder can run faster (instead of checking

higher order model variables inside their coding loop). Also, all

their memory allocs were moved outside of their coding loop (just

their coding loop was timed using the hi-res win32 timer).

As with the speed tests, the restriction to order 0 model was

made because that is where the coders differ (since one can

always use the same model and the same encoding of the model

parameters with both coders on higher order models). Hence, the

optimum way to measure the coding efficiency _difference_ was to

remove any extra variables in which they don't differ. Again, if

you are interested in the modeling quality of Moffat98 AC

implementation beyond order 0, go test that.

Of course, both aspects, the speed and the coding accuracy

advantage of QI, are self-evident to anyone who understands the

mathematical description of the two coding algorithms. QI is

always more accurate since AC is its direct approximation

(assuming you use both under 'everything else set the same way',

such as model, model parameter encoding, etc). AC and QI use the

same type of addends, except that AC has different normalization,

in which it rescales what is in QI an exact integer "path count"

at some point (x,y), by dividing it with QI's total "path count"

(e.g. the path count at the path endpoint B in Fig 1, p. 4). This

AC scaling turns QI's integers into unlimited binary fractions,

which the AC then truncates to the given number of bits.

This truncation of the infinite fractions even for a small number

of symbols (which is absent in QI's integer format of addends),

is a loss of precision which leads to AC losing parts of its

coding interval in each step. If one were to use fractions of

infinite precision, all intervals would fit exactly next to each

other, without gaps. Since allowing intervals to overlap would

result in non-decodable output (by Kraft inequality), any loss in

precision for specifying interval boundaries must leave unused

gaps in the output code space.

The basic arithmetic differen

That's not the "only" proof. Analysis of the algorithms,

as sketched below (and covered in more detail in the the

preprint & the tech reports) will give you the same answer,

showing it even more clearly. The source code which is

available allows anyone to test as they wish. The source

also allows examination so that one can explain the results

from the code itself, rather than form the mathematical

description of the algorithms.

The main mod on their code was to make it code from memory

to memory instead of using much slower C streams. The

alternative of adding stream in/out to QI code would simply

add a large and variable term and uncertainty to both coding

times making tests measure mostly the irrelevant aspects (such

as how much time the C stream library takes under all OS &

buffering fluctuations).

Hence, regarding the speed tests, the removal of their

stream i/o has allowed much more accurate and reliable

measurements of the coding speeds differences. If you wish

to measure your C stream i/o functions speed, which combines

some addons of coding times, you go ahead, test that. That

doesn't interest me.

The other mods were selection of the coding mode and precision

via their own options (in the headers). Binary coder, order 0

mode was used. Their code for the higher order contexts was

commented out, so the coder can run faster (instead of checking

higher order model variables inside their coding loop). Also, all

their memory allocs were moved outside of their coding loop (just

their coding loop was timed using the hi-res win32 timer).

As with the speed tests, the restriction to order 0 model was

made because that is where the coders differ (since one can

always use the same model and the same encoding of the model

parameters with both coders on higher order models). Hence, the

optimum way to measure the coding efficiency _difference_ was to

remove any extra variables in which they don't differ. Again, if

you are interested in the modeling quality of Moffat98 AC

implementation beyond order 0, go test that.

Of course, both aspects, the speed and the coding accuracy

advantage of QI, are self-evident to anyone who understands the

mathematical description of the two coding algorithms. QI is

always more accurate since AC is its direct approximation

(assuming you use both under 'everything else set the same way',

such as model, model parameter encoding, etc). AC and QI use the

same type of addends, except that AC has different normalization,

in which it rescales what is in QI an exact integer "path count"

at some point (x,y), by dividing it with QI's total "path count"

(e.g. the path count at the path endpoint B in Fig 1, p. 4). This

AC scaling turns QI's integers into unlimited binary fractions,

which the AC then truncates to the given number of bits.

This truncation of the infinite fractions even for a small number

of symbols (which is absent in QI's integer format of addends),

is a loss of precision which leads to AC losing parts of its

coding interval in each step. If one were to use fractions of

infinite precision, all intervals would fit exactly next to each

other, without gaps. Since allowing intervals to overlap would

result in non-decodable output (by Kraft inequality), any loss in

precision for specifying interval boundaries must leave unused

gaps in the output code space.

The basic arithmetic differen

gt; You claim a coder better than the best arithmetic,

That's not the "only" proof. Analysis of the algorithms,

as sketched below (and covered in more detail in the the

preprint & the tech reports) will give you the same answer,

showing it even more clearly. The source code which is

available allows anyone to test as they wish. The source

also allows examination so that one can explain the results

from the code itself, rather than form the mathematical

description of the algorithms.

The main mod on their code was to make it code from memory

to memory instead of using much slower C streams. The

alternative of adding stream in/out to QI code would simply

add a large and variable term and uncertainty to both coding

times making tests measure mostly the irrelevant aspects (such

as how much time the C stream library takes under all OS &

buffering fluctuations).

Hence, regarding the speed tests, the removal of their

stream i/o has allowed much more accurate and reliable

measurements of the coding speeds differences. If you wish

to measure your C stream i/o functions speed, which combines

some addons of coding times, you go ahead, test that. That

doesn't interest me.

The other mods were selection of the coding mode and precision

via their own options (in the headers). Binary coder, order 0

mode was used. Their code for the higher order contexts was

commented out, so the coder can run faster (instead of checking

higher order model variables inside their coding loop). Also, all

their memory allocs were moved outside of their coding loop (just

their coding loop was timed using the hi-res win32 timer).

As with the speed tests, the restriction to order 0 model was

made because that is where the coders differ (since one can

always use the same model and the same encoding of the model

parameters with both coders on higher order models). Hence, the

optimum way to measure the coding efficiency _difference_ was to

remove any extra variables in which they don't differ. Again, if

you are interested in the modeling quality of Moffat98 AC

implementation beyond order 0, go test that.

Of course, both aspects, the speed and the coding accuracy

advantage of QI, are self-evident to anyone who understands the

mathematical description of the two coding algorithms. QI is

always more accurate since AC is its direct approximation

(assuming you use both under 'everything else set the same way',

such as model, model parameter encoding, etc). AC and QI use the

same type of addends, except that AC has different normalization,

in which it rescales what is in QI an exact integer "path count"

at some point (x,y), by dividing it with QI's total "path count"

(e.g. the path count at the path endpoint B in Fig 1, p. 4). This

AC scaling turns QI's integers into unlimited binary fractions,

which the AC then truncates to the given number of bits.

This truncation of the infinite fractions even for a small number

of symbols (which is absent in QI's integer format of addends),

is a loss of precision which leads to AC losing parts of its

coding interval in each step. If one were to use fractions of

infinite precision, all intervals would fit exactly next to each

other, without gaps. Since allowing intervals to overlap would

result in non-decodable output (by Kraft inequality), any loss in

precision for specifying interval boundaries must leave unused

gaps in the output code space.

The basic arithmetic differen

That's not the "only" proof. Analysis of the algorithms,

as sketched below (and covered in more detail in the the

preprint & the tech reports) will give you the same answer,

showing it even more clearly. The source code which is

available allows anyone to test as they wish. The source

also allows examination so that one can explain the results

from the code itself, rather than form the mathematical

description of the algorithms.

The main mod on their code was to make it code from memory

to memory instead of using much slower C streams. The

alternative of adding stream in/out to QI code would simply

add a large and variable term and uncertainty to both coding

times making tests measure mostly the irrelevant aspects (such

as how much time the C stream library takes under all OS &

buffering fluctuations).

Hence, regarding the speed tests, the removal of their

stream i/o has allowed much more accurate and reliable

measurements of the coding speeds differences. If you wish

to measure your C stream i/o functions speed, which combines

some addons of coding times, you go ahead, test that. That

doesn't interest me.

The other mods were selection of the coding mode and precision

via their own options (in the headers). Binary coder, order 0

mode was used. Their code for the higher order contexts was

commented out, so the coder can run faster (instead of checking

higher order model variables inside their coding loop). Also, all

their memory allocs were moved outside of their coding loop (just

their coding loop was timed using the hi-res win32 timer).

As with the speed tests, the restriction to order 0 model was

made because that is where the coders differ (since one can

always use the same model and the same encoding of the model

parameters with both coders on higher order models). Hence, the

optimum way to measure the coding efficiency _difference_ was to

remove any extra variables in which they don't differ. Again, if

you are interested in the modeling quality of Moffat98 AC

implementation beyond order 0, go test that.

Of course, both aspects, the speed and the coding accuracy

advantage of QI, are self-evident to anyone who understands the

mathematical description of the two coding algorithms. QI is

always more accurate since AC is its direct approximation

(assuming you use both under 'everything else set the same way',

such as model, model parameter encoding, etc). AC and QI use the

same type of addends, except that AC has different normalization,

in which it rescales what is in QI an exact integer "path count"

at some point (x,y), by dividing it with QI's total "path count"

(e.g. the path count at the path endpoint B in Fig 1, p. 4). This

AC scaling turns QI's integers into unlimited binary fractions,

which the AC then truncates to the given number of bits.

This truncation of the infinite fractions even for a small number

of symbols (which is absent in QI's integer format of addends),

is a loss of precision which leads to AC losing parts of its

coding interval in each step. If one were to use fractions of

infinite precision, all intervals would fit exactly next to each

other, without gaps. Since allowing intervals to overlap would

result in non-decodable output (by Kraft inequality), any loss in

precision for specifying interval boundaries must leave unused

gaps in the output code space.

The basic arithmetic differen

-- Errata:

The figure 16384 should be repaced by 128*1024/9 = 14563.5... since the

8 ones produce 9 sections of zeros.

The figure 16384 should be repaced by 128*1024/9 = 14563.5... since the

8 ones produce 9 sections of zeros.

) That's not the "only" proof. Analysis of the algorithms,

) as sketched below (and covered in more detail in the the

) preprint & the tech reports) will give you the same answer,

) showing it even more clearly.

This analysis is no different from any other analysis in that

you have to make lots of assumptions. This means that if you use

such an analysis to make real-world predictions, then that depends

on how well your assumptions match the real world.

Because of the massive nature of your posts, I haven't been able to

find an answer to a few questions I have about the QI coder:

- if I read you correctly, it is not an adaptive coder,

so how do you transmit the model information for the QI coder ?

- how would you use a QI coder with an adaptive model ?

- assuming I have a stream of symbols, where at each position in

the stream, the probability distribution of the symbols is different,

then how does QI coder adapt itself to all those different distributions ?

(I have a few more questions, but stick to these for now.)

SaSW, Willem

--

Disclaimer: I am in no way responsible for any of the statements

made in the above text. For all I know I might be

drugged or something..

No I'm not paranoid. You all think I'm paranoid, don't you !

#EOT

-- Errata:

The figure 16384 should be repaced by 128*1024/9 = 14563.5... since the

8 ones produce 9 sections of zeros.

The figure 16384 should be repaced by 128*1024/9 = 14563.5... since the

8 ones produce 9 sections of zeros.

"nightlight" < XXXX@XXXXX.COM > wrote in

I wish we had someone here that was an expert on both

algorithms since from your previous posts it sure indicates

to me that you are no expert on arithmetic coding. I

admit I know very little about your QI however I now enough

about arithmetic coding to realize that proper optimal bijective

file coding is one of the best methods and it is an optimal

method something you don't seem to understand from your various

posts. Even if QI could be used to make some sort of optimal

file compressor it could never compress all files as well

as an optimal arithmetic. Since you can't grasp that simple

fact you can't be an expert in arithmetic so I doubt what you

think is self evident has any relationship to reality.

You seem to think your stuff is better than arithmetic when

used as an entropy encoder. Yet it appears this so called neat

method of yours has yet to be even tested in any simple code

where one is using an entropy coder. Why is that?

Even Matt did FPAQ0 can't you or your team do something

similar with QI or is it to complex of a task?

I know you could care less what I think. But many people

here would like to see real results. We get plenty of people

that can quote and talk about how good there stuff is and people

here realize talk is cheap. They want to see REAL RESULTS can

you do that or do you care what the average person here thinks

of your method.

Like mentioned above it appears you can modify Moffat which

is not the best but you picked what you wanted to pick and then

called it the best. I would continue asking new questions but

for some reason you seem not to anwser even the most simple

of questions.

David A. Scott

--

My Crypto code

http://www.yqcomputer.com/

http://www.yqcomputer.com/

My Compression code http://www.yqcomputer.com/

**TO EMAIL ME drop the roman "five" **

Disclaimer:I am in no way responsible for any of the statements

made in the above text. For all I know I might be drugged.

As a famous person once said "any cryptograhic

system is only as strong as its weakest link"

I wish we had someone here that was an expert on both

algorithms since from your previous posts it sure indicates

to me that you are no expert on arithmetic coding. I

admit I know very little about your QI however I now enough

about arithmetic coding to realize that proper optimal bijective

file coding is one of the best methods and it is an optimal

method something you don't seem to understand from your various

posts. Even if QI could be used to make some sort of optimal

file compressor it could never compress all files as well

as an optimal arithmetic. Since you can't grasp that simple

fact you can't be an expert in arithmetic so I doubt what you

think is self evident has any relationship to reality.

You seem to think your stuff is better than arithmetic when

used as an entropy encoder. Yet it appears this so called neat

method of yours has yet to be even tested in any simple code

where one is using an entropy coder. Why is that?

Even Matt did FPAQ0 can't you or your team do something

similar with QI or is it to complex of a task?

I know you could care less what I think. But many people

here would like to see real results. We get plenty of people

that can quote and talk about how good there stuff is and people

here realize talk is cheap. They want to see REAL RESULTS can

you do that or do you care what the average person here thinks

of your method.

Like mentioned above it appears you can modify Moffat which

is not the best but you picked what you wanted to pick and then

called it the best. I would continue asking new questions but

for some reason you seem not to anwser even the most simple

of questions.

David A. Scott

--

My Crypto code

http://www.yqcomputer.com/

http://www.yqcomputer.com/

My Compression code http://www.yqcomputer.com/

**TO EMAIL ME drop the roman "five" **

Disclaimer:I am in no way responsible for any of the statements

made in the above text. For all I know I might be drugged.

As a famous person once said "any cryptograhic

system is only as strong as its weakest link"

I agree that someone having enough time for reading his massive posts

could rather read the paper and find answers himself.

I will try to answer some questions on the basis of my understanding of

what he has posted here (I have not read his papers yet).

He says that QI is based on enumerative coding, where models are

conceptually different from PPM models which we are more familiar with.

So it will probably mean that we will need to study EC from ground up

and then see how QI fits in (atleast I will have to as I am not

familiar with EC) and how it relates to PPM models.

Or if someone can summarize the difference here for the people like me,

please do so.

He said that QI is the "natural" choice for BWT post-processing. This

probably means that QI itself cant be used for higher order adaptive

coding but by using BWT, the higher-order-adaptive-modelling problem

can be reduced into something which QI can handle.

I dont know the answer to this one.

I hope this helps (my apologies to nightlight if I am mistaken

somewhere, feel free to correct me).

As for nightlights's comments on QI's speed, I am afraid that as the

modelling scheme for QI is different from modelling scheme for

ArithCoding, we will need to compare speed of "QI+its modelling code"

with "AC+its modelling code". Where both models should be of same

order, or chosen to give same compression ratio. (My doubt here is

"what if QI just *shifts* computation burden to modelling code instead

of reducing it".)

Sachin Garg [India]

http://www.yqcomputer.com/

Sachin Garg" < XXXX@XXXXX.COM > wrote in

news: XXXX@XXXXX.COM :

If the paper is that small pdf file at his site it does not anwser

the simple questions being asked. So I don't see this as an anwser.

And if its some stuff where one would have to sign some nondiscloser

agreement I think that to would be a waste to time. Since the questions

asked of him are not that hard.

He seems to belive arithemtic compression is a poor approximation

of EC compression. Here is summary take two symbol one and zero

suppose you have 2 ones and 2 zeros then there are 4!/(2!*2!) or 6

combinations. his method would assign the 6 possible numbers to this

problem so you could say its exact. At least if numbers small. But

the paper really doesn't say how to do compression. You read various

other papers that show how to assign a number to a combination.

All he seem to do is get a number for a combination.

He has in his paper the example of 00101001 in which he calculates

the value as 2+6+35 = 43 this was done in long post message

59 at

http://groups.google.com/group/comp.compression/browse_frm/thread/ffc7f7ca8

4c76378/30566aec6aa7d363?q=nightlight&rnum=1#30566aec6aa7d363

Big deal thats not compression. He state there

"The answer is on page 6, where it shows the string index

I(S8)=2+6+35=43 and how to calculate it (eq. 17). The actual size in

bits of the index is L=log(56)=5.807... bits since the valid values of

the index are 0..55 (there are 56 paths from A to B). The fractional

bits of L don't normally go to waste since they are coded in the mixed

radix with other items sent, usually with the fractional bits of

indices for other blocks (cf. p.9 [N4]). The encoder also sends the

count of 1's per block, which is 3 here and the length of the entire

string which is 8. The latter two items get coded in variety of ways in

different variants and different parts of the coder/s and experimental

prototypes (cf. p.9 [N6])."

This means he never compares it to an arithmetic compressor. He only

makes claims thats it better. I think if he could actually compress the

one example he gives in paper then he might be more beliveable. But he

wants to say its exact. But he can't be pinned down on any application

that compares it to an arithmetic. So not having any simple examples

done in a complete way says him being shown arithmetic ways that beat

his method. He leaves it to you to do the coding. At which point he

could claim you didn't do it the right way. Actually its a clever way

to prevent it from being compared to any real world entropy compressor.

So the only thing I got out of the paper besides the fact he never

completes anything is to say for a string made of only ones and zeros

the compressed encoded result is 3 items that are easy to combine

just don't ask him how to combine them since he seems not likely to do so.

1) the length of entire string

2) the number of ones

3) the index value.

He doesn't risk actually combining these in an output string

its possible he does not want to risk being laughed at or he

fears one could show how a simple bijective string compressor gets

better results. If you can think of any other reason no examples

like this are done to the end please Schin tell us what you think

it is.

Maybe it depends on the defination of "natural"

I suspect he will paste in lots of links to various papers that

seems to be his style. So don't ho

news: XXXX@XXXXX.COM :

If the paper is that small pdf file at his site it does not anwser

the simple questions being asked. So I don't see this as an anwser.

And if its some stuff where one would have to sign some nondiscloser

agreement I think that to would be a waste to time. Since the questions

asked of him are not that hard.

He seems to belive arithemtic compression is a poor approximation

of EC compression. Here is summary take two symbol one and zero

suppose you have 2 ones and 2 zeros then there are 4!/(2!*2!) or 6

combinations. his method would assign the 6 possible numbers to this

problem so you could say its exact. At least if numbers small. But

the paper really doesn't say how to do compression. You read various

other papers that show how to assign a number to a combination.

All he seem to do is get a number for a combination.

He has in his paper the example of 00101001 in which he calculates

the value as 2+6+35 = 43 this was done in long post message

59 at

http://groups.google.com/group/comp.compression/browse_frm/thread/ffc7f7ca8

4c76378/30566aec6aa7d363?q=nightlight&rnum=1#30566aec6aa7d363

Big deal thats not compression. He state there

"The answer is on page 6, where it shows the string index

I(S8)=2+6+35=43 and how to calculate it (eq. 17). The actual size in

bits of the index is L=log(56)=5.807... bits since the valid values of

the index are 0..55 (there are 56 paths from A to B). The fractional

bits of L don't normally go to waste since they are coded in the mixed

radix with other items sent, usually with the fractional bits of

indices for other blocks (cf. p.9 [N4]). The encoder also sends the

count of 1's per block, which is 3 here and the length of the entire

string which is 8. The latter two items get coded in variety of ways in

different variants and different parts of the coder/s and experimental

prototypes (cf. p.9 [N6])."

This means he never compares it to an arithmetic compressor. He only

makes claims thats it better. I think if he could actually compress the

one example he gives in paper then he might be more beliveable. But he

wants to say its exact. But he can't be pinned down on any application

that compares it to an arithmetic. So not having any simple examples

done in a complete way says him being shown arithmetic ways that beat

his method. He leaves it to you to do the coding. At which point he

could claim you didn't do it the right way. Actually its a clever way

to prevent it from being compared to any real world entropy compressor.

So the only thing I got out of the paper besides the fact he never

completes anything is to say for a string made of only ones and zeros

the compressed encoded result is 3 items that are easy to combine

just don't ask him how to combine them since he seems not likely to do so.

1) the length of entire string

2) the number of ones

3) the index value.

He doesn't risk actually combining these in an output string

its possible he does not want to risk being laughed at or he

fears one could show how a simple bijective string compressor gets

better results. If you can think of any other reason no examples

like this are done to the end please Schin tell us what you think

it is.

Maybe it depends on the defination of "natural"

I suspect he will paste in lots of links to various papers that

seems to be his style. So don't ho

1. Quantized Indexing Source Code Available

2. Quantized Indexing (References update)

3. Recursively seed fill alg VS iteratively seed fill alg,, help!!

5. Quantized Indexing questions

6. 1stWorks Corporation Awarded Patent for Quantized Indexing

7. Could you track office update history like windows update history.

8. Worst Case Efficient Multidimensional Indexing (new alg.)

9. Import index.dat History file in current History folder

10. difficulty in preemphasizing,quantizing and coding the speech parameters

11. Data Recovery SOURCE CODE ( SOURCE CODES of Professional Data Recovery Software )

12. Source code for source code

13. Data Recovery SOURCE CODE ( SOURCE CODES of Professional Data Recovery Software )