Some ideas and questions.

Some ideas and questions.

Post by chaotikmin » Fri, 12 Dec 2008 22:10:17


Hello,
just to prevent me to do useless work i just want to submit you 2
ideas and gather some critics

As far as i know these ideas are not used anywhere, (i google'd for
it) but i could be very wrong, thus the reason of my post

1- Did someone tried to model context with a bayesian approach (or
even a hierarchical bayesian analysis ?)
If you don't think it's a nice approach of the problem why ?

2- Did someone tried to implement a multiple sliding window
compressor ?
My point here is to keep compression fast, but having a second window
at a carefully choosen position could lead to nice compression
improvement no ? (on some types of data at least )

regards.
 
 
 

Some ideas and questions.

Post by Thomas Ric » Sat, 13 Dec 2008 03:55:49


Sure it's a nice approach, sure it's used. If I just use "bayesian
compression" in google, I get lots of results. I'm not clear on the
hierarchical approach, but I haven't looked very carefully.


In which sense? Typically, the window approach is justified because the
correlation between elements decreases with the distance, justifying
that you simply stop caring if the elements are too far apart because it
doesn't buy you anything.


Sorry, where would that second window be, and data from that window
would be used how?

So long,
Thomas

 
 
 

Some ideas and questions.

Post by chaotikmin » Sat, 13 Dec 2008 04:36:46


mmm i don't know how i missed that one , searching for "bayesian
compression" seems so obvious, oooh well..
nice to know it's used anyway ;)
is it used in some easy to obtain archiver ?

>I'm not clear on the

i'm more an AI guy, i throwed the question without proper thinking i
fear !
(same as the difference between BOA and HBOA but i just figured it's
probably irrelevant in compression domain)


there is some cases where we know that the data at a Position is
similar to the data at position + arbitrary offset
ok i understand that i maybe overlooked the fact that's a very special
case .


let's say we make a 1st pass on the file to be compressed,
during which we find 1 (or more) position that have interresting
similarity with another
we start the compression with 2 windows , the second one having a
negative offset determined by our 1st pass
(here i meant the second window to be only a search buffer in fact)

you could object that at compression start the second window would be
outside of the data, which is an easy problem to rule out obviously

regards .
 
 
 

Some ideas and questions.

Post by Thomas Ric » Sat, 13 Dec 2008 04:57:32


Not exactly my corner in the compression business, but the "mainstream"
products are mostly old-fashioned in this respect AFAIK. There are many
experimental codecs out that have been benchmarked - you should probably
take a look at

http://www.yqcomputer.com/

and wait for responses from people that are more working on this side of
the field. I'm the imaging guy, you know, I don't compress arbitrary
data. There, bayesian approaches are used, but the applications I know
go more into the direction of de-noising than compression.


Don't ask me, sorry. Somebody else wants to jump in, probably?


> let's say we make a 1st pass on the file to be compressed,
> during which we find 1 (or more) position that have interresting
> similarity with another
> we start the compression with 2 windows , the second one having a
> negative offset determined by our 1st pass
> (here i meant the second window to be only a search buffer in fact)

Well, in my field the "second window" would be "the row of pixels above
the current row". In that sense, yes, but that's because the source data
is two-dimensional and not one-dimensional. I believe I understand why
that could make sense - it could be understood as some "optimized"
PPM-scheme where you ignore data between the two windows. You could also
understand the LZ-algorithm as a two-window approach: The current
scanning window you check, and the position in the text the dictionary
entry came from you currently compare against. Probably a bit
far-fetched, I agree.


Sure.

So long,
Thomas
 
 
 

Some ideas and questions.

Post by chaotikmin » Sat, 13 Dec 2008 05:08:23

> Not exactly my corner in the compression business, but the "mainstream"

interresting link
thanks

>> and wait for responses from people that are more working on this side of >> the field. I'm the imaging guy, you know, I don't compress arbitrary >> data. There, bayesian approaches are used, but the applications I know >> go more into the direction of de-noising than compression.

i just found a paper about bayesian compression, we can view my
question as answered.
http://www.yqcomputer.com/

>> Don't ask me, sorry. Somebody else wants to jump in, probably?

not so important, i guess my questions felt more in the "curious to
know" category.
>> Well, in my field the "second window" would be "the row of pixels above >> the current row". In that sense, yes, but that's because the source data >> is two-dimensional and not one-dimensional. I believe I understand why >> that could make sense - it could be understood as some "optimized" >> PPM-scheme where you ignore data between the two windows. You could also >> understand the LZ-algorithm as a two-window approach: The current >> scanning window you check, and the position in the text the dictionary >> entry came from you currently compare against. Probably a bit >> far-fetched, I agree.

something like that !

thanks for the quick answer

regards.