DSP opcode decompression on a microcontroller.

DSP opcode decompression on a microcontroller.

Post by kenh » Sat, 14 Jan 2006 03:57:26

Hi all,

Let me first introduce myself. My name is Ken Helberg and I've been
reading this newsgroup for over a year simply out of a general interest
in compression; learning what I can when it interests me. With my
limited knowledge, I have little to add to various discussions that
occur in this group. My work also only allows me to read this via
google groups, so I apologize ahead for anything funky that will occur
with this message.

I'm at a point with a particular project where I need to consider
compressing some data in order to make space available for other pieces
of code.

In the system I have a small scale audio DSP that has a 35-bit,
512-word program space. The code is downloaded to the DSP only once at
startup and never changes. The 35-bit values are stored as 40-bit
values currently in flash. So the DSP program code takes up 2560 bytes.
If possible, I would like to get that smaller. The tradeoff, of course,
is that whatever code I add to decompress needs to be smaller than the
savings in compressed data. The time it takes to decompress is of
little concern since it's only done once.

The first thing I thought I could do is to store the data in 36-bit
chunks. Possibly with the upper nibble of the 36-bit words stored in
their own array so I don't have to swap nibbles for every odd-numbered
word if I were to store them consecutively. Similarly, it'd be too
costly in code to store the 35-bit values and shift the bits around
just to save a bit per word over the 36-bit method. It might take 40
bytes of code to handle this little compression method; giving a
savings of 216 bytes - not much.

Another method I considered is that there are various runs of (what
look to be) NOPs that I could take out and pseudo-RLE; storing where
the run of NOPs exists and the length of the run. I could use the
offset of the next run off the start or end of the previous run to try
and save space, but unfortunately one of the steps between NOPs is over
256, which means I'd still have store the offset as a 16-bit value. I
could take a small hit in code complexity and contain the length of the
run as well as the offset in a single 16-bit value. WIth the current
DSP program code I have, this'll save me an additional 600 bytes,
approximately. This is different for different program code,
understandably, so I'd like to find a better method.

...which brings me to why I'm really writing to the group. I've got no
information as to how the opcodes are setup; which bits mean what. I
was curious if I could use a method similar to how the probabilities
are gathered for arithmetic coding to determine the bit-to-bit or
bit-to-multibit correlation. Figuring I could, I wondered if anyone
here has done that and has some sample code I could tweak for my needs.

This information would help me to find where the different pieces of
the opcode are split, allowing me to then analyze each word and the
probability of the separate opcode parts. With the small micro I have,
it would probably be easier to do Huffman with known probabilities than
it would to do arithmetic. With my limited RAM (1K) and processing
capability, I don't believe adaptive is an option, either. But, maybe
you folks have some more insight into this.

Here's a text representation of the DSP program code:

Thanks in advance for any help,


DSP opcode decompression on a microcontroller.

Post by Jasen Bett » Sat, 14 Jan 2006 19:07:21

I'm new here too, looking at your DSP code it appears to me that it can
probably be compressed very well...

I'm mostly posting to see what the others think of this idea...
here's my idea, split the lines into 5 7 bit sections and code them like this.

0ddd dddd

if the high bit is zero send the 7 data bits (ddddddd) to the DSP,

1nnn mmmm

it it's 1 do the following split out the numbers n (represented by bits 654)
and m (represented by the low 4 bits)

go back in the output history the distance specified by the m+1
from there m+1 (7 bit) symbols worth of data

* represents start of a 35 bit word.

compressed raw

0000 0000 ; *0000000
1001 0000 ; 0000000
0000 0011 ; 0000011
0000 0100 ; 0000100
1010 0010 ; *0000000
0010 0010 ; 0100010
0000 0000 ; 0000000
1001 0101 ; *0000100
0000 1000 ; 0001000
1001 0010 ; 0000000

something like that should be fairly easy to program....
later on in the data some of the repeated or recently encoded lines can
be stored as a single byte...
(I just can't be botherd to encode that far by hand...)

you could also implement run-length encoding by reserving a
special byte (eg) 1111 1111 indicate the bext byte to be a count
of times to repeat the meaning of the byte after it.

eg, for 20 NOPS in a row after coding the first nop.

1111 1111 ; RLE
0001 0011 ; 19 times
1100 0100 ; last 5 output bytes


With decompression code there's a trade off between code length and data
length, huffman etc might be better suited as I think a decoder for that
can be fairly compact and would probably do better than my ad-hoc attempt



DSP opcode decompression on a microcontroller.

Post by Pasi Ojal » Sat, 14 Jan 2006 19:58:30

I would recommed using any kind of simple LZ77 approach.
In our DSP (32-bit instruction word) a byte-wise LZ77 gives
roughly 50% compression ratio.

For simplicity you should probably first try simple instruction-
word granularity. Because the media is 8-bit, something like:

0ccccxxx + 4 bytes = repeat instruction code X C+1 times (RLE possible)
1ccccccc llllllll = copy C+1 instructions from location cur-L-1 to cur
Because you have 512-word address space, 9 bits of L would be
better, but 8 is probably easier to handle -> shorter decompressor.

Involving huffman and probabilities will make the code "too big".

/Nynaeve could only shake her head. She remembered the night the girl had
filled her fool self with wine. At least she had never done that again;
her head the next morning had seemed an effective cure./
-- Nynaeve about Elayne in The Wheel of Time:"The Fires of Heaven"