(Z80/PC) (De)Compression Routines - Cemetech | Forum

Iambian
2
Advanced Member (Posts: 258)

(Z80/PC) (De)Compression Routines
26 Feb 2015 02:19:53 pm
Last edited by Iambian on 18 Mar 2015 08:50:23 pm; edited 5 times in total

To start off, thanks to Xeda112358 for inspiring me to take a stab at this for myself. I'm doing this since I wanted to learn Python (which is a great language) and wanted my first .py script to be something useful. So...

LZ77 Compression

After reading a relatively easy-to-understand paper regarding LZ77, I've decided to take a crack at it. The algorithm I wrote (at the moment) uses fixed-length codes each at 8-bits, so the maximum back-reference distance and code size is 255. I put in two optimizations over the method described which are:

* If size = 0, omit backref and output symbol
* If backref = 0, emit following symbols by size+1

The 8-bit fixed length codes really hurt in terms of compression ratio for files that doesn't match very well, or has matches further back than 255 bytes, but I wasn't really writing this to have the best compression ratio. I just wanted something that worked for my first Python project.

Python script: Compress/Decompress (the important bits)
Thanks to Sorunome, who got the ball rolling with providing me the file I/O functions

Code:

#!/usr/bin/python3


import sys





#File format:


#   header [compressedArraySizeLSB,compressedArraySizeMSB]


#   header [outputArraySizeLSB,outputArraySizeMSB]


#   tuples [count,backref,symbol(s)]


#If count=0, omit backref, emit symbol


#If backref=0, emit # of symbols by count+1


#Else, canonical LZ77.








MAXCODELEN = 255


 


def readFile(file):


        a = []


        f = open(file,'rb')


        b = f.read(1)


        while b!=b'':


                a.append(ord(b))


                b = f.read(1)


        f.close()


        return a


 


def writeFile(file,a):


        f = open(file,'wb+')


        f.write(bytes(a))


        f.close()


 


#winsize: default -1 for SoF, else curpos-winsize bounded by SoF.


def resetSlidingWindow(curpos,winsize=None):


        if winsize is None:


            a=0


        else:


            if winsize>curpos:


                a=0


            else:


                a=curpos-winsize


        return a





#Tuple: [Size,Backref,Symbol(s)]


#If Size=0, then it becomes [0,Symbol]


#If backref=0, then symbols is a string of literals.


#


#


#


def returnLZCodes(count,backrefptr,symbol):


        a = []


        if isinstance(symbol,list):


            b = symbol


        else:


            b = [symbol]


        if count < 1:


            a = [0] + b


        else:


            a = [count,backrefptr] + b


        return a


        





        


        


        


# do not allow "compression" of files less than 2 bytes long        


def compress(inArray):


    global MAXCODELEN


    lbhptr = 0    #filestart


    cptr = 1      #filestart+1


    matchfound = 0  #flag1


    foundcount = 0


    foundbackref = 0


    outArray = []


    literalbuf = [inArray[0]]


    EoA = len(inArray)  #not +1, cuz we need var at EoF to emit as literal


    while cptr < EoA:


        if inArray[lbhptr] == inArray[cptr]:


            csrchptr = cptr


            clbhptr = lbhptr


            rlecount = 0


            mcount = 0


            while inArray[clbhptr] == inArray[csrchptr]:


                if (csrchptr+1)==(EoA-1): #do not allow final tok to be consumed


                    break


                if mcount >= MAXCODELEN:


                    break


                if clbhptr==cptr:


                    clbhptr = lbhptr


                    rlecount += 1


                mcount += 1


                clbhptr += 1


                csrchptr += 1


            if (mcount > foundcount) and (mcount > 3):  #replace 3 later with codelen


                matchfound = 1


                foundcount = mcount


                foundbackref = cptr-lbhptr


                foundposend = csrchptr


        lbhptr += 1


        


        if lbhptr >= cptr:


            if matchfound == 1:


                if len(literalbuf) > 0:


                   # if len(literalbuf) > 255:


                   #     print("Error literalbuffer overrun!", str(literalbuf))


                    outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


                    del literalbuf[:]


                outArray.extend(returnLZCodes(foundcount,foundbackref,inArray[foundposend]))


               # print("Match found: count " + str(foundcount) + ", backref " + str(foundbackref) + ", position " + str(cptr) + ", trailing num " + str(inArray[cptr+foundcount]) + " sanity check match " + str(inArray[foundposend]))


                cptr = foundposend  #to compensate for lookahead literal write, also if last symbol, next check falls through and exits out while loop normally


                matchfound = 0


                foundcount = 0


                foundbackref = 0


            else:


                literalbuf.append(inArray[cptr])


                if len(literalbuf) >= MAXCODELEN:  #flush buffer if reached max code size


                    outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


                    del literalbuf[:]


                    print("literalbuf filled, forcing buffer flush.")


            cptr += 1


            lbhptr = resetSlidingWindow(cptr,255)


            if cptr == (EoA-1):


                literalbuf.append(inArray[cptr])


                break  #break out of the while loop and force a buffer flush


    if len(literalbuf) > 0:


        outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


    #later on, return 2D array, [[outArray],[listOfCompressionDetails]]


    return outArray





#Ver 0.1: fixed len [size,backref,symbol(s)]


# If size=0, omit backref, use symbol.


# If backref=0, size+1= number of symbols (including trailing symbol)


#


def decompress(inArray):


    pass  #begin routine


    outArray = []


    cptr = 0


    while cptr < len(inArray):


        pass #begin loop


        temps = inArray[cptr]


        if temps == 0:


         #   print("Single literal found: " + str(inArray[cptr+1]))


            outArray.append(inArray[cptr+1])


            cptr += 2


        else:


            tempb = inArray[cptr+1]


            if tempb == 0:


                cptr += 2


                tmpArray = []


                for idx in range(temps+1):


                    tmpArray.append(inArray[cptr])


                    cptr += 1


            #    print("Literal array found: " + str(tmpArray))


                outArray.extend(tmpArray)


            else:


                cptrorig = len(outArray)


            #    print("LZ encoding: cptrorig " + str(cptrorig) + ", tempb " + str(tempb) + ", idx " + str(idx))


                for idx in range(temps):


                    outArray.append(outArray[(cptrorig-tempb)+idx])


                outArray.append(inArray[cptr+2])


                cptr += 3


        pass #end loop


    pass  #end routine


    return outArray

z80 decompressor routine (again, the important bits)

Code:




;LZ77 (1) Decompression


;Header: [outputSizeLSB,outputSizeMSB,inputSizeLSB,inputsizeMSB] 


;Format: [size,backref,symbol(s)]


;If size=0, omit backref, output only symbol.


;If backref = 0, string of symbols follow indicating size+1


;Otherwise, its a string of matches starting from curptr-backref


;


;* Input may be streamed one byte at a time. It does not need random access.


;* Output is assumed to happen in-place and start anywhere.


;* Output data structure must be allocated yourself.


;





;HL = input file address start (must start at inputSizeLSB byte)


;DE = output file address start


;BC = file size (if called from dlz77_1_loop instead of dlz77_1_main)





dlz77_1_main:


 ld c,(hl)


 inc hl


 ld b,(hl)


 inc hl


dlz77_1_loop:


 ld a,(hl)


 inc hl


 dec bc


 or a


 jr z,dlz77_1_print_one_literal


 dec bc


 push bc


   ld b,a


   ld a,(hl)


   inc hl


   or a


   jr z,dlz77_1_print_many_literals


   push hl


     neg


     ld L,a


     ld h,$FF


     add hl,de  ;-offset+curpos


     ld c,b


     ld b,0


     ldir


   pop hl


dlz77_1_print_last_literal_in_block:


 pop bc


dlz77_1_print_one_literal:


 ld a,(hl)


 inc hl


 ld (de),a


 inc de


 dec bc


 ld a,c


 or b


 jr nz,dlz77_1_loop


 ret


dlz77_1_print_many_literals:


 ld a,(hl)


 inc hl


 ld (de),a


 inc de


 ex (sp),hl


 dec hl     ;decrement BC that was pushed


 ex (sp),hl


 djnz dlz77_1_print_many_literals


 jr dlz77_1_print_last_literal_in_block

So, the thing you all really want to know: Performance.

For files that contain many repeating sequences, such as tilemaps, this algo works quite well. For things such as text... not so well. The results are compared with the results of WinRAR (.rar, default compression), a popular archiver for Windows.

Code:




                               IN         OUT          WINRAR


          NaCaNiS Sprite Set: 13401  ->  05083 (37%)   02541


        E:SoR tilemap hive 1: 06714  ->  02842 (42%)   01600


The Great Gatsby (txt, part): 28721  ->  26637 (92%)   12372


  Moby-[censored] (txt,part): 36678  ->  33429 (91%)   16131

Things will probably improve when I add such things as variable-width codes, and some of the goodies that makes such products as pucrunch so awesome.

Edit: Feb 27, 2015
Test: On-calc decompression using an emulator running at 6MHz mode decompressed the NaCaNiS sprite set in a loop 255 times. This test took somewhat less than 25 seconds. The spriteset expands to 13401 bytes, so it wrote out the data in that instance at a rate of about 133KB/sec. To put that in perspective, a straight LDIR at 6MHz copies at about 279KB/sec, and since the important parts of the main loop *is* an LDIR, we get a bit of speed from that. Even though this was done in an emulator, that shouldn't change the fact that decompression is very fast.

----

Improved LZ77 (Runer112's variation):http://www.cemetech.net/forum/viewtopic.php?p=231705#231705
More improvements. And an on-calc compressor.http://www.cemetech.net/forum/viewtopic.php?p=232677#232677

This post will be modified periodically to keep any extra stuff I post all in one place.

Sorunome
1
Super-Expert (Posts: 807)

27 Feb 2015 08:44:09 am

compression always seems like some dark magic to me, but this is sure looking nice! You are able to compress quite some for tilemaps. Now, how quickly is the de-compressing on-calc in 6MHz mode?

Iambian
2
Advanced Member (Posts: 258)

27 Feb 2015 12:19:49 pm

Sorunome wrote:

compression always seems like some dark magic to me, but this is sure looking nice! You are able to compress quite some for tilemaps. Now, how quickly is the de-compressing on-calc in 6MHz mode?

Ran one test, got about 133KB/sec decompression. I'd still expect 100KB/sec even for files that don't compress well. Initial post was updated with findings of that experiment.

Hitechcomputergeek
New Member (Posts: 90)

27 Feb 2015 12:34:19 pm

Sorunome wrote:

compression always seems like [dangerous, evil] dark magic to me[...]!

Compression is black magic. Pure sorcery and evil. No one in there right minds would've ever designed a compression algorithm, considering how dangerous dabbling with dark magic can be and the potential disastrous consequences, except for the fact that there's a bribe for those who can successfully create a compression algorithm that outperforms any previous algorithms. Even I almost went insane once when I was trying to decompile an LZMA decoder in Java. And that's not even writing my code, I was only trying to fix the decompiled code because all the decompilers suck. (procyon would be the best if not for putting "this." all over the place)

That said, I know very little about compression, but could you use a Burrows-Wheeler transform? Sorry in advance if that's a stupid suggestion or something.

Sorunome
1
Super-Expert (Posts: 807)

01 Mar 2015 04:49:21 am

Iambian wrote:

Sorunome wrote:

compression always seems like some dark magic to me, but this is sure looking nice! You are able to compress quite some for tilemaps. Now, how quickly is the de-compressing on-calc in 6MHz mode?

Ran one test, got about 133KB/sec decompression. I'd still expect 100KB/sec even for files that don't compress well. Initial post was updated with findings of that experiment.

That's pretty awesomely fast! I shall use this!

EDIT: For your python decompressing, why does the header have the output size, seeing you don't use that?

Iambian
2
Advanced Member (Posts: 258)

01 Mar 2015 01:22:55 pm
Last edited by Iambian on 01 Mar 2015 04:38:46 pm; edited 2 times in total

Sorunome wrote:

Iambian wrote:

Sorunome wrote:

compression always seems like some dark magic to me, but this is sure looking nice! You are able to compress quite some for tilemaps. Now, how quickly is the de-compressing on-calc in 6MHz mode?

Ran one test, got about 133KB/sec decompression. I'd still expect 100KB/sec even for files that don't compress well. Initial post was updated with findings of that experiment.

That's pretty awesomely fast! I shall use this!

EDIT: For your python decompressing, why does the header have the output size, seeing you don't use that?

Good catch! I probably should have worked around that a while ago, since output size was more of an afterthought than anything else. I don't use output size in the python script because there was no need to so long as I knew how many input bytes there were, but on the calculator, you need to know how large the output if you're decompressing to file since you might not have enough memory to create the variable, and I didn't have a way of stopping the decompressor in case it ran out.

------------------------------------------
Large edit

After talking to Runer112 (the guy maintaining the Axe Parser) on IRC regarding his version of LZ77, I decided to give it a shot. Turns out, it blew what I had clear out of the water.

LZ77 Compression (Improved by Runer112)

Again, here's the relevant parts of the Python script that does the compression/decompression. Some of it's different, but much of it is the same since I did in-place edits to change the scheme around.

Python (De)Compression script

Code:

#!/usr/bin/python3


import sys





#File format:


#   header [compressedArraySizeLSB,compressedArraySizeMSB]


#   header [outputArraySizeLSB,outputArraySizeMSB]


#   Eh.. see returnLZCodes function notes for more details.


#





MAXCODELEN = 126


 


[...]


 


#winsize: default -1 for SoF, else curpos-winsize bounded by SoF.


def resetSlidingWindow(curpos,winsize=None):


        if winsize is None:


            a=0


        else:


            if winsize>curpos:


                a=0


            else:


                a=curpos-winsize


        return a





#decomp v0.2


#[0zzz zzzz] = (z+1) size of following literal bytes


#[1zzz zzzz,xxxx xxxx] = (z+2) size of bytes at x bytes backref


#


def returnLZCodes(count,backrefptr,symbol):


        a = []


        if isinstance(symbol,list):


            b = symbol


        else:


            b = [symbol]


        if backrefptr == 0:


            a = [count&0x7F] + b


        else:


            a = [count|0x80,backrefptr] # + b


        return a


        


# do not allow "compression" of files less than 2 bytes long        


def compress(inArray):


    global MAXCODELEN


    lbhptr = 0    #filestart


    cptr = 1      #filestart+1


    matchfound = 0  #flag1


    foundcount = 0


    foundbackref = 0


    outArray = []


    literalbuf = [inArray[0]]


    EoA = len(inArray)  #not +1, cuz we need var at EoF to emit as literal


    while cptr < EoA:


        if inArray[lbhptr] == inArray[cptr]:


            csrchptr = cptr


            clbhptr = lbhptr


            rlecount = 0


            mcount = 0


            while inArray[clbhptr] == inArray[csrchptr]:


                if (csrchptr+1)==(EoA-1): #do not allow final tok to be consumed


                    break


                if mcount >= MAXCODELEN:


                    break


                if clbhptr==cptr:


                    clbhptr = lbhptr


                    rlecount += 1


                mcount += 1


                clbhptr += 1


                csrchptr += 1


            if (mcount > foundcount) and (mcount > 3):  #replace 3 later with codelen


                matchfound = 1


                foundcount = mcount


                foundbackref = cptr-lbhptr


                foundposend = csrchptr


        lbhptr += 1


        


        if lbhptr >= cptr:


            if matchfound == 1:


                if len(literalbuf) > 0:


                   # if len(literalbuf) > 255:


                   #     print("Error literalbuffer overrun!", str(literalbuf))


                    outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


                    del literalbuf[:]


                outArray.extend(returnLZCodes(foundcount-2,foundbackref,inArray[foundposend]))


               # print("Match found: count " + str(foundcount) + ", backref " + str(foundbackref) + ", position " + str(cptr) + ", trailing num " + str(inArray[cptr+foundcount]) + " sanity check match " + str(inArray[foundposend]))


                cptr = foundposend-1  #to compensate for lookahead literal write, also if last symbol, next check falls through and exits out while loop normally


                matchfound = 0


                foundcount = 0


                foundbackref = 0


            else:


                literalbuf.append(inArray[cptr])


                if len(literalbuf) >= MAXCODELEN:  #flush buffer if reached max code size


                    outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


                    del literalbuf[:]


                    print("literalbuf filled, forcing buffer flush.")


            cptr += 1


            lbhptr = resetSlidingWindow(cptr,255)


            if cptr == (EoA-1):


                literalbuf.append(inArray[cptr])


                break  #break out of the while loop and force a buffer flush


    if len(literalbuf) > 0:


        outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))


    #later on, return 2D array, [[outArray],[listOfCompressionDetails]]


    return outArray








def decompress(inArray):


    pass  #begin routine


    outArray = []


    cptr = 0


    while cptr < len(inArray):


        tempc = inArray[cptr]


        if (tempc&0x80)==0:


            count = (tempc&0x7F)+1


            cptr += 1


            for x in range(count):


                outArray.append(inArray[cptr])


                cptr += 1


        else:


            count = (tempc&(0x7F)) + 2


            backref = inArray[cptr+1]


            startptr = len(outArray)-backref


            for x in range(count):


                outArray.append(outArray[startptr+x])


          #  outArray.append(inArray[cptr+2])


            cptr += 2


    pass #end main loop


    return outArray

The on-calc decompressor got smaller too, though Runer had one that's even smaller since this was originally meant for Axe, tho he said something about the file format being a bit strange. I didn't ask for details.

Z80 Decompressor

Code:

;decomp v0.2


;[0zzz zzzz] = (z+1) size of following literal bytes


;[1zzz zzzz,xxxx xxxx] = (z+2) size of bytes at x bytes backref


;


;





;HL=inptr


;DE=outptr


;BC (if not called from main) = inputsize





dlz77_2_main:


 ld c,(hl)


 inc hl


 ld b,(hl)


 inc hl


dlz77_2_loop:


 ld a,(hl)


 inc hl


 dec bc


 or a


 jp p,dlz77_2_output_literals


 push hl


   push bc


     and $7F


     add a,2


     ld c,(hl)


     ld b,0


     ld h,d


     ld L,e


     sbc hl,bc


     ld c,a


     ldir


   pop bc


 pop hl


 inc hl


 dec bc


dlz77_2_check_remaining_bytes:


 ld a,b


 or c


 jr nz,dlz77_2_loop


 ret


dlz77_2_output_literals:


 inc a


dlz77_2_output_literal_loop:


 ldi


 dec a


 jr nz,dlz77_2_output_literal_loop


 jr dlz77_2_check_remaining_bytes

And now the fun stuff. The numbers

Compression performance is enhanced very much.

Code:




                               IN         OLD        NEW 


          NaCaNiS Sprite Set: 13401  ->  05083  ->  04281


        E:SoR tilemap hive 1: 06714  ->  02842  ->  02582


The Great Gatsby (txt, part): 28721  ->  26637  ->  25100


  Moby-[censored] (txt,part): 36678  ->  33429  ->  31569

And the decompression is even faster. The test data (Nacanis' spriteset) decompressed 13401 bytes 255 times in slightly less than 23.5 seconds in Wabbitemu running at 6Mhz, giving a total speed of about 142KB/sec.

While I was writing this post...

More chatting happened on IRC and it was found on Runer's side that if more bits were taken from size and allocated to backref, more compression happened (at least, with the test data).

More work to be done. More cherries to nom.

MOAR EDITS
If you're curious, here's the whole Python script: http://pastebin.com/L8YTBHgN
And this is the Z80 test wrapper routine: http://pastebin.com/VNCt4hWj

EVEN MOAR EDITS
Some of the comments flat out lie. Changed or removed them. The pastebins tho. Too lazy. Read at your own risk.

Sorunome
1
Super-Expert (Posts: 807)

01 Mar 2015 04:18:59 pm

Iambian wrote:

Good catch! I probably should have worked around that a while ago, since output size was more of an afterthought than anything else. I don't use output size in the python script because there was no need to so long as I knew how many input bytes there were, but on the calculator, you need to know how large the output if you're decompressing to file since you might not have enough memory to create the variable, and I didn't have a way of stopping the decompressor in case it ran out.

Just noticed that i wrote python decompressor, but i actually meant the asm decompressor. Is it there the same case, too?

Iambian
2
Advanced Member (Posts: 258)

18 Mar 2015 08:47:40 pm
Last edited by Iambian on 21 Aug 2016 02:32:38 pm; edited 4 times in total

Cherries have been nommed. My cherry bowl is now empty. This will be fun. And probably the most useful.

Also, all the tests done and the code posted below doesn't use the full count range that is allowed. You'll need to make those changes yourself because I'm too lazy to fix it.

LZ77 (De)Compression (Improved Even More)
This post includes an on-calc compression routine. Finally

We've noticed that match length tend to be rather small, and rarely (if at all) reach as high as 127, though more matches could be found if the back-reference offset could be made longer, so that's pretty much what we did. I stole two bits out of counter (max 31+2) and gave it the back-reference offset so a maximum of 1023 can be addressed.

The fun doesn't stop there. To make things faster for decompression, Runer112 suggested that the back-reference offset is actually 1024-backref instead of just backref since one can just OR in the upper bits an do an ADD rather than juggle some registers trying to line up a SBC. Also, the flag bit is bit 0, so the byte can be rra'd.

And so the moment you've been waiting for.

The whole Python (De)Compressor
It was reworked. Mostly.

Code:

#!/usr/bin/python3


import sys


import time





#File format:


#   header [outputArraySizeLSB,outputArraySizeMSB]


#   header [numCodewordsLSB,numCodewordsMSB]


#   Eh.. see returnLZCodes function notes for more details.


#





MAXCODELEN = 30


LITCODELEN = 126





SIZE_LEN = 8-3


BACK_LEN = 8+3





BACKREF_DIST = 1023


 


def readFile(file):


        a = []


        f = open(file,'rb')


        b = f.read(1)


        while b!=b'':


                a.append(ord(b))


                b = f.read(1)


        f.close()


        return a


 


def writeFile(file,a):


        f = open(file,'wb+')


        f.write(bytes(a))


        f.close()


 


#winsize: default -1 for SoF, else curpos-winsize bounded by SoF.


def resetSlidingWindow(curpos,winsize=None):


        if winsize is None:


            a=0


        else:


            if winsize>curpos:


                a=0


            else:


                a=curpos-winsize


        return a





#decomp v0.2


#[0zzz zzzz] = (z+1) size of following literal bytes


#[1zzz zzzz,xxxx xxxx] = (z+2) size of bytes at x bytes backref


#


def returnLZCodes(count,backrefptr,symbol):


#        global SIZE_LEN


#        global BACK_LEN


# hardcode a backlen of 11 and a size len of 5


        global NUMBER_OF_CODES_EMITTED


        NUMBER_OF_CODES_EMITTED += 1


        a = []


        if isinstance(symbol,list):


            b = symbol


        else:


            b = [symbol]


        if backrefptr == 0:


            a = [(count<<1)&0xFE] + b


          #  print("C-LTRL SIZE:"+str(count+1))


          #  time.sleep(0.5)


        else:


            # [0000 00bb bbbb bbbb] -> [bb00 0000] [bbbb bbbb]


            # (b>>2&0xC0)|(c<<1&0x3E)            [bbcc ccc1]


            #


            a = [(backrefptr>>2&0xC0)|(count<<1&0x3E)|1,backrefptr&0xFF]


          #  print("L-LZCO SIZE:"+str(count+2)+" BREF:"+str(backrefptr))


         #   time.sleep(0.5)


        return a


        





        


        


        


# do not allow "compression" of files less than 2 bytes long        


def compress(inArray):


    global MAXCODELEN


    global BACKREF_DIST


    global LITCODELEN


    global NUMBER_OF_CODES_EMITTED


    NUMBER_OF_CODES_EMITTED = 0


    lbhptr = 0    #filestart


    cptr = 1      #filestart+1


    matchfound = 0  #flag1


    emitnow =0


    foundcount = 0


    foundbackref = 0


    outArray = []


    lit_ptr = 0


    lit_size = 1


    EoA = len(inArray)  #not +1, cuz we need var at EoF to emit as literal


    while cptr < EoA:


        if inArray[lbhptr]==inArray[cptr] and (cptr!=lbhptr):


            #initial match found


            tmpcptr=cptr


            tmplbhptr=lbhptr


            if (len(inArray)-cptr)>MAXCODELEN:


                counter=MAXCODELEN


            else:


                counter=len(inArray)-cptr


            counterbase=counter


            while counter!=0:


                if inArray[tmplbhptr]!=inArray[tmpcptr]:


                    break


                tmpcptr+=1


                tmplbhptr+=1


                counter-=1


            curcount=counterbase-counter


            if (curcount>2) and (foundcount<curcount):


                matchfound=1


                if curcount==MAXCODELEN:


                    emitnow=1


                foundcount=curcount


                foundbackref=cptr-lbhptr


        lbhptr+=1


        if (cptr==lbhptr) or (emitnow==1):


            if matchfound==1:


                if lit_size>0:


                    outArray.extend(returnLZCodes(lit_size-1,0,inArray[lit_ptr:(lit_ptr+lit_size)]))


                    lit_size=0


                    #NUMBER_OF_CODES_EMITTED+=1


                #match found, emit literal and LZcodes


                outArray.extend(returnLZCodes(foundcount-2,1024-foundbackref,0))


                #NUMBER_OF_CODES_EMITTED+=1


                cptr+=foundcount


                foundcount=0


                foundbackref=0


                matchfound=0


                emitnow=0


            else:


                #increment literal or something


                if lit_size==0:


                    lit_ptr=cptr


                lit_size+=1


                cptr+=1


                if lit_size>127:


                    outArray.extend(returnLZCodes(lit_size-1,0,inArray[lit_ptr:(lit_ptr+lit_size)]))


                    lit_size=0


                    #NUMBER_OF_CODES_EMITTED+=1


            lbhptr=resetSlidingWindow(cptr,BACKREF_DIST)


    if lit_size>0:


        outArray.extend(returnLZCodes(lit_size-1,0,inArray[lit_ptr:(lit_ptr+lit_size)]))


        lit_size=0


        #NUMBER_OF_CODES_EMITTED+=1


    return outArray





#Ver 0.1: fixed len [size,backref,symbol(s)]


# If size=0, omit backref, use symbol.


# If backref=0, size+1= number of symbols (including trailing symbol)


#


def decompress(inArray):


    pass  #begin routine


    outArray = []


    cptr = 0


    while cptr < len(inArray):


        tempc = inArray[cptr]


        if (tempc&1)==0:


            count = (tempc>>1)+1


            cptr += 1


            for x in range(count):


                outArray.append(inArray[cptr])


                cptr += 1


        else:


            count = ((tempc&0x3E)>>1) + 2


            backref = 1024-(inArray[cptr+1] + ((tempc&0xC0)<<2))


       #     print("Dump count " + str(count) + ", backref " + str(backref))


            startptr = len(outArray)-backref


            for x in range(count):


                outArray.append(outArray[startptr+x])


          #  outArray.append(inArray[cptr+2])


            cptr += 2


    pass #end main loop


    return outArray





# ----------------------------------------------------------------------------


# ----------------------------------------------------------------------------


# ----------------------------------------------------------------------------


# ----------------------------------------------------------------------------


# ----------------------------------------------------------------------------


# ----------------------------------------------------------------------------


# Program start





NUMBER_OF_CODES_EMITTED = 0


timestart = time.clock()





if (len(sys.argv) < 2) or (sys.argv[1][0]=='-') or (sys.argv[1][0]=='/'):


    print("Usage: cmprss.ph <infile> <outfile> <options>")


    print("For help: cmprss.ph -h")


    sys.exit(1)


if "-h" in sys.argv:


    print("Usage: cmprss.ph <infile> <outfile> <options>")


    print("Options:")


    print("-t : Test mode (do not output to file)")


    print("-a : Add size header to file")


inFileName = sys.argv[1]


if (len(sys.argv) > 2) and (sys.argv[2][0]!='-'):


    outFileName = sys.argv[2]


else:


    outFileName = ""


if "-a" in sys.argv:


    addHeader = 1


else:


    addHeader = 0


if "-t" in sys.argv:


    testMode = 1


else:


    testMode = 0


    











inFileArray = readFile(inFileName)  #later read from cmdline


dataFreq = [0 for i in range(256)]  #table of num of occurances of bytes


#check 


count = 0


while count < len(inFileArray):


    dataFreq[inFileArray[count]] += 1


    count += 1





item_freq = 65535


item_code = 0


count = 0


filesum = 0





while count < 256:


    if dataFreq[count]<item_freq:


        item_freq = dataFreq[count]


        item_code = count


    filesum += dataFreq[count]


    count += 1


    


sixteensum = 0


for x in range(len(inFileArray)):


    sixteensum += inFileArray[x]


sixteensum = sixteensum & 0x00FFFF





print("Sanity check: fSize: " + str(len(inFileArray)) +


      ", summation " + str(filesum) + ", 16-bit chksum " + str(sixteensum))


#print("Item chosen: " + str(item_code) + " with " +


#      str(item_freq) + " occurrances.")





#escapecode = item_code


#escapelength = 0


#tmpvar = escapecode


#while 1:


#    escapelength += 1


#    tmpvar //= 2


#    if tmpvar == 0:


#        break


      


# print("Escape code chosen: " + str(escapecode) +


#      ", bit length: " + str(escapelength))





resultArray = compress(inFileArray)





print("Result array length :" + str(len(resultArray)))





for x in range(0,len(resultArray)):


    if resultArray[x]>255:


        print("Error: Array item " + str(x) + " outside bounds (" + str(resultArray[x]) + ")")


        print("Around: " + str(resultArray[x-4:x+4]))


#writeFile("testoutput",resultArray)





print("Decompression test start.")


decompTest = decompress(resultArray)


print("Length comparison: original: " + str(len(inFileArray)) + ", decompressed: " + str(len(decompTest)))





if str(len(inFileArray)) == str(len(decompTest)):


    print("Data integrity test")


    errcode = 0


    for x in range(len(resultArray)):


        if inFileArray[x] != decompTest[x]:


            errcode = 1


            print("Data mismatch at position " + str(x) + ": " + str(inFileArray[x]) + " vs " + str(decompTest[x]))


    if errcode == 0:


        print("Test successful. No discrepencies detected.")


    else:


        print("Error: Data mismatch found. File not written.")


        print("In: " + str(inFileArray[0:10]) + "\nOut: " + str(resultArray[0:10]) + "\nChk: " + str(decompTest[0:10]))


        sys.exit(2)


else:


    print("Error: Test length mismatch. File not written.")


    print("In: " + str(inFileArray[0:10]) + "\nOut: " + str(resultArray[0:10]) + "\nChk: " + str(decompTest[0:10]))


    sys.exit(2)


#





sizeLSB = NUMBER_OF_CODES_EMITTED%256


sizeMSB = (NUMBER_OF_CODES_EMITTED//256)%256


resultArray = [sizeLSB,sizeMSB] + resultArray





# Commented this chunk out if you're using the included z80 decompressor. Such fights. Many problems. Wow.


#sizeLSB = len(inFileArray)%256


#sizeMSB = (len(inFileArray)//256)%256


#resultArray = [sizeLSB,sizeMSB] + resultArray





print("Returned size ["+str(len(inFileArray))+"], numcodes ["+str(NUMBER_OF_CODES_EMITTED)+"]")





if addHeader == 1:


    print("Debug: resarraysample: " + str(resultArray[0:10]))





if testMode == 0:


    writeFile(outFileName,resultArray)


    print("File [" + outFileName + "] was output.")


else:


    print("Test mode running. No output.")


inSize = len(inFileArray)


outSize = len(resultArray)


print("Success! In: " + str(inSize) + ", Out: " + str(outSize) + " (" + str((((outSize/inSize)*100)//1)) + "% of original)")





timeend=time.clock()





print("Time elapsed: "+str(timeend-timestart))

Decompressor in Z80 Assembly
It's even less understandable.

Code:

;cccc ccc0 : lits[c+1]


;bbcc ccc1 bbbb bbbb : count c+2 1024-backref











;HL=reader


;DE=writer


;BC=number of codewords


dlz77_4_main:


 ld c,(hl)


 inc hl


 ld b,(hl)


 inc hl


dlz77_4_loop:


 ld a,b


 or c  ;ensures carry is dead when rra is reached


 ret z


 push bc


   ld a,(hl)


   inc hl


   ld b,a


   rra


   inc a


   jr nc,dlz77_output_literals


   push hl


     and %00011111


     inc a


     ld c,a


     ld a,b


     or  %00111111  ;set all bits to simulate negative


     rlca


     rlca


     ld L,(hl)


     ld H,a


     add hl,de


     ld b,0


     ldir


   pop hl


   inc hl


dlz77_4_loopend:


 pop bc


 dec bc


 jr dlz77_4_loop


dlz77_output_literals:


   ld c,a


   ld b,0


   ldir


   jr dlz77_4_loopend

Compressor in Z80 Assembly
Yay. Finally. The comments LIE. They LIE LIKE A RUG!

Code:

;in: HL=startadr DE=destinationAdr BC=insize IX=outsizelimit


;out: .dw indatsize,numcodewords \ .db data


;out: HL=start of output file stream ; BC=size of file output stream


;USE CODEWORD SIZE TO SCAN FILE AND DERIVE COMPRESSED DATA SIZE. INCLUDING A


;RUNNING COUNT WILL MAKE COMPRESSOR EXECUTION SLOWER.


;


;





;Compressor memory is 24 bytes long


size_remain            .equ compressorMemStart+0


found_backref          .equ compressorMemStart+2


found_size             .equ compressorMemStart+4


lit_ptr                .equ compressorMemStart+6


lit_size               .equ compressorMemStart+8


num_codes              .equ compressorMemStart+10


output_ptr             .equ compressorMemStart+12


output_size_limiter    .equ compressorMemStart+14


input_sof              .equ compressorMemStart+16


compressed_output_sof  .equ compressorMemStart+18


input_eof              .equ compressorMemStart+20


match_size_gate        .equ compressorMemStart+22





.echoln lit_size








MAX_BACKREF_DIST .equ 1023


MAX_COUNT        .equ 32





myflags .equ asm_flag1


emit_now     .equ 0


match_found  .equ 1





clz77_2_start:


;decrement insize appropriately for autopointer advancement


 ld (input_sof),hl


 ex de,hl   ;DE=leading pointer (SoF, cptr)


 ld (compressed_output_sof),hl


 ld (hl),c


 inc hl


 ld (hl),b


 inc hl


 inc hl


 inc hl


 ld (output_ptr),hl


 ld (size_remain),bc


; ld L,c


; ld H,b


; add hl,de  ;size+SoF=EoF


; ld (input_eof),hl


 ld bc,-4    ;size of the compressed file header


 add ix,bc


 ld (output_size_limiter),ix


 ld hl,0


 ld (found_size),hl


 ld (num_codes),hl


 ld (lit_size),hl


 res match_found,(iy+myflags)


 jp clz77_2_add_to_literals  ;force first byte to literal, then process rest


 


;BC=size remain before HL==DE ; DE=cptr ; HL=lbhptr ; IX=


clz77_2_main_loop:


;matching initial byte


 ld a,(de)


 cpir


 jp po,clz77_2_no_match_in_range


;set BC to maximum range sans EoF, preserve lbhptr and cptr


 push bc


   push hl


     push de


       ;biscuits.


       push hl


         push de


           ;sizeremain-1>maxsize


           ld hl,(size_remain)


           dec hl


           ld bc,MAX_COUNT


           or a


           sbc hl,bc  ;carry if failed to fit (must use size remain value)


           jr nc,clz77_2_set_inner_count_to_maxsize


           inc hl


           add hl,bc  ;undo so size_remain is found again


           ld c,L


           ld b,H


clz77_2_set_inner_count_to_maxsize:


           ld (match_size_gate),bc


         pop de


       pop hl


       ld a,c


       or b


       jr z,clz77_do_not_update_match  ;do not process if no bytes to read


;restrict BC to maximum allowable match+1 to compensate for initial decrement.


;(THIS MEANS ALLOW BC TO BE MAXSIZE+1, OR EOF-CPTR+1


;If BC init to MAXSIZE then to get end size: (MAXSIZE-1-BC)=size.


; Note: simple CPL covers the (-1-BC) part i in A, so


; ld a,c \ cpl \ add a,MAXSIZE \ ta-da!


;


clz77_sub_loop:


       inc de


       ld a,(de)


       cpi


       jp nz,clz77_match_ended


       jp pe,clz77_sub_loop


clz77_match_ended:


;simplified for maxlen being a 1 byte value


       ld a,c


       neg


       ld hl,match_size_gate


       add a,(hl)


       ld c,a


       cp 3


       jp c,clz77_do_not_update_match


       ld a,(found_size)


       cp c  ;found-cur. only if carry results should replacement take place.


       jp nc,clz77_do_not_update_match


       ld a,c


       or a


       jp z,clz77_do_not_update_match


       ld (found_size),a


       set match_found,(iy+myflags)


     pop de


   pop hl


;HL=lbhptr DE=cptr


   push de


     ex de,hl


     or a


     sbc hl,de  ;cptr-(lbhptr-1) -> cptr-lbhptr+1


     inc hl


     ld (found_backref),hl


     ex de,hl


   pop de


 pop bc


 cp MAX_COUNT


 jr z,clz77_2_emit_LZ_codes_now


 jp clz77_2_main_loop


;pop all and jump back to start of main loop. The cptr==lbhptr equality check


;is embodied in the size arg in BC 


clz77_do_not_update_match:


     pop de


   pop hl


 pop bc


 jp clz77_2_main_loop


;BC=size remain before HL==DE ; DE=cptr ; HL=lbhptr ; IX=


;increment cptr, check bounds, and reset lbhptr


clz77_2_no_match_in_range:


;check if a match was found. If not, then just increment cptr and ensure that


;things are still in range. If not, then use match size to add to cptr and sub


;from size_remain (with error checking). Remember that size should never over-


;run size_remain since that check should've been done during size selection


 bit match_found,(iy+myflags)


 jp z,clz77_2_add_to_literals


clz77_2_emit_LZ_codes_now:


;emit current literals and emit LZ codes for best match


;then the stuff that's in the big block of txt above.


 call clz77_2_emit_literals


 ;call printlzdebug ;##########################################################


;TODO: IMPORTANT TASK AHEAD IN COMMENTS


;emit format: .db [9,8] bbcc ccc1, [7,6,...] bbbb bbbb


 ld hl,(found_backref) ;#&#&#&#&#&#&#&# LATER PROCESS THIS TO BE NEG HL


 ld a,L \ cpl \ ld L,a


 ld a,H \ cpl \ ld H,a


 inc hl


 ld a,(found_size)


 sub 2  ;size-2 is being stored.


 scf


 rla  ;float in tht bit.


 rrc h


 rrc h  ;move the two lower bits to the upper bits for combinatiorial goodness


 xor h


 and %00111111


 xor h


;emit bytes


 push de


   ld de,(output_ptr)


   ld (de),a


   inc de


   call clz77_2_decrement_output_limit_and_check


   ld a,L


   ld (de),a


   inc de


   call clz77_2_decrement_output_limit_and_check


   ld (output_ptr),de


 pop hl


 ld bc,(found_size)


 add hl,bc


 ex de,hl


 ld hl,(size_remain)


 or a


 sbc hl,bc


 ld (size_remain),hl


 jp c,clz77_2_blatant_size_mismatch_error


 jp z,clz77_end_compression


 ld hl,(num_codes)


 inc hl


 ld (num_codes),hl


 xor a


 res match_found,(iy+myflags)


 res emit_now,(iy+myflags)


 ld (found_size),a


 jp clz77_2_reset_lbh_ptr


 


clz77_2_add_to_literals:


;increment literal table count (dist from pointer), else initialize it with


;whatever cptr is before incrementing it, then emit literals if exceed 127+1


 ld hl,lit_size


 ld a,(hl)


 or a


 jr nz,clz77_2_skip_literal_table_reinit


 ld (lit_ptr),de


clz77_2_skip_literal_table_reinit:


 inc a


 ld (hl),a


 call m,clz77_2_emit_literals  ;emit literals if 128 (cuz max is 127+1)


 inc de


 ld hl,(size_remain)


 dec hl


 ld a,L


 or H


 ld (size_remain),hl


 jp z,clz77_end_compression


;to reset LBH ptr after either lit+LZ emit or add to lit


clz77_2_reset_lbh_ptr:


 ld hl,-MAX_BACKREF_DIST


 add hl,de  ;address of furthest backref.


 ld bc,(input_sof)


 or a


 sbc hl,bc  ;backref-sof. if carry, len exceeded.


 jr nc,clz77_2_set_lbh_to_max_backref_len


 push bc


   ld bc,MAX_BACKREF_DIST


   add hl,bc  ;need cptr-sof, for new backref size so add back what was taken before.


   ld c,L


   ld b,H


 pop hl  ;sof


 inc bc  ;compensate for final CPI


 jp clz77_2_main_loop


clz77_2_set_lbh_to_max_backref_len:


 add hl,bc  ;undo backref-sof to have just backref-max.


 ld bc,MAX_BACKREF_DIST+1  ;and set len to max


 jp clz77_2_main_loop





clz77_end_compression:  ;### END OF ALL THINGS HERE.


;finalizing shit.


 call clz77_2_emit_literals  ;emit any literals not yet emitted


 ld hl,(output_ptr)


 ld de,(compressed_output_sof)


 or a


 sbc hl,de


 ld c,L


 ld b,H


 ld hl,(num_codes)


 ex de,hl


 push hl


   inc hl


   inc hl


   ld (hl),e


   inc hl


   ld (hl),d  ;emit numcodes to file to finalize file stream output


 pop hl


 ret


 


 


clz77_2_emit_literals:


 ld a,(lit_size)


 or a


 ret z


 ;call printlitdebug ;#########################################################


 push de


   push hl


     push bc


       ld hl,(lit_ptr)


       ld de,(output_ptr)


       ld b,a


       dec a


       add a,a


       ld (de),a


       inc de


       call clz77_2_decrement_output_limit_and_check


       ld a,b


       ld bc,(output_size_limiter)


clz77_2_emit_literals_loop:


       ldi


       jp po,clz77_out_of_output_memory_handler


       dec a


       jr nz,clz77_2_emit_literals_loop


       ld (output_ptr),de


       ld (output_size_limiter),bc


       ld (lit_size),a  ;A already zero at this point.


       ld hl,(num_codes)


       inc hl


       ld (num_codes),hl       


     pop bc


   pop hl


 pop de


 ret


       





clz77_2_decrement_output_limit_and_check:


 push hl


   ld hl,(output_size_limiter)


   dec hl


   ld a,h


   or L


   ld (output_size_limiter),hl


   call z,clz77_out_of_output_memory_handler


 pop hl


 ret 


 


clz77_out_of_output_memory_handler:


 ld hl,1


 jp triggererror  ;external, returns to test.z80 for error out


 


clz77_2_blatant_size_mismatch_error:


 ld hl,2


 jp triggererror


 


.echoln "Cmprssr size: ",$-clz77_2_start

And some performance things

Code:




                               IN         OLD        NEW 


          NaCaNiS Sprite Set: 13401  ->  04281  ->  03357


        E:SoR tilemap hive 1: 06714  ->  02582  ->  02172


The Great Gatsby (txt, part): 28721  ->  25100  ->  20604


  Moby-[censored] (txt,part): 36678  ->  31569  ->  25938

I didn't bother testing the decompressor, but it ought to be fast. The fun stuff? That on-calc compressor. I tried compressing page 0 of TI-OS 1.16 and it took about 48 seconds and shrunk by about 3K. The Nacanis tileset? Took about 17 seconds.

Edit: Aug 21, 2016
Such fights. Many problems. So broken.
Commented a chunk of code out in the included Python script because it didn't work AS-IS with the z80 decompressor. Thanks Soru. I've also gotten much better at Python at this point so I may post again later to include something that hopefully will work faster and is more readable.

Edit (2, 3, and 4) Aug 21, 2016:
Okay. Actually tested the thing against what I posted. The python code ought to work AS-IS now. I begin to understand why CS professors like to demonize globals. Again. And again.

AHelper
1
Developers, developers, developers (Posts: 2367)

18 Mar 2015 10:21:56 pm

Since this topic has a PC spin to it, dropping a link to BCL which I used on the prizm and had partial support in GlassOS. Supports RLE, Huffman, Rice, LZ77 and Shannon-Fano compression.