- (Z80/PC) (De)Compression Routines
- 26 Feb 2015 02:19:53 pm
- Last edited by Iambian on 18 Mar 2015 08:50:23 pm; edited 5 times in total
To start off, thanks to Xeda112358 for inspiring me to take a stab at this for myself. I'm doing this since I wanted to learn Python (which is a great language) and wanted my first .py script to be something useful. So...
LZ77 Compression
After reading a relatively easy-to-understand paper regarding LZ77, I've decided to take a crack at it. The algorithm I wrote (at the moment) uses fixed-length codes each at 8-bits, so the maximum back-reference distance and code size is 255. I put in two optimizations over the method described which are:
* If size = 0, omit backref and output symbol
* If backref = 0, emit following symbols by size+1
The 8-bit fixed length codes really hurt in terms of compression ratio for files that doesn't match very well, or has matches further back than 255 bytes, but I wasn't really writing this to have the best compression ratio. I just wanted something that worked for my first Python project.
Python script: Compress/Decompress (the important bits)
Thanks to Sorunome, who got the ball rolling with providing me the file I/O functions
Code:
z80 decompressor routine (again, the important bits)
Code:
So, the thing you all really want to know: Performance.
For files that contain many repeating sequences, such as tilemaps, this algo works quite well. For things such as text... not so well. The results are compared with the results of WinRAR (.rar, default compression), a popular archiver for Windows.
Code:
Things will probably improve when I add such things as variable-width codes, and some of the goodies that makes such products as pucrunch so awesome.
Edit: Feb 27, 2015
Test: On-calc decompression using an emulator running at 6MHz mode decompressed the NaCaNiS sprite set in a loop 255 times. This test took somewhat less than 25 seconds. The spriteset expands to 13401 bytes, so it wrote out the data in that instance at a rate of about 133KB/sec. To put that in perspective, a straight LDIR at 6MHz copies at about 279KB/sec, and since the important parts of the main loop *is* an LDIR, we get a bit of speed from that. Even though this was done in an emulator, that shouldn't change the fact that decompression is very fast.
----
Improved LZ77 (Runer112's variation):http://www.cemetech.net/forum/viewtopic.php?p=231705#231705
More improvements. And an on-calc compressor.http://www.cemetech.net/forum/viewtopic.php?p=232677#232677
This post will be modified periodically to keep any extra stuff I post all in one place.
LZ77 Compression
After reading a relatively easy-to-understand paper regarding LZ77, I've decided to take a crack at it. The algorithm I wrote (at the moment) uses fixed-length codes each at 8-bits, so the maximum back-reference distance and code size is 255. I put in two optimizations over the method described which are:
* If size = 0, omit backref and output symbol
* If backref = 0, emit following symbols by size+1
The 8-bit fixed length codes really hurt in terms of compression ratio for files that doesn't match very well, or has matches further back than 255 bytes, but I wasn't really writing this to have the best compression ratio. I just wanted something that worked for my first Python project.
Python script: Compress/Decompress (the important bits)
Thanks to Sorunome, who got the ball rolling with providing me the file I/O functions
Code:
#!/usr/bin/python3
import sys
#File format:
# header [compressedArraySizeLSB,compressedArraySizeMSB]
# header [outputArraySizeLSB,outputArraySizeMSB]
# tuples [count,backref,symbol(s)]
#If count=0, omit backref, emit symbol
#If backref=0, emit # of symbols by count+1
#Else, canonical LZ77.
MAXCODELEN = 255
def readFile(file):
a = []
f = open(file,'rb')
b = f.read(1)
while b!=b'':
a.append(ord(b))
b = f.read(1)
f.close()
return a
def writeFile(file,a):
f = open(file,'wb+')
f.write(bytes(a))
f.close()
#winsize: default -1 for SoF, else curpos-winsize bounded by SoF.
def resetSlidingWindow(curpos,winsize=None):
if winsize is None:
a=0
else:
if winsize>curpos:
a=0
else:
a=curpos-winsize
return a
#Tuple: [Size,Backref,Symbol(s)]
#If Size=0, then it becomes [0,Symbol]
#If backref=0, then symbols is a string of literals.
#
#
#
def returnLZCodes(count,backrefptr,symbol):
a = []
if isinstance(symbol,list):
b = symbol
else:
b = [symbol]
if count < 1:
a = [0] + b
else:
a = [count,backrefptr] + b
return a
# do not allow "compression" of files less than 2 bytes long
def compress(inArray):
global MAXCODELEN
lbhptr = 0 #filestart
cptr = 1 #filestart+1
matchfound = 0 #flag1
foundcount = 0
foundbackref = 0
outArray = []
literalbuf = [inArray[0]]
EoA = len(inArray) #not +1, cuz we need var at EoF to emit as literal
while cptr < EoA:
if inArray[lbhptr] == inArray[cptr]:
csrchptr = cptr
clbhptr = lbhptr
rlecount = 0
mcount = 0
while inArray[clbhptr] == inArray[csrchptr]:
if (csrchptr+1)==(EoA-1): #do not allow final tok to be consumed
break
if mcount >= MAXCODELEN:
break
if clbhptr==cptr:
clbhptr = lbhptr
rlecount += 1
mcount += 1
clbhptr += 1
csrchptr += 1
if (mcount > foundcount) and (mcount > 3): #replace 3 later with codelen
matchfound = 1
foundcount = mcount
foundbackref = cptr-lbhptr
foundposend = csrchptr
lbhptr += 1
if lbhptr >= cptr:
if matchfound == 1:
if len(literalbuf) > 0:
# if len(literalbuf) > 255:
# print("Error literalbuffer overrun!", str(literalbuf))
outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))
del literalbuf[:]
outArray.extend(returnLZCodes(foundcount,foundbackref,inArray[foundposend]))
# print("Match found: count " + str(foundcount) + ", backref " + str(foundbackref) + ", position " + str(cptr) + ", trailing num " + str(inArray[cptr+foundcount]) + " sanity check match " + str(inArray[foundposend]))
cptr = foundposend #to compensate for lookahead literal write, also if last symbol, next check falls through and exits out while loop normally
matchfound = 0
foundcount = 0
foundbackref = 0
else:
literalbuf.append(inArray[cptr])
if len(literalbuf) >= MAXCODELEN: #flush buffer if reached max code size
outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))
del literalbuf[:]
print("literalbuf filled, forcing buffer flush.")
cptr += 1
lbhptr = resetSlidingWindow(cptr,255)
if cptr == (EoA-1):
literalbuf.append(inArray[cptr])
break #break out of the while loop and force a buffer flush
if len(literalbuf) > 0:
outArray.extend(returnLZCodes(len(literalbuf)-1,0,literalbuf))
#later on, return 2D array, [[outArray],[listOfCompressionDetails]]
return outArray
#Ver 0.1: fixed len [size,backref,symbol(s)]
# If size=0, omit backref, use symbol.
# If backref=0, size+1= number of symbols (including trailing symbol)
#
def decompress(inArray):
pass #begin routine
outArray = []
cptr = 0
while cptr < len(inArray):
pass #begin loop
temps = inArray[cptr]
if temps == 0:
# print("Single literal found: " + str(inArray[cptr+1]))
outArray.append(inArray[cptr+1])
cptr += 2
else:
tempb = inArray[cptr+1]
if tempb == 0:
cptr += 2
tmpArray = []
for idx in range(temps+1):
tmpArray.append(inArray[cptr])
cptr += 1
# print("Literal array found: " + str(tmpArray))
outArray.extend(tmpArray)
else:
cptrorig = len(outArray)
# print("LZ encoding: cptrorig " + str(cptrorig) + ", tempb " + str(tempb) + ", idx " + str(idx))
for idx in range(temps):
outArray.append(outArray[(cptrorig-tempb)+idx])
outArray.append(inArray[cptr+2])
cptr += 3
pass #end loop
pass #end routine
return outArray
z80 decompressor routine (again, the important bits)
Code:
;LZ77 (1) Decompression
;Header: [outputSizeLSB,outputSizeMSB,inputSizeLSB,inputsizeMSB]
;Format: [size,backref,symbol(s)]
;If size=0, omit backref, output only symbol.
;If backref = 0, string of symbols follow indicating size+1
;Otherwise, its a string of matches starting from curptr-backref
;
;* Input may be streamed one byte at a time. It does not need random access.
;* Output is assumed to happen in-place and start anywhere.
;* Output data structure must be allocated yourself.
;
;HL = input file address start (must start at inputSizeLSB byte)
;DE = output file address start
;BC = file size (if called from dlz77_1_loop instead of dlz77_1_main)
dlz77_1_main:
ld c,(hl)
inc hl
ld b,(hl)
inc hl
dlz77_1_loop:
ld a,(hl)
inc hl
dec bc
or a
jr z,dlz77_1_print_one_literal
dec bc
push bc
ld b,a
ld a,(hl)
inc hl
or a
jr z,dlz77_1_print_many_literals
push hl
neg
ld L,a
ld h,$FF
add hl,de ;-offset+curpos
ld c,b
ld b,0
ldir
pop hl
dlz77_1_print_last_literal_in_block:
pop bc
dlz77_1_print_one_literal:
ld a,(hl)
inc hl
ld (de),a
inc de
dec bc
ld a,c
or b
jr nz,dlz77_1_loop
ret
dlz77_1_print_many_literals:
ld a,(hl)
inc hl
ld (de),a
inc de
ex (sp),hl
dec hl ;decrement BC that was pushed
ex (sp),hl
djnz dlz77_1_print_many_literals
jr dlz77_1_print_last_literal_in_block
So, the thing you all really want to know: Performance.
For files that contain many repeating sequences, such as tilemaps, this algo works quite well. For things such as text... not so well. The results are compared with the results of WinRAR (.rar, default compression), a popular archiver for Windows.
Code:
IN OUT WINRAR
NaCaNiS Sprite Set: 13401 -> 05083 (37%) 02541
E:SoR tilemap hive 1: 06714 -> 02842 (42%) 01600
The Great Gatsby (txt, part): 28721 -> 26637 (92%) 12372
Moby-[censored] (txt,part): 36678 -> 33429 (91%) 16131
Things will probably improve when I add such things as variable-width codes, and some of the goodies that makes such products as pucrunch so awesome.
Edit: Feb 27, 2015
Test: On-calc decompression using an emulator running at 6MHz mode decompressed the NaCaNiS sprite set in a loop 255 times. This test took somewhat less than 25 seconds. The spriteset expands to 13401 bytes, so it wrote out the data in that instance at a rate of about 133KB/sec. To put that in perspective, a straight LDIR at 6MHz copies at about 279KB/sec, and since the important parts of the main loop *is* an LDIR, we get a bit of speed from that. Even though this was done in an emulator, that shouldn't change the fact that decompression is very fast.
----
Improved LZ77 (Runer112's variation):http://www.cemetech.net/forum/viewtopic.php?p=231705#231705
More improvements. And an on-calc compressor.http://www.cemetech.net/forum/viewtopic.php?p=232677#232677
This post will be modified periodically to keep any extra stuff I post all in one place.