- Code optimisation
- 09 Jun 2007 06:28:57 pm
- Last edited by Fallen Ghost on 10 Jun 2007 08:57:26 am; edited 1 time in total
Ok, so here's my problem. This is the critical point of development of StarCalc. If I find a way to fit it in, then it might work all (if not, there's no point of continuing the whole)(at like 3 frames/s if it works), considering the timings I gave are the worst possible in each condition (the average would then be "X compare only" timing). The ways of improving the speed would be to :
- reduce number of searches per frame (now 25/frame, meaning all units searched in 8 frames)
- unroll the routine (that's no fun on memory, but can save a max of 196500 T states/frame over all timings)
- optimizing
You guys come in for the 3rd point. Any clock you can get saves 25*600 clocks overall. So 15000 T/clock removed. You have a relatively low amount of RAM (max ~1k), but still, and all the registers you may think of, even IX, IY and shadows and SP and R (except I, for interrupt speed regulation). You may change inputs and data structure.
Routine all in RAM
Data structure: Type (below 64=ground, upper=air),internal(7)
xcoordinate, ycoordinate, xtarget, ytarget, internal(4 bytes)
Timings: (400 units and 200 buildings, let's say all marines+turrets, so 600 searches)
-not in type search range
101+1200
780 600 T states (7.6 FPS)
-x compare only
101+2200
1 380 600 T states (4.3 FPS)
-x and y compare
101+3650
2 225 600 T states (2.6 FPS) (and that's only with target search)
Code:
Thanks in advance for your help!
- reduce number of searches per frame (now 25/frame, meaning all units searched in 8 frames)
- unroll the routine (that's no fun on memory, but can save a max of 196500 T states/frame over all timings)
- optimizing
You guys come in for the 3rd point. Any clock you can get saves 25*600 clocks overall. So 15000 T/clock removed. You have a relatively low amount of RAM (max ~1k), but still, and all the registers you may think of, even IX, IY and shadows and SP and R (except I, for interrupt speed regulation). You may change inputs and data structure.
Routine all in RAM
Data structure: Type (below 64=ground, upper=air),internal(7)
xcoordinate, ycoordinate, xtarget, ytarget, internal(4 bytes)
Timings: (400 units and 200 buildings, let's say all marines+turrets, so 600 searches)
-not in type search range
101+1200
780 600 T states (7.6 FPS)
-x compare only
101+2200
1 380 600 T states (4.3 FPS)
-x and y compare
101+3650
2 225 600 T states (2.6 FPS) (and that's only with target search)
Code:
;Inputs:
; -(DE) =xcoordinate byte of the unit that's searching a target
; -C reg8 =range of search+2
; -A reg8 =24 for all, 48 for air, 56 for ground
; -(Start) =location to start search, because we search 25/frame only
;Range explanation:
;
;X * * * * Y * * *
;for Y to be in range of X, normal range (without the +2 offset) shall be 5, so input=7
Target_search:
ld (loop+6),a;13
ld HL,(Start) ;16
ld b,25 ;7
push de ;11
exx ;4
pop de ;10
inc de ;6
exx ;4
ld (SaveSP),SP;20
ld SP,8 ;10;intialize=101T
loop:
ld a,(HL) ;7
add hl,sp ;11
or a ;4
jr z,next_unit ;12/7
cp 64 ;7
jr next_unit ;12/7 ;smc effectued here for condition
ld a,(de) ;7 ;7 ;7 ;7
sub a,(hl) ;7 ;7-7=0(nc);7-3=4(nc);7-11=-4 /252 (carry)
jr nc,$+2+2 ;12/7;jump ;jump ;not jump
neg ;8 ; ; ;+4
cp C ;4 ;cp 5 (r+2);cp 5 ;cp 5
jr nc,next_unit+1;12/7;carry ;carry ;carry
inc hl ;6
exx ;4
ld a,(DE) ;7
exx ;4
sub a,(hl) ;7
jr nc,$+2+2 ;12/7
neg ;8
dec hl ;6
cp C ;4
call c,target_found;17/10
next_unit:
add HL,SP ;11
;repeat 25 times
Thanks in advance for your help!