Hello, I am making floating-point math routines in Axe, but I can't figure out how it works. I am getting the basic grasps of floating-point math, but IEEE754 is binary, which is very different from what I made already. Can anyone give some help/explaination on the system? Thanks!
Do you know the terms "Sign" "Exponent" "Mantissa"?
Yep, I know these.
I think I am going to understand the format:
1byte sign-11bit exponent-13-nib mantissa (+1bit)
is that right?
1 bit sign, 8 bit unsigned exponent, 23 bit mantissa.

if exponent == 0x00, you have unnormalized numbers (which includes ±0). ±(2^-126)*0.mantissabits

if 0x00 < exponent < 0xff, you have a normalized number ±(2^(exponentbits-127))*1.mantissabits

if exponent == 0xff && mantissabits == 0, you have ±infinity
if exponent == 0xff && mantissabits != 0, you have NaN.
Um, I will be using 64 bit format. This is 32-bit. Also, I do not really get the pseudocode in your post. Could you explain it?
aeTIos wrote:
Um, I will be using 64 bit format. This is 32-bit.

1 bit sign, 11 bits exponent, 52 bit mantissa. Though actually, the IEEE wants you to call it the "significand" rather than the mantissa (due to some historically conflicting definitions of mantissa).

Quote:
Also, I do not really get the pseudocode in your post. Could you explain it?

Which part is confusing you? Do you understand hexadecimal?
I understand hexidecimal. It is just a bit unreadable IMO. Could you explain it in words?
aeTIos wrote:
I understand hexidecimal. It is just a bit unreadable IMO. Could you explain it in words?

Being able to change bases in your head is invaluable. I'm happy to explain it, but why don't you try explaining it to me by interpreting the explanation I already gave in your own words first.
Okay, I'll try to explain.
If the exponent is $000, you have unnormalized (what is that?) numbers, which includes plus or minus 0.
Format is like (2^-126) then the mantissa bits
Did you intend it this way?
Also, could you explain normalized and unnormalized numbers?
thanks.
aeTIos wrote:
Okay, I'll try to explain.
If the exponent is $000, you have unnormalized (what is that?) numbers, which includes plus or minus 0.

un-normalized (or denormalized or subnormal) numbers are the small ones that don't have an implied 1._____ in front of the mantissa bits (and are instead 0.______). Additionally, the exponent is 2^-126.
aeTIos wrote:
Hello, I am making floating-point math routines in Axe, but I can't figure out how it works. I am getting the basic grasps of floating-point math, but IEEE754 is binary, which is very different from what I made already. Can anyone give some help/explaination on the system? Thanks!
At the risk of sounding like a pedant, everything on a calculator (and computer) is binary. The only thing special here is that the binary bits are being used to represent floating point numbers instead of integers.
'everything on a calculator (and computer) is binary'
Knew. I think I figured it out. I made an adding routine in Axe, but its only working for non-negative numbers. can someone help with some code for negative numbers? (pseudocode please)
aeTIos wrote:
'everything on a calculator (and computer) is binary'
Knew. I think I figured it out. I made an adding routine in Axe, but its only working for non-negative numbers. can someone help with some code for negative numbers? (pseudocode please)
Can you post your pseudocode for adding non-negative IEEE 754 floating-point numbers? And you are indeed using 32-bit numbers here, right?
not right now, I was using some self-built floating point math code, but in fact its the same. will post in a minute... (if my parents dont come home)
KermMartian wrote:
aeTIos wrote:
'everything on a calculator (and computer) is binary'
Knew. I think I figured it out. I made an adding routine in Axe, but its only working for non-negative numbers. can someone help with some code for negative numbers? (pseudocode please)
Can you post your pseudocode for adding non-negative IEEE 754 floating-point numbers? And you are indeed using 32-bit numbers here, right?


He just said he wanted 64, so I'm a little confused.
Wait, I mean 64 bit.
anyways, here is the code :

Code:
:.FLOATING MATH
:Lbl FCA
:GetCalc("appvtmpmath",100)→A
:Return
:Lbl FPA
:GetCalc("appvtmpmath")→A
:...
:format:
:expon-number
:...
:
:.Rewrite to high exponent
:{r1}→I
:{r2}→J
:max(I,J)→{A}
:If I-J
:If I>J
:I-J→V
:copy(r1+1,A+20,4)
:copy(r2+1,A+40+V,4)
:Else
:J-I→V
:copy(r2+1,A+20,4)
:copy(r1+1,A+40+V,4)
:End
:Else
:copy(r1+1,A+20,4)
:copy(r2+1,A+40,4)
:End
:
:A+20→r1
:A+40→r2
:.Do math
:For(B,0,3
:{A+5-B}+{r1+3-B}+{r2+3-B}→θ
:If θ>9
:1→{A+4-B}
:θ-10→θ
:End
:θ→{A+5-B}
:End
:
:.Check if exp should be incr.
:If {A+1}=1
:If {A+5}>4
:{A+4}+1→{A+4}
:End
:For(B,0,3
:{A+4-B}→{A+5-B}
:End
:0→{A+1}
:{A}+1→{A}
:Else
:If I-J
:If {A+44}>4
:{A+5}+1→{A+5}
:End
:End
:End
:A
:Return
:Lbl DFP
:GetCalc("appvTMP",40)→G
:{r1}→Z
:0→C
:For(B,2,5
:{r1+B}+48→{G+C+B}
:!If (B-2)-Z
:1→C
:46→{G+C+B}
:End
:End
:Disp G+2
:DelVar "appvTMP"
:Return

A bit Axe knowledge is recommended, but not 100% necessary. If you need any explaination, plz ask+include quote from code.

Edit: the routine uses 2 pointers to 2 4-digit floats. they are in r1,r2. Lbl DFP is a float displaying routine.
***Bump*** Any suggestions for my code?
What do you mean by 4-digit floats? 4 bytes? 4 bytes would be 32-bit floats, but you're talking potentially about 64-bit floats, right? I think you and we are both unclear about what bit-length numbers you're trying to implement.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement