Java Floating Point error explain with IEEE 754 specification

dilshan ukwattage
3 min readMay 5, 2021

Check the below code snippet. What do you think about the answer?

So most people think the answer is 0.5. But the answer is not 0.5. So let’s see the answer first.

So as you can see the answer is not what we expected. So why we getting that kind of answer instead of 0.5? Because float and double data types are implement IEEE 754 Specification. So what is this IEEE 754 specification. Before we learn about IEEE 754 specification let’s see another example.

Check the below code snippet and check whether you got the answer what you expected.

In here also you can see the answer is not what we we expected. So let’s learn about IEEE 754 standard.

IEEE 754 Standard

Today, the most common representation for real numbers on computers is IEEE Standard 754 floating point, which is used on Intel-based PCs, Macs, and most Unix platforms. IEEE 754 has 3 basic components,

  1. Sign bit
  2. Exponent
  3. Mantissa

Sign bit

This is as straightforward as the title says. A positive number is represented by 0 and a negative number is represented by 1.

Exponent

Both positive and negative exponents must be represented in the exponent field. To get the stored exponent, a bias is applied to the real exponent.

Mantissa

The mantissa is part of a number in scientific notation or a floating-point number, consisting of its significant digits.

So IEEE 754 numbers mainly divided into three types based on the above three components.

  1. Single Precision

2. Double Precision

3. Long double precision.

Take a look below example which is converting the 9.1 into IEEE 754 standard.

Now let’s see some advanced example. So below 85.125 is converting to IEEE 754 standard.

85.125
85 = 1010101
0.125 = 001
85.125 = 1010101.001
=1.010101001 x 2^6
sign = 0

1. Single precision:
biased exponent 127+6=133
133 = 10000101
Normalised mantissa = 010101001
we will add 0's to complete the 23 bits

The IEEE 754 Single precision is:
= 0 10000101 01010100100000000000000

Big Decimal

The solution is using Big Decimal instead of float or double. Big decimal is class which is belong to Java.math Package. It extends the Number class and implements Comparable interface. I have used the Big Decimal class and solve the above mentioned problem.

So you can see I have got the exactly correct answer. So in Java Big Decimal class there are various of method and just look the implementation of the class and you can use methods which you need.

References

CodeLabs(https://www.youtube.com/watch?v=2VM028vpguU)

IEEE Standard 754 Floating Point Numbers(https://www.geeksforgeeks.org/ieee-standard-754-floating-point-numbers/)

Rounding off errors in Java(https://www.geeksforgeeks.org/rounding-off-errors-java/)

--

--