2007-08-15 16:28:22 +02:00
|
|
|
.. _tut-fp-issues:
|
|
|
|
|
|
|
|
**************************************************
|
|
|
|
Floating Point Arithmetic: Issues and Limitations
|
|
|
|
**************************************************
|
|
|
|
|
|
|
|
.. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net>
|
|
|
|
|
|
|
|
|
|
|
|
Floating-point numbers are represented in computer hardware as base 2 (binary)
|
|
|
|
fractions. For example, the decimal fraction ::
|
|
|
|
|
|
|
|
0.125
|
|
|
|
|
|
|
|
has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction ::
|
|
|
|
|
|
|
|
0.001
|
|
|
|
|
|
|
|
has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only
|
|
|
|
real difference being that the first is written in base 10 fractional notation,
|
|
|
|
and the second in base 2.
|
|
|
|
|
|
|
|
Unfortunately, most decimal fractions cannot be represented exactly as binary
|
|
|
|
fractions. A consequence is that, in general, the decimal floating-point
|
|
|
|
numbers you enter are only approximated by the binary floating-point numbers
|
|
|
|
actually stored in the machine.
|
|
|
|
|
|
|
|
The problem is easier to understand at first in base 10. Consider the fraction
|
|
|
|
1/3. You can approximate that as a base 10 fraction::
|
|
|
|
|
|
|
|
0.3
|
|
|
|
|
|
|
|
or, better, ::
|
|
|
|
|
|
|
|
0.33
|
|
|
|
|
|
|
|
or, better, ::
|
|
|
|
|
|
|
|
0.333
|
|
|
|
|
|
|
|
and so on. No matter how many digits you're willing to write down, the result
|
|
|
|
will never be exactly 1/3, but will be an increasingly better approximation of
|
|
|
|
1/3.
|
|
|
|
|
|
|
|
In the same way, no matter how many base 2 digits you're willing to use, the
|
|
|
|
decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base
|
|
|
|
2, 1/10 is the infinitely repeating fraction ::
|
|
|
|
|
|
|
|
0.0001100110011001100110011001100110011001100110011...
|
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
Stop at any finite number of bits, and you get an approximation. On most
|
|
|
|
machines today, floats are approximated using a binary fraction with
|
|
|
|
the numerator using the first 53 bits following the most significant bit and
|
|
|
|
with the denominator as a power of two. In the case of 1/10, the binary fraction
|
|
|
|
is ``3602879701896397 / 2 ** 55`` which is close to but not exactly
|
|
|
|
equal to the true value of 1/10.
|
|
|
|
|
|
|
|
Many users are not aware of the approximation because of the way values are
|
|
|
|
displayed. Python only prints a decimal approximation to the true decimal
|
|
|
|
value of the binary approximation stored by the machine. On most machines, if
|
|
|
|
Python were to print the true decimal value of the binary approximation stored
|
|
|
|
for 0.1, it would have to display ::
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
>>> 0.1
|
2009-04-24 05:09:06 +02:00
|
|
|
0.1000000000000000055511151231257827021181583404541015625
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
That is more digits than most people find useful, so Python keeps the number
|
|
|
|
of digits manageable by displaying a rounded value instead ::
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
>>> 1 / 10
|
|
|
|
0.1
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
Just remember, even though the printed result looks like the exact value
|
|
|
|
of 1/10, the actual stored value is the nearest representable binary fraction.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
Interestingly, there are many different decimal numbers that share the same
|
|
|
|
nearest approximate binary fraction. For example, the numbers ``0.1`` and
|
|
|
|
``0.10000000000000001`` and
|
|
|
|
``0.1000000000000000055511151231257827021181583404541015625`` are all
|
|
|
|
approximated by ``3602879701896397 / 2 ** 55``. Since all of these decimal
|
2009-04-24 21:06:29 +02:00
|
|
|
values share the same approximation, any one of them could be displayed
|
2009-04-24 05:09:06 +02:00
|
|
|
while still preserving the invariant ``eval(repr(x)) == x``.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
Historically, the Python prompt and built-in :func:`repr` function would chose
|
|
|
|
the one with 17 significant digits, ``0.10000000000000001``, Starting with
|
|
|
|
Python 3.1, Python (on most systems) is now able to choose the shortest of
|
|
|
|
these and simply display ``0.1``.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
Note that this is in the very nature of binary floating-point: this is not a bug
|
|
|
|
in Python, and it is not a bug in your code either. You'll see the same kind of
|
|
|
|
thing in all languages that support your hardware's floating-point arithmetic
|
|
|
|
(although some languages may not *display* the difference by default, or in all
|
|
|
|
output modes).
|
|
|
|
|
2009-02-21 21:59:32 +01:00
|
|
|
Python's built-in :func:`str` function produces only 12 significant digits, and
|
2007-08-15 16:28:22 +02:00
|
|
|
you may wish to use that instead. It's unusual for ``eval(str(x))`` to
|
|
|
|
reproduce *x*, but the output may be more pleasant to look at::
|
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
>>> str(math.pi)
|
|
|
|
'3.14159265359'
|
|
|
|
|
|
|
|
>>> repr(math.pi)
|
|
|
|
'3.141592653589793'
|
|
|
|
|
|
|
|
>>> format(math.pi, '.2f')
|
|
|
|
'3.14'
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
It's important to realize that this is, in a real sense, an illusion: you're
|
|
|
|
simply rounding the *display* of the true machine value.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-26 22:10:50 +02:00
|
|
|
One illusion may beget another. For example, since 0.1 is not exactly 1/10,
|
2009-04-26 23:37:46 +02:00
|
|
|
summing three values of 0.1 may not yield exactly 0.3, either::
|
|
|
|
|
|
|
|
>>> .1 + .1 + .1 == .3
|
|
|
|
False
|
|
|
|
|
|
|
|
Also, since the 0.1 cannot get any closer to the exact value of 1/10 and
|
|
|
|
0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with
|
|
|
|
:func:`round` function cannot help::
|
|
|
|
|
|
|
|
>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
|
|
|
|
False
|
|
|
|
|
|
|
|
Though the numbers cannot be made closer to their intended exact values,
|
|
|
|
the :func:`round` function can be useful for post-rounding so that results
|
|
|
|
have inexact values that are comparable to one another::
|
|
|
|
|
|
|
|
>>> round(.1 + .1 + .1, 1) == round(.3, 1)
|
|
|
|
True
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
Binary floating-point arithmetic holds many surprises like this. The problem
|
|
|
|
with "0.1" is explained in precise detail below, in the "Representation Error"
|
|
|
|
section. See `The Perils of Floating Point <http://www.lahey.com/float.htm>`_
|
|
|
|
for a more complete account of other common surprises.
|
|
|
|
|
|
|
|
As that says near the end, "there are no easy answers." Still, don't be unduly
|
|
|
|
wary of floating-point! The errors in Python float operations are inherited
|
|
|
|
from the floating-point hardware, and on most machines are on the order of no
|
|
|
|
more than 1 part in 2\*\*53 per operation. That's more than adequate for most
|
|
|
|
tasks, but you do need to keep in mind that it's not decimal arithmetic, and
|
|
|
|
that every float operation can suffer a new rounding error.
|
|
|
|
|
|
|
|
While pathological cases do exist, for most casual use of floating-point
|
|
|
|
arithmetic you'll see the result you expect in the end if you simply round the
|
|
|
|
display of your final results to the number of decimal digits you expect.
|
2008-05-26 03:03:56 +02:00
|
|
|
:func:`str` usually suffices, and for finer control see the :meth:`str.format`
|
|
|
|
method's format specifiers in :ref:`formatstrings`.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2008-10-05 19:57:52 +02:00
|
|
|
For use cases which require exact decimal representation, try using the
|
|
|
|
:mod:`decimal` module which implements decimal arithmetic suitable for
|
|
|
|
accounting applications and high-precision applications.
|
|
|
|
|
|
|
|
Another form of exact arithmetic is supported by the :mod:`fractions` module
|
|
|
|
which implements arithmetic based on rational numbers (so the numbers like
|
|
|
|
1/3 can be represented exactly).
|
|
|
|
|
2007-08-31 05:25:11 +02:00
|
|
|
If you are a heavy user of floating point operations you should take a look
|
|
|
|
at the Numerical Python package and many other packages for mathematical and
|
|
|
|
statistical operations supplied by the SciPy project. See <http://scipy.org>.
|
2008-10-05 18:46:29 +02:00
|
|
|
|
|
|
|
Python provides tools that may help on those rare occasions when you really
|
|
|
|
*do* want to know the exact value of a float. The
|
|
|
|
:meth:`float.as_integer_ratio` method expresses the value of a float as a
|
|
|
|
fraction::
|
|
|
|
|
|
|
|
>>> x = 3.14159
|
|
|
|
>>> x.as_integer_ratio()
|
|
|
|
(3537115888337719L, 1125899906842624L)
|
|
|
|
|
|
|
|
Since the ratio is exact, it can be used to losslessly recreate the
|
|
|
|
original value::
|
|
|
|
|
|
|
|
>>> x == 3537115888337719 / 1125899906842624
|
|
|
|
True
|
|
|
|
|
|
|
|
The :meth:`float.hex` method expresses a float in hexadecimal (base
|
|
|
|
16), again giving the exact value stored by your computer::
|
|
|
|
|
|
|
|
>>> x.hex()
|
|
|
|
'0x1.921f9f01b866ep+1'
|
|
|
|
|
|
|
|
This precise hexadecimal representation can be used to reconstruct
|
|
|
|
the float value exactly::
|
|
|
|
|
|
|
|
>>> x == float.fromhex('0x1.921f9f01b866ep+1')
|
|
|
|
True
|
|
|
|
|
|
|
|
Since the representation is exact, it is useful for reliably porting values
|
|
|
|
across different versions of Python (platform independence) and exchanging
|
|
|
|
data with other languages that support the same format (such as Java and C99).
|
|
|
|
|
2009-04-27 00:01:46 +02:00
|
|
|
Another helpful tool is the :func:`math.fsum` function which helps mitigate
|
|
|
|
loss-of-precision during summation. It tracks "lost digits" as values are
|
|
|
|
added onto a running total. That can make a difference in overall accuracy
|
|
|
|
so that the errors do not accumulate to the point where they affect the
|
|
|
|
final total:
|
|
|
|
|
|
|
|
>>> sum([0.1] * 10) == 1.0
|
|
|
|
False
|
|
|
|
>>> math.fsum([0.1] * 10) == 1.0
|
|
|
|
True
|
2008-10-05 18:46:29 +02:00
|
|
|
|
2007-08-15 16:28:22 +02:00
|
|
|
.. _tut-fp-error:
|
|
|
|
|
|
|
|
Representation Error
|
|
|
|
====================
|
|
|
|
|
|
|
|
This section explains the "0.1" example in detail, and shows how you can perform
|
|
|
|
an exact analysis of cases like this yourself. Basic familiarity with binary
|
|
|
|
floating-point representation is assumed.
|
|
|
|
|
|
|
|
:dfn:`Representation error` refers to the fact that some (most, actually)
|
|
|
|
decimal fractions cannot be represented exactly as binary (base 2) fractions.
|
|
|
|
This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many
|
2009-04-24 05:09:06 +02:00
|
|
|
others) often won't display the exact decimal number you expect.
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
Why is that? 1/10 is not exactly representable as a binary fraction. Almost all
|
|
|
|
machines today (November 2000) use IEEE-754 floating point arithmetic, and
|
|
|
|
almost all platforms map Python floats to IEEE-754 "double precision". 754
|
|
|
|
doubles contain 53 bits of precision, so on input the computer strives to
|
Merged revisions 69129-69131,69139-69140,69143,69154-69159,69169,69288-69289,69293,69297-69301,69348 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r69129 | benjamin.peterson | 2009-01-30 19:42:55 -0600 (Fri, 30 Jan 2009) | 1 line
check the errno in bad fd cases
........
r69130 | andrew.kuchling | 2009-01-30 20:50:09 -0600 (Fri, 30 Jan 2009) | 1 line
Add a section
........
r69131 | andrew.kuchling | 2009-01-30 21:26:02 -0600 (Fri, 30 Jan 2009) | 1 line
Text edits and markup fixes
........
r69139 | mark.dickinson | 2009-01-31 10:44:04 -0600 (Sat, 31 Jan 2009) | 2 lines
Add an extra test for long <-> float hash equivalence.
........
r69140 | benjamin.peterson | 2009-01-31 10:52:03 -0600 (Sat, 31 Jan 2009) | 1 line
PyErr_BadInternalCall() raises a SystemError, not TypeError #5112
........
r69143 | benjamin.peterson | 2009-01-31 15:00:10 -0600 (Sat, 31 Jan 2009) | 1 line
I believe the intention here was to avoid a global lookup
........
r69154 | benjamin.peterson | 2009-01-31 16:33:02 -0600 (Sat, 31 Jan 2009) | 1 line
fix indentation in comment
........
r69155 | david.goodger | 2009-01-31 16:53:46 -0600 (Sat, 31 Jan 2009) | 1 line
markup fix
........
r69156 | gregory.p.smith | 2009-01-31 16:57:30 -0600 (Sat, 31 Jan 2009) | 4 lines
- Issue #5104: The socket module now raises OverflowError when 16-bit port and
protocol numbers are supplied outside the allowed 0-65536 range on bind()
and getservbyport().
........
r69157 | benjamin.peterson | 2009-01-31 17:43:25 -0600 (Sat, 31 Jan 2009) | 1 line
add explanatory comment
........
r69158 | benjamin.peterson | 2009-01-31 17:54:38 -0600 (Sat, 31 Jan 2009) | 1 line
more flags which only work for function blocks
........
r69159 | gregory.p.smith | 2009-01-31 18:16:01 -0600 (Sat, 31 Jan 2009) | 2 lines
Update doc wording as suggested in issue4903.
........
r69169 | guilherme.polo | 2009-01-31 20:56:16 -0600 (Sat, 31 Jan 2009) | 3 lines
Restore Tkinter.Tk._loadtk so this test doesn't fail for problems
related to ttk.
........
r69288 | georg.brandl | 2009-02-05 04:30:57 -0600 (Thu, 05 Feb 2009) | 1 line
#5153: fix typo in example.
........
r69289 | georg.brandl | 2009-02-05 04:37:07 -0600 (Thu, 05 Feb 2009) | 1 line
#5144: document that PySys_SetArgv prepends the script directory (or the empty string) to sys.path.
........
r69293 | georg.brandl | 2009-02-05 04:59:28 -0600 (Thu, 05 Feb 2009) | 1 line
#5059: fix example.
........
r69297 | georg.brandl | 2009-02-05 05:32:18 -0600 (Thu, 05 Feb 2009) | 1 line
#5015: document PythonHome API functions.
........
r69298 | georg.brandl | 2009-02-05 05:33:21 -0600 (Thu, 05 Feb 2009) | 1 line
#4827: fix callback example.
........
r69299 | georg.brandl | 2009-02-05 05:35:28 -0600 (Thu, 05 Feb 2009) | 1 line
#4820: use correct module for ctypes.util.
........
r69300 | georg.brandl | 2009-02-05 05:38:23 -0600 (Thu, 05 Feb 2009) | 1 line
#4563: disable alpha and roman lists, fixes wrong formatting of contributor list.
........
r69301 | georg.brandl | 2009-02-05 05:40:35 -0600 (Thu, 05 Feb 2009) | 1 line
#5031: fix Thread.daemon property docs.
........
r69348 | benjamin.peterson | 2009-02-05 19:47:31 -0600 (Thu, 05 Feb 2009) | 1 line
fix download link
........
2009-02-06 03:40:07 +01:00
|
|
|
convert 0.1 to the closest fraction it can of the form *J*/2**\ *N* where *J* is
|
2007-08-15 16:28:22 +02:00
|
|
|
an integer containing exactly 53 bits. Rewriting ::
|
|
|
|
|
|
|
|
1 / 10 ~= J / (2**N)
|
|
|
|
|
|
|
|
as ::
|
|
|
|
|
|
|
|
J ~= 2**N / 10
|
|
|
|
|
|
|
|
and recalling that *J* has exactly 53 bits (is ``>= 2**52`` but ``< 2**53``),
|
|
|
|
the best value for *N* is 56::
|
|
|
|
|
|
|
|
>>> 2**52
|
2008-08-10 14:16:45 +02:00
|
|
|
4503599627370496
|
2007-08-15 16:28:22 +02:00
|
|
|
>>> 2**53
|
2008-08-10 14:16:45 +02:00
|
|
|
9007199254740992
|
2007-08-15 16:28:22 +02:00
|
|
|
>>> 2**56/10
|
2008-08-10 14:16:45 +02:00
|
|
|
7205759403792794.0
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
That is, 56 is the only value for *N* that leaves *J* with exactly 53 bits. The
|
|
|
|
best possible value for *J* is then that quotient rounded::
|
|
|
|
|
|
|
|
>>> q, r = divmod(2**56, 10)
|
|
|
|
>>> r
|
2008-08-10 14:16:45 +02:00
|
|
|
6
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
Since the remainder is more than half of 10, the best approximation is obtained
|
|
|
|
by rounding up::
|
|
|
|
|
|
|
|
>>> q+1
|
2008-08-10 14:16:45 +02:00
|
|
|
7205759403792794
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
Therefore the best possible approximation to 1/10 in 754 double precision is
|
|
|
|
that over 2\*\*56, or ::
|
|
|
|
|
|
|
|
7205759403792794 / 72057594037927936
|
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
Dividing both the numerator and denominator by two reduces the fraction to::
|
|
|
|
|
|
|
|
3602879701896397 / 36028797018963968
|
|
|
|
|
2007-08-15 16:28:22 +02:00
|
|
|
Note that since we rounded up, this is actually a little bit larger than 1/10;
|
|
|
|
if we had not rounded up, the quotient would have been a little bit smaller than
|
|
|
|
1/10. But in no case can it be *exactly* 1/10!
|
|
|
|
|
|
|
|
So the computer never "sees" 1/10: what it sees is the exact fraction given
|
|
|
|
above, the best 754 double approximation it can get::
|
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
>>> 0.1 * 2 ** 55
|
|
|
|
3602879701896397.0
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
If we multiply that fraction by 10\*\*60, we can see the value of out to
|
|
|
|
60 decimal digits::
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
>>> 3602879701896397 * 10 ** 60 // 2 ** 55
|
|
|
|
1000000000000000055511151231257827021181583404541015625
|
2007-08-15 16:28:22 +02:00
|
|
|
|
|
|
|
meaning that the exact number stored in the computer is approximately equal to
|
|
|
|
the decimal value 0.100000000000000005551115123125. Rounding that to 17
|
|
|
|
significant digits gives the 0.10000000000000001 that Python displays (well,
|
|
|
|
will display on any 754-conforming platform that does best-possible input and
|
|
|
|
output conversions in its C library --- yours may not!).
|
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
The :mod:`fractions` and :mod:`decimal` modules make these calculations
|
|
|
|
easy::
|
2007-08-15 16:28:22 +02:00
|
|
|
|
2009-04-24 05:09:06 +02:00
|
|
|
>>> from decimal import Decimal
|
|
|
|
>>> from fractions import Fraction
|
|
|
|
>>> print(Fraction.from_float(0.1))
|
|
|
|
3602879701896397/36028797018963968
|
|
|
|
>>> print(Decimal.from_float(0.1))
|
|
|
|
0.1000000000000000055511151231257827021181583404541015625
|