Checking Against Infinity and NaN (not a number)

Validating a Sequence has no Infinity or Not a Number Values#

Given a sequence of floats, it may be necessary to check that there are only finite inputs, since many functions behave poorly with inputs of infinity or not a number (NaN). To do this, I wrote a quick function to check against three values, math.inf, -math.inf and math.nan.

from collections.abc import Sequence
import math

def validate_sequence_is_finite(input_seq: Sequence[float]) -> bool:
    """
    Given an input sequence of floats checks that none of the values
    are infinity, negative infinity, or not a number (therefore finite).

    Returns True if all values are finite, otherwise False.
    """

    for val in input_seq:
        if val in [math.inf, -math.inf, math.nan]:
            return False
    return True

I also wrote a series of tests to show that everything works as intended:

class TestSequenceValuesAreFinite(unittest.TestCase):
    def test_empty_sequence(self) -> None:
        self.assertTrue(sequence_values_are_finite([]))

    def test_valid_sequence(self) -> None:
        self.assertTrue(sequence_values_are_finite([1.0, 0.0, -42.0, 47102378931.0]))

    def test_has_math_infinity(self) -> None:
        self.assertFalse(sequence_values_are_finite([1.0, math.inf, -2.0]))

    def test_has_negative_math_infinity(self) -> None:
        self.assertFalse(sequence_values_are_finite([1.0, -2.0, -math.inf]))

    def test_has_math_nan(self) -> None:
        self.assertFalse(sequence_values_are_finite([math.nan, 1.0, -2.0]))

Not soon after I started getting reports that NaNs were not being caught by the above errors, so I added another test:

    def test_has_float_nan(self) -> None:
        self.assertFalse(sequence_values_are_finite([float("nan"), 1.0, -2.0]))

This test fails because float("nan") is not triggering matching math.nan but for some reason math.nan does match.

To validate to myself that it isn’t a difference between creating the values manually with float and the math library, I created tests that use float for the infinities, and these pass. Thus, it looks like only NaNs are exhibiting this behavior:

    def test_has_float_infinity(self) -> None:
        self.assertFalse(sequence_values_are_finite([float("inf"), 1.0, -2.0]))

    def test_has_float_negative_infinity(self) -> None:
        self.assertFalse(sequence_values_are_finite([float("-inf"), 1.0, -2.0]))

The function and all tests are in original_validation_code.py on GitHub.

How does `in` behave?#

In my post about iterable membership checks, while I discussed how there different ways it can go through objects to determine membership, I didn’t discuss how it does comparisons.

The membership test operations documentation details how this works. Personally, I always thought it used the value comparison operator (==), but that is only half of it. Specifically, it performs both identity and value equality verifications. The check passes if either val1 is val2 or val1 == val2 is True.

Going back to the original code, I then modified it in value_equality_validation_code.py to not use in but just check against value equality:

def value_equality_based_sequence_values_are_finite(input_seq: Sequence[float]) -> bool:
    for val in input_seq:
        if val == math.inf or val == -math.inf or val == math.nan:
            return False
    return True

For the same tests as above, now both math.nan and float("nan") are not detected, so what is the difference?

IEEE 754#

IEEE 754 is the “IEEE Standard for Floating-Point Arithmetic”, which is source of our interesting conundrum. The IEEE 754 Wikipedia article goes into more details, but a separate article for NaN or the Python math.nan documentation we learn that NaNs are not equal to anything else in IEEE 754, even themselves.

Looking at value comparisons between different nan objects, we see the result is always False:

>>> math.nan == float("nan")
False
>>> float("nan") == float("nan")
False
>>> math.nan == math.nan
False

Therefore, we cannot compare NaN to another NaN for equality. Instead, there is the math.isnan function which should be used to compare to nan.

How does `is` behave with `nan`?#

IEEE 754 explains why doing a check with equality will always fail to find a nan, but in the original code, in also did comparisons by identity, which does not follow IEEE 754! In this case, if you have two floats which both reference the same nan object, the is evaluation will return True.

float("nan") will create a new object each time. This is also true for math computations that create a nan, like float("inf") * 0. math.nan is an object in the math library and hold a reference to an object that has a value of nan. Thus, is comparisons between two objects that came from math.nan will equate to True, while other comparisons will equate to False. Doing similar tests as the value equalities above, we see that only math.nan is math.nan equates to True, while others equate to False.

>>> math.nan is float("nan")
False
>>> float("nan") is float("nan")
False
>>> math.nan is math.nan
True

Therefore, the difference of how is behaves for math.nan vs. the == comparison is why our original function worked with in for comparisons to math.nan but not float("nan").

What is `math.nan`?#

I’d like to make a minor detour to show how Python can cause problems if you are not careful or there is a malicious actor. The source code for CPython shows that nan is added with PyModule_Add. PyModule_Add is new in 3.13 based on the documentation but specifically acts similarly to PyModule_AddObjectRef which will “Add an object to module as name” per its documentation.

This means that unlike some other values, like False, math.nan is just a normal object. When the PyModule_Add function is run, a new nan float is created and assigned to math.nan. This does mean that you can overwrite math.nan like most other objects in a module!

Putting a simple line like math.nan = 42 would break any references to math.nan in any locations that import and run that line of code. A quick manual run shows this behavior is not protected:

>>> math.nan = 42
>>> math.isnan(math.nan)
False
>>> math.nan == 42
True

Lesson of the day is that Python will let you sometimes do things that you shouldn’t. It could be very easy overwrite math.nan (or any of the other “constants” in math like e, pi, tau, etc.) and cause very strange and unexpected behavior to later callers.

How to fix the code?#

As discussed earlier, there is the math.isnan function. Similarly there is a math.isinf that verifies if a number is positive or negative infinity. Instead of having to use two checks, it is possible to just use math.isfinite which does both checks at the same time. This makes our old code easily able to map to a one line solution using all, map, and math.isfinite to check all values easily and properly:

def fixed_sequence_values_are_finite(input_seq: Sequence[float]) -> bool:
    """
    Given an input sequence of floats checks that none of the values
    are infinity, negative infinity, or not a number (therefore finite).

    Returns True if all values are finite, otherwise False.

    """

    return all(map(math.isfinite, input_seq))

The tests are the same as original code and now all pass. The code and tests are all in fixed_validation_code.py on GitHub.

Conclusion#

nan is a special case that needs to be treated with care. Since nan does not equal itself due to IEEE 754, comparisons must be done with math.isnan or if checking for finite values, math.isfinite.

It is also important to understand how in not just checks value equality (==) but also identity equality (is). While nan is a special case that does not equal itself, many other Python objects behave the same way. For example dataclasses where eq is False and therefore do not have a valid __eq__ method could encounter the same type of error.