Skip to content

Inconsistent behavior serializing python numbers and decimals #826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ucwillg opened this issue Mar 19, 2025 · 2 comments
Open

Inconsistent behavior serializing python numbers and decimals #826

ucwillg opened this issue Mar 19, 2025 · 2 comments

Comments

@ucwillg
Copy link

ucwillg commented Mar 19, 2025

Description

Hello all - we use msgspec to encode json of api call responses.

We recently discovered that there is a discrepancy between how python floats and decimals are encoded.

In particular, NaN and Infinity float values are encoded as "null", but NaN and Infinity Decimal values are interpolated directly, which produces invalid json.

We prefer the behavior of encoding these invalid numeric values as null.

I couldn't find a way to work around this without preprocessing our responses before passing them to msgspec - it doesn't seem like there is a way to add a custom hook for the encoding of known types like decimal. However, it is possible that I missed a way to do so.

Can you all please advise on whether this behavior is expected or not?

Replication code:


import msgspec
import math
import decimal

msgspec_encoder = msgspec.json.Encoder(decimal_format="number")


def encode(arg: object) -> object:
    return msgspec_encoder.encode(arg).decode()


def test(arg: object) -> None:
    wrapped_arg = {"value": arg}
    print(f"`{wrapped_arg}` encodes as `{encode(wrapped_arg)}`")

print(f"Version: {msgspec.__version__}")
test(1)
test(math.inf)
test(math.nan)
test(decimal.Decimal(1))
test(decimal.Decimal("Infinity"))
test(decimal.Decimal("NaN"))

Output:

Version: 0.19.0
`{'value': 1}` encodes as `{"value":1}`
`{'value': inf}` encodes as `{"value":null}`
`{'value': nan}` encodes as `{"value":null}`
`{'value': Decimal('1')}` encodes as `{"value":1}`
`{'value': Decimal('Infinity')}` encodes as `{"value":Infinity}`
`{'value': Decimal('NaN')}` encodes as `{"value":NaN}`
@provinzkraut
Copy link

Not saying this is invalid, but it's documented behaviour at least:

decimal.Decimal values are encoded as their string representation in all protocols by default. This ensures no precision loss during serialization, as would happen with a float representation.

Floats map to floats in all supported protocols. Note that per RFC8259, JSON doesn’t support nonfinite numbers (nan, infinity, -infinity); msgspec.json handles this by encoding these values as null.

https://jcristharif.com/msgspec/supported-types.html#decimal
https://jcristharif.com/msgspec/supported-types.html#float

@ucwillg
Copy link
Author

ucwillg commented Mar 20, 2025

That's a fair. The next section of the docs on decimals discusses the mode we're using here and doesn't have a note similar to the one you cited for floats.

For JSON and MessagePack you may instead encode decimal values the same as numbers by creating a Encoder and specifying decimal_format='number'.

In any case, the default behavior of either encoding mode is less important to us than having some way to efficiently encode all valid decimal.Decimal values as valid JSON, including NaN and Infinity values. We're happy to configure something custom, but it's not clear to me if there's a way to achieve this behavior without preprocessing the entire data structure before handing it off to msgspec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants