Skip to content

Inconsitent typing for DataFrame.to_json #1179

Closed
@sk-

Description

@sk-

Describe the bug
DataFrame.to_json does not work with binary buffers, even though the original types do accept them both in the base class NdFrame and in the underlying json library.

This could be fixed by just adding WriteBuffer[bytes] as a valid argument, but probably better would be to restrict that argument to the case when the compression is set (not sure though if there are any other cases when a binary buffer is accepted and no compression is set).

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
import io

import pandas as pd

buffer = io.BytesIO()

df = pd.DataFrame()
df.to_json(buffer, compression="gzip")

print(len(buffer.getvalue()))

Note that if we change the buffer to a StringIO as suggested by the types we get the runtime warning:

pandas_types.py:8: RuntimeWarning: compression has no effect when passing a non-binary object as input.
  df.to_json(buffer, compression="gzip")

and compression is disabled
2. Indicate which type checker you are using (mypy or pyright). Both
3. Show the error message received from that type checker while checking your example.
Mypy

pandas_types.py:8: error: No overload variant of "to_json" of "DataFrame" matches argument types "BytesIO", "str"  [call-overload]
pandas_types.py:8: note: Possible overload variants:
pandas_types.py:8: note:     def to_json(self, path_or_buf: str | PathLike[str] | WriteBuffer[str], *, orient: Literal['records'], date_format: Literal['epoch', 'iso'] | None = ..., double_precision: int = ..., force_ascii: bool = ..., date_unit: Literal['s', 'ms', 'us', 'ns'] = ..., default_handler: Callable[[Any], str | float | bool | list[Any] | dict[Any, Any]] | None = ..., lines: Literal[True], compression: Literal['infer', 'gzip', 'bz2', 'zip', 'xz', 'zstd'] | dict[str, Any] | None = ..., index: bool = ..., indent: int | None = ..., mode: Literal['a']) -> None
pandas_types.py:8: note:     def to_json(self, path_or_buf: None = ..., *, orient: Literal['records'], date_format: Literal['epoch', 'iso'] | None = ..., double_precision: int = ..., force_ascii: bool = ..., date_unit: Literal['s', 'ms', 'us', 'ns'] = ..., default_handler: Callable[[Any], str | float | bool | list[Any] | dict[Any, Any]] | None = ..., lines: Literal[True], compression: Literal['infer', 'gzip', 'bz2', 'zip', 'xz', 'zstd'] | dict[str, Any] | None = ..., index: bool = ..., indent: int | None = ..., mode: Literal['a']) -> str
pandas_types.py:8: note:     def to_json(self, path_or_buf: None = ..., orient: Literal['split', 'records', 'index', 'columns', 'values', 'table'] | None = ..., date_format: Literal['epoch', 'iso'] | None = ..., double_precision: int = ..., force_ascii: bool = ..., date_unit: Literal['s', 'ms', 'us', 'ns'] = ..., default_handler: Callable[[Any], str | float | bool | list[Any] | dict[Any, Any]] | None = ..., lines: bool = ..., compression: Literal['infer', 'gzip', 'bz2', 'zip', 'xz', 'zstd'] | dict[str, Any] | None = ..., index: bool = ..., indent: int | None = ..., mode: Literal['w'] = ...) -> str
pandas_types.py:8: note:     def to_json(self, path_or_buf: str | PathLike[str] | WriteBuffer[str], orient: Literal['split', 'records', 'index', 'columns', 'values', 'table'] | None = ..., date_format: Literal['epoch', 'iso'] | None = ..., double_precision: int = ..., force_ascii: bool = ..., date_unit: Literal['s', 'ms', 'us', 'ns'] = ..., default_handler: Callable[[Any], str | float | bool | list[Any] | dict[Any, Any]] | None = ..., lines: bool = ..., compression: Literal['infer', 'gzip', 'bz2', 'zip', 'xz', 'zstd'] | dict[str, Any] | None = ..., index: bool = ..., indent: int | None = ..., mode: Literal['w'] = ...) -> None
Found 1 error in 1 file (checked 1 source file)

Pyright

pandas_types.py:8:1 - error: No overloads for "to_json" match the provided arguments (reportCallIssue)
pandas_types.py:8:12 - error: Argument of type "BytesIO" cannot be assigned to parameter "path_or_buf" of type "FilePath | WriteBuffer[str]" in function "to_json"
    Type "BytesIO" is not assignable to type "FilePath | WriteBuffer[str]"
      "BytesIO" is not assignable to "str"
      "BytesIO" is incompatible with protocol "PathLike[str]"
        "__fspath__" is not present
      "BytesIO" is incompatible with protocol "WriteBuffer[str]"
        "write" is an incompatible type
          Type "(buffer: ReadableBuffer, /) -> int" is not assignable to type "(__b: AnyStr_con@WriteBuffer, /) -> Any"
            Parameter 1: type "AnyStr_con@WriteBuffer" is incompatible with type "ReadableBuffer"
    ... (reportArgumentType)
2 errors, 0 warnings, 0 informations

Please complete the following information:

  • OS: MacOS
  • OS Version: 15.3.1
  • python version: 3.12.9
  • version of type checker: mypy 1.15.0 (compiled: yes), pyright 1.1.398
  • version of installed pandas-stubs: 2.2.3.250308

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO JSONread_json, to_json, json_normalize

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions