Description
Describe the bug
The type for pandas.read_sql_query
is specified by pandas-stubs
as a dictionary mapping from str
to str | bytes | date | timedelta | datetime64 | timedelta64 | int | float | complex
. But for some database drivers like psycopg2
or mysql-connector
, it's also possible to pass in tuples e.g. as the right-hand side of IN
conditions (WHERE ... IN (1, 2, 3)
).
To Reproduce
A minimal example using Postgres (it looks similar for MySQL except the parameter is written as :ids
in the query):
- Launch a Postgres Docker container:
docker run -p 9876:5432 -e POSTGRES_HOST_AUTH_METHOD=trust postgres
- Install dependencies and run this script to see that tuple substitutions for
IN
conditions work fine (I included PEP 723 script metadata so you can just douv run
orpipx run
on it to avoid needing to install dependencies manually):
# To make this executable directly w/ pipx run or uv run:
# /// script
# dependencies = [
# "psycopg2==2.9.9",
# "pandas==2.2.2"
# ]
# ///
import pandas as pd
import psycopg2
conn = psycopg2.connect(
host="localhost", dbname="postgres", user="postgres", port=9876
)
cur = conn.cursor()
cur.execute("CREATE TABLE test (id serial PRIMARY KEY)")
cur.execute("INSERT INTO test(id) VALUES(1)")
cur.execute("INSERT INTO test(id) VALUES(2)")
cur.execute("INSERT INTO test(id) VALUES(3)")
df = pd.read_sql_query(
"SELECT * FROM test WHERE id IN %(ids)s",
conn,
params={"ids": (1, 2)},
)
print(df)
Output is as expected:
id
0 1
1 2
- Run Mypy on it and get this error message:
bug.py:24: error: Dict entry 0 has incompatible type "str": "list[int]"; expected "str": "str | bytes | date | timedelta | datetime64 | timedelta64 | int | float | complex" [dict-item]
Please complete the following information:
- OS: Linux
- OS Version: Ubuntu 22.04
- python version: 3.10
- version of type checker: Mypy 1.11.2
- version of installed
pandas-stubs
: 2.2.2.240909
Additional context
Something like this doesn't work for sqlite3
connections, because it really does only accept "primitive" values for substitutions.
But IMO it doesn't make sense to restrict the set of allowed types to the universally supported subset, because some other weird database could decide to not even support int
substitutions or something like that and then even the current set wouldn't be valid.