Fix: Special handling of numeric type for oracle #3144

anuunchin · 2025-09-29T10:41:00Z

This PR adjusts sqla_col_to_column_schema in dlt/sources/sql_database/schema_types.py so that it correctly infers the bigint dlt type from numeric columns from oracle that represent integers. Previously, they were inferred as "double", which was causing issues downstream, when pyarrow tried to build float values for very large integer numbers. An appropriate test is added.

Resolves #3133

cloudflare-workers-and-pages · 2025-09-29T10:42:39Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	docs	`38f93e2`	Commit Preview URL Branch Preview URL	Sep 29 2025, 02:42 PM

dlt/sources/sql_database/schema_types.py

sh-rp

LGTM!

rudolfix

I'm not sure the solution here is correct. See my comments. @jflipts are you able to review this? if not we'll need to run this through real oracle db.

rudolfix · 2025-09-30T17:48:58Z

dlt/sources/sql_database/schema_types.py

-        # check for Numeric type first and integer later, some numeric types (ie. Oracle)
-        # derive from both
+        # special handling for oracle NUMBER types
+        if isinstance(sql_t, NUMBER) and sql_t.scale is None or sql_t.scale == 0:


I'm not sure this can be implemented this way. What about NUMERIC with 128bit precision and scale 0? We can't convert it into integer. This will work in Python (unlimited integers) but won't work with arrow which will not coerce such data.

we do not have oracle in our test setup so we cannot check it ourselves. My take would be to ask @jflipts to take a look at this. btw. in the issue a different method is used (._type_affinity)

I see how it's not the correct way to do this,

PS: the initial implementation of this PR was based on ._type_affinity, but is it really okay to use private attributes like this one? 👀

I'm not an oracle expert either. I'm just tasked with implementing extracts from an existing oracle db. ._type_affinity uses the same implementation as is used here.

If I understand it correctly, then arrow doesn't support 128 bit integers? I'm not sure how you would deal with that. Maybe interpret it as a decimal so it is coerced to https://arrow.apache.org/docs/python/generated/pyarrow.decimal128.html?

Oracle does not have 128bit either. What our current fix does is to interpret any NUMERIC with 0 scale as integer Python type while it should be simply DECIMAL with 0 scale.

~~@anuunchin let's go for type affinity. And we'll add Oracle to our tests because more problems are popping up.~~

OK I checked again what Oracle does and it looks like it always is using decimal arithmetic on NUMERIC types. So there are no binary integers in Oracle at all.

binary arithmetic is done on float/double types only.

@jflipts it looks like our code works correctly. Numeric types must always be Python Decimals. otherwise you'll do wrong type of arithmetic on them ie. decimal division is very different from binary division.

I just wonder why are you getting floats in the first place? Are you coercing decimals to floats?
https://docs.sqlalchemy.org/en/20/dialects/oracle.html#cx-oracle-numeric

if so large NUMERIC must fail, otherwise you lose significant bits in your numbers.

I'm getting floats, because that is the default behavior of the combination of sqlalchemy number with dlts type mapping for NUMBER types where number has a none or 0 scale.

sqlalchemy sets asdecimal to false asdecimal = bool(scale and scale > 0) https://github.com/sqlalchemy/sqlalchemy/blob/e833474db064df32a7c02e809893e71ac4df4996/lib/sqlalchemy/dialects/oracle/types.py#L48

I'm not sure why they introduced it like this in sqlalchemy, over 15 years ago:

sqlalchemy/sqlalchemy@d564baf

Change Oracle Number to Integer conversions sqlalchemy/sqlalchemy#885

dlt sets the type to double if asdecimal is false

if sql_t.asdecimal is False: col["data_type"] = "double"

dlt/dlt/sources/sql_database/schema_types.py

Line 119 in a3ec5bf

if sql_t.asdecimal is False:

My original solution was essentially to just interpret NUMBER the same in dlt as it was implemented in sqlalchemy, but now I'm not even sure it is correct there. In any case double/float is very wrong for NUMBER(p) with a 0 scale as the oracle documentation states it is an integer.
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Data-Types.html#GUID-9401BC04-81C4-4CD5-99E7-C5E25C83F608:~:text=Specify%20an%20integer%20using%20the%20following%20form%3A

rudolfix · 2025-09-30T17:50:25Z

dlt/sources/sql_database/schema_types.py

+        # special handling for oracle NUMBER types
+        if isinstance(sql_t, NUMBER) and sql_t.scale is None or sql_t.scale == 0:
+            col["data_type"] = "bigint"
+            if add_precision and sql_t.precision is not None:


I'm not sure it is correct. If there's a precision in numeric: is it a decimal precision or binary precision (integer) when we deal with integer? probably the safest was would be to not set it at all if we detect integer until we have a proper test setup with oracle (btw. we've added mssql recently)

The precision is decimal, but the combination with integer doesn't make much sense to me.

Special handling of numeric type for oracle

c2d2e13

anuunchin self-assigned this Sep 29, 2025

sh-rp reviewed Sep 29, 2025

View reviewed changes

dlt/sources/sql_database/schema_types.py Outdated Show resolved Hide resolved

No private attributes, explicit type comparison

38f93e2

sh-rp approved these changes Sep 30, 2025

View reviewed changes

rudolfix requested changes Sep 30, 2025

View reviewed changes

anuunchin added the question Further information is requested label Oct 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Special handling of numeric type for oracle #3144

Fix: Special handling of numeric type for oracle #3144

Uh oh!

anuunchin commented Sep 29, 2025

Uh oh!

cloudflare-workers-and-pages bot commented Sep 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

sh-rp left a comment

Uh oh!

rudolfix left a comment

Uh oh!

rudolfix Sep 30, 2025

Uh oh!

anuunchin Oct 2, 2025

Uh oh!

jflipts Oct 3, 2025

Uh oh!

rudolfix Oct 5, 2025 •

edited

Loading

Uh oh!

rudolfix Oct 5, 2025

Uh oh!

jflipts Oct 6, 2025

Uh oh!

rudolfix Sep 30, 2025

Uh oh!

jflipts Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix: Special handling of numeric type for oracle #3144

Are you sure you want to change the base?

Fix: Special handling of numeric type for oracle #3144

Uh oh!

Conversation

anuunchin commented Sep 29, 2025

Uh oh!

cloudflare-workers-and-pages bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

Uh oh!

sh-rp left a comment

Choose a reason for hiding this comment

Uh oh!

rudolfix left a comment

Choose a reason for hiding this comment

Uh oh!

rudolfix Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

anuunchin Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

jflipts Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

rudolfix Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rudolfix Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

jflipts Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

rudolfix Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

jflipts Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cloudflare-workers-and-pages bot commented Sep 29, 2025 •

edited

Loading

rudolfix Oct 5, 2025 •

edited

Loading