-
Notifications
You must be signed in to change notification settings - Fork 234
feat: Add support for Microsoft Fabric Warehouse #4751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4e7d6c7
to
b679716
Compare
Thanks for creating this PR draft, so I can try it out 😃 I tried the models creating by ProgrammingError:
('42000', '[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]An object or column name is missing or empty. For SELECT INTO statements, verify each column
has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name. (1038) (SQLExecDirectW)') The log show some interesting stuff:
This in particular looks suspect:
Here's my model: MODEL (
name data_according_to_business.hook.frame__northwind__customers,
kind FULL
);
SELECT
*
FROM data_according_to_business.dbo.raw__northwind__customers And the rendered SQL works just fine when evaluating: $ uv run sqlmesh evaluate hook.frame__northwind__customers
customer_id company_name contact_name contact_title ... fax _dlt_load_id _dlt_id region
0 ALFKI Alfreds Futterkiste Maria Anders Sales Representative ... 030-0076545 1750078533.6436024 xpfDb7mcWB5ijQ None
1 ANATR Ana Trujillo Emparedados y helados Ana Trujillo Owner ... (5) 555-3745 1750078533.6436024 Pr3sRmDpwu66mA None
2 ANTON Antonio Moreno Taquería Antonio Moreno Owner ... None 1750078533.6436024 X206DXOYfUMhMA None
3 AROUT Around the Horn Thomas Hardy Sales Representative ... (171) 555-6750 1750078533.6436024 UvMQUiuIwfPMVw None
4 BERGS Berglunds snabbköp Christina Berglund Order Administrator ... 0921-12 34 67 1750078533.6436024 sPupxoT/AS8XXA None
.. ... ... ... ... ... ... ... ... ...
86 WARTH Wartian Herkku Pirkko Koskitalo Accounting Manager ... 981-443655 1750078533.6436024 sbnEuPm0vmJbTw None
87 WELLI Wellington Importadora Paula Parente Sales Manager ... None 1750078533.6436024 JUEwhfkd5hbtYQ SP
88 WHITC White Clover Markets Karl Jablonski Owner ... (206) 555-4115 1750078533.6436024 iwjZC43nTrqgKg WA
89 WILMK Wilman Kala Matti Karttunen Owner/Marketing Assistant ... 90-224 8858 1750078533.6436024 LTCR7N1bsPyuhw None
90 WOLZA Wolski Zajazd Zbyszek Piestrzeniewicz Owner ... (26) 642-7012 1750078533.6436024 fDzC3tFHAgLfPQ None
[91 rows x 13 columns] |
Thanks. I will investigate later - perhaps we need |
I've made some progress... it fails later now:
But I can't find where the failing part is actually generated. |
@mattiasthalen the information schema query is generated in SQLGlot (source code) for "create if not exists" expressions, which are constructed in SQLMesh when trying to materialize the model (create physical table, etc). |
@georgesittas, would you say that most of these changes would be more suitable in sqlglot, a fabric-tsql dialect, if you will. Seeing as there are more differences between tsql and the version in fabric. |
Could you summarize what the differences are? I thought fabric used t-sql under the hood, but if the two diverge then what you say is reasonable. I'd start with this information schema example and then see if there are more examples besides that. |
@mattiasthalen yeah that's the conclusion I came to when I first started investigating this. Like all abstractions, the Fabric TSQL abstraction is leaky enough to be subtly different from the TSQL supported by SQL Server and not a drop-in replacement. @fresioAS thanks for giving this a go! The general process for adding new engines to SQLMesh is:
I know this is an early draft, but rather than implementing two separate adapters for The connection config could take a Note that the lakehouse side can just throw |
@erindru, I don't think there should be any separation between warehouse and Lakehouse. Both use the same type of sql endpoint, the "fabric flavored t-sql". The only difference I can think of is wether or not the Lakehouse supports schemas. As of now, you get the option to activate schemas when creating a Lakehouse. And that comes with its own issues, e.g., a weaker API. This might merit a parameter to tell if the catalog/database is a Lakehouse with/without schema, or a warehouse. But I agree that a different engine is overkill. With that said, the current code in this PR can actually query a Lakehouse. The host/endpoint used is the same for LH & WH, and you specify which one by the catalog/database. Same thing happens with the sql database object, they share host/endpoint and you select the object by setting the database. |
In that case, a coherent MS has probably improved this since I last looked, but isn't Lakehouse based on Spark SQL and Warehouse based on the "Polaris" flavour of TSQL? |
Well, yeah. Spark SQL is used in a Lakehouse to create tables. But you can use the SQL endpoint to query it, and I think you can create views with it. The warehouse can use both tsql and spark. |
The latest commit including the dialect found with this sqlglot fork allows me to reference lakehouse external data. Now there is most likely some overlaps between the engine and the dialect now, and also there is a good amount of generated code that is probably irrelevant. Try it out @mattiasthalen and check if we get a bit closer towards the goal |
You're fast! I haven't even fired up a codespace for sqlglot yet. Did a quick test, but all I got was that there is no fabric dialect. Not sure if the error comes from sqlmesh or sqlglot. |
Did my own attempt at creating a fabric dialect (https://github.com/mattiasthalen/sqlglot/tree/add-fabric-tsql-dialect), so far only ensures
|
test it on datetime2 - i had to do some changes there to get it to work |
605badf
to
332ea32
Compare
@georgesittas / @erindru, what's needed for the config to be available for Python configs? |
Do you mean Nothing - if you're using Python config, you're just manually instantiating the same classes that Yaml config would automatically instantiate |
You should also add |
I got it all working here: https://github.com/mattiasthalen/sqlmesh/tree/fabric It requires the SQLGlot from main, hopefully that will be released soon. |
Makes the codebase simpler - probably a standalone PR to change the mssql adapter |
@fresioAS & @georgesittas I'm curious, how is the fabric dialect from sqlglot included here? I thought sqlglot needed a new tag and the version bumped in sqlmesh. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
I have a pending PR in SQLGlot that will sort the datetime2 errors
Hey folks 👋 Just a heads up that I've released the latest Fabric fixes / additions in SQLGlot and there's a PR up here to bump its version. Keep an eye out for when it's merged so you can rebase and run the CI with the latest state of the other repo. |
Found a pesky bug, it tries to execute "das"."raw"."raw__northwind__territories"
ProgrammingError:
('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Incorrect syntax
near 'MERGE'. (102) (SQLExecDirectW)") The model config is: @model(
"das.raw.@{name}",
is_sql=True,
kind=dict(
name=ModelKindName.INCREMENTAL_BY_UNIQUE_KEY,
unique_key="_record__hash",
disable_restatement=True
),
blueprints=generate_blueprints("northwind"),
)
|
@georgesittas / @erindru, from my understanding, the merge error I got is due to the unique key incremental strategy, I thought this would be handled be fabric setting the strategy to delete insert. What is needed to fix that, override the merge function in mssql? |
The
|
Is it as simple as: FabricEngineAdapter(LogicalMergeMixin, MSSQLEngineAdapter): |
Co-authored-by: Mattias Thalén <bitter-polders0x@icloud.com>
@fresioAS, I'm sorry, I forgot |
The work so far looks reasonable to me, but I think we need to have integration tests to consider this ready. At least the sushi project should be able to run end-to-end without issues. @treysp @erindru since you folks have already implemented a few adapters, I'll let you take a look as well and verify the above claim re: testing. |
@georgesittas, wouldn't that need access to a fabric capacity? Trying to wrap my head around it |
@mattiasthalen yes, it would. In short, a new adapter can't be merged until it passes our integration test suite. We can assist with this however. I'm going to do some work to enable integration tests for Fabric, with the intention that you set some environment variables to point to a Fabric endpoint you control (the same way the rest of our integration tests work). This should enable you to run the tests locally and address any gaps. I assume you have access to a Fabric capacity since you've been testing it so far? |
@erindru yes |
DIALECT: t.ClassVar[t.Literal["fabric"]] = "fabric" # type: ignore | ||
DISPLAY_NAME: t.ClassVar[t.Literal["Fabric"]] = "Fabric" # type: ignore | ||
DISPLAY_ORDER: t.ClassVar[t.Literal[17]] = 17 # type: ignore | ||
driver: t.Literal["pyodbc"] = "pyodbc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This driver literal doesn't seem to work. If I don't manually specify it in my Fabric connection config:
gateways:
fabric:
connection:
type: fabric
#driver: pyodbc
Then the pymssql
check still happens and SQLMesh aborts with:
Error: Failed to import the 'pymssql' engine library.
Here's a commit where I add the primitives to enable running our integration tests against Fabric. It relies on a service account currently but you can modify it to suit your local auth. To use it, set the following env vars:
and then run Only 6 tests are failing currently so nice work 👍 |
Got the test suite running. Just need to find the mental capacity to decode what's wrong 😅 |
Add Microsoft Fabric Engine Support
Overview
This PR adds support for Microsoft Fabric as a new execution engine in SQLMesh. Users can now connect to and execute queries on Microsoft Fabric Data Warehouses.
Changes
Documentation:
docs/integrations/engines/fabric.md
with Fabric connection options, installation, and configuration instructions.docs/guides/configuration.md
anddocs/integrations/overview.md
.mkdocs.yml
to include the new Fabric documentation page.Core Implementation:
FabricConnectionConfig
, inheriting fromMSSQLConnectionConfig
, with Fabric-specific defaults and validation.FabricAdapter
) in the registry.sqlmesh/core/engine_adapter/fabric.py
with Fabric-specific logic, including the use ofDELETE
/INSERT
for overwrite operations.Testing:
tests/core/engine_adapter/test_fabric.py
for adapter logic, table checks, insert/overwrite, and replace query tests.tests/core/test_connection_config.py
for config validation and ODBC connection string generation.Configuration:
pyproject.toml
to add afabric
test marker.