Skip to content

Commit 3b66046

Browse files
authored
feat: add named secret support DuckDB (#4912)
1 parent c6b9e18 commit 3b66046

File tree

3 files changed

+209
-19
lines changed

3 files changed

+209
-19
lines changed

docs/integrations/engines/duckdb.md

Lines changed: 92 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@
1010

1111
### Connection options
1212

13-
| Option | Description | Type | Required |
14-
|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|:--------:|
15-
| `type` | Engine type name - must be `duckdb` | string | Y |
16-
| `database` | The optional database name. If not specified, the in-memory database is used. Cannot be defined if using `catalogs`. | string | N |
17-
| `catalogs` | Mapping to define multiple catalogs. Can [attach DuckDB catalogs](#duckdb-catalogs-example) or [catalogs for other connections](#other-connection-catalogs-example). First entry is the default catalog. Cannot be defined if using `database`. | dict | N |
18-
| `extensions` | Extension to load into duckdb. Only autoloadable extensions are supported. | list | N |
19-
| `connector_config` | Configuration to pass into the duckdb connector. | dict | N |
20-
| `secrets` | Configuration for authenticating external sources (e.g., S3) using DuckDB secrets. | dict | N |
21-
| `filesystems` | Configuration for registering `fsspec` filesystems to the DuckDB connection. | dict | N |
13+
| Option | Description | Type | Required |
14+
|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------:|:--------:|
15+
| `type` | Engine type name - must be `duckdb` | string | Y |
16+
| `database` | The optional database name. If not specified, the in-memory database is used. Cannot be defined if using `catalogs`. | string | N |
17+
| `catalogs` | Mapping to define multiple catalogs. Can [attach DuckDB catalogs](#duckdb-catalogs-example) or [catalogs for other connections](#other-connection-catalogs-example). First entry is the default catalog. Cannot be defined if using `database`. | dict | N |
18+
| `extensions` | Extension to load into duckdb. Only autoloadable extensions are supported. | list | N |
19+
| `connector_config` | Configuration to pass into the duckdb connector. | dict | N |
20+
| `secrets` | Configuration for authenticating external sources (e.g., S3) using DuckDB secrets. Can be a list of secret configurations or a dictionary with custom secret names. | list/dict | N |
21+
| `filesystems` | Configuration for registering `fsspec` filesystems to the DuckDB connection. | dict | N |
2222

2323
#### DuckDB Catalogs Example
2424

@@ -194,9 +194,18 @@ DuckDB can read data directly from cloud services via extensions (e.g., [httpfs]
194194

195195
The `secrets` option allows you to configure DuckDB's [Secrets Manager](https://duckdb.org/docs/configuration/secrets_manager.html) to authenticate with external services like S3. This is the recommended approach for cloud storage authentication in DuckDB v0.10.0 and newer, replacing the [legacy authentication method](https://duckdb.org/docs/stable/extensions/httpfs/s3api_legacy_authentication.html) via variables.
196196

197-
##### Secrets Configuration Example for S3
197+
##### Secrets Configuration
198198

199-
The `secrets` accepts a list of secret configurations, each defining the necessary authentication parameters for the specific service:
199+
The `secrets` option supports two formats:
200+
201+
1. **List format** (default secrets): A list of secret configurations where each secret uses DuckDB's default naming
202+
2. **Dictionary format** (named secrets): A dictionary where keys are custom secret names and values are the secret configurations
203+
204+
This flexibility allows you to organize multiple secrets of the same type or reference specific secrets by name in your SQL queries.
205+
206+
##### List Format Example (Default Secrets)
207+
208+
Using a list creates secrets with DuckDB's default naming:
200209

201210
=== "YAML"
202211

@@ -253,6 +262,75 @@ The `secrets` accepts a list of secret configurations, each defining the necessa
253262
)
254263
```
255264

265+
##### Dictionary Format Example (Named Secrets)
266+
267+
Using a dictionary allows you to assign custom names to your secrets for better organization and reference:
268+
269+
=== "YAML"
270+
271+
```yaml linenums="1"
272+
gateways:
273+
duckdb:
274+
connection:
275+
type: duckdb
276+
catalogs:
277+
local: local.db
278+
remote: "s3://bucket/data/remote.duckdb"
279+
extensions:
280+
- name: httpfs
281+
secrets:
282+
my_s3_secret:
283+
type: s3
284+
region: "YOUR_AWS_REGION"
285+
key_id: "YOUR_AWS_ACCESS_KEY"
286+
secret: "YOUR_AWS_SECRET_KEY"
287+
my_azure_secret:
288+
type: azure
289+
account_name: "YOUR_AZURE_ACCOUNT"
290+
account_key: "YOUR_AZURE_KEY"
291+
```
292+
293+
=== "Python"
294+
295+
```python linenums="1"
296+
from sqlmesh.core.config import (
297+
Config,
298+
ModelDefaultsConfig,
299+
GatewayConfig,
300+
DuckDBConnectionConfig
301+
)
302+
303+
config = Config(
304+
model_defaults=ModelDefaultsConfig(dialect="duckdb"),
305+
gateways={
306+
"duckdb": GatewayConfig(
307+
connection=DuckDBConnectionConfig(
308+
catalogs={
309+
"local": "local.db",
310+
"remote": "s3://bucket/data/remote.duckdb"
311+
},
312+
extensions=[
313+
{"name": "httpfs"},
314+
],
315+
secrets={
316+
"my_s3_secret": {
317+
"type": "s3",
318+
"region": "YOUR_AWS_REGION",
319+
"key_id": "YOUR_AWS_ACCESS_KEY",
320+
"secret": "YOUR_AWS_SECRET_KEY"
321+
},
322+
"my_azure_secret": {
323+
"type": "azure",
324+
"account_name": "YOUR_AZURE_ACCOUNT",
325+
"account_key": "YOUR_AZURE_KEY"
326+
}
327+
}
328+
)
329+
),
330+
}
331+
)
332+
```
333+
256334
After configuring the secrets, you can directly reference S3 paths in your catalogs or in SQL queries without additional authentication steps.
257335

258336
Refer to the official DuckDB documentation for the full list of [supported S3 secret parameters](https://duckdb.org/docs/stable/extensions/httpfs/s3api.html#overview-of-s3-secret-parameters) and for more information on the [Secrets Manager configuration](https://duckdb.org/docs/configuration/secrets_manager.html).
@@ -273,9 +351,9 @@ The `filesystems` accepts a list of file systems to register in the DuckDB conne
273351
type: duckdb
274352
catalogs:
275353
ducklake:
276-
type: ducklake
277-
path: myducklakecatalog.duckdb
278-
data_path: abfs://MyFabricWorkspace/MyFabricLakehouse.Lakehouse/Files/DuckLake.Files
354+
type: ducklake
355+
path: myducklakecatalog.duckdb
356+
data_path: abfs://MyFabricWorkspace/MyFabricLakehouse.Lakehouse/Files/DuckLake.Files
279357
extensions:
280358
- ducklake
281359
filesystems:

sqlmesh/core/config/connection.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ class BaseDuckDBConnectionConfig(ConnectionConfig):
277277
catalogs: t.Optional[t.Dict[str, t.Union[str, DuckDBAttachOptions]]] = None
278278
extensions: t.List[t.Union[str, t.Dict[str, t.Any]]] = []
279279
connector_config: t.Dict[str, t.Any] = {}
280-
secrets: t.List[t.Dict[str, t.Any]] = []
280+
secrets: t.Union[t.List[t.Dict[str, t.Any]], t.Dict[str, t.Dict[str, t.Any]]] = []
281281
filesystems: t.List[t.Dict[str, t.Any]] = []
282282

283283
concurrent_tasks: int = 1
@@ -362,14 +362,22 @@ def init(cursor: duckdb.DuckDBPyConnection) -> None:
362362
"More info: https://duckdb.org/docs/stable/extensions/httpfs/s3api_legacy_authentication.html"
363363
)
364364
else:
365-
for secrets in self.secrets:
365+
if isinstance(self.secrets, list):
366+
secrets_items = [(secret_dict, "") for secret_dict in self.secrets]
367+
else:
368+
secrets_items = [
369+
(secret_dict, secret_name)
370+
for secret_name, secret_dict in self.secrets.items()
371+
]
372+
373+
for secret_dict, secret_name in secrets_items:
366374
secret_settings: t.List[str] = []
367-
for field, setting in secrets.items():
375+
for field, setting in secret_dict.items():
368376
secret_settings.append(f"{field} '{setting}'")
369377
if secret_settings:
370378
secret_clause = ", ".join(secret_settings)
371379
try:
372-
cursor.execute(f"CREATE SECRET ({secret_clause});")
380+
cursor.execute(f"CREATE SECRET {secret_name} ({secret_clause});")
373381
except Exception as e:
374382
raise ConfigError(f"Failed to create secret: {e}")
375383

tests/core/test_connection_config.py

Lines changed: 105 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
import pytest
66
from _pytest.fixtures import FixtureRequest
7-
from unittest.mock import patch
7+
from unittest.mock import patch, MagicMock
88

99
from sqlmesh.core.config.connection import (
1010
BigQueryConnectionConfig,
@@ -455,6 +455,110 @@ def test_duckdb(make_config):
455455
assert not config.is_recommended_for_state_sync
456456

457457

458+
@patch("duckdb.connect")
459+
def test_duckdb_multiple_secrets(mock_connect, make_config):
460+
"""Test that multiple secrets are correctly converted to CREATE SECRET SQL statements."""
461+
mock_cursor = MagicMock()
462+
mock_connection = MagicMock()
463+
mock_connection.cursor.return_value = mock_cursor
464+
mock_connection.execute = mock_cursor.execute
465+
mock_connect.return_value = mock_connection
466+
467+
# Create config with 2 secrets
468+
config = make_config(
469+
type="duckdb",
470+
secrets=[
471+
{
472+
"type": "s3",
473+
"region": "us-east-1",
474+
"key_id": "my_aws_key",
475+
"secret": "my_aws_secret",
476+
},
477+
{
478+
"type": "azure",
479+
"account_name": "myaccount",
480+
"account_key": "myaccountkey",
481+
},
482+
],
483+
)
484+
485+
assert isinstance(config, DuckDBConnectionConfig)
486+
assert len(config.secrets) == 2
487+
488+
# Create cursor which triggers _cursor_init
489+
cursor = config.create_engine_adapter().cursor
490+
491+
execute_calls = [call[0][0] for call in mock_cursor.execute.call_args_list]
492+
create_secret_calls = [call for call in execute_calls if call.startswith("CREATE SECRET")]
493+
494+
# Should have exactly 2 CREATE SECRET calls
495+
assert len(create_secret_calls) == 2
496+
497+
# Verify the SQL for the first secret (S3)
498+
assert (
499+
create_secret_calls[0]
500+
== "CREATE SECRET (type 's3', region 'us-east-1', key_id 'my_aws_key', secret 'my_aws_secret');"
501+
)
502+
503+
# Verify the SQL for the second secret (Azure)
504+
assert (
505+
create_secret_calls[1]
506+
== "CREATE SECRET (type 'azure', account_name 'myaccount', account_key 'myaccountkey');"
507+
)
508+
509+
510+
@patch("duckdb.connect")
511+
def test_duckdb_named_secrets(mock_connect, make_config):
512+
"""Test that named secrets are correctly converted to CREATE SECRET SQL statements."""
513+
mock_cursor = MagicMock()
514+
mock_connection = MagicMock()
515+
mock_connection.cursor.return_value = mock_cursor
516+
mock_connection.execute = mock_cursor.execute
517+
mock_connect.return_value = mock_connection
518+
519+
# Create config with named secrets using dictionary format
520+
config = make_config(
521+
type="duckdb",
522+
secrets={
523+
"my_s3_secret": {
524+
"type": "s3",
525+
"region": "us-east-1",
526+
"key_id": "my_aws_key",
527+
"secret": "my_aws_secret",
528+
},
529+
"my_azure_secret": {
530+
"type": "azure",
531+
"account_name": "myaccount",
532+
"account_key": "myaccountkey",
533+
},
534+
},
535+
)
536+
537+
assert isinstance(config, DuckDBConnectionConfig)
538+
assert len(config.secrets) == 2
539+
540+
# Create cursor which triggers _cursor_init
541+
cursor = config.create_engine_adapter().cursor
542+
543+
execute_calls = [call[0][0] for call in mock_cursor.execute.call_args_list]
544+
create_secret_calls = [call for call in execute_calls if call.startswith("CREATE SECRET")]
545+
546+
# Should have exactly 2 CREATE SECRET calls
547+
assert len(create_secret_calls) == 2
548+
549+
# Verify the SQL for the first secret (S3) includes the secret name
550+
assert (
551+
create_secret_calls[0]
552+
== "CREATE SECRET my_s3_secret (type 's3', region 'us-east-1', key_id 'my_aws_key', secret 'my_aws_secret');"
553+
)
554+
555+
# Verify the SQL for the second secret (Azure) includes the secret name
556+
assert (
557+
create_secret_calls[1]
558+
== "CREATE SECRET my_azure_secret (type 'azure', account_name 'myaccount', account_key 'myaccountkey');"
559+
)
560+
561+
458562
@pytest.mark.parametrize(
459563
"kwargs1, kwargs2, shared_adapter",
460564
[

0 commit comments

Comments
 (0)