Survey and fix logger initialization #2216

asnare · 2026-01-06T16:30:33Z

Changes

What does this PR do?

This PR makes the following changes related to logging within the project:

Log-levels are only configured at entry points, not within code and not as a side-effect of importing anything or running tests.
Logging initialisation during the install/uninstall hooks is now consistent.
A temporary workaround (needed due to [BUG]: CLI ignores --log-level=warn #2167, resolves [BUG]: CLI mishandles --log-level= and --debug #2211) for the CLI is introduced so that we can once again get --debug logs if we need them.

Relevant implementation details

The base_install.py entry point for installation has been moved into install.py: this where it was logging from, and where it's normally located in blueprint-based applications.

Caveats/things to watch out for when reviewing:

Linked issues

Resolves: #2211
Relates to: #2167

Functionality

Modified existing commands:

databricks labs lakebridge * (all subcommands)
databricks labs install lakebridge
databricks labs uninstall lakebridge

Tests

manually tested
existing unit tests
existing integration tests

Manual Tests

databricks labs install lakebridge
databricks labs install lakebridge@consistent-log-initialization
rm -fr ~/.databricks/labs
databricks labs install lakebridge@consistent-log-initialization
databricks labs lakebridge install-transpile --debug
databricks labs install lakebridge@consistent-log-initialization
databricks labs uninstall lakebridge

…ized log-levels.

Tests should not do this.

The `.level` property is undocumented, and doesn't need to be set: the logger level is then inherited. Using `.getEffectiveLevel()` is the way this should be done.

…y initialised.

…mport. Entry points are responsible for configuring logging, not the code that does the logging.

…l hook.

codecov · 2026-01-06T16:34:50Z

Codecov Report

❌ Patch coverage is 22.80702% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.96%. Comparing base (fa47bd7) to head (8db65a2).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...s/assessments/synapse/dedicated_sqlpool_extract.py	0.00%	9 Missing ⚠️
.../assessments/synapse/monitoring_metrics_extract.py	0.00%	7 Missing ⚠️
...resources/assessments/synapse/workspace_extract.py	0.00%	7 Missing ⚠️
src/databricks/labs/lakebridge/cli.py	33.33%	6 Missing ⚠️
.../assessments/synapse/serverless_sqlpool_extract.py	0.00%	5 Missing ⚠️
...rc/databricks/labs/lakebridge/reconcile/execute.py	0.00%	4 Missing ⚠️
src/databricks/labs/lakebridge/__init__.py	60.00%	2 Missing ⚠️
...urces/assessments/synapse/common/duckdb_helpers.py	0.00%	2 Missing ⚠️
.../labs/lakebridge/assessments/dashboards/execute.py	0.00%	1 Missing ⚠️
.../resources/assessments/synapse/common/functions.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2216      +/-   ##
==========================================
- Coverage   64.05%   63.96%   -0.09%     
==========================================
  Files         100       99       -1     
  Lines        8625     8626       +1     
  Branches      889      888       -1     
==========================================
- Hits         5525     5518       -7     
- Misses       2928     2936       +8     
  Partials      172      172

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-01-06T16:36:11Z

✅ 129/129 passed, 8 flaky, 5 skipped, 19m42s total

Flaky tests:

🤪 test_installs_and_runs_local_bladebridge (19.92s)
🤪 test_installs_and_runs_pypi_bladebridge (22.68s)
🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (19.889s)
🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (19.998s)
🤪 test_transpiles_informatica_to_sparksql (21.446s)
🤪 test_transpile_teradata_sql_non_interactive[True] (21.202s)
🤪 test_transpile_teradata_sql (21.16s)
🤪 test_transpile_teradata_sql_non_interactive[False] (5.639s)

_{Running from acceptance #3446}

m-abulazm · 2026-01-07T11:18:11Z

src/databricks/labs/lakebridge/cli.py

-            self._logger_instance = self._logger
-            self._logger_instance.setLevel(logging.INFO)
-        return self._logger_instance
+    def _log_level(self, raw: str) -> int:


are we doing this change here because it is faster than releasing blueprint with that change?

Yes, and I'm hoping that it's a temporary. (My plan is to implement this workaround in blueprint next week if I don't have a better idea about when it will fixed upstream.)

m-abulazm · 2026-01-07T11:19:00Z

src/databricks/labs/lakebridge/__init__.py


+# Ensure that anything that imports this (or lower) submodules triggers setup of the blueprint
+# logging.
 install_logger()


we should move this to the entry points similar to bladebridge

Hmm… why?

because this means we keep reinitializing the logger over and over again instead of just once. it feels odd and will break some handlers e.g. file handlers. I know we dont use it and will be blocked in the future to use if we want to

@m-abulazm: That's not how imports work in python: modules are effectively imported once and cached. The way import works is:

Python looks inside sys.modules for the module. If it's there, import just uses the already-initialised module.

If not, locate via sys.path and initialise, which "runs" the module. (This is where the above install_logger() call happens.

Add the initialised module to sys.modules, after which step 1 will always find it.

(Without this approach things like using local imports to break cycles would not work.)

This means that the install_logger() above is only invoked once, not over and over again.

ok got it. I also googled a bit more and looks like import mypkg vs import src.mypkg will cause a module to run twice. ideally this should not happen otherwise it would be a bug

m-abulazm · 2026-01-07T11:20:55Z

src/databricks/labs/lakebridge/install.py

+    databricks_log_level = logging.DEBUG if is_in_debug() else logging.INFO
+    logging.getLogger("databricks").setLevel(databricks_log_level)


Suggested change

databricks_log_level = logging.DEBUG if is_in_debug() else logging.INFO

logging.getLogger("databricks").setLevel(databricks_log_level)

log_level = logging.DEBUG if is_in_debug() else logging.INFO

install_logger(log_level)

Can you elaborate on this a bit more? I don't understand why you're suggesting this change.

Here we're setting the filtering log-level on the databricks logger (which databricks.* delegate to by default)

This is different to the log-level that we pass to install_logger(): that one configures the minimum level that will be written out to the console: it defaults to DEBUG so that anything that reaches it will be written out, which is the conventional way to configure the handlers.

blueprint configures databricks logger depending on the requested log level from the user. why do we need to set it again?

we should also configure the root logger with the right log level. if we do not set it and only set databricks logger, the user might get the wrong DEBUG logs if they run with info as root logger allows DEBUG

@m-abulazm: Blueprint does not configure the logger level, it only configures the level of the handler it attaches. Handlers and loggers have their own (independent) levels. This is described in the docstring as such:

The root logger will be modified: - Its logging level will be left as-is. - All existing handlers will be removed. - A new handler will be installed with our custom formatter. It will be configured to emit logs at the given level (default: DEBUG) or higher, to the specified stream (default: sys.stderr).

Note that the level is set on the handler, we leave the logging level as it was.

we should also configure the root logger with the right log level. if we do not set it and only set databricks logger, the user might get the wrong DEBUG logs if they run with info as root logger allows DEBUG

As documented here Python's logging system defaults to WARNING and we don't touch that. When the user requests --debug the intent is that this only applies to the databricks.* modules/loggers.

App#__call__() configures the databricks logger which gets called in cli.py

meaning that we could leverage that instead of adding our own method

sundarshankar89

I will run e2e tests with this branch before we can merge. I'm unsure about the log level behaviour.

src/databricks/labs/lakebridge/resources/assessments/synapse/common/functions.py

sundarshankar89

LGTM, now that tests pass.

gueniai

LGTM

m-abulazm

LGTM

asnare added 10 commits January 6, 2026 16:46

Stop a few random places in the code from blindly overwriting initial…

81784f2

…ized log-levels.

Fix some tests that were overwriting (and not restoring) logging levels.

6b38f28

Tests should not do this.

Improve detection of current log-level when launching LSP server.

556a8f1

The `.level` property is undocumented, and doesn't need to be set: the logger level is then inherited. Using `.getEffectiveLevel()` is the way this should be done.

Restore install.py as the install hook, and ensure logging is properl…

0cec975

…y initialised.

Internally document why initialisation happens during import.

e531839

Eliminate another situation where log-levels are manipulated during i…

b81b480

…mport. Entry points are responsible for configuring logging, not the code that does the logging.

Ensure unit tests run with DEBUG-level logging.

9b5f617

Update the uninstall hook to set up logging identically to the instal…

0d2af62

…l hook.

Restore intended logger initialisation for blueprint-based entry point.

3e475d3

Work around upstream issue with the log-level handoff.

b388965

asnare self-assigned this Jan 6, 2026

asnare added bug Something isn't working tech debt design flaws and other cascading effects labels Jan 6, 2026

asnare temporarily deployed to tool January 6, 2026 16:30 — with GitHub Actions Inactive

Deal with cyclic import.

ca3857a

asnare temporarily deployed to tool January 6, 2026 17:09 — with GitHub Actions Inactive

asnare marked this pull request as ready for review January 6, 2026 17:16

asnare requested a review from a team as a code owner January 6, 2026 17:16

asnare requested review from gueniai, m-abulazm and sundarshankar89 January 6, 2026 17:17

m-abulazm requested changes Jan 7, 2026

View reviewed changes

Merge branch 'main' into consistent-log-initialization

7e2a22c

asnare temporarily deployed to tool January 7, 2026 13:20 — with GitHub Actions Inactive

sundarshankar89 reviewed Jan 7, 2026

View reviewed changes

src/databricks/labs/lakebridge/resources/assessments/synapse/common/functions.py Show resolved Hide resolved

Merge branch 'main' into consistent-log-initialization

ca99abb

asnare temporarily deployed to tool January 9, 2026 12:58 — with GitHub Actions Inactive

Merge branch 'main' into consistent-log-initialization

8441c3e

asnare temporarily deployed to tool January 12, 2026 09:17 — with GitHub Actions Inactive

asnare added 2 commits January 12, 2026 11:40

Update logging initialization for profiler pipeline steps.

02fb72f

Update logging initialization for the entry points of Lakeview jobs.

f4dab21

asnare temporarily deployed to tool January 12, 2026 10:41 — with GitHub Actions Inactive

Move initialize_logging() to avoid unnecessary imports.

f2dd941

asnare temporarily deployed to tool January 12, 2026 13:54 — with GitHub Actions Inactive

asnare requested a review from m-abulazm January 12, 2026 14:01

Squash another stray set_logger() call.

fc52822

asnare temporarily deployed to tool January 12, 2026 14:16 — with GitHub Actions Inactive

sundarshankar89 approved these changes Jan 12, 2026

View reviewed changes

eri-adepoju mentioned this pull request Jan 13, 2026

Update the dependency installation process to capture and log output #2226

Open

4 tasks

gueniai approved these changes Jan 14, 2026

View reviewed changes

Merge branch 'main' into consistent-log-initialization

8db65a2

asnare temporarily deployed to tool January 19, 2026 13:49 — with GitHub Actions Inactive

m-abulazm approved these changes Jan 21, 2026

View reviewed changes

m-abulazm added this pull request to the merge queue Jan 21, 2026

Merged via the queue into main with commit a678bfe Jan 21, 2026
6 checks passed

m-abulazm deleted the consistent-log-initialization branch January 21, 2026 10:07

		databricks_log_level = logging.DEBUG if is_in_debug() else logging.INFO
		logging.getLogger("databricks").setLevel(databricks_log_level)

Survey and fix logger initialization #2216

Survey and fix logger initialization #2216

Uh oh!

Conversation

asnare commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

What does this PR do?

Relevant implementation details

Caveats/things to watch out for when reviewing:

Linked issues

Functionality

Tests

Manual Tests

Uh oh!

codecov bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sundarshankar89 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sundarshankar89 left a comment

Choose a reason for hiding this comment

Uh oh!

gueniai left a comment

Choose a reason for hiding this comment

Uh oh!

m-abulazm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

asnare commented Jan 6, 2026 •

edited

Loading

codecov bot commented Jan 6, 2026 •

edited

Loading

github-actions bot commented Jan 6, 2026 •

edited

Loading