Skip to content

Fix: SQL export reports success on failure #63#64

Merged
monozoide merged 2 commits intodevfrom
fix/63-sql-export-status-check
Oct 17, 2025
Merged

Fix: SQL export reports success on failure #63#64
monozoide merged 2 commits intodevfrom
fix/63-sql-export-status-check

Conversation

@monozoide
Copy link
Owner

@monozoide monozoide commented Oct 17, 2025

🐞 Bug Summary

The --sql-export command incorrectly reports success even when data conversion errors occur, resulting in an invalid or empty SQL file being created. When values like 'N/A' cannot be converted to integers for NOT NULL columns, the current code logs a warning and proceeds silently, giving users a false sense that their export succeeded.

Severity: high

🔁 Reproduction

  1. Run --sql-export on a dataset containing null-like values ('N/A', 'null', etc.) in NOT NULL integer columns
  2. Observe that conversion errors are logged as warnings but processing continues
  3. Verify that the command displays a success message despite conversion failures
  4. Check that an invalid or incomplete SQL file was created

Actual result:
The command returns a success message and zero exit code, but creates an invalid SQL file with missing or improperly converted data. Conversion errors are silently suppressed.

Expected result:
The command should detect conversion errors for NOT NULL columns, fail immediately, provide appropriate error feedback, and clean up any incomplete SQL files before returning a failure status.

🌍 Runtime Context

  • Affected version(s): 5.15.x
  • Environment(s): PROD
  • OS: Debian 13.1 - Trixie
  • Python: 3.13, 3.12, 3.11

🕵️ Analysis & Root Cause

The root cause stems from overly permissive error handling in the data conversion pipeline:

  • The format_sql_value function silently returns NULL when conversion fails, even for NOT NULL columns, without raising any error signal
  • The generate_insert_statement function does not validate conversion success or propagate errors
  • The run_sql_export function has no error aggregation mechanism, so it cannot detect when conversion failures have occurred
  • Incomplete SQL files are never cleaned up, leaving invalid artifacts behind

This silent error suppression allows the process to report success while producing unusable data.

  • Confirmed root cause: Insufficient validation in format_sql_value combined with lack of error tracking in run_sql_export

✅ Applied Fix

The fix implements strict error handling and validation throughout the conversion pipeline:

Changes to format_sql_value:

  • Now checks if a column is NOT NULL
  • Raises SQLExportError (new custom exception) when a null-like value ('N/A', 'null', etc.) is provided for a NOT NULL column
  • Raises SQLExportError when type conversion (e.g., integer conversion) fails for a NOT NULL column
  • Returns NULL only for nullable columns with invalid values

Changes to generate_insert_statement:

  • Wraps format_sql_value calls to catch SQLExportError
  • Re-raises caught exceptions with additional context (row data) for better debugging

Changes to run_sql_export:

  • Added a conversion_errors counter initialized to 0
  • Wrapped the main processing loop in try...except block to catch SQLExportError
  • When an error is caught: logs the error with details, increments the error counter, and continues to the next row
  • After the loop, checks if any errors occurred
  • If conversion_errors > 0: logs a final error message, deletes the incomplete SQL file, and returns False
  • Only reports success if the export completed without errors

This ensures strict validation of data conversions while providing comprehensive error reporting and cleanup.

  • Affected modules: sql_exporter.py
  • Risks/side effects:
    • Exports that previously appeared to succeed but contained conversion errors will now correctly fail and delete the incomplete file
    • Existing workflows or scripts relying on the previous (incorrect) behavior may need adjustment to handle failures appropriately
    • Performance impact: minimal (only additional validation checks and error tracking)
    • The stricter validation may surface previously hidden data quality issues in source datasets

🧪 Tests

Manual validation procedure:

  1. Prepare a test dataset with a NOT NULL integer column containing the value 'N/A'
  2. Run --sql-export on this dataset
  3. Verify the command returns a non-zero exit code (failure)
  4. Verify an error log message identifies the conversion failure with row details
  5. Verify no SQL file (or an incomplete one) exists in the output directory
  6. Prepare a valid dataset with all properly formatted values matching column types
  7. Run --sql-export on the valid dataset
  8. Expected result: Command returns zero exit code (success), produces a valid SQL file, no error logs

🔗 Links

Closes #63


✅ Project checklist

  • All my commits are signed (GPG/SSH) in accordance with the contribution guide
  • lint passes locally and in CI
  • No sensitive data exposed (credentials, tokens)
  • Manual validation completed with both valid and invalid datasets
  • Security/performance impacts have been reviewed
  • I have described a clear manual validation procedure

The --sql-export command was displaying a misleading success message even when data conversion errors occurred. This was because the `format_sql_value` function would log a warning and return NULL on conversion failure, but it did not propagate the error.

This commit makes the data conversion stricter by raising an `SQLExportError` when a conversion for a NOT NULL column fails. The `run_sql_export` function now handles this exception, counts the errors, and returns `False` if any errors occurred. It also deletes the incomplete SQL file to avoid leaving invalid artifacts.

This ensures that the SQL export process provides accurate feedback and only reports success when the export is actually successful.
@monozoide monozoide self-assigned this Oct 17, 2025
@monozoide monozoide added the bug Something isn't working label Oct 17, 2025
@monozoide monozoide moved this from Todo to In progress in MailLogSentinel Roadmap Oct 17, 2025
@monozoide monozoide added this to the v5.15.3 milestone Oct 17, 2025
@monozoide monozoide linked an issue Oct 17, 2025 that may be closed by this pull request
4 tasks
Tests now expect SQLExportError when None is provided for NOT NULL columns. Updated assertions to match new SQL statement formatting with quoted column names. This improves test accuracy and enforces stricter validation in SQL export logic.
@monozoide monozoide merged commit daa0c8f into dev Oct 17, 2025
4 checks passed
@github-project-automation github-project-automation bot moved this from In progress to Done in MailLogSentinel Roadmap Oct 17, 2025
@monozoide monozoide deleted the fix/63-sql-export-status-check branch October 17, 2025 14:15
monozoide added a commit that referenced this pull request Oct 17, 2025
* workflows/update-gitignore-and-create-mls-ci #34 (#38)

* workflows/update-gitignore #34

Updated .gitignore to allow .log files in the repository for test data.

* workflow/add-gh-actions-workflow #34

Introduces a CI workflow that runs linting and tests on code changes to the main and dev branch, while skipping these steps for documentation-only changes. This setup uses flake8 for linting and pytest for testing, and optimizes CI runs by detecting code vs. docs changes.

* docs: Add sample email report and log files to dataset #31 (#39)

Added sample_email_report_output.txt, sample_mail.log, and sample_sasl.log to docs/dataset for documentation and testing purposes. These files provide example outputs and logs for MailLogSentinel.

* chore: add standardized PR templates for all contribution types (#35) (#42)

* Add PR templates

Introduces standardized pull request templates for bugfixes, code changes, documentation, CI/CD, and features in the .github/PR_TEMPLATES directory. These templates help ensure consistent and thorough PR descriptions, validation steps, and project checklists across different types of contributions.

* Delete PULL_REQUEST_TEMPLATE.md #35

Deleted the .github/PULL_REQUEST_TEMPLATE.md file. This change removes the default template for new pull requests.

* Revamp README with clearer setup and feature guide #32 (#43)

* Revamp README with clearer setup and feature guide #32

The README has been rewritten for clarity and conciseness, featuring a new quick start guide, clearer prerequisites, simplified command references, and improved documentation links. The overview, installation, and usage instructions are now more accessible, and advanced features are summarized with direct links to the Wiki. The new format is more user-friendly for first-time users and contributors.

* Update README links and formatting #32

Corrected documentation and sample output links, updated the contributing guide URL, and improved formatting for the closing quote in the README.

* Fix relative link to sample email report in README #32

Updated the link to the daily email report example to use the correct relative path, ensuring the documentation points to the right file location.

* ci: fix path filter for docs-only changes #44 (#45)

Enhanced the GitHub Actions workflow to better distinguish between code and documentation changes using separate path filters for pull requests and pushes. Updated the lint job to use Python 3.11 and ruff instead of flake8, and improved dependency installation for both lint and test jobs. The workflow now supports a fast path for documentation-only changes, skipping unnecessary jobs.

* Revise and expand contributing guidelines #33 (#46)

* Revise and expand contributing guidelines #33

Updated CONTRIBUTING.md with clearer, more structured quick-start instructions and recommendations. Added a new CONTRIBUTING_DETAILED.md file providing comprehensive workflow, commit signing, quality standards, and contribution requirements to help contributors follow best practices.

* Fix relative links in contributing docs #33

Updated relative paths in CONTRIBUTING.md and CONTRIBUTING_DETAILED.md to ensure links to detailed guidelines and discussions work correctly.

Closes #33

* Revise and condense maillogsentinel man page #47 (#49)

The man page for maillogsentinel was rewritten for clarity, brevity, and improved structure. Redundant and verbose sections were condensed, option descriptions were clarified, and auxiliary tool documentation was streamlined. The new version emphasizes practical usage, configuration, diagnostics, and security best practices, while removing excessive detail and outdated formatting.

* Add manpages for ipinfo and log_anonymizer #48 (#50)

Introduces manual pages for the ipinfo and log_anonymizer command-line tools, providing usage instructions, options, examples, and related information for system administrators.

* Add comprehensive FAQ documentation (#51)

Introduces a detailed FAQ (docs/wiki/FAQ.md) covering installation, configuration, usage, maintenance, integrations, troubleshooting, data analysis, security, and development for MailLogSentinel. This resource aims to assist users and contributors with common questions and operational guidance.

* Update documentation links and add manual pages #52 (#53)

Adjusted wiki links to use correct relative paths, added FAQ link, and included references to manual pages for maillogsentinel, ipinfo, and log_anonymizer in the README.

* Create readable markdown versions of manpages #54 (#55)

Introduces manual pages in Markdown format for the ipinfo, log_anonymizer, and maillogsentinel utilities. These documents provide usage instructions, options, configuration details, examples, and security considerations for each tool as part of the MailLogSentinel project.

* Add Debian install guide for MailLogSentinel #21 (#56)

Introduces a comprehensive installation and configuration guide for MailLogSentinel on Debian 12/13. The guide covers prerequisites, system preparation, installation steps, configuration, verification, service and timer setup, advanced options, troubleshooting, security, and additional resources.

* Fix linting errors in CI workflow #58 (#59)

* Fix linting errors in CI workflow #58

This commit fixes a number of linting errors that were causing the CI workflow to fail. The errors were primarily related to unused imports, f-strings without placeholders, and unused variables.

* Refactor and clean up test code #58

Removed unused imports and variables in test_maillogsentinel_setup.py and test_sql_exporter.py. Updated test_run_sql_export_basic_flow to use context manager for patching datetime and simplified the mocking of the logger. These changes improve test clarity and maintainability.

* Remove duplicate unittest.mock import #58

Consolidated the import of patch and MagicMock from unittest.mock to avoid redundancy in the test file.

* Remove unused MagicMock import #58

Cleaned up the import statements by removing MagicMock, which was not used in the test file.

* Fix Python 3.13 compatibility with pathlib #61 (#62)

Refactored the SQL import/export functionality to use `importlib.resources.as_file` instead of the deprecated `pathlib.Path` context manager. This resolves a crash on Python 3.13, where `pathlib.Path` objects no longer support the context manager protocol.

close #61

* Fix: SQL export reports success on failure #63 (#64)

* Fix: SQL export reports success on failure #63

The --sql-export command was displaying a misleading success message even when data conversion errors occurred. This was because the `format_sql_value` function would log a warning and return NULL on conversion failure, but it did not propagate the error.

This commit makes the data conversion stricter by raising an `SQLExportError` when a conversion for a NOT NULL column fails. The `run_sql_export` function now handles this exception, counts the errors, and returns `False` if any errors occurred. It also deletes the incomplete SQL file to avoid leaving invalid artifacts.

This ensures that the SQL export process provides accurate feedback and only reports success when the export is actually successful.

* Update SQL export tests #63

Tests now expect SQLExportError when None is provided for NOT NULL columns. Updated assertions to match new SQL statement formatting with quoted column names. This improves test accuracy and enforces stricter validation in SQL export logic.

Closes #63

* Improve NULL handling for NOT NULL integer columns Fix #65 (#66)

Refines the logic in format_sql_value to treat columns as nullable only if 'NOT NULL' is absent from the SQL type definition. Adds stricter validation to prevent empty strings from being converted to integers for NOT NULL columns, and introduces a new test case to verify data conversion failure for NOT NULL integer fields.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[BUG] SQL export reports false success despite conversion errors

1 participant