Skip to content

Conversation

ghanse
Copy link
Collaborator

@ghanse ghanse commented Oct 15, 2025

Changes

Applies formatting and adds type hints to the following modules:

  • data_analyzer.py
  • text_generation_plugins.py

Linked issues

N/A

Requirements

  • manually tested
  • updated documentation
  • updated demos
  • updated tests

@ghanse ghanse requested review from a team as code owners October 15, 2025 22:31
@ghanse ghanse requested review from akshayamin, howardwu-db, nfx and renardeinside and removed request for a team October 15, 2025 22:31
@ghanse ghanse changed the title Format data_analyzer, text_generation_plugins Format data_analyzer and text_generation_plugins Oct 15, 2025
@ghanse ghanse self-assigned this Oct 15, 2025
Copy link
Collaborator

@howardwu-db howardwu-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, just some typos and maybe consistency issue with type checking!

:param dfData: Source ``DataFrame`` to summarize
:param rowLimit: Number of rows to use for ``DataFrame`` summarization
:param dfSummary: Summary metrics ``DataFrame``
:return: Summary metrics ``DataFrame`` with the added measure
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns plural here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed ✅

:param mname: Method name to invoke
:param args: Positional argumentss to pass to the Faker text generation method
:param _lib: Optional import alias of Faker library (dfault is ``"faker"``)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling "default"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed ✅

result = f"""minValue={minValue}, maxValue={maxValue}, step=0.1"""
elif sqlType == TimestampType():

elif isinstance(sqlType, types.DecimalType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep this as isinstance or replace with ==. If we want to have this take into account inheritance should we make the other typechecks also isinstance rather than ==?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the input is the actual instance types.DecimalType(), so we need to use equality and we'll need to keep the heuristic minimum and maximum values.

@ghanse ghanse requested a review from howardwu-db October 20, 2025 02:47
@ghanse ghanse merged commit e664451 into master Oct 20, 2025
3 checks passed
@ghanse ghanse deleted the fmt_modules_002 branch October 20, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants