-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Consolidate dataframe examples (#18142) #18862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Consolidate dataframe examples (#18142) #18862
Conversation
High-Level OverviewThis PR consolidates all cargo run --example dataframe -- [dataframe|default_column_values|deserialize_to_struct] |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cj-zhukov and @martin-g
I have a suggestion about moving one of these examples to another file -- let me know if it makes sense
| /// | ||
| /// The metadata-based approach provides a flexible way to store default values as strings | ||
| /// and cast them to the appropriate types at query time. | ||
| pub async fn default_column_values() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure that this is a "dataframe" example -- it does use the DataFrame API but I think it is really illustrating how to provide a custom data source that fills in default column values
I suggest we move it to https://github.com/apache/datafusion/tree/main/datafusion-examples/examples/custom_data_source instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with you, let's move the example
| let usage = format!( | ||
| "Usage: cargo run --example {} -- [{}]", | ||
| ExampleKind::EXAMPLE_NAME, | ||
| ExampleKind::variants().join("|") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW for some reason
cargo run --example dataframedoesn't show me the valid options, it only tells me the argument is missing 🤔
Usage: cargo run --example dataframe -- [dataframe|default_column_values|deserialize_to_struct]
Error: Execution("Missing argument")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The options are listed here: '-- [dataframe|default_column_values|deserialize_to_struct]'
Maybe another format should be used...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I see what you mean - while the valid options are included in the usage line, they are not very visible, especially since they appear only after the “Missing argument” error.
I agree the formatting could be improved to make the valid options clearer. We could print them separately (as a list) or improve the usage string so it’s easier to read.
I'm happy to adjust the formatting if you have a preferred style, or I can propose one:
Usage:
cargo run --example dataframe <EXAMPLE>
Examples:
dataframe
default_column_values
deserialize_to_structThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just missed the options in my haste. Maybe we can improve the formatting for these errors as some follow on PR
Which issue does this PR close?
Rationale for this change
This PR is for consolidating all the
dataframeexamples (dataframe, default_column_values, deserialize_to_struct) into a single example binary. We are agreed on the pattern and we can apply it to the remaining examplesWhat changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?