Document the data options #1539

Azaya89 · 2025-04-04T17:35:44Z

This PR documents the Data Options in the new API reference guide.

maximlt · 2025-04-07T09:53:48Z

doc/ref/plot_options/data.ipynb

@@ -19,7 +19,7 @@
   "source": [
    "## `by`\n",
    "\n",
-    "Text TBD."
+    "The `by` option allows you to group your data based on one or more categorical variables. By specifying a column name (or a list of column names) with `by`, the plot automatically separates the data into groups. This makes it easier to compare different subsets of your data in a single visualization. For instance, in the penguin dataset, grouping by 'species' column creates separate overlays (or subplots when using `subplots=True`) for each species."


Let's indicate that an NdOverlay is returned normally and a NdLayout when subplots=True. NdOverlay and NdLayout should link to their relevant page in HoloViews using intersphinx cross-references.

maximlt

@Azaya89 I had a quick look and that's a great start!! The documentation is going to be so much better with that new content, really looking forward to it. One comment is that so far the page is tabular-centric. Instead of using the term column maybe it'd be best to use a more generic term like variable or dimension. Having some xarray examples would be nice too. Other than that, I've left a few comments.

maximlt · 2025-04-07T09:54:56Z

doc/ref/plot_options/data.ipynb

@@ -19,7 +19,7 @@
   "source": [
    "## `by`\n",
    "\n",
-    "Text TBD."
+    "The `by` option allows you to group your data based on one or more categorical variables. By specifying a column name (or a list of column names) with `by`, the plot automatically separates the data into groups. This makes it easier to compare different subsets of your data in a single visualization. For instance, in the penguin dataset, grouping by 'species' column creates separate overlays (or subplots when using `subplots=True`) for each species."


hvPlot users are often confused with by, groupby, and color. Let's keep in mind we should make it clear in the reference how they differ (I also think we will have specific how-tos for that).

maximlt · 2025-04-07T09:56:11Z

doc/ref/plot_options/data.ipynb

+    "\n",
+    "df = hvsampledata.penguins(\"pandas\")\n",
+    "\n",
+    "df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', by='species', subplots=True).cols(1)"


I'd say either add a note to explain cols(1) or keep it simple without calling it.

OK. Chose to remove it instead, heh.

maximlt · 2025-04-07T10:00:24Z

doc/ref/plot_options/data.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `fields` option lets you rename or transform your dataset’s dimensions before plotting. If your data contains column names that aren’t descriptive or need minor adjustments for clarity, you can use `fields` to rename them or apply simple transformations. This can help to make your plots more understandable and tailored to your needs."


What sort of transformation can one do?

Looked a bit at the converter.py file and it seems you can also assign metadata as well. Will update that section.

maximlt · 2025-04-07T10:10:32Z

doc/ref/plot_options/data.ipynb

+   "metadata": {},
+   "source": [
+    "## `kind`\n",
+    "The kind option determines the type of plot to generate from your data. By specifying a plot kind (such as ‘line’, ‘scatter’, or ‘bar’), you tell hvPlot which plot to create. The default is 'line', which generates a line plot. Changing the `kind` parameter lets you quickly experiment with different visual representations without altering your data."


It's the default for tabular data but not for xarray.

Oh? Will look into it then...

maximlt · 2025-04-07T10:13:34Z

doc/ref/plot_options/data.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The label option allows you to specify a custom name for your dataset that appears in the plot title or legend."


There's the title plot option that has a somewhat similar effect. It'd be nice to explain when label should be used over title.

I'm not sure yet what the answer to this is. Will look into it more...

Azaya89 · 2025-04-11T17:51:22Z

doc/ref/plot_options/data.ipynb

@@ -19,7 +17,7 @@
   "source": [
    "## `by`\n",
    "\n",
-    "Text TBD."
+    "The `by` option allows you to group your data based on one or more categorical variables. By specifying a dimension name (or a list of dimension names) with `by`, the plot automatically separates the data into groups, making it easier to compare different subsets in a single visualization. By default, an :class:holoviews.NdOverlay is returned, overlaying all groups in one plot. However, when you set `subplots=True`, a :class:holoviews.NdLayout is returned instead, arranging the groups as separate subplots."


The :class:holoviews.... inter-sphinx tag did not build correctly in my local build, so I'm not sure I did it correctly.

Azaya89 · 2025-04-11T19:22:38Z

doc/ref/plot_options/data.ipynb

+   "outputs": [],
+   "source": [
+    "import hvplot.pandas  # noqa\n",
+    "from bokeh.sampledata.sea_surface_temperature import sea_surface_temperature as sst\n",


This seemed like a good enough dataset for this example, but I'm open to using something else from hvsampledata

Azaya89 added 3 commits April 4, 2025 18:29

initial work on data options

b1a6974

update 'group_label' docstring

8e448fc

minor code formatting

76556d9

Azaya89 self-assigned this Apr 4, 2025

Azaya89 added the NF SDG 2025 NumFocus Software Development Grant 2025 label Apr 4, 2025

maximlt reviewed Apr 7, 2025

View reviewed changes

Azaya89 added 8 commits April 7, 2025 20:00

add more data options

0d20ac7

add width to subplots

be90476

add final options

64ae1c4

remove redundant docstring

af02b7e

moved 'robust' to style options

3cea0a5

self-review

fc5e73b

Merge branch 'main' into data-op

b9b8748

add robust to the correct special list

8741a1e

Azaya89 marked this pull request as ready for review April 11, 2025 20:15

Azaya89 requested a review from maximlt April 11, 2025 20:15

Azaya89 commented Apr 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the data options #1539

Document the data options #1539

Azaya89 commented Apr 4, 2025

maximlt Apr 7, 2025

Azaya89 Apr 7, 2025

maximlt left a comment

maximlt Apr 7, 2025

maximlt Apr 7, 2025

Azaya89 Apr 7, 2025 •

edited

Loading

maximlt Apr 7, 2025

Azaya89 Apr 7, 2025

maximlt Apr 7, 2025

Azaya89 Apr 7, 2025

maximlt Apr 7, 2025

Azaya89 Apr 7, 2025

Azaya89 Apr 11, 2025

Azaya89 Apr 11, 2025

Document the data options #1539

Are you sure you want to change the base?

Document the data options #1539

Conversation

Azaya89 commented Apr 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximlt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Azaya89 Apr 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Azaya89 Apr 7, 2025 •

edited

Loading