Skip to content

wrong plot for geom_path, when using color and group #935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
szst11 opened this issue Apr 25, 2025 · 7 comments
Open

wrong plot for geom_path, when using color and group #935

szst11 opened this issue Apr 25, 2025 · 7 comments

Comments

@szst11
Copy link

szst11 commented Apr 25, 2025

when I use a table like:

Group1 Group2 Data1 Data2
A Y 0 0
A X 1 0
A Y 2 0
A X 3 0
A X 4 0
B Y 5 1
B Y 6 1
B X 7 1
B X 8 1
B X 9 1

and plot that with:

chart = (
    ggplot(
        df,
        aes(
            x='Data1',
            y='Data2',
            color="Group1",
            group="Group2",
        ),
    )
    + geom_path()
)

I get a graph, where Data2 for A is not always 0:
(uploading of the png does not work for me, but hopefully the link does)

Marimo notebook

@TyberiusPrime
Copy link
Contributor

Your group and color axis aren't the same. If you add in a geom_point, this becomes visible. What color should the line connecting a red and a blue dot have?

@szst11
Copy link
Author

szst11 commented Apr 25, 2025

Hi @TyberiusPrime ,
I added a plotly graph to the notebook to visualise the expected behaviour.
new Notebook

It is the goal, that group and color are not the same to get separated sections of a line(which should stay as a line).
There should not be a connection between the A(always 0) and B(always 1) lines.

@TyberiusPrime
Copy link
Contributor

I think what you need is to have one distinct group2 value per line you want drawn.

Plotnine will connect all points within one group.

I have no clue what plotly is doing - possibly drawing one line per group & color combination?

@szst11
Copy link
Author

szst11 commented Apr 25, 2025

@TyberiusPrime
yes, one line per group & color combination would also be my expectation.

@TyberiusPrime
Copy link
Contributor

I imagine plotnine does the same ggplot does.

Then you'll need to pass in Group1+Group2 for the group aesthetic.

Which sort of incidentally works, since you've string-typed columns.

Maybe @has2k1 can chime in whether there's an explicit 'interaction of these levels' operator one would use here.

@szst11
Copy link
Author

szst11 commented Apr 26, 2025

I added a hopefully more precise example in that Notebook (now with the plot):
No datapoint of a is in the high value-range.

Image

@has2k1
Copy link
Owner

has2k1 commented May 2, 2025

There should not be a connection between the A(always 0) and B(always 1) lines.

In plotnine, there should be a connection because the group aesthetic is special. It does not interact with any other aesthetic. All points that are mapped to the same group will be part of the same path. If you do not map to the group, then it is derived internally after doing the interaction of the discrete aesthetic mappings.

import pandas as pd
from plotnine import *

# Dataframe from original query
df1 = pd.DataFrame({
  "Group1": ["A","A", "A", "A", "A", "B", "B", "B", "B", "B"],
  "Group2": ["Y", "X", "Y", "X", "X", "Y", "Y", "X", "X", "X"],
  "Data1": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
  "Data2": [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
})

# Simpler dataframe from demonstration
df2 = pd.DataFrame({
    "x": range(5),
    "y": range(5),
    "g1": list("abcde"),
    "g2": "R",
    "g3": list("XXXYY")
})

If you map to some other aesthetic, they you get the interaction you "expect".

(
    ggplot(df1, aes('Data1', 'Data2', color="Group1", shape="Group2"))
    + geom_path(size=2)
    + geom_point(size=2, color="black")
)

Image

This is a valid expectation because we can account for all aesthetics. The colour of any segment within the path is determined by the colour at the starting point.

# All points have different colours, but belong to same group
(
    ggplot(df2, aes("x", "y", color="g1", group="g2"))
    + geom_path(size=2)
    + geom_point(size=2)
)

Image

# All points have different colours, but belong to two groups so there is a disjoint
(
    ggplot(df2, aes("x", "y", color="g1", group="g3"))
    + geom_path(size=2)
    + geom_point(size=2)
)

Image

However, it is due to a limitation of the underlying graphics device that points with different colours get a segment that is a single colour. Ideally it should be a linear gradient from one color to the next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants