Skip to content

docs: adding translation stats to docs #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

RobPasMue
Copy link
Contributor

Adding a dependency to this package with plotly for providing nice stats graphs.

See rendered docs for demonstration but here is a snapshot

image

@RobPasMue
Copy link
Contributor Author

This PR is related to #493 -- second step declared in #493 (comment)

@RobPasMue
Copy link
Contributor Author

The solution involves creating a Sphinx extension that reads in the JSON data generated, and creates a Plotly graph (with plotlyjs) and embeds it into the docs. Full statistics reports for each module are shown when hovering over the bar (as a function of the locale and module). This uses the hover tooltip to render properly.

Hope you like it! Looking forward to feedback related to it @lwasser!

@lwasser lwasser requested review from flpm and sneakers-the-rat May 28, 2025 16:20
@lwasser
Copy link
Member

lwasser commented May 28, 2025

@RobPasMue this is awesome!! Here is my one question, and then I'll provide a suggestion, but let's see what @flpm and @sneakers-the-rat think! I suspect that because translating.md is NOT actually a part of our pshinx guide, that people won't be able to see the graphic that you made in the page (i could be wrong but it looks like it needs to render plotly).

So as a middle ground, could we do the following

  1. Could we export the plot as a PNG too, so we could drop it in the readme file?
  2. Could we make TRANSLATING.md a part of our Sphinx build and add a contributing tab?
  • That way the interactive plot gets rendered via Sphinx (and this page is no longer an "orphan".
  • We can also include a static image in our README file for drive-by contributors on GitHub.

y'all - let me know how that lands!

))

# Create figure
fig = go.Figure(data=traces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobPasMue could we create a grid of plots - one for each language?

then each plot could have 3 bars - one for fuzzy, one for complete and one for incomplete (or it could be stacked bars too.

What you have now is awesome but if we add more languages it will get complex over time. And a static version of the plot would be nice too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure sounds good!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good idea would be a heat map, which is condensed enough we can add many more languages.

Copy link
Contributor

@sneakers-the-rat sneakers-the-rat May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed on the heatmap. i would expect it oriented with languages as rows and pages as columns (which satisfies the need to expandability to future languages)


# Create figure
fig = go.Figure(data=traces)
fig.update_layout(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the plot please use our pyOS colors?

Dark Purple: #33205c
Light Purple: #735fab
Pale Purple: #bab3d4
Magenta: #bb82b0
Sea Green: #81c0aa

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most def! =) I'll look into the colors as soon as I can

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh whoops. just saw this comment. probably would be good to use the css variables directly when we can to avoid having them hardcoded in multiple places. i couldn't find a rhyme or reason to when i was able to use css vars in the plotly values and when i needed to declare them in the stylesheet, but ya some examples in this comment

Copy link
Member

@lwasser lwasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks so good - i just suggested a few changes. We could add translating, contributing etc to the guidebook as pages in another pr as well if we want to merge this. I just worry that the beautiful work done here won't render until we add these pages to the guide (i could be wrong!).

@RobPasMue
Copy link
Contributor Author

I suspect that because translating.md is NOT actually a part of our pshinx guide, that people won't be able to see the graphic that you made in the page (i could be wrong but it looks like it needs to render plotly).

Hi @lwasser -- just to clarify, the translation page is available in our published docs: https://www.pyopensci.org/python-package-guide/TRANSLATING.html

I think we should just link it properly to the landing page =) so that it is no longer an orphan as you mentioned.

  1. Could we export the plot as a PNG too, so we could drop it in the readme file?

Nonetheless, this is possible if y'all prefer. Although my feeling is that only people interested in the translation would be curious about this information. So, IMO, the best location would be the TRANSLATING.md file. But I'm up for discussion! =)

@flpm
Copy link
Member

flpm commented May 29, 2025

Wow, that's pretty neat! 🤩 I like the idea of having an interactive visualization that people can consult and the Translation guide feels like a natural place.

We should add a link to the live version on the site inside TRANSLATING.md, I imagine most users will look at the md in their own clones or in GitHub and they will not see the chart at first. The link will be a quick way to get them to the real data.

On the visualization itself: I am not sure the bar chart is the best approach to show this data. In my head it feels more natural to imagine it as a heat map, where the rows are the files, the columns are the languages.

In the heat map each cell would show the % complete and be colored with the proper intensity. I think the main advantage of a heat map is that empty cells are clearly defined and will be easier to spot than missing bars.

I would also include English (hard coded at 100% in all cells), as a first column. And order the languages from most done to least done (currently, JA then ES).

But I have never used plotly so I am not sure how much work that would be, we could always have that as a future improvement in a separate issue.


```{translation-graph}
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe here would be a good spot to include the link to the site

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we would want both directions? from the contributing page here and vice versa?

))

# Create figure
fig = go.Figure(data=traces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good idea would be a heat map, which is condensed enough we can add many more languages.

@lwasser
Copy link
Member

lwasser commented May 29, 2025

https://www.pyopensci.org/python-package-guide/TRANSLATING.html

My apologies, @RobPasMue you are correct, i saw that it was flagged orphan but now i realize it's just a matter of adding a link to it from the guide somewhere.

Please ignore my comment. I'll defer to @flpm for what the final plots look like!! I do want the ability to add more languages in the future. And the ability for users to easily identify gaps and determine where they can contribute (without needing to run an A nox session).

Let's add a link to the translating page in a separate PR so you don't have to worry about it here and can focus on the data viz challenge!!

@lwasser
Copy link
Member

lwasser commented May 29, 2025

Copy link
Contributor

@sneakers-the-rat sneakers-the-rat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, ya nice, i needed a distraction today so i spent some time on the plot. haven't used plotly since it came out and whew somehow it became three different packages or something? anyway it's very slick. put my version of the plot in suggestion.


```{translation-graph}
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we would want both directions? from the contributing page here and vice versa?

@sneakers-the-rat
Copy link
Contributor

I guess the last thing to do would be accessibility. since we already have all the data when generating the plot, and we're generating an SVG, we might as well add alt or aria-label or whatever is appropriate for the cells.

Co-authored-by: Jonny Saunders <sneakers-the-rat@protonmail.com>
@RobPasMue
Copy link
Contributor Author

Thank you for the heatmap implementation @sneakers-the-rat! If y'all like it the way it's rendering (@flpm, @lwasser) I can add the suggestions to the code and then all solved.

I will work on the auto-generation of the JSON info as part of #493 (comment) -- we can probably move away from having the persistent file and generate it on the fly. Although it would imply that everytime someone wants to build the docs, they have to go through that step... which might be unnecessary. If we have a workflow that updates it regularly, I think we solve our problem -- but I am up for discussion.

Bottom line: I can have it implemented either way - either through a Sphinx hook at build time or as a GitHub actions workflow that updates it on a scheduled basis.

@sneakers-the-rat
Copy link
Contributor

If we have a workflow that updates it regularly, I think we solve our problem

to be clear i'm good with whatever does something in this neighborhood! just as long as we avoid forgetting something and eventually realizing the wonderful colorful square box has been broken for awhile. cosplaying as ornery reviewer who refuses all nice tooling and docs and rehearses the edge cases.

Also sorry for making a code suggestion that was just like "here's a whole different thing," that is pretty rude of me. I was just having an afternoon where i needed a bit of time to distract myself and you had set up this great canvas in these two PRs, so thanks for that! you can feel free to take or leave any part of that.

And order the languages from most done to least done

rats, i did forget this. i like nice color sorting and it does make sense. it induces a tiny amount of gamification (the translation scoreboard!) which i think could be both cute and useful as a way of knowing where to target effort. there is a maybe remote chance that it makes someone feel bad or unwittingly participate in linguistic-cultural rivalries. So weight that as an "i'm aware of this being possible and think it's worth raising but have no estimate of either likelihood or magnitude" vs a higher probability of useful information and tidier color gradients.

~ yielding da floor ~ thanks for ur patience

@RobPasMue
Copy link
Contributor Author

Also sorry for making a code suggestion that was just like "here's a whole different thing," that is pretty rude of me. I was just having an afternoon where i needed a bit of time to distract myself and you had set up this great canvas in these two PRs, so thanks for that! you can feel free to take or leave any part of that.

Oh no, never apologize for that! I really appreciate that you took the time to come up with a whole new implementation of the heatmap! I'm really glad it helped you distract yourself. Coding is the perfect way for it! =) It also helped me on understanding how to do it with a heatmap too! =)

to be clear i'm good with whatever does something in this neighborhood! just as long as we avoid forgetting something and eventually realizing the wonderful colorful square box has been broken for awhile. cosplaying as ornery reviewer who refuses all nice tooling and docs and rehearses the edge cases.

I fully agree - I'm fine with either option too! As long as the data keeps getting updated haha. Also suffered from this in the past.. so I know the feeling.

And order the languages from most done to least done

Regarding this last point - I understand all points as well. I just went on default sorting (alphabetical) but sorting based on completeness makes sense as well to reach out for help on those languages that are not fully there yet. But I'll let @lwasser comment on this last point too =)

and @sneakers-the-rat .. thanks for your thorough review. Once again, I really appreciate your efforts to read it through, provide code suggestions and help out on the implementation! =)

@flpm
Copy link
Member

flpm commented May 30, 2025

I think it looks awesome! It will allow to put many languages in relatively small screen space.

I think we should add the percentage in the cell. In addition to the "identify where the bigger gaps are" use case, there is also keeping the translation up to date as it ages, "spot where new gaps are appearing".

When the English text evolves more and more entries will start to be marked fuzzy and the 100% will drift down to 99%, 98% etc. Without the numbers it will be hard to spot the small changes in the color.

Reviewing fuzzy entries in the .PO file is a very easy task too, so this will help new contributors find those opportunities.

@lwasser
Copy link
Member

lwasser commented Jun 4, 2025

Hi Friends!! Gosh I love how this has evolved and all of the discussion. Thank you for all of the work on this
I have a few cents of comments to add ✨

Here is our color palette:

Screenshot 2025-06-04 at 4 34 33 PM
  • What if we used a categorical color approach rather than a linear gradient? And then we had categories like:

  • 0-25% complete: (white or yellow which might look white to some, will stand out well and will look "empty") help!

  • partially done 25-75% done (light magenta -- getting there!)

  • fuzzy / almost there - light green (so close)

  • complete! Green DONESO (we could make the green darker too to make the light green less similar visually for accessibility

That might make it easier to figure out what is done quickly.
We can also add numbers to each box for additional clarity, which is a fantastic idea, Felipe!
I just wonder if bins will be visually easier to quickly understand where we need help (with a prominent legend next to it)

Screenshot 2025-06-04 at 4 52 59 PM

I'm not married to these colors, but I do like green == done and simplifying the legend (which I think Jonny alluded to above! And some version of white / light color to signify empty / help.

The light green and dark green might be too similar in my example, but I'm just thinking there is value in someone being able to glance and see what areas need help quickly!

@RobPasMue
Copy link
Contributor Author

@lwasser's approach sounds good to me! We can do a more step-wise distribution rather than gradient. I will add @sneakers-the-rat's suggestion and work shortly on the color palette. I will also add the numbers to the cells if possible =)

Thank you all for your contributions!

RobPasMue and others added 2 commits June 5, 2025 08:40
Co-authored-by: Jonny Saunders <sneakers-the-rat@protonmail.com>
@RobPasMue
Copy link
Contributor Author

Had to go local and commit the suggestions myself because GitHub didn't allow me to apply them automatically (specially for the heatmap) 😢

Now that @sneakers-the-rat's code is in, I will continue to work on it shortly! Probably starting tomorrow :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

5 participants