Skip to content

Website value being deleted for some users in the contributors.yml file output #271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lwasser opened this issue Mar 14, 2025 · 2 comments

Comments

@lwasser
Copy link
Member

lwasser commented Mar 14, 2025

In the latest contributor run, I noticed that some users' websites were removed. but not all. This is a bug in our build (i think).

@banesullivan
Copy link
Contributor

I want to follow up here after seeing several of these over the last few weeks:

I'm not convinced there's anything wrong per-se in pyosmeta that is causing websites to be removed. I think we're seeing intermittent issues with checking lots of websites at once, many of which are hosted on GitHub Pages, so perhaps we see outages/small interruptions, or we could be getting rate limited in some cases (not sure).

I think it may be worth looking into whether there is a more robust mechanism for checking websites than the standard get request as we do today:

def check_url(url: str) -> bool:
"""Test url. Return true if there's a valid response, False if not
Parameters
----------
url : str
String for a url to a website to test.
"""
try:
response = requests.get(url, timeout=6)
return response.status_code == 200
except Exception: # pragma: no cover
return False

One avenue is to see what Sphinx has done for link checking which has felt less flaky to me: https://github.com/sphinx-doc/sphinx/blob/master/sphinx/builders/linkcheck.py. From what I can tell this is well thought out and has a queue and attempts to prevents rate limiting

@lwasser lwasser changed the title Website being deleted for some users Website value being deleted for some users in the contributors.yml file output Apr 29, 2025
@lwasser
Copy link
Member Author

lwasser commented Apr 29, 2025

For documentation purposes: Bane and I discussed moving away from displaying the user's website on our contributor page and rather using the GitHub landing page, as that is where we are getting the information from anyway! So the next steps here are simply to update the website repo so it creates a GitHub-based URL rather than a user's website.

This will be easier to maintain in the long run. And then we don't have to perform link checking here at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants