Skip to content

[py] Auto-generate Python API docs from code #15822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

cgoldberg
Copy link
Contributor

@cgoldberg cgoldberg commented May 29, 2025

User description

🔗 Related Issues

Fixes #14178

💥 What does this PR do?

This PR updates the Python documentation build process so API documentation is auto-generated from code. Previously, we had to manually keep ./py/docs/source/api.rst updated when new modules were added so they were included in the API docs (which was often forgotten).

Now, we have a script (./py/generate_api_module_listing.py) that scans the codebase for Python modules and generates a new api.rst file. This file is later used by sphinx-autogen to generate sphinx autodoc stub pages used in the Python API documentation.

The docs can be built using tox -c py/tox.ini -e docs or ./go py:docs.

Other changes:

  • update .readthedocs.yaml config file to use the new script
  • remove unused Makefile

🔄 Types of changes

  • build infrastructure
  • documentation

PR Type

Enhancement, Documentation


Description

  • Automate generation of Python API docs from codebase

    • Add script to generate api.rst from modules
    • Remove manual maintenance of API module list
  • Update documentation build process and configs

    • Modify tox and ReadTheDocs to use new script
    • Remove obsolete Makefile and update README
  • Regenerate and restructure api.rst for autodoc


Changes walkthrough 📝

Relevant files
Enhancement
generate_api_module_listing.py
Add script to auto-generate API module listing                     

py/generate_api_module_listing.py

  • Add script to scan selenium package for modules
  • Script generates docs/source/api.rst for Sphinx autodoc
  • Automates API documentation module listing
  • +83/-0   
    Configuration changes
    .readthedocs.yaml
    Update ReadTheDocs config for auto-generated API docs       

    py/docs/.readthedocs.yaml

  • Update Python version to 3.12
  • Add step to run API module generation script before Sphinx
  • Ensure docs build uses auto-generated api.rst
  • +2/-1     
    tox.ini
    Update tox docs environment for auto-generated API docs   

    py/tox.ini

  • Add step to run API module generation script in docs build
  • Ensure Sphinx uses up-to-date module listing
  • +2/-0     
    Cleanup
    Makefile
    Remove obsolete Sphinx Makefile                                                   

    py/docs/Makefile

  • Remove obsolete Sphinx Makefile
  • Clean up legacy manual build commands
  • +0/-131 
    Documentation
    README.rst
    Update docs for automated API doc generation                         

    py/docs/README.rst

  • Update documentation to reflect new API doc generation
  • Remove references to manual api.rst maintenance and Makefile
  • +3/-11   
    api.rst
    Regenerate and restructure API reference file                       

    py/docs/source/api.rst

  • Regenerate API reference file with new module structure
  • Add auto-generation notice and reorganize sections
  • Reflects current modules found by the new script
  • +109/-58

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • @selenium-ci selenium-ci added the C-py Python Bindings label May 29, 2025
    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Path Handling

    The code uses site.getsitepackages()[-1] to get the site packages path, which might be unreliable across different environments. This approach could fail if the site packages list is empty or has a different structure than expected.

    .removeprefix(site.getsitepackages()[-1])
    .removeprefix(os.sep)
    Error Handling

    The script lacks error handling for file operations and directory access. If the output directory doesn't exist or if there are permission issues when writing the file, the script will fail without a helpful error message.

    with open(output_file, "w") as f:
        f.write(

    Copy link
    Contributor

    qodo-merge-pro bot commented May 29, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix path handling issue

    The site.getsitepackages()[-1] call may fail if the site module doesn't have any
    site packages or if the list is empty. This can happen in virtual environments
    or custom Python installations. Use a more robust approach to handle the package
    path.

    py/generate_api_module_listing.py [15-28]

     def find_modules(package_name):
         modules = []
    +    package_path = os.path.abspath(package_name)
         for dirpath, _, filenames in os.walk(package_name):
             for filename in filenames:
                 if filename.endswith(".py") and not filename.startswith("__"):
    -                module_name = (
    -                    os.path.join(dirpath, filename)
    -                    .removeprefix(site.getsitepackages()[-1])
    -                    .removeprefix(os.sep)
    -                    .removesuffix(".py")
    -                    .replace(os.sep, ".")
    -                )
    +                rel_path = os.path.relpath(os.path.join(dirpath, filename), os.path.dirname(package_path))
    +                module_name = rel_path.removesuffix(".py").replace(os.sep, ".")
                     modules.append(module_name)
         return sorted(set(modules))
    • Apply / Chat
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion correctly identifies that site.getsitepackages()[-1] can fail in virtual environments or custom Python installations. The improved approach using os.path.relpath is more robust and appropriate for walking a local package directory.

    Medium
    Ensure output directory exists

    The script assumes the current working directory is the repository root, but
    this may not always be the case. Add a check to ensure the output directory
    exists before writing to it, or create it if needed.

    py/generate_api_module_listing.py [31-35]

     if __name__ == "__main__":
         package_name = "selenium"
    -    output_file = os.path.join("docs", "source", "api.rst")
    +    output_dir = os.path.join("docs", "source")
    +    output_file = os.path.join(output_dir, "api.rst")
    +    
    +    # Ensure output directory exists
    +    os.makedirs(output_dir, exist_ok=True)
    +    
         print(f"generating module list for sphinx autodoc in: {output_file}\n")
         modules = find_modules(package_name)
    • Apply / Chat
    Suggestion importance[1-10]: 6

    __

    Why: This is good defensive programming practice that prevents potential directory not found errors. While the build environments likely have the directory structure in place, adding os.makedirs with exist_ok=True is a reasonable safeguard.

    Low
    • Update

    @shbenzer
    Copy link
    Contributor

    shbenzer commented May 30, 2025

    Were the changes to api.rst performed by the script?

    @cgoldberg
    Copy link
    Contributor Author

    Were the changes to api.rst performed by the script?

    Yes, they were. There are some minor differences from the old api.rst (module ordering, capitalization, etc), but those changes were intentional.

    @shbenzer
    Copy link
    Contributor

    Sweet, good work!

    Copy link
    Contributor

    @shbenzer shbenzer left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    LGTM

    @cgoldberg cgoldberg merged commit a5dd13f into SeleniumHQ:trunk May 30, 2025
    16 checks passed
    @cgoldberg cgoldberg deleted the py-auto-generate-api-docs-module-list branch May 30, 2025 17:39
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    [🚀 Feature]: Auto-generate the Python API from the code
    3 participants