Skip to content

Conversation

@pritkc
Copy link
Collaborator

@pritkc pritkc commented Dec 17, 2025

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Example
  • Documentation

Description

This PR fixes an issue where images worked correctly on branch RTD builds but failed when merged to main. The solution implements automatic image handling that works consistently across all build environments (local, branch RTD, and main RTD).

Key Changes:

  • Implemented automatic image collection from multiple source locations (doc/assets/img/, figures/ directories, etc.)
  • Added path fixing for included markdown files at both source and doctree levels
  • Changed from config-inited to builder-inited event for more reliable timing
  • Added build output copying to ensure images are available in final HTML
  • Fixed HTML paths to use relative paths for local file viewing
  • Updated README with image handling instructions for contributors

Technical Details:
The fix uses Sphinx event hooks to:

  1. Collect images from various locations before build starts (builder-inited event)
  2. Fix image paths in source markdown before MyST processes includes (source-read event)
  3. Fix broken paths in doctree after MyST processing (doctree-read event)
  4. Copy images to build output directory (build-finished event)
  5. Convert absolute paths to relative paths in HTML (build-finished event)

The solution is generic and works for any contributor adding images anywhere in the documentation, without requiring special configuration or knowledge of Sphinx internals.

Related Issues & Documents

QA Instructions, Screenshots, Recordings

RTD Hosted link -- https://mole-form.readthedocs.io/en/main/

Testing Performed:

  1. Local Build Testing: Verified images work correctly with make doc-html
  2. RTD Branch Build: Tested on personal fork branch - images display correctly
  3. RTD Main Build: Merged to fork's main branch - images work correctly
  4. Multiple Image Locations: Tested images from:
    • doc/assets/img/ (OSE_ORGANIZATION.md images)
    • source/math_functions/figures/ (StaggeredGrids.md, CSRCReportOnMOLE.md)
    • source/api/examples/md/figures/ (example figures)
    • Section-specific figures/ directories

How to Test:

  1. Build documentation locally:
    cd doc/sphinx
    make doc-clean
    make doc-html
  2. Verify images appear in build/html/ output
  3. Check RTD builds (branch and main) to ensure images display correctly
  4. Test adding a new image:
    • Place image in doc/assets/img/ or a figures/ directory
    • Reference it in markdown: ![Alt text](doc/assets/img/image.png)
    • Build and verify it appears correctly

Screenshots:

  • Images now display correctly on RTD main builds (previously broken)
image - No effect to latexpdf image

Added/updated tests?

  • Yes
  • No, and this is why: please replace this line with details on why tests
    have not been included
  • I need help with writing tests

Read Contributing Guide and Code of Conduct

[optional] Are there any post deployment tasks we need to perform?

No post-deployment tasks required. The fix is automatic and works for all existing and future images.

[optional] What gif best describes this PR or how it makes you feel?

- Implement automatic image collection from multiple source locations
- Add path fixing for included markdown files (source and doctree level)
- Use builder-inited event for reliable image copying timing
- Add build output copying to ensure images are available in HTML
- Fix HTML paths to use relative paths for local file viewing
- Remove debug code and update README with image handling instructions

This fix ensures images work correctly on both branch and main RTD builds,
as well as local builds. The solution is generic and works for any
contributor adding images anywhere in the documentation.

Tested and verified on both test branch and main branch RTD builds.
Copy link
Collaborator

@aboada aboada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed conf.py. Some of the comments apply to other parts of the code.

Please let me know if you have any questions or need any help. Thanks!

# Source locations to copy from
image_sources = [
# Main image directory (for OSE_ORGANIZATION.md and similar)
repo_root / "doc" / "assets" / "img",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check the validity of repo_root before use or handle any exceptions?

Copy link
Collaborator Author

@pritkc pritkc Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo_root is derived from app.confdir which is guaranteed by Sphinx to be valid during builds.
Path operations on it won't raise exceptions; they'll just return non-existent paths, which we already handle via img_source.exists() checks in the loop

try:
shutil.copy2(img, dest_file)
copied_count += 1
except Exception as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should handle specific errors here. Some examples: PermissionError, FileNotFoundError. It applies to other exceptions in this file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 3a15829. Covered exception handling for shutil.SameFileError, PermissionError, and FileNotFoundError.

except Exception as e:
print(f"Warning: Could not copy {img.name}: {e}")
# Don't warn on overwrites (same file)
if "already exists" not in str(e).lower():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same as:

exception shutil.SameFileError
This exception is raised if source and destination in copyfile() are the same file.

from shutil-documentation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! Fixed in commit 3a15829

figures_dir = source_file.parent / "figures"
if figures_dir.exists():
# Copy to _images (they'll be found there)
for pattern in ["*.png", "*.jpg", "*.jpeg", "*.gif", "*.svg"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we are processing images every time we find a markdown file. My suggestion for improving this is to keep track of the images (a set may do the job) and process them once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Currently, the code handles duplicate copies via the try/except block.
I'll note this as a future optimization since the current approach works correctly. Should I implement this optimization in this PR or defer it?

2. Standalone files with relative paths also benefit from the fix
3. The logic is safe - it only changes relative paths and skips absolute paths/URLs
"""
import re
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adhering to PEP8 guidelines.

From PEP8:

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:

Standard library imports.
Related third party imports.
Local application/library specific imports.
You should put a blank line between each group of imports.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those function-level imports are for lazy loading. This pattern was common in Sphinx configurations where functions may not always be called. If you'd prefer I remove the redundant imports.

pass # Ignore overwrites


def fix_included_image_paths_source(app, docname, source):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the functions in this file are too long and may be against the Single Responsibility Principle. I suggest keeping them short, concise, and focusing on a single concern (when possible). Just a comment for the future and something to keep in mind, it does not need to be addressed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged, thank you for the feedback. I'll keep this in mind for future refactoring


import shutil
from pathlib import Path
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 45151e2

pass # Ignore overwrites


def fix_included_image_paths_source(app, docname, source):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think app and docname are not used within this scope. Please check.

Copy link
Collaborator Author

@pritkc pritkc Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit - Fixed in commit 1b2a8d6. Prefixed them with underscore (_app, _docname) to indicate intentional non-use, and added docstring documentation explaining this

These parameters are required by Sphinx's source-read event signature (app.connect('source-read', fix_included_image_paths_source)). Sphinx passes these arguments automatically, so we must accept them in the function signature even if not used.
See Sphinx source-read event docs.

return fixed_path

# Match markdown image syntax: ![alt](path)
image_pattern = r'!\[([^\]]*)\]\(([^)]+)\)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this regex consider a line with 2 or more images? What if the image has a title like:
![alt](path "Some Title Here")

See markdown-guide.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!

  1. Yes, re.sub replaces all matches globally
  2. The current pattern [^)]+ matches greedily up to the first ), so ![alt](path "Title") would match path "Title" as the path. When we extract the filename with Path(img_path).name, this would result in "Title") which is incorrect. However, after re-reviewing our documentation source files, we don't use image titles anywhere. Should I add support for this edge case for future-proofing, or document it as a known limitation?

@jbrzensk
Copy link
Collaborator

@pritkc I checked the code, and it seems to function correctly. You only need to address the comments from @aboada, to ensure that if there are errors, they are handled correctly. Then we can modify the backward_euler image to be in the correct directory.

pritkc added 4 commits January 2, 2026 19:06
Addresses review comment: 'Unused import.'

Removed unused 'import os' statements from:
- fix_included_image_paths_source()
- fix_included_image_paths_doctree()
- copy_images_to_build_output()
Addresses review comments:
- 'Should handle specific errors here. Examples: PermissionError, FileNotFoundError'
- 'Isn't this the same as shutil.SameFileError?'

Changed exception handling in image copy operations:
- Use shutil.SameFileError for same-file scenarios (silently skip)
- Catch PermissionError and FileNotFoundError specifically
- Removed string matching on error messages
Addresses review comment: 'I think app and docname are not used within
this scope. Please check.'

In fix_included_image_paths_source(), the 'app' and 'docname' parameters
are required by Sphinx's source-read event signature but not used in the
function body. Prefixed with underscore to indicate intentional non-use.

Added docstring Args section to document this design decision.
@pritkc pritkc requested a review from aboada January 3, 2026 05:33
@pritkc
Copy link
Collaborator Author

pritkc commented Jan 3, 2026

Thanks, @aboada, for the detailed feedback! I’ve worked through most of the points and would appreciate you taking another look. I’d also love your thoughts on a few questions I still have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OSE Organization figures missing from HTML document

3 participants