Skip to content

Add automated check for broken links in documentation #15490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
per1234 opened this issue Apr 21, 2025 · 0 comments
Open

Add automated check for broken links in documentation #15490

per1234 opened this issue Apr 21, 2025 · 0 comments

Comments

@per1234
Copy link
Contributor

per1234 commented Apr 21, 2025

Feature Description:

The repository contains a large amount of documentation, which is written in Markdown. The documentation contains many helpful links. Links tend to break over time, which reduces the utility of the documentation. Currently there is no automated system for detecting such breakage, so it will only be detected by the human readers.

It would be beneficial to set up an automated system to detect breakage so that it can be fixed before affecting the readers.

Implementation

Overview

Causes of link breakage can be classified into two distinct classes:

  • Internal: the target of a link to a resource inside the project is moved or removed during development without updating the link.
  • External: the target of a link to a resource outside the project is moved or removed through actions unrelated to the project development.

Potential internal breakage can be caught before the regression is introduced into the project by triggering an automated link check on pull requests.

External breakage can be caught by running the automated link check on the content of the production branch, triggered on a schedule.

Technology

Automation

A GitHub Actions workflow would be capable of providing both the pull request-triggered check, and the scheduled check.

Link Checker

I use the markdown-link-check tool for detecting broken links in Markdown content in dozens of projects, and am quite satisfied with it. I'm sure there are other excellent tools that could be used for the purpose.

When evaluating a prospective link check tool, I recommend checking whether it has the capability to check for broken fragment links. Especially in the case of links to the anchors that are automatically generated for headings, these are especially prone to breakage (because the editor of the target document may not realize that changing a heading text is a potentially breaking change if they don't manually add the markup to add a backwards compatibility anchor).

markdown-link-check does have support for checking for broken links to anchors within the same page (e.g., [foo](#foo)). Unfortunately it does not have the same capability for other documents (it does correctly detect when the target file is not accessible, but is not able to detect when the target file is accessible but doesn't contain the specified anchor).

Check Execution

I generally use a Task-based system to run markdown-link-check:

https://github.com/arduino/tooling-project-assets/blob/main/workflow-templates/check-markdown-task.md

Since this project does not use the Task task runner tool, it is likely that would not be a suitable approach here.

In projects that don't use Task-based infrastructure, I have also used the gaurav-nelson/github-action-markdown-link-check GitHub Action action. That action is now deprecated, but the markdown-link-check maintainers are in the process of taking over the maintenance of the action (tcort/markdown-link-check#439). The downside of using a GitHub Actions action is that it is not practical for contributors to run the system locally to validate their development work in preparation for submitting a pull request. So you much choose to either not provide any infrastructure for running the check locally (which is reasonable in a project where you don't expect any contributors to bother doing that anyway, but that is probably not the case here), or else maintaining redundant infrastructure for a local check (in which case you might as well use that infrastructure in the GitHub Actions workflow as well).

So I would guess that you would use an npm script to run the link check operation, then execute that script both in the GitHub Actions workflow and locally.

Related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant