Skip to content

Conversation

@mre
Copy link
Member

@mre mre commented Sep 5, 2025

Summary

Fixes #1657 by properly handling different types of Markdown reference links that were not being extracted by lychee:

  • Reference links: [text][ref]
  • Collapsed links: [text][]
  • Shortcut links: [text]

Root Cause

The issue was in the markdown extraction logic where reference link destinations were processed using extract_raw_uri_from_plaintext(), which relies on the linkify crate. Linkify only recognizes URLs with schemes (http://, https://, etc.) and ignores relative file paths like "target.md".

Solution

  1. Enable footnote support (ENABLE_FOOTNOTES) to properly differentiate footnotes from reference links in pulldown-cmark
  2. Add explicit handling to skip footnote events (FootnoteReference, FootnoteDefinition) since they're not links to check
  3. Create RawUri directly for reference links to handle relative file paths that linkify doesn't recognize

This approach leverages pulldown-cmark's built-in semantic distinction between footnotes and reference links rather than using heuristics.

Test Plan

  • Added comprehensive test case covering all reference link types
  • Verified footnotes are properly ignored (existing test still passes)
  • All existing markdown extraction tests pass
  • Manual testing confirms all four link types are now extracted correctly

Verification

Before:

[link1](target1.md) ✅ extracted
[link2][ref2] ❌ not extracted  
[link3][] ❌ not extracted
[link4] ❌ not extracted

After:

[link1](target1.md) ✅ extracted
[link2][ref2] ✅ extracted
[link3][] ✅ extracted  
[link4] ✅ extracted

The fix ensures lychee can now properly validate all types of Markdown reference links while maintaining backward compatibility.

Resolves #1657 by properly handling different types of Markdown reference links:
- Reference links: [text][ref]
- Collapsed links: [text][]
- Shortcut links: [text]

The issue was that reference link destinations were processed using
extract_raw_uri_from_plaintext(), which relies on the linkify crate that
only recognizes URLs with schemes (http://, https://, etc.) and ignores
relative file paths like "target.md".

Solution:
1. Enable footnote support (ENABLE_FOOTNOTES) to properly differentiate
   footnotes from reference links in pulldown-cmark
2. Add explicit handling to skip footnote events (FootnoteReference,
   FootnoteDefinition) since they're not links to check
3. Create RawUri directly for reference links to handle relative file
   paths that linkify doesn't recognize

This approach is semantically correct - it leverages pulldown-cmark's
built-in distinction between footnotes and reference links rather than
using heuristics.

Tests added to verify all reference link types are extracted correctly
while footnotes are properly ignored.
- Remove unnecessary hashes from raw string literal
- Add allow annotations for explicit footnote event handling
- Add allow annotation for function length (102 lines vs 100 limit)
Copy link
Member

@thomas-zahner thomas-zahner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great!

@mre mre merged commit 67a1571 into master Sep 11, 2025
6 checks passed
@mre mre deleted the issue-1657 branch September 11, 2025 22:35
@mre mre mentioned this pull request Sep 11, 2025
This was referenced Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot extract relative reference links in Markdown

3 participants