Skip to content

Conversation

@katrinafyi
Copy link
Contributor

@katrinafyi katrinafyi commented Sep 23, 2025

the previous behaviour accepted relative directory paths as bases but this led to later InvalidBaseJoin errors, because relative bases cannot be used to join relative URLs. this means that relative local bases were only useful for resolving root-relative links, and were confusingly problematic with ordinary relative links.

see #1574 which talks about errors when passing --base ../network-documentation/ or, in a later comment, --base build. it is my opinion that it is better to fail earlier in these cases, so the user is not hit with mysterious InvalidBaseJoin.

also, the previous behaviour would parse something like --base-url google.com to a local base pointing to ./google.com. this would also lead to downstream errors or incorrect URLs and it's better to guard against this.

after this pr, trying to use a non-absolute local base will be a CLI parsing error:

error: invalid value 'build' for '--base-url <BASE_URL>': Error with
base dir `build` : Base must either be a URL (with scheme) or an
absolute path. Alternatively, if you want to resolve root-relative links
in local files, see `--root-dir`.

in a slightly opinionated touch, i've mentioned --root-dir in the error message, since i think --root-dir is more suitable for most use cases where people try to use relative local bases. this agrees with later comments in #1574.

however, this does make the error message quite long. so i'm happy to take on feedback and changes about this. also, i'm not quite sure which conventional commit verb to use in this PR title. i've put "feat" for now.

also, an alernative approach to think about is to ban "local" bases altogether and require use of file:/// for local paths. this would unify the argument under the URL format and maybe help prod people towards --root-dir for local paths.

also, another alternative instead of doing path.is_absolute() is to make a Local base then try to .join() with it and ensure the join succeeds. this will more precisely guard against join failures (and hence, InvalidBaseJoin errors)

this pr implements (1) in my outline to reduce InvalidBaseJoin confusion, as described in #1624 (comment)

Comment on lines 75 to 80
"Base must either be a URL (with scheme) or an absolute path.",
"Alternatively, if you want to resolve root-relative links in",
"local files, see `--root-dir`.",
]
.join(" "),
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should add a few examples or reference to the docs here?

Copy link
Contributor Author

@katrinafyi katrinafyi Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For adding examples, I think that this message location isn't quite suitable, the current message is already relatively large. Maybe a link to the docs would be okay, but idk which docs. The base url page is not super informative.

Maybe it could mention that more info is in --help? But that's also kind of obvious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

The base url page is definitely not great and needs a makeover at some point. Besides, hardcoding website URLs is probably not a good idea. Would be quite ironic if that link breaks in the future. 😆

Setting up the base url and root dir are probably the trickiest things to get right from a user's perspective, so in this case mentioning --help might be a good idea.

@katrinafyi
Copy link
Contributor Author

katrinafyi commented Sep 29, 2025

I've made some changes in response to the review comments, but is the CI workflow working? It's failed in a setup step twice in a row.

Also, I'm wondering if this PR's approach is a good idea. It's adding a message to TryFrom<&str> for Base that mentions command-line flags, but this is part of lychee-lib and could be used entirely without the CLI.

@mre
Copy link
Member

mre commented Sep 30, 2025

is the CI workflow working? It's failed in a setup step twice in a row.

Yeah, I noticed that in a few other projects as well.
#1862

@mre
Copy link
Member

mre commented Sep 30, 2025

Also, I'm wondering if this PR's approach is a good idea.
The approach is fine, I think. Or at least it's a step into the right direction to avoid confusion around these options.

It's adding a message to TryFrom<&str> for Base that mentions command-line flags, but this is part of lychee-lib and could be used entirely without the CLI.

Yeah, that's indeed not great. Sorry for not noticing this sooner and leading you astray. Maybe we can instead add more .context whereever this TryFrom gets called in the CLI for a cleaner separation of concerns.

@mre
Copy link
Member

mre commented Sep 30, 2025

#1862 is merged so you can rebase.

katrinafyi and others added 6 commits October 1, 2025 00:05
the previous behaviour accepted relative directory paths as bases
but this led to later InvalidBaseJoin errors, because relative paths
cannot be used to join relative URLs. this means that relative local
bases were *only* useful for resolving root-relative links, and were
confusingly problematic with ordinary relative links.

see https://www.github.com/lycheeverse/lychee/issues/1574 which talks about
errors when passing `--base ../network-documentation/` or, in a later
comment, `--base build`.

also, the previous behaviour would parse something like
`--base-url google.com` to a local base pointing to `./google.com`.
this would also lead to downstream errors and it's better to guard
against this.

it is my opinion that it is better to fail earlier in these cases, so
the user is not hit with mysterious InvalidBaseJoin.

after this pr, there will be a command-line argument parsing error:
```
error: invalid value 'build' for '--base-url <BASE_URL>': Error with
base dir `build` : Base must either be a URL (with scheme) or an
absolute path. Alternatively, if you want to resolve root-relative links
in local files, see `--root-dir`.
```

in a slightly opinionated touch, i've mentioned --root-dir in the error
message, since i think --root-dir is more suitable for most use cases
where people try to use relative local bases. this agrees with later
comments in https://www.github.com/lycheeverse/lychee/issues/1574.

however, this does make the error message quite long. so i'm happy to
take on feedback and changes about this.

this pr implements (1) in my outline to reduce InvalidBaseJoin
confusion, as described in
https://www.github.com/lycheeverse/lychee/pull/1624#issuecomment-3274485963
TODO: probably mention these restrictions and suggestions in --help too
this does have the side-effect of attaching the --help and --root-dir
suggestions even to the "cannot be a base" error.

```
error: invalid value 'a:datafdajsio' for '--base-url <BASE_URL>': Error
with base dir `a:datafdajsio` : The given URL cannot be used as a base
URL. See `--help` for more information. If you want to resolve
root-relative links in local files, also see `--root-dir`.
```

this could be a bit confusing, but idk a way around it without
inspecting the error message of InvalidBase to determine which context
to add.
@katrinafyi
Copy link
Contributor Author

Thanks for the fix and the hint about context. I've made the changes and it makes a bit more sense now. See the commit comment, though.

Copy link
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now! Unless you'd like to try the suggestion in the comment of which I'm not sure if it will work (and you've possibly tried it already), we can go ahead and merge this. Great progress.

add a sentence about URL schemes to cover the "cannot be a base"
error.
@katrinafyi
Copy link
Contributor Author

I've tweaked the --base-url help text to read a bit better from top to bottom. I'm happy for it to be merged anytime now :)

@mre mre merged commit ff53ca3 into lycheeverse:master Oct 1, 2025
6 checks passed
@mre
Copy link
Member

mre commented Oct 1, 2025

Good work, thanks

@mre mre mentioned this pull request Oct 1, 2025
This was referenced Oct 21, 2025
@thomas-zahner
Copy link
Member

Documentation is outdated:

      - name: Link Checker
        id: lychee
        uses: lycheeverse/lychee-action@v2
        with:
          args: --base-url dist --exclude-all-private dist
          fail: false

Will now lead to an error as expected. @katrinafyi how would you fix this?

I tried

          args: --dump --base-url "$(pwd)/dist/" --exclude-all-private dist

but this doesn't work. I guess this is a GitHub workflow related issue. How would you be able to access the absolute path?
I've also tried ${GITHUB_WORKSPACE} and ${{ github.workspace }} without success.

See lycheeverse/lycheeverse.github.io#114 and https://github.com/lycheeverse/lycheeverse.github.io/actions/runs/19075734005/job/54490637825?pr=114

@katrinafyi
Copy link
Contributor Author

@thomas-zahner I started a fix here but there's other (unrelated) issues to do with how existing links are excluded: lycheeverse/lycheeverse.github.io#111

In hindsight, maybe we should've been more careful with this change 😅 even if it was slightly buggy, i think base-url was very widely used...

@thomas-zahner
Copy link
Member

Ah I've missed that one. Thank you. Yes maybe we were a bit quick to release that but at the same time we're still not on v1 so I think it's okay if the docs are up to date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants