-
-
Notifications
You must be signed in to change notification settings - Fork 183
Redirect reporting #1725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redirect reporting #1725
Conversation
…g an error. This allows users to still accept redirections
|
Glad you're tackling this.
Yeah, that's fine. Although the naming of the flag is a bit confusing then. Maybe
Looking at planned features (like per-host rate-limiting and recursion), I believe we'll end up adding more metadata to requests and responses in the future anyway. The question is if we should start with this PR already and instead of a redirect-map, we add a |
|
Sure, no problem. Thanks for the idea in your second point, I'll take a look at it.
So you think we should rename the feature flag even though we don't change its functionality? |
|
I think we should probably separate redirect tracking from the existing I initially thought redirects could be considered suggestions since you might want to update links to point directly to the final destination, but the reality is more nuanced. Many redirects are intentional and shouldn't be "fixed" - URL shorteners, CDN routing, domain migrations with proper redirects, etc. Treating every redirect as a "suggestion to fix" might create noise and dilute the actionable nature of actual suggestions. If we at some point introduce an The use cases feel somewhat different too. Suggestions tend to answer "what's broken and needs fixing" while redirects answer "what path did this request take." Broken links generally force action - your site is broken until you fix them. Redirects are more about inefficiency - your site works fine, you're just making extra HTTP requests. Perhaps some users are also just curious about the redirect chain (similar to I'm leaning toward keeping |
|
Yes I agree with that. That was basically my initial proposal, maybe you misunderstood me.
What do you think about this? I'm also okay with your proposal of a new flag called |
Yes! Good idea to show them by default. Or alternative with |
Yes, probably. Sorry about that. |
d637544 to
5009924
Compare
Yes, it probably makes sense to do that at one point. However, this will not resolve the |
|
Makes sense, yeah; The draft looks good. use std::{collections::HashMap, sync::{Arc, Mutex}};
use url::Url;
#[derive(Debug, Clone)]
pub struct RedirectTracker(Arc<Mutex<HashMap<Url, Url>>>);
impl RedirectTracker {
pub fn new() -> Self {
Self(Arc::new(Mutex::new(HashMap::new())))
}
pub fn record_redirect(&self, original: Url, resolved: Url) {
if let Ok(mut map) = self.0.lock() {
map.insert(original, resolved);
}
}
pub fn get_resolved(&self, original: &Url) -> Option<Url> {
self.0.lock().ok()?.get(original).cloned()
}
pub fn all_redirects(&self) -> HashMap<Url, Url> {
self.0.lock().unwrap_or_else(|_| {
// Handle poisoned mutex gracefully
HashMap::new()
}).clone()
}
pub fn clear(&self) {
if let Ok(mut map) = self.0.lock() {
map.clear();
}
}
}
impl Default for RedirectTracker {
fn default() -> Self {
Self::new()
}
}I also thought of a simple type alias, but I believe this is a bit more declarative and encapsulates the actual tracking. |
|
@mre Thanks, I like the |
5009924 to
c11fee9
Compare
59d8919 to
7513537
Compare
|
FYI In 7513537 I removed When the redirect limit is reached the resulting status should be treated just like any normal status code. Depending on the accept config this will be either This is the best and simplest approach in my opinion because of the configurability with |
|
Fully agree. Great explanation. 👌 |
|
@mre Thanks for taking a look already! I've just now finished some final changes and the PR is now in a state I'm happy with, apart from this one other comment. Will reply to your comments later. |
57413c4 to
21f434b
Compare
I've reconsidered this statement and realised that this is not the path we should follow. I've reverted mentioned commit, so that SummaryUpdated default behaviourNow we list the Max redirectsThe following lead to a hard error. Now it does no longer result in a "hard" error, but the link is just no longer followed so that the result will normally be a 3XX status code. Detailed output and JSONPreviously it was impossible to discern redirects. Now it is and users get new useful information. |
This error message sounds a bit misleading to me given that changing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of switching to macros? Or was it necessary because of the changes?
| pub(crate) suggestion_map: HashMap<InputSource, HashSet<Suggestion>>, | ||
| /// Map to store excluded responses (if `detailed_stats` is enabled) | ||
| /// Store redirected responses with their redirection list (if `detailed_stats` is enabled) | ||
| pub(crate) redirect_map: HashMap<InputSource, HashSet<ResponseBody>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really store the redirection list here as the comment suggests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see my previous comment section Detailed output and JSON. It's the new redirect_map field. I've added it for a more detailed JSON output. Maybe with their redirection list was a bit misleading? I've removed it now.
| .. | ||
| } = request.try_into()?; | ||
|
|
||
| // Allow filtering based on element and attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was dead code but can we replace it with a TODO? I think we still want to do that. We could even create an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, creating an issue for that definitely makes sense. Could you do that? Because I'm not quite sure what the comment and disabled code tried to convey. To me, the comment does not seem to relate to the code below.
Is the comment requesting a feature like: Allow filtering based on HTML elements and HTML attributes? If so, the comment is in the wrong place.
@mre That was the behaviour before this PR, it always resulted in
The macros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! 👌
Fixes #1433
We want to report redirects at the end of the link check process, the same way we do for example with
--suggest. The output formatmarkdownfor example should then also contain a section with all listed redirects.Summary bug
There is a category
Redirectedin the summary. However, redirects were never counted. This is fixed with 6b28e2b.This means
echo "https://http.codes/301 " | cargo run - --format detailedyieldednow it yields
Questions
Redirectssection in the end just as we currently do withSuggestions. But I think we should show them by default without introducing a new feature/flag. Tying the feature to--suggestwouldn't make sense in my opinion. Do you agree?redirect_map. It is used to keep track of the redirects that lychee followed. This allows us to create the new summary section. (not yet implemented) The map is wrapped in anArc<Mutex<_>>. Do you think this is okay? I probably could try to use channels instead. The reason for theArc<Mutex<_>>is thatredirect::Policy::customtakesFn(Attempt) -> Action + Send + Sync + 'staticas argument, so we can't mutate data in there. I also briefly considered getting rid ofredirect::Policy::customand implement our own mechanism. But that turns out to be way more effort then necessary.