-
Notifications
You must be signed in to change notification settings - Fork 18.9k
hotfix: fix _extract_filename for rfc 5987 #26230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hotfix: fix _extract_filename for rfc 5987 #26230
Conversation
Signed-off-by: NeatGuyCoding <[email protected]>
Summary of ChangesHello @NeatGuyCoding, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a hotfix that significantly enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes filename extraction from HTTP Content-Disposition headers to properly support RFC 5987 encoding, specifically the filename* parameter which allows for UTF-8 encoded filenames.
Key Changes:
- Added proper RFC 5987
filename*parameter parsing with charset/language support - Enhanced URL path decoding with percent-encoding handling
- Added security protections against path injection attacks
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| api/factories/file_factory.py | Implements RFC 5987 compliant filename extraction with proper charset handling and security measures |
| api/tests/unit_tests/factories/test_file_factory.py | Adds comprehensive test coverage for RFC 5987 parsing, security scenarios, and edge cases |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides a hotfix for filename extraction from the Content-Disposition header, specifically to add support for RFC 5987's filename* parameter. The new implementation is significantly more robust, handling various encodings and malformed headers. The addition of a comprehensive test suite is a great improvement and ensures the new logic is well-covered. I've identified a couple of areas for improvement to enhance robustness and code quality, detailed in the comments below.
Signed-off-by: NeatGuyCoding <[email protected]>
Signed-off-by: NeatGuyCoding <[email protected]>
Important
Fixes #<issue number>.Summary
close #28754
This pull request refactors and strengthens the filename extraction logic from remote file URLs and Content-Disposition headers, with a particular focus on proper RFC5987 support and security against path injection. It also introduces a comprehensive set of unit tests to ensure correct behavior across a wide range of scenarios, including encoding edge cases and malicious input.
Filename extraction improvements:
_extract_filenamefunction infile_factory.pyto robustly parse RFC5987filename*parameters, handle charset and language tags, decode percent-encoding, and always sanitize the result usingos.path.basenameto prevent path injection. The function now also returnsNonefor empty or whitespace-only filenames.reimport to support regular expression parsing in the filename extraction logic.Testing enhancements:
TestExtractFilenameclass totest_file_factory.pywith extensive unit tests covering RFC5987 parsing, encoding/decoding, language tags, fallback behaviors, path injection protection, and edge cases like empty filenames._extract_filenamefor direct testing.Screenshots
Checklist
dev/reformat(backend) andcd web && npx lint-staged(frontend) to appease the lint gods