-
Notifications
You must be signed in to change notification settings - Fork 1.4k
ReplaceNewLinesLayoutRendererWrapper - Replace all line ending characters #5915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ters Windows uses \r\n, Linux and new Macs use \n. Older Macs use \r. Unicode adds more line endings. Old implementation didn't even replace single \r characters. Text editors like notepad++ treat single \r as new line character and log files aren't formatted as expected when using newline replacements. Line ending characters replaced are those defined in unicode 16.0.0 specification section 5.8 Newline guidelines in table 5-1.
|
Thanks for opening this pull request! |
|
""" WalkthroughThe changes update the newline replacement logic in the ReplaceNewLinesLayoutRendererWrapper to explicitly handle a wider range of newline and line separator characters. The implementation now processes newlines using a character-by-character approach. Corresponding unit tests are added to verify behavior with various newline types, multicharacter replacements, consecutive newlines, and default replacement scenarios. Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs (1)
75-75: Consider using HashSet for better performance.The LineEndCharacters array contains the correct Unicode newline characters as per the Unicode 16.0.0 specification. However, using
char[].Contains()in the processing loop results in O(n) lookups for each character.Consider replacing the array with a HashSet for O(1) lookups:
-private static readonly char[] LineEndCharacters = new char[] { '\u000D', '\u000A', '\u0085', '\u2028', '\u000C', '\u2029' }; +private static readonly HashSet<char> LineEndCharacters = new HashSet<char> { '\u000D', '\u000A', '\u0085', '\u2028', '\u000C', '\u2029' };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs(2 hunks)tests/NLog.UnitTests/LayoutRenderers/Wrappers/ReplaceNewLinesTests.cs(1 hunks)
🔇 Additional comments (6)
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs (3)
37-37: LGTM - Import added for Contains() method.The System.Linq import is correctly added to support the
Contains()method used in the new implementation.
82-82: LGTM - Efficient early detection of newline characters.Using
IndexOfAnyis an efficient way to detect if any newline characters are present before processing the string character by character.
85-109: LGTM - Correct implementation of Unicode newline replacement.The new character-by-character processing logic correctly handles:
- Windows newline sequence
\r\nas a single replacement (lines 93-97)- Other Unicode newline characters individually (lines 98-101)
- Non-newline characters are preserved (lines 104-106)
The StringBuilder capacity calculation (line 87) is a good optimization to reduce memory allocations.
tests/NLog.UnitTests/LayoutRenderers/Wrappers/ReplaceNewLinesTests.cs (3)
139-149: LGTM - Test covers standalone carriage return replacement.This test correctly verifies that standalone
\rcharacters are now properly replaced, addressing the core issue mentioned in the PR description where single\rcharacters were not handled by the previous implementation.
151-161: LGTM - Comprehensive test for all Unicode newline characters.This test thoroughly verifies that all Unicode newline characters defined in the implementation are correctly replaced with a custom replacement string. The test string includes:
\n(LF)\r\n(CRLF)\r(CR)\u0085(NEL)\u2028(LS)\u000C(FF)\u2029(PS)
163-173: LGTM - Test verifies default replacement behavior.This test ensures that all Unicode newline characters are correctly replaced with the default space character, providing comprehensive coverage of the enhanced functionality.
| /// <docgen category='Layout Options' order='50' /> | ||
| public string ReplaceNewLines { get => Replacement; set => Replacement = value; } | ||
|
|
||
| private static readonly char[] LineEndCharacters = new char[] { '\u000D', '\u000A', '\u0085', '\u2028', '\u000C', '\u2029' }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using unicode-numbers directly, then I prefer the more human-readable standard chars when possible. Ex \n + \r + etc.
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Outdated
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Outdated
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Outdated
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Outdated
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Outdated
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Show resolved
Hide resolved
src/NLog/LayoutRenderers/Wrappers/ReplaceNewLinesLayoutRendererWrapper.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for being patient with me and all my comments.
Very happy about the result, and think it is a good improvement together with the new unit-tests.
|
Closing an re-opening the pull-request to tickle the build-servers. |
Windows uses \r\n, Linux and new Macs use \n. Older Macs use \r. Unicode adds more line endings. Old implementation didn't even replace single \r characters. Text editors like notepad++ treat single \r as new line character and log files aren't formatted as expected when using newline replacements.
Line ending characters replaced are those defined in unicode 16.0.0 specification section 5.8 Newline guidelines in table 5-1.
\r// CARRIAGE RETURN\n// LINE FEED\f// FORM FEED (FF)\u0085// NEXT LINE (NEL)\u2028// LINE SEPARATOR\u2029// PARAGRAPH SEPARATOR