perf(#1470): cache compiled PathMatchers in GlobFilter constructor#1471
perf(#1470): cache compiled PathMatchers in GlobFilter constructor#1471idrisoffrinat-cpu wants to merge 2 commits into
Conversation
…nstructor GlobFilter.test() used to compile the include/exclude glob patterns into PathMatcher objects on every invocation, despite includes and excludes being final. In a typical run over thousands of .class files that meant every pattern was compiled N times instead of once. Move the compilation to the constructor and store the resulting matchers as final fields. test() now reads the precomputed sets.
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/main/java/org/eolang/jeo/GlobFilter.java (1)
54-59: Optional: considerList<PathMatcher>instead ofSet<PathMatcher>.
PathMatcherimplementations don't overrideequals/hashCode, so theSetdeduplicates by identity only — every compiled matcher is distinct regardless of pattern. Since the sourceincludes/excludesare alreadySet<String>(patterns are unique), you gain no dedup but pay for hashing on insert and iteration. AList<PathMatcher>would be a touch more efficient and semantically clearer.♻️ Proposed refactor
- private final Set<PathMatcher> whitelist; + private final List<PathMatcher> whitelist; @@ - private final Set<PathMatcher> blacklist; + private final List<PathMatcher> blacklist; @@ - this.whitelist = includes.stream() - .map(GlobFilter::matcher) - .collect(Collectors.toSet()); - this.blacklist = excludes.stream() - .map(GlobFilter::matcher) - .collect(Collectors.toSet()); + this.whitelist = includes.stream() + .map(GlobFilter::matcher) + .collect(Collectors.toUnmodifiableList()); + this.blacklist = excludes.stream() + .map(GlobFilter::matcher) + .collect(Collectors.toUnmodifiableList());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/eolang/jeo/GlobFilter.java` around lines 54 - 59, The whitelist and blacklist are built as Set<PathMatcher> but PathMatcher doesn't implement equals/hashCode so the Set provides no real dedup and wastes hashing; change the fields in GlobFilter from Set<PathMatcher> to List<PathMatcher>, build them with includes.stream().map(GlobFilter::matcher).collect(Collectors.toList()) and excludes.stream().map(GlobFilter::matcher).collect(Collectors.toList()), and update any methods that iterate or test matchers (referencing whitelist, blacklist and the matcher(...) method) to use List semantics (e.g., iterate with for/stream anyMatch) rather than Set-specific operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/main/java/org/eolang/jeo/GlobFilter.java`:
- Around line 54-59: The whitelist and blacklist are built as Set<PathMatcher>
but PathMatcher doesn't implement equals/hashCode so the Set provides no real
dedup and wastes hashing; change the fields in GlobFilter from Set<PathMatcher>
to List<PathMatcher>, build them with
includes.stream().map(GlobFilter::matcher).collect(Collectors.toList()) and
excludes.stream().map(GlobFilter::matcher).collect(Collectors.toList()), and
update any methods that iterate or test matchers (referencing whitelist,
blacklist and the matcher(...) method) to use List semantics (e.g., iterate with
for/stream anyMatch) rather than Set-specific operations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 955fbaca-d7cd-484b-8c53-d08b9fb148e6
📒 Files selected for processing (1)
src/main/java/org/eolang/jeo/GlobFilter.java
volodya-lombrozo
left a comment
There was a problem hiding this comment.
@idrisoffrinat-cpu great thanks for the contribution, just a small suggestion.
| GlobFilter(final Set<String> includes, final Set<String> excludes) { | ||
| this.includes = includes; | ||
| this.excludes = excludes; | ||
| this.whitelist = includes.stream() |
There was a problem hiding this comment.
@idrisoffrinat-cpu Can we create a separate constructor for the GolbFilter that will accept all four params? GlobFilter(final Set<String> includes, final Set<String> excludes, Set<PathMatcher> whitelist, private final Set<PathMatcher> blacklist;)
🚀 Performance AnalysisAll benchmarks are within the acceptable range. No critical degradation detected (threshold is 100%). Please refer to the detailed report for more information. Click to see the detailed report
|
PathMatcher does not override equals/hashCode, so Set deduplicates only by identity — providing no real dedup since the source includes and excludes are already Set<String> with unique patterns. Switch to List<PathMatcher> built with Collectors.toUnmodifiableList() to drop the unnecessary hashing on insert and to express intent more clearly. Per CodeRabbit nitpick on PR objectionary#1471.
Closes #1470.
Problem
GlobFilter.test()recompiled every include/exclude glob into aPathMatcheron every invocation, even though
includesandexcludesarefinalandimmutable. For a plugin run that filters thousands of
.classfiles, eachpattern was compiled N times instead of once.
Fix
Move the
PathMatchercompilation to the constructor and store the resultingsets as
finalfields.test()now reads precomputedwhitelist/blacklistinstead of rebuilding them per call. The matcher-building logic itself is
unchanged (same
GlobFilter::matcherhelper, sameCollectors.toSet()).This also aligns with the immutable-object principle the class already follows
for its string fields — all state is established at construction time.
Testing
mvn test -Dtest=GlobFilterTest— 12/12 green, no new or modified testsneeded since behaviour is identical.
Summary by CodeRabbit
Release Notes