fix(core): Reduce false positives in truncateBase64 for path-like strings by yamadashy · Pull Request #1307 · yamadashy/repomix

yamadashy · 2026-03-26T14:03:18Z

truncateBase64 was incorrectly truncating XPath and file-path strings (e.g., postTransactionAmounts/sharesOwnedFollowingTransaction/value) by misidentifying them as standalone base64 data.

Changes:

Raise MIN_BASE64_LENGTH_STANDALONE from 60 to 256 — truncating short strings saves negligible tokens, and the feature targets large embedded blobs
Require digits in isLikelyBase64() heuristic — real base64-encoded binary data virtually always contains numbers, while path-like strings typically do not
Update and expand tests with XPath/path false-positive cases and a 256-char boundary test

Checklist

Run npm run test
Run npm run lint

…ings Raise MIN_BASE64_LENGTH_STANDALONE from 60 to 256 since truncating short strings saves negligible tokens. Require digits in isLikelyBase64 heuristic since real base64-encoded binary data virtually always contains numbers, while XPath and file path strings typically do not. Closes #1298 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-26T14:03:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 13a39b16-eb88-46b1-ab8a-8820faa4e25e

📥 Commits

Reviewing files that changed from the base of the PR and between 1919956 and e2101a5.

📒 Files selected for processing (2)

src/core/file/truncateBase64.ts
tests/core/file/truncateBase64.test.ts

📝 Walkthrough

Walkthrough

The base64 truncation logic is refined to reduce false positives on path-like and XPath-like strings. The minimum standalone base64 length threshold increases from 60 to 256 characters, and the heuristic now rejects candidates lacking numeric digits before proceeding with character-type validation.

Changes

Cohort / File(s)	Summary
Base64 Truncation Logic `src/core/file/truncateBase64.ts`	Increased `MIN_BASE64_LENGTH_STANDALONE` from `60` to `256` to narrow the scope of standalone base64 eligibility. Added a requirement in `isLikelyBase64` to reject strings without digits before applying character-type checks, reducing false positives on paths and XPath expressions.
Base64 Truncation Tests `tests/core/file/truncateBase64.test.ts`	Introduced a new realistic `longBase64` constant and added boundary tests around the 256-character threshold. Added regression test cases to verify truncation does not occur on path-like and XPath-like strings, including those without digits.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

PR #733: Directly introduced the truncateBase64 function and its heuristics, which this PR refines by increasing the threshold and improving the digit-checking logic to address false positives on path-like strings.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: reducing false positives in truncateBase64 for path-like strings, which matches the primary objective of fixing issue `#1298`.
Description check	✅ Passed	The description addresses the linked issue, explains the changes made, and includes a completed checklist with test and lint verification.
Linked Issues check	✅ Passed	The PR fulfills both proposed solutions from issue `#1298`: increasing MIN_BASE64_LENGTH_STANDALONE to 256 and improving isLikelyBase64() by requiring digits to reduce false positives on path-like strings.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing the false positive issue: threshold increase, heuristic improvement, and comprehensive test updates including path-like string regression tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/truncate-base64-false-positives

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-26T14:03:36Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the truncateBase64 utility was incorrectly identifying and truncating path-like strings (e.g., XPath expressions) as base64 data. The changes refine the base64 detection heuristic and adjust the truncation threshold, ensuring that only genuinely large, embedded base64 blobs are truncated, thereby preventing unintended modifications to valid string data.

Highlights

Base64 Truncation Threshold: The minimum length for standalone base64 strings to be considered for truncation (MIN_BASE64_LENGTH_STANDALONE) has been increased from 60 to 256 characters to prevent truncating shorter, non-blob-like strings.
Base64 Detection Heuristic: The isLikelyBase64() function now requires the presence of digits in a string to identify it as base64, reducing false positives for path-like strings that typically lack numbers.
Test Coverage: New test cases were added to specifically address XPath and file-path false positives, along with a test for the new 256-character truncation boundary.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-03-26T14:04:02Z

⚡ Performance Benchmark

Latest commit:	`e2101a5` test(core): Add boundary test for exactly 256-char base64 string
Status:	✅ Benchmark complete!
Ubuntu:	2.40s (±0.01s) → 2.40s (±0.02s) · -0.00s (-0.2%)
macOS:	1.98s (±0.10s) → 2.06s (±0.16s) · +0.08s (+4.2%)
Windows:	3.65s (±0.06s) → 3.67s (±0.07s) · +0.02s (+0.6%)

Details

Packing the repomix repository with node bin/repomix.cjs
Warmup: 2 runs (discarded)
Measurement: 10 runs / 20 on macOS (median ± IQR)
Workflow run

codecov · 2026-03-26T14:05:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.19%. Comparing base (1919956) to head (e2101a5).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1307   +/-   ##
=======================================
  Coverage   87.18%   87.19%           
=======================================
  Files         115      115           
  Lines        4331     4333    +2     
  Branches     1007     1008    +1     
=======================================
+ Hits         3776     3778    +2     
  Misses        555      555

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

claude · 2026-03-26T14:06:39Z

Code Review

Overall: Looks good — clean, well-scoped fix for the false positive issue reported in #1298. The two-pronged approach (threshold increase + digits requirement) is complementary and appropriate.

Noteworthy Observations

The belt-and-suspenders approach works well here. The XPath string from #1298 (postTransactionAmounts/sharesOwnedFollowingTransaction/value, 61 chars) would be caught by either change alone — it's under 256 chars AND lacks digits. Having both guards makes the heuristic more robust against different classes of false positives.

Minor observation on isLikelyBase64 logic flow: After the !hasNumbers early return, hasNumbers is always true in the charTypeCount check. This means MIN_CHAR_TYPE_COUNT (3) effectively requires 2 of the remaining 3 types (uppercase, lowercase, +/). This is fine and maintains the existing contract — just noting the implicit tightening.

Potential edge case to consider (low risk)

Base64-encoded ASCII text (e.g., btoa("Hello World") → SGVsbG8gV29ybGQ=) in the 60–255 char range will no longer be truncated. This is an acceptable trade-off since:

The token savings for strings under 256 chars is minimal (the stated goal)
ASCII-text base64 is less common in real codebases than binary blobs
False positives on real content (like the XPath case) are worse than missing a small base64 string

Test coverage notes

Good coverage of:

Boundary at exactly 256 chars ✓
Above threshold (344 chars) ✓
Below threshold / medium strings (60 chars) ✓
Original false positive (XPath without digits) ✓
Long paths without digits ✓

One optional addition for the future: a test with a base64 string at 255 chars (just below threshold) that does contain digits, verifying it's preserved. The current "medium-length" test uses a 60-char string, so there's a gap in boundary testing from the "not truncated" side.

LGTM 👍

Reviewed by Claude

gemini-code-assist

Code Review

This pull request refines the base64 truncation logic by increasing the minimum length for standalone base64 strings to be truncated from 60 to 256 characters. It also introduces a new heuristic in the isLikelyBase64 function, requiring strings to contain digits to be considered base64, which addresses false positives for path-like or XPath strings (as reported in #1298). The changes include updated and new tests to validate these adjustments and the new truncation threshold. There are no review comments to provide feedback on.

yamadashy and others added 2 commits March 26, 2026 22:52

test(core): Add boundary test for exactly 256-char base64 string

e2101a5

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

devin-ai-integration Bot reviewed Mar 26, 2026

View reviewed changes

coderabbitai Bot approved these changes Mar 26, 2026

View reviewed changes

gemini-code-assist Bot reviewed Mar 26, 2026

View reviewed changes

yamadashy merged commit 2d70fc3 into main Mar 26, 2026
64 checks passed

yamadashy deleted the fix/truncate-base64-false-positives branch March 26, 2026 14:12

This was referenced Mar 26, 2026

Bug: truncateBase64 causes false positives on 60+ char XPath / path-like strings #1298

Open

docs(release): Add v1.13.1 release notes #1309

Merged

coderabbitai Bot mentioned this pull request Apr 26, 2026

perf(core): Automated performance tuning by Claude #1516

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(core): Reduce false positives in truncateBase64 for path-like strings#1307

fix(core): Reduce false positives in truncateBase64 for path-like strings#1307
yamadashy merged 2 commits into
mainfrom
fix/truncate-base64-false-positives

yamadashy commented Mar 26, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

gemini-code-assist Bot commented Mar 26, 2026

Uh oh!

github-actions Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

claude Bot commented Mar 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yamadashy commented Mar 26, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

coderabbitai Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

gemini-code-assist Bot commented Mar 26, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡ Performance Benchmark

Uh oh!

codecov Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

claude Bot commented Mar 26, 2026

Code Review

Noteworthy Observations

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yamadashy commented Mar 26, 2026 •

edited by devin-ai-integration Bot

Loading

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading

github-actions Bot commented Mar 26, 2026 •

edited

Loading

codecov Bot commented Mar 26, 2026 •

edited

Loading