Skip to content

Commit 8fe3977

Browse files
only split by newlines
To reduce overhead of the Extractor itself, we can chunk the work by lines instead of every whitespace-separated chunk. This seems to improve the overall cost even more! Co-authored-by: Jordan Pittman <jordan@cryptica.me>
1 parent e99f276 commit 8fe3977

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

crates/oxide/src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ fn read_all_files(changed_content: Vec<ChangedContent>) -> Vec<Vec<u8>> {
456456
fn parse_all_blobs(blobs: Vec<Vec<u8>>) -> Vec<String> {
457457
let mut result: Vec<_> = blobs
458458
.par_iter()
459-
.flat_map(|blob| blob.par_split(|x| x.is_ascii_whitespace()))
459+
.flat_map(|blob| blob.par_split(|x| matches!(x, b'\n' | b'\r')))
460460
.map(|blob| Extractor::unique(blob, Default::default()))
461461
.reduce(Default::default, |mut a, b| {
462462
a.extend(b);

0 commit comments

Comments
 (0)