Commit 52b01fd
committed
Added binary index header implementation with benchmarks.
This PR adds index-header implementation based on [this design](https://thanos.io/proposals/201912_thanos_binary_index_header.md/)
It adds a separate indexheader.Binary* structs and method allowing to build and read index-header in binary format.
## Stats
Size difference:
10k series for my autogenerated data it's 2.1x
-rw-r--r-- 1 bwplotka bwplotka 6.1M Jan 10 13:20 index
-rw-r--r-- 1 bwplotka bwplotka 23K Jan 10 13:20 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 9.2K Jan 10 13:20 index-header
For realistic block 8mln series, also similar gain.
-rw-r--r-- 1 bwplotka bwplotka 1.9G Jan 10 13:29 index
-rw-r--r-- 1 bwplotka bwplotka 287M Jan 10 13:29 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 122M Jan 10 13:29 index-header
NOTE: Size is smaller, but it's not what we are trying to optimize for. Nevertheless
PostingOffsets and Symbols takes significant amount of bytes. The only downsides of size
is the fact that to create such index-header we have to fetch those two parts ~60MB each from object storage.
Idea for improvement if that will become a problem: Cache only 32th of the posting ranges and fetch gaps between on demand
on query time (with some cache).
Real time latencies for creation and loading (without network traffic):
For 10k block it's similar for both (ms/micros), for 8mln we can spot the difference:
index-header:
* write 134.197732ms
* read 415.971774ms
index-cache.json:
* write 6.712496338s
* read 6.112222132s
## Go Benchmarks:
Before comparing I changed names to correlate tests:
BenchmarkJSONReader-12-> BenchmarkRead-12 old
BenchmarkBinaryReader-12 -> BenchmarkRead-12 new
BenchmarkJSONWrite-12 -> BenchmarkWrite-12 old
BenchmarkBinaryWrite-12 -> BenchmarkWrite-12 new
### 10k series block:
benchmark old ns/op new ns/op delta
BenchmarkRead-12 591780 66613 -88.74%
BenchmarkWrite-12 2458454 6532651 +165.72%
benchmark old allocs new allocs delta
BenchmarkRead-12 2306 629 -72.72%
BenchmarkWrite-12 1995 64 -96.79%
benchmark old bytes new bytes delta
BenchmarkRead-12 150904 32976 -78.15%
BenchmarkWrite-12 161501 73412 -54.54%
CPU time for smaller index file is interesting. Value is low anyway. Might be
something to follow up.
### 8mln series (index takes 2GB so not committed to git):
benchmark old ns/op new ns/op delta
BenchmarkRead-12 7026290474 552913402 -92.13%
BenchmarkWrite-12 6480769814 276441977 -95.73%
benchmark old allocs new allocs delta
BenchmarkRead-12 20100014 5501312 -72.63%
BenchmarkWrite-12 18263356 64 -100.00%
benchmark old bytes new bytes delta
BenchmarkRead-12 1873789526 406021516 -78.33%
BenchmarkWrite-12 2385193317 74187 -100.00%
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>1 parent 718e51a commit 52b01fd
File tree
17 files changed
+1376
-106
lines changed- docs/components
- pkg
- block
- indexheader
- testdata
- index_format_v1
- chunks
- index_format_v2
- chunks
- store
- testutil
17 files changed
+1376
-106
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
221 | 221 | | |
222 | 222 | | |
223 | 223 | | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
446 | 446 | | |
447 | 447 | | |
448 | 448 | | |
449 | | - | |
450 | | - | |
| 449 | + | |
| 450 | + | |
451 | 451 | | |
452 | 452 | | |
453 | 453 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
| 32 | + | |
| 33 | + | |
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
| |||
12 | 11 | | |
13 | 12 | | |
14 | 13 | | |
15 | | - | |
16 | 14 | | |
17 | 15 | | |
18 | 16 | | |
| |||
104 | 102 | | |
105 | 103 | | |
106 | 104 | | |
107 | | - | |
| 105 | + | |
108 | 106 | | |
109 | 107 | | |
110 | 108 | | |
| |||
115 | 113 | | |
116 | 114 | | |
117 | 115 | | |
118 | | - | |
| 116 | + | |
119 | 117 | | |
120 | 118 | | |
121 | 119 | | |
| |||
125 | 123 | | |
126 | 124 | | |
127 | 125 | | |
128 | | - | |
| 126 | + | |
129 | 127 | | |
130 | 128 | | |
131 | 129 | | |
| |||
136 | 134 | | |
137 | 135 | | |
138 | 136 | | |
139 | | - | |
| 137 | + | |
140 | 138 | | |
141 | 139 | | |
142 | 140 | | |
| |||
170 | 168 | | |
171 | 169 | | |
172 | 170 | | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | 171 | | |
199 | 172 | | |
200 | 173 | | |
| |||
0 commit comments