Commit ca2690d
committed
Added binary index header implementation with benchmarks.
This PR adds index-header implementation based on [this design](https://thanos.io/proposals/201912_thanos_binary_index_header.md/)
It adds a separate indexheader.Binary* structs and method allowing to build and read index-header in binary format.
## Stats
Size difference:
10k series for my autogenerated data it's 2.1x
-rw-r--r-- 1 bwplotka bwplotka 6.1M Jan 10 13:20 index
-rw-r--r-- 1 bwplotka bwplotka 23K Jan 10 13:20 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 9.2K Jan 10 13:20 index-header
For realistic block 8mln series, also similar gain.
-rw-r--r-- 1 bwplotka bwplotka 1.9G Jan 10 13:29 index
-rw-r--r-- 1 bwplotka bwplotka 287M Jan 10 13:29 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 122M Jan 10 13:29 index-header
NOTE: Size is smaller, but it's not what we are trying to optimize for. Nevertheless
PostingOffsets and Symbols takes significant amount of bytes. The only downsides of size
is the fact that to create such index-header we have to fetch those two parts ~60MB each from object storage.
Idea for improvement if that will become a problem: Cache only 32th of the posting ranges and fetch gaps between on demand
on query time (with some cache).
Real time latencies for creation and loading (without network traffic):
For 10k block it's similar for both (ms/micros), for 8mln we can spot the difference:
index-header:
* write 134.197732ms
* read 415.971774ms
index-cache.json:
* write 6.712496338s
* read 6.112222132s
## Go Benchmarks:
Before comparing I changed names to correlate tests:
BenchmarkJSONReader-12-> BenchmarkRead-12 old
BenchmarkBinaryReader-12 -> BenchmarkRead-12 new
BenchmarkJSONWrite-12 -> BenchmarkWrite-12 old
BenchmarkBinaryWrite-12 -> BenchmarkWrite-12 new
### 10k series block:
benchmark old ns/op new ns/op delta
BenchmarkRead-12 591780 66613 -88.74%
BenchmarkWrite-12 2458454 6532651 +165.72%
benchmark old allocs new allocs delta
BenchmarkRead-12 2306 629 -72.72%
BenchmarkWrite-12 1995 64 -96.79%
benchmark old bytes new bytes delta
BenchmarkRead-12 150904 32976 -78.15%
BenchmarkWrite-12 161501 73412 -54.54%
CPU time for smaller index file is interesting. Value is low anyway. Might be
something to follow up.
### 8mln series (index takes 2GB so not committed to git):
benchmark old ns/op new ns/op delta
BenchmarkRead-12 6221319001 502898265 -91.92%
BenchmarkWrite-12 5609757863 286510336 -94.89%
benchmark old allocs new allocs delta
BenchmarkRead-12 20099976 5501314 -72.63%
BenchmarkWrite-12 18263425 66 -100.00%
benchmark old bytes new bytes delta
BenchmarkRead-12 1873778853 406021704 -78.33%
BenchmarkWrite-12 2133929462 8462761 -99.60%
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>1 parent 718e51a commit ca2690d
File tree
17 files changed
+1376
-106
lines changed- docs/components
- pkg
- block
- indexheader
- testdata
- index_format_v1
- chunks
- index_format_v2
- chunks
- store
- testutil
17 files changed
+1376
-106
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
221 | 221 | | |
222 | 222 | | |
223 | 223 | | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
446 | 446 | | |
447 | 447 | | |
448 | 448 | | |
449 | | - | |
450 | | - | |
| 449 | + | |
| 450 | + | |
451 | 451 | | |
452 | 452 | | |
453 | 453 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
| 32 | + | |
| 33 | + | |
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
| |||
12 | 11 | | |
13 | 12 | | |
14 | 13 | | |
15 | | - | |
16 | 14 | | |
17 | 15 | | |
18 | 16 | | |
| |||
104 | 102 | | |
105 | 103 | | |
106 | 104 | | |
107 | | - | |
| 105 | + | |
108 | 106 | | |
109 | 107 | | |
110 | 108 | | |
| |||
115 | 113 | | |
116 | 114 | | |
117 | 115 | | |
118 | | - | |
| 116 | + | |
119 | 117 | | |
120 | 118 | | |
121 | 119 | | |
| |||
125 | 123 | | |
126 | 124 | | |
127 | 125 | | |
128 | | - | |
| 126 | + | |
129 | 127 | | |
130 | 128 | | |
131 | 129 | | |
| |||
136 | 134 | | |
137 | 135 | | |
138 | 136 | | |
139 | | - | |
| 137 | + | |
140 | 138 | | |
141 | 139 | | |
142 | 140 | | |
| |||
170 | 168 | | |
171 | 169 | | |
172 | 170 | | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | 171 | | |
199 | 172 | | |
200 | 173 | | |
| |||
0 commit comments