Skip to content

Commit 1a7a125

Browse files
committed
Add slides and move/fix lessons.
1 parent d028c50 commit 1a7a125

File tree

104 files changed

+1101
-440
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+1101
-440
lines changed

README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,12 +28,12 @@ Fuzzing experience is not required.
2828
4. Writing fuzzers (simple examples)
2929
5. Finding Heartbleed (CVE-2014-0160)
3030
6. Finding c-ares $100,000 bug (CVE-2016-5180)
31-
7. Fuzzing libxml2, learning how to improve the fuzzer and analyze performance
32-
8. Fuzzing libpng, learning an importance of seed corpus and other stuff
33-
9. Fuzzing re2 (TODO: add problems?)
34-
10. Fuzzing pcre2
35-
11. Chromium integration
36-
12. OSS-Fuzz project
31+
7. How to improve your fuzzer
32+
8. Fuzzing libxml2, learning how to improve the fuzzer and analyze performance
33+
9. Fuzzing libpng, learning an importance of seed corpus and other stuff
34+
10. Fuzzing re2 (TODO: add problems?)
35+
11. Fuzzing pcre2
36+
12. Chromium integration & homework assignment
3737

3838

3939
## Prerequisites
@@ -48,6 +48,7 @@ Fuzzer/build.sh
4848

4949
## Links
5050

51+
* all slides in a single presentation: [Google Slides](https://docs.google.com/presentation/d/1pbbXRL7HaNSjyCHWgGkbpNotJuiC4O7L_PDZoGqDf5Q/edit?usp=sharing)
5152
* libFuzzer documentation: [http://libfuzzer.info](http://libfuzzer.info)
5253
* libFuzzer tutorial: [http://tutorial.libfuzzer.info](http://tutorial.libfuzzer.info)
5354
* Google Online Security Blog: [Guided in-process fuzzing of Chrome components](https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html)
576 KB
Binary file not shown.

lessons/01/README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
11
# Lesson 01
22

3-
This is a theorethical introduction. Here will be slides.
4-
5-
TODO: Add slides for *"An introduction to fuzz testing"*
3+
This is a theorethical introduction, see the slides.

lessons/02/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@
1111

1212
## Instruction
1313

14-
Use `radamsa` to generate testcases from `seed_corpus`:
14+
Take a look at [generate_testcases.py](generate_testcases.py) scripts. Then use
15+
`radamsa` to generate testcases from `seed_corpus`:
1516
```bash
1617
./generate_testcases.py
1718
```
@@ -22,13 +23,14 @@ ls work/corpus/ | wc -l
2223
1000
2324
```
2425

25-
Run fuzzing:
26+
Take a look at [run_fuzzing.py](run_fuzzing.py) script. Then run fuzzing:
2627
```bash
2728
unxz bin/asan.tar.xz && tar xf bin/asan.tar
2829
./run_fuzzing.py
2930
```
3031

31-
If you don't see any output, no crash has been found.
32+
If you don't see any output, no crash has been found. Feel free to re-generate
33+
testcases many more times. Though it should take for a while to find a crash.
3234

3335

3436
[pdfium]: https://pdfium.googlesource.com/pdfium/
507 KB
Binary file not shown.

lessons/03/README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
11
# Lesson 03
22

3-
This is a theorethical lesson. Here will be slides.
4-
5-
TODO: Add slides for *"Coverage-guided fuzzing"*
3+
This is a theorethical lesson, see the slides.
1.01 MB
Binary file not shown.

lessons/07/README.md

Lines changed: 1 addition & 286 deletions
Original file line numberDiff line numberDiff line change
@@ -1,288 +1,3 @@
11
# Lesson 07
22

3-
Here we will be fuzzing [libxml2]. During this lesson we will:
4-
* see an importance of dictionaries
5-
* learn how to minimize the corpus
6-
* generate coverage report
7-
* catch Out-of-Memory errors and memory leaks
8-
9-
10-
### Build the library
11-
12-
```bash
13-
tar xzf libxml2.tgz
14-
cd libxml2
15-
16-
./autogen.sh
17-
18-
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -g -fsanitize=address \
19-
-fsanitize-coverage=edge,indirect-calls,8bit-counters,trace-cmp,trace-div,trace-gep"
20-
21-
CXX="clang++ $FUZZ_CXXFLAGS" CC="clang $FUZZ_CXXFLAGS" \
22-
CCLD="clang++ $FUZZ_CXXFLAGS" ./configure
23-
make -j$(nproc)
24-
```
25-
26-
### Build the first fuzzer
27-
28-
Take a look at the following fuzzer. Note the `xmlSetGenericErrorFunc` call. It
29-
is there to disable logging of error messages like "Incorrect XML document".
30-
These messages are very noisy, given the numbe rof invalid input generated by
31-
the fuzzer:
32-
33-
```cpp
34-
#include "libxml/parser.h"
35-
36-
void ignore (void* ctx, const char* msg, ...) {
37-
// Error handler to avoid spam of error messages from libxml parser.
38-
}
39-
40-
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
41-
xmlSetGenericErrorFunc(NULL, &ignore);
42-
43-
if (auto doc = xmlReadMemory(reinterpret_cast<const char*>(data),
44-
static_cast<int>(size), "noname.xml", NULL, 0)) {
45-
xmlFreeDoc(doc);
46-
}
47-
48-
return 0;
49-
}
50-
```
51-
52-
Then build it:
53-
54-
```bash
55-
cd ..
56-
clang++ -std=c++11 xml_read_memory_fuzzer.cc $FUZZ_CXXFLAGS -I libxml2/include \
57-
libxml2/.libs/libxml2.a ../../libFuzzer/libFuzzer.a -lz \
58-
-o xml_read_memory_fuzzer
59-
```
60-
61-
### Run the fuzzer with and without a dictionary
62-
63-
Run the fuzzer on empty corpus for 5 minutes (`-max_total_time=300`):
64-
65-
```bash
66-
mkdir corpus1
67-
./xml_read_memory_fuzzer -max_total_time=300 -print_final_stats=1 corpus1
68-
```
69-
70-
Open a new terminal and run the fuzzing on empty corpus again, but also add a
71-
dictionary (`-dict=`):
72-
73-
```bash
74-
mkdir corpus2
75-
./xml_read_memory_fuzzer -dict=./xml.dict -max_total_time=300 \
76-
-print_final_stats=1 corpus2
77-
```
78-
79-
Compare output of both processes while they are running. You should see that the
80-
second process gets the same coverage as the first one and then overrun it very
81-
quickly. This is an impact of dictionary used.
82-
83-
84-
### Corpus and coverage
85-
86-
The first process terminates somewhere at:
87-
88-
```
89-
#1975901 DONE cov: 1736 ft: 5795 corp: 1544/75Kb exec/s: 6564 rss: 494Mb
90-
```
91-
92-
Let's minimize its corpus (using `-merge=1` flag):
93-
94-
```bash
95-
mkdir corpus1_min
96-
./xml_read_memory_fuzzer -merge=1 corpus1_min corpus1
97-
```
98-
99-
The output looks like:
100-
101-
```bash
102-
INFO: Seed: 1508800405
103-
INFO: Loaded 1 modules (79184 guards): [0xd017e0, 0xd4ed20),
104-
INFO: -max_len is not provided, using 1048576
105-
Loaded 1024/1539 files from corpus1
106-
=== Merging extra 1539 units
107-
#1539 MIN0 cov: 1723 ft: 5810 units: 1008 exec/s: 0 rss: 95Mb
108-
#2547 MIN1 cov: 1724 ft: 5764 units: 987 exec/s: 0 rss: 125Mb
109-
#3534 MIN2 cov: 1724 ft: 5765 units: 975 exec/s: 0 rss: 154Mb
110-
#4509 MIN3 cov: 1724 ft: 5763 units: 971 exec/s: 0 rss: 183Mb
111-
=== Merge: written 971 units
112-
```
113-
114-
That means that libFuzzer made `971` testcase out of `1539` at the same code
115-
coverage.
116-
117-
To get some understanding of inputs generated by the fuzzer from scratch, let's
118-
brielfy go through the corpus:
119-
120-
```bash
121-
strings corpus1_min/* | more
122-
```
123-
124-
The second process terminates somewhere at:
125-
126-
```
127-
#2317811 DONE cov: 2873 ft: 8005 corp: 2359/121Kb exec/s: 7700 rss: 438Mb
128-
```
129-
130-
The coverage is significantly higher comparing with the first process output.
131-
132-
Let's minimize its corpus as well:
133-
134-
```bash
135-
mkdir corpus2_min
136-
./xml_read_memory_fuzzer -merge=1 corpus2_min corpus2
137-
```
138-
139-
The output:
140-
141-
```bash
142-
INFO: Seed: 2449634923
143-
INFO: Loaded 1 modules (79184 guards): [0xd017e0, 0xd4ed20),
144-
INFO: -max_len is not provided, using 1048576
145-
Loaded 1024/2356 files from corpus2
146-
Loaded 2048/2356 files from corpus2
147-
=== Merging extra 2356 units
148-
#2356 MIN0 cov: 2829 ft: 8012 units: 1571 exec/s: 0 rss: 126Mb
149-
#3927 MIN1 cov: 2830 ft: 7970 units: 1516 exec/s: 0 rss: 169Mb
150-
#5443 MIN2 cov: 2830 ft: 7969 units: 1503 exec/s: 0 rss: 210Mb
151-
#6946 MIN3 cov: 2830 ft: 7968 units: 1496 exec/s: 6946 rss: 250Mb
152-
#8442 MIN4 cov: 2830 ft: 7967 units: 1494 exec/s: 8442 rss: 291Mb
153-
=== Merge: written 1494 units
154-
```
155-
156-
And quickly go through the inputs generated by the fuzzer with a dictionary:
157-
158-
```bash
159-
strings corpus2_min/* | more
160-
```
161-
162-
### Generate coverage report
163-
164-
```bash
165-
ASAN_OPTIONS=coverage=1 ./xml_read_memory_fuzzer corpus1_min -runs=0
166-
```
167-
168-
This command should generate `.sancov` file in your working directory:
169-
170-
```bash
171-
$ ls *.sancov
172-
xml_read_memory_fuzzer.26851.sancov
173-
```
174-
175-
Then we need to convert that binary file to a symbolized `.symcov` file:
176-
177-
```bash
178-
sancov -symbolize xml_read_memory_fuzzer xml_read_memory_fuzzer.26851.sancov \
179-
> xml_read_memory_fuzzer.symcov
180-
```
181-
182-
To see the coverage report with user-friendly interface, let's launch local
183-
[coverage report server]:
184-
185-
```bash
186-
python3 coverage-report-server.py --symcov xml_read_memory_fuzzer.symcov \
187-
--srcpath libxml2
188-
```
189-
190-
Open [localhost:8001](http://localhost:8001/) in your browser to see the report.
191-
192-
193-
Let's generate coverage report for the second corpus (generated with dictionary)
194-
and compare both reports by eyes. Open new terminal and do the same stuff:
195-
196-
```bash
197-
ASAN_OPTIONS=coverage=1 ./xml_read_memory_fuzzer corpus2_min -runs=0
198-
199-
sancov -symbolize xml_read_memory_fuzzer <NEW_.SANCOV_FILE_PATH> \
200-
> xml_read_memory_fuzzer_2.symcov
201-
202-
python3 coverage-report-server.py --symcov xml_read_memory_fuzzer_2.symcov \
203-
--srcpath libxml2 --port 8002
204-
```
205-
206-
Go to [localhost:8002](http://localhost:8002/).
207-
208-
The second report obviously has higher percentage of coverage for the same files
209-
and even more source code files covered.
210-
211-
212-
### Build the second fuzzer
213-
214-
The second fuzzer aims `xmlRegexpCompile` function of libxml2 library:
215-
216-
```cpp
217-
#include "libxml/parser.h"
218-
#include "libxml/tree.h"
219-
#include "libxml/xmlversion.h"
220-
221-
void ignore (void * ctx, const char * msg, ...) {
222-
// Error handler to avoid spam of error messages from libxml parser.
223-
}
224-
225-
// Entry point for LibFuzzer.
226-
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
227-
xmlSetGenericErrorFunc(NULL, &ignore);
228-
229-
std::vector<uint8_t> buffer(size + 1, 0);
230-
std::copy(data, data + size, buffer.data());
231-
232-
xmlRegexpPtr x = xmlRegexpCompile(buffer.data());
233-
if (x)
234-
xmlRegFreeRegexp(x);
235-
236-
return 0;
237-
}
238-
```
239-
240-
Let's build it and run:
241-
242-
```bash
243-
clang++ -std=c++11 xml_compile_regexp_fuzzer.cc $FUZZ_CXXFLAGS \
244-
-I libxml2/include libxml2/.libs/libxml2.a ../../libFuzzer/libFuzzer.a -lz \
245-
-o xml_compile_regexp_fuzzer
246-
247-
mkdir corpus3
248-
./xml_compile_regexp_fuzzer -dict=./xml.dict corpus3
249-
```
250-
251-
You will quickly get an Out-of-memory crash:
252-
253-
```bash
254-
#796 NEW cov: 289 bits: 845 indir: 49 corp: 54/1518b exec/s: 0 rss: 43Mb L: 64 MS: 4 CrossOver-PersAutoDict-CrossOver-ChangeByte- DE: " xml:id=\"1\""-
255-
#800 NEW cov: 289 bits: 855 indir: 49 corp: 55/1556b exec/s: 0 rss: 43Mb L: 38 MS: 3 PersAutoDict-ChangeBit-CrossOver- DE: "%a"-
256-
==27928== ERROR: libFuzzer: out-of-memory (used: 2100Mb; limit: 2048Mb)
257-
To change the out-of-memory limit use -rss_limit_mb=<N>
258-
259-
Live Heap Allocations: 1003258238 bytes from 30527559 allocations; showing top 95%
260-
732653304 byte(s) (73%) in 30527221 allocation(s)
261-
#0 0x4c2a0c in __interceptor_malloc (/home/mmoroz/projects/libfuzzer-workshop/lessons/07/xml_compile_regexp_fuzzer+0x4c2a0c)
262-
#1 0x5d8506 in xmlRegNewRange /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:719:28
263-
#2 0x5d8506 in xmlRegAtomAddRange /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:1251
264-
#3 0x5d717e in xmlFAParseCharRange /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5066:9
265-
#4 0x5d717e in xmlFAParsePosCharGroup /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5084
266-
#5 0x5d4c40 in xmlFAParseCharGroup /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5125:6
267-
#6 0x5d2f89 in xmlFAParseCharClass /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5145:2
268-
#7 0x5d2f89 in xmlFAParseAtom /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5299
269-
#8 0x5d2f89 in xmlFAParsePiece /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5316
270-
#9 0x5d25e4 in xmlFAParseBranch /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5351:8
271-
#10 0x5b03ad in xmlFAParseRegExp /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5377:5
272-
#11 0x5af8f4 in xmlRegexpCompile /home/mmoroz/projects/libfuzzer-workshop/lessons/07/libxml2/xmlregexp.c:5473:5
273-
#12 0x4f14d0 in LLVMFuzzerTestOneInput /home/mmoroz/projects/libfuzzer-workshop/lessons/07/xml_compile_regexp_fuzzer.cc:27:20
274-
<...>
275-
```
276-
277-
In some cases it can be a memory leak. To detect leaks, enable `detect_leaks=1`
278-
option of AddressSanitizer and run the fuzzer again:
279-
280-
```bash
281-
ASAN_OPTIONS=detect_leaks=1 ./xml_compile_regexp_fuzzer -dict=./xml.dict corpus3
282-
```
283-
284-
That option enabled LeakSanitizer (a part of AddressSanitizer) to report memory
285-
leaks and crash the similar way as other crash reports.
286-
287-
[coverage report server]: http://llvm.org/svn/llvm-project/llvm/trunk/tools/sancov/coverage-report-server.py
288-
[libxml2]: http://www.xmlsoft.org/
3+
This is a theorethical lesson, see the slides.

0 commit comments

Comments
 (0)