Skip to content

Commit 21db238

Browse files
committed
move unfinished compilation articles to drafts
1 parent fe2510a commit 21db238

File tree

5 files changed

+37
-34
lines changed

5 files changed

+37
-34
lines changed

content/english/hpc/compilation/abstractions.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
title: Non-Zero-Cost Abstractions
33
weight: 7
4+
draft: true
45
---
56

67
In general, abstractions are great. Applied well, they reduce the amount of code and the mental burden of a programer.

content/english/hpc/compilation/limitations.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
title: What Compilers Can and Can't Do
33
weight: 10
4+
draft: true
45
---
56

67
Let's sum up this chapter with a general advice.

content/english/hpc/compilation/pgo.md

Lines changed: 0 additions & 32 deletions
This file was deleted.

content/english/hpc/compilation/precalc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
title: Compile-Time Computation
33
weight: 8
4+
draft: true
45
---
56

67
### Precalculation

content/english/hpc/compilation/situational.md

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,41 @@ int factorial(int n) {
7171
```
7272

7373
<!--
74-
7574
What it usually does is it swaps the branches so that the more likely one goes immediately after jump (recall that "don't jump" branch is taken by default). The performance gain is usually rather small, because for most hot spots hardware branch prediction works just fine.
76-
7775
-->
7876

7977
There are many other cases like this when you need to point the compiler in the right direction, but we will get to them later when they become more relevant.
78+
79+
### Profile-Guided Optimization
80+
81+
Adding all this metadata to the source code is tedious. People already hate writing C++ even without having to do it.
82+
83+
It is also not always obvious whether certain optimizations are beneficial or not. To make a decision about branch reordering, function inlining, or loop unrolling, we need answers to questions like these:
84+
85+
- How often is this branch taken?
86+
- How often is this function called?
87+
- What is the average number of iterations in this loop?
88+
89+
Luckily for us, there is a way to provide this real-world information automatically.
90+
91+
*Profile-guided optimization* (PGO, also called "pogo" because it's easier and more fun to pronounce) is a technique that uses [profiling data](/hpc/profiling) to improve performance beyond what can be achieved with just static analysis. In a nutshell, it involves adding timers and counters to the points of interest in the program, compiling and running it on real data, and then compiling it again, but this time supplying additional information from the test run.
92+
93+
The whole process is automated by modern compilers. For example, the `-fprofile-generate` flag will let GCC instrument the program with profiling code:
94+
95+
```
96+
g++ -fprofile-generate [other flags] source.cc -o binary
97+
```
98+
99+
After we run the program — preferably on input that is as representative of real use case as possible — it will create a bunch of `*.gcda` files that contain log data for the test run, after which we can rebuild the program, but now adding the `-fprofile-use` flag:
100+
101+
```
102+
g++ -fprofile-use [other flags] source.cc -o binary
103+
```
104+
105+
It usually improves performance by 10-20% for large codebases, and for this reason it is commonly included in the build process of performance-critical projects. One more reason to invest in solid benchmarking code.
106+
107+
<!--
108+
109+
We will study how profiling works more deeply in the [next chapter](../../profiling).
110+
111+
-->

0 commit comments

Comments
 (0)