11= Nginx Variables (04) =
22
3- Even if a Nginx variable is hooked with "get handler", it can opt-in to
4- use the value container as cache, so that when a variable is read multiple
5- times, "get handler" is executed only once.Here is an example:
3+ == Value Containers for Caching & ngx_map ==
4+
5+ Some Nginx variables choose to use their value containers as a data cache when
6+ the "get handler" is configured. In this setting, the "get handler" is run only
7+ once, i.e., at the first time the variable is read, which reduces overhead when
8+ the variable is read multiple times during its lifetime. Let's see an example
9+ for this.
610
711 :nginx
812 map $args $foo {
@@ -17,141 +21,155 @@ times, "get handler" is executed only once.Here is an example:
1721 set $orig_foo $foo;
1822 set $args debug;
1923
20- echo "orginal foo: $orig_foo";
24+ echo "original foo: $orig_foo";
2125 echo "foo: $foo";
2226 }
2327 }
2428
25- Module L<ngx_map> and its command L<ngx_map/map> is new, let me explain.
26- command L<ngx_map/map> in Nginx defines the mapping in between two Nginx
27- variables. Back to our example, command L<ngx_map/map> defines the mapping
28- from builtin variable L<ngx_core/$args> to user variable C<$foo>, in other
29- words, the value of C<$foo> is decided by the value of L<ngx_core/$args>
30- with the given mapping.
31-
32- What exactly our mapping is defined as ?
29+ Here we use the L<ngx_map/map> directive from the standard module L<ngx_map>
30+ for the first time, which deserves some introduction. The word C<map> here
31+ means mapping or correspondence. For example, functions in Maths are a kind of
32+ "mapping". And Nginx's L<ngx_map/map> directive is used to define a "mapping"
33+ relationship between two Nginx variables, or in other words, "function
34+ relationship". Back to this example, we use the L<ngx_map/map> directive to
35+ define the "mapping" relationship between user variable C<$foo> and built-in
36+ variable L<ngx_core/$args>. When using the Math function notation, C<y = f(x)>,
37+ our C<$args> variable is effectively the "independent variable", C<x>, while
38+ C<$foo> is the "dependent variable", C<y>. That is, the value of C<$foo>
39+ depends on the value of L<ngx_core/$args>, or rather, we I<map> the value of
40+ L<ngx_core/$args> onto the C<$foo> variable (in some way).
41+
42+ Now let's look at the exact mapping rule defined by the L<ngx_map/map>
43+ directive in this example.
3344
3445 :nginx
3546 map $args $foo {
3647 default 0;
3748 debug 1;
3849 }
3950
40- C<default>, found in the first line within curly bracket, defines the
41- default mapping rule. It means if no other rules can be applied, mapping
42- executes the default one, which assigns variable C<$foo> with value C<0>.
43- The second line in the curly bracket defines another rule, which assigns
44- variable C<$foo> with value C<1> when builtin variable L<ngx_core/$args>
45- equals to string C<debug>. Therefore, variable C<$foo> is either C<0> or
46- C<1>,
47- up to whether L<ngx_core/$args> equals to string C<debug>.
48-
49- It's cleared enough. Back to our C<location /test>, we saved the value
50- of
51- C<$foo> to another user variable C<$orig_foo> and forcefully overwrite
52- the
53- value of L<ngx_core/$args> as C<debug>. At last, we print both C<$orig_foo>
54- and C<$foo> using L<ngx_echo/echo>.
55-
56- When L<ngx_core/$args> is forcefully overwritten as C<debug>, we might
57- have
58- thought C<$foo> has the value C<1> according to our L<ngx_map/map> mappings,
59- but testing defeats us:
51+ The first line within the curly braces is a special rule condition, that is,
52+ this condition holds if and only if other conditions all fail. When this
53+ "default" condition holds, the "dependent variable" C<$foo> is assigned by the
54+ value C<0>. The second line within the curly braces means that the "dependent
55+ variable" C<$foo> is assigned by the value C<1> if the "independent variable"
56+ C<$args> matches the string value C<debug>. Combining these two lines, we
57+ obtain the following complete mapping rule: if the value of L<ngx_core/$args>
58+ is C<debug>, variable C<$foo> gets the value C<1>; otherwise C<$foo> gets the
59+ value C<0>. So essentially, this is a conditional assignment to the variable
60+ C<$foo>.
61+
62+ Now that we understand what the L<ngx_map/map> directive does, let's look at
63+ the definition of C<location /test>. We first save the value of C<$foo> into
64+ another user variable C<$orig_foo>, then overwrite the value of
65+ L<ngx_core/$args> to C<debug>, and finally output the values of C<$orig_foo>
66+ and C<$foo>, respectively.
67+
68+ Intuitively, after we overwrite the value of L<ngx_core/$args> to C<debug>, the
69+ value of C<$foo> should automatically get adjusted to C<1> according to the
70+ mapping rule defined earlier, regardless of the original value of C<$foo>. But
71+ the test result suggests the other way around.
6072
6173 :bash
6274 $ curl 'http://localhost:8080/test'
6375 original foo: 0
6476 foo: 0
6577
66- As expected, C<$orig_foo> is C<0>, since the request has no URL parameters
67- and
68- L<ngx_core/$args> is empty, our default mapping rule is effective, and
69- C<$foo>
70- gets its value C<0>.
71-
72- But the second output appears confusing, as L<ngx_core/args> is already
73- overwritten
74- as C<debug>, our mapping rule should have assigned variable C<$foo> with
75- value C<1>,
76- what's wrong?
77-
78- The reason is simple, when variable C<$foo> is needed the first time, its
79- calculated
80- value from the mapping algorithm is cached, as being said, Nginx module
81- can opt-in to
82- use value container as cache for the outcome of its "get handler". Apparently,
83- L<ngx_map>
84- caches the outcome to avoid further expensive calculation, so that Nginx
85- can use the cached
86- result for that variable in the subsequent handling for free.
87-
88- To verify this, we request again with an URL parameter C<debug>:
78+ The first output line indicates that the value of C<$orig_foo> is C<0>, which
79+ is exactly what we expected: the original request does not take a URL query
80+ string, so the initial value of L<ngx_core/$args> is empty, leading to the C<0>
81+ initial value of C<$foo>, according to the "default" condition in our mapping
82+ rule.
83+
84+ But surprisingly, the second output line indicates that the final value of
85+ C<$foo> is still C<0>, even after we overwrite L<ngx_core/$args> to the value
86+ C<debug>. This apparently violates our mapping rule because when
87+ L<ngx_core/$args> takes the value C<debug>, the value of C<$foo> should really
88+ be C<1>. So what is happening here?
89+
90+ Actually the reason is pretty simple: when the first time variable C<$foo> is
91+ read, its value computed by L<ngx_map>'s "get handler" is
92+ cached in its value container. We already learned earlier that Nginx modules
93+ may choose to use the value container of the variable created by themselves as
94+ a data cache for its "get handler". Obviously, the L<ngx_map> module considers
95+ the mapping computation between variables expensive enough and caches the result
96+ automatically, so that the next time the same variable is read within the
97+ lifetime of the current request, Nginx can just return the cached result
98+ without invoking the "get handler" again.
99+
100+ To verify this further, we can try specifying the URL query string as C<debug>
101+ in the original request.
89102
90103 :bash
91104 $ curl 'http://localhost:8080/test?debug'
92105 original foo: 1
93106 foo: 1
94107
95- Granted, the value of C<$orig_foo> becomes C<1>. Since builtin variable
96- L<ngx_core/$args>
97- equals C<debug>, according to the mapping rule, variable C<$foo> is calculated
98- as C<1>, and
99- the calculation result is cached and remains as C<1> no matter how L<ngx_core/$args>
100- will
101- be modified subsequently.
102-
103- Command L<ngx_map/map> is really more than what it looks, the command actually
104- hooks a
105- "get handler" for user variables, and exposes the script interface so that
106- exact devalue
107- logic can be easily modified by user themselves. The price of doing this,
108- is to restrict
109- the logic be the mapping from one variable to another. Meanwhile, let's
110- recall what we've
111- learnt back in L<vartut/ (03)>, even if a variable is devalued by a "get
112- handler", it does
113- not necessarily uses a value container as cache, such as the L<$arg_XXX>
114- variables.
115-
116- Just like module L<ngx_map>, another builtin module L<ngx_geo> uses cache
117- for variables.
118-
119- We should have noticed that command L<ngx_map/map> is written in front
120- of C<server>
121- directive, i.e. the mappings are defined directly within C<http>. Is it
122- possible to
123- write it within a C<location> directive since it is used only in C<location
124- /test> in
125- our example, the answer is no !
126-
127- People who have just learnt Nginx, would argue this global configuration
128- of
129- mappings by L<ngx_map/map>, is likely to be inefficient since request to
130- every C<location>
131- will cause the mapping be repeatedly calculated. Have no worry and let us
132- review,
133- command L<ngx_map/map> actually defines a "get handler" for a user variable,
134- the
135- get handler is only executed when the variable needs to be devalued (if
136- cache is used, the
137- handler is executed once for all), therefore, for those requests to certain
138- C<location>
139- which has not used the variable, no calculation will be triggered.
140-
141- The technique, which only calculates till the needed moment, is called
142- "lazy evaluation" in
143- computing. "Lazy evaluation", contrary to "eager evaluation", is not natively
144- supported by
145- most programming languages, a classic one who does is Haskell. In the mini
146- language of Nginx,
147- "eager evaluation" is far more common, such as following statement using
148- L<ngx_rewrite/set>:
108+ It can be seen that the value of C<$orig_foo> becomes C<1>, complying with our
109+ mapping rule. And subsequent readings of C<$foo> always yield the same cached
110+ result, C<1>, regardless of the new value of L<ngx_core/$args> later on.
111+
112+ The L<ngx_map/map> directive is actually a unique example, because it not only
113+ registers a "get handler" for the user variable, but also allows the user to
114+ define the computing rule in the "get handler" directly in the Nginx
115+ configuration file. Of course, the rule that can be defined here is limited to
116+ simple mapping relations with another variable. Meanwhile, it must be made
117+ clear that not all the variables using a "get handler" will cache the result.
118+ For instance, we have already seen earlier that the L<$arg_XXX> variable does
119+ not use its value container at all.
120+
121+ Similar to the L<ngx_map> module, the standard module L<ngx_geo> that we
122+ encountered earlier also enables value caching for the variables created by its
123+ L<ngx_geo/geo> directive.
124+
125+ === A Side Note for Use Contexts of Directives ===
126+
127+ In the previous example, we should also note that the L<ngx_map/map> directive
128+ is put outside the C<server> configuration block, that is, it is defined
129+ directly within the outermost C<http> configuration block. Some readers may be
130+ curious about this setting, since we only use it in C<location /test> after
131+ all. If we try putting the L<ngx_map/map> statement within the C<location>
132+ block, however, we will get the following error while starting Nginx:
133+
134+ [emerg] "map" directive is not allowed here in ...
135+
136+ So it is explicitly prohibited. In fact, it is only allowed to use the
137+ L<ngx_map/map> directive in the C<http>
138+ block. Every configure directive does have a pre-defined set of use contexts in
139+ the configuration file. When in doubt, always refer to the corresponding
140+ documentation for the exact use contexts of a particular directive.
141+
142+ == Lazy Evaluation of Variable Values ==
143+
144+ Many Nginx freshmen would worry that the use of the L<ngx_map/map> directive
145+ within the global scope (i.e., the C<http> block) will lead to unnecessary
146+ variable value computation and assignment for all the C<location>s in all the
147+ virtual servers even if only one C<location> block actually uses it.
148+ Fortunately, this is I<not> what is happening here. We have already learned how
149+ the L<ngx_map/map>
150+ directive works. It is the "get handler" (registered by the L<ngx_map> module)
151+ that performs the value computation and related assignment. And the "get
152+ handler" will not run at all
153+ unless the corresponding user variable is actually being read. Therefore, for
154+ those requests that never access that variable, there cannot be any (useless)
155+ computation involved.
156+
157+ The technique that postpones the value computation off to the point where the
158+ value is actually needed is called "lazy evaluation" in the computing world.
159+ Programming languages natively offering "lazy evaluation" is not very
160+ common though. The most famous example is the Haskell programming language,
161+ where lazy evaluation is the default semantics. In contrast with "lazy
162+ evaluation", it is much more common to see "eager evaluation". We are lucky
163+ to see examples of lazy evaluation here in the L<ngx_map> module, but
164+ the "eager evaluation" semantics is also much more common in the Nginx
165+ world. Consider the following L<ngx_rewrite/set> statement that cannot be
166+ simpler:
149167
150168 :nginx
151169 set $b "$a,$a";
152170
153- When variable C<$b> is declared by command L<ngx_rewrite/set>, the value
154- of C<$b> is computed right away, the calculation won't be delayed
155- till
156- variable C<$b> needs to be devalued .
171+ When running the L<ngx_rewrite/set> directive, Nginx eagerly
172+ computes and assigns the new value for the variable C<$b> without postponing to
173+ the point when C<$b> is actually read later on. Similarly, the
174+ L<ngx_set_misc/set_unescape_uri> directive also evaluates eagerly .
157175
0 commit comments