Skip to content

Commit 63f04ca

Browse files
committed
[en] massive wording improvements in the "Nginx Variables (02)" article, also with proper section names added.
1 parent 930c01f commit 63f04ca

File tree

2 files changed

+139
-115
lines changed

2 files changed

+139
-115
lines changed

en/01-NginxVariables02.tut

Lines changed: 136 additions & 112 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,15 @@
11
= Nginx Variables (02) =
22

3-
One common misunderstanding with Nginx variable, is its life cycle be bounded
4-
with the C<Location> directive. Let's challenge it by checking another
5-
example
3+
== Variable Lifetime & Internal Redirection ==
4+
5+
We already know that Nginx variables are bound to each request handled by
6+
Nginx, for this reason they have exactly the same lifetime as the corresponding
7+
request.
8+
9+
There is another common misunderstanding here though: some newcomers tend to
10+
assume
11+
that the lifetime of Nginx variables is bound to the C<location> configuration
12+
block. Let's consider the following counterexample:
613

714
:nginx
815
server {
@@ -18,52 +25,51 @@ example
1825
}
1926
}
2027

21-
We use the 3rd party module L<ngx_echo> and its command L<ngx_echo/echo_exec>
22-
to execute C<location /bar> from within C<location /foo>. The mechanism
23-
is like
24-
"jumping" from one C<location> to another C<location> in Nginx internals
25-
when
26-
handling a specific request. This is different with HTTP C<301> and C<302>
27-
redirect, which conducts the external jump by collaborating with HTTP client.
28-
External redirect can be asserted when requesting URL is modified. Whereas
29-
Nginx
30-
internal jumps has no impact on the client side, and just like the C<exec>
31-
command
32-
found in C<Bourne Shell> (or C<Bash>), the execution path has no returns.
33-
Another
34-
close schema is the C<goto> statement in C<C>.
35-
36-
Being an internal jump in between the C<location> directives for Nginx
37-
processing,
38-
the request remains to be the original one, its copy of declared variables
39-
has not changed neither.
40-
Back to our example, if C</foo> is requested, the processing is like following:
41-
variable C<$a>
42-
is declared and initialized with string value C<hello> by command L<ngx_rewrite/set>
43-
within
44-
C<location> directive, then internal jump occurs by L<ngx_echo/echo_exec>
45-
and processing continues
46-
from C<location /bar>. Since C<$a> in the latter case is the same C<$a>
47-
variable initialized earlier
48-
we can expect the outcome is C<hello> when variable C<$a> is printed. Let'
49-
s prove ourselves with
50-
request:
28+
Here in C<location /foo> we use the L<ngx_echo/echo_exec> directive (provided
29+
by the 3rd-party module L<ngx_echo>) to initiate an "internal redirection" to
30+
C<location /bar>. The "internal redirection" is an operation that makes Nginx
31+
jump
32+
from one C<location> to another while processing a request. This "jumping"
33+
happens
34+
completely within the server itself. This is different from those "external
35+
redirections"
36+
based on the HTTP C<301> and C<302> responses because the latter is
37+
collaborated externally, by the HTTP clients. Also, in case of "external
38+
redirections", the
39+
end user could usually observe the change of the URL in her web browser's
40+
address bar while this is not the case for internal ones. "Internal
41+
redirections"
42+
are very similar to the C<exec> command in
43+
C<Bourne Shell> (or C<Bash>); it is a "one way trip" and never returns. Another
44+
similar example is the C<goto> statement in the C<C> language.
45+
46+
Being an "internal redirection", the request after the redirection
47+
remains the original one. It is just the current C<location> that is changed,
48+
so we are still using the original copy of the Nginx variable containers. Back
49+
to our example, the whole workflow is like this: Nginx first assigns to the
50+
C<$a> variable the string value C<hello> via the L<ngx_rewrite/set> directive
51+
in C<location /foo>, and then it issues an internal redirection via the
52+
L<ngx_echo/echo_exec> directive, thus leaving C<location /foo> and entering
53+
C<location /bar>, and finally it
54+
outputs the value of C<$a>. Because the value container of C<$a> remains the
55+
original one, we can expect the response output to be C<hello>. The test result
56+
confirms this:
5157

5258
:bash
5359
$ curl localhost:8080/foo
5460
a = [hello]
5561

56-
If however, the C</bar> is requested directly, C<$a> still has an empty
57-
value because
58-
it is initialized in C<location /foo> only.
62+
But when accessing C</bar> directly from the client side, we will get an empty
63+
value for the C<$a> value, since this variable relies on C<location /foo> to
64+
get initialized.
5965

60-
The example tells, when a request is being handled, even across multiple
61-
C<location> directives,
62-
its copy of Nginx variables has not been reconstructed. The concept of
63-
"internal jumps" is also
64-
worth of noting, that the built-in L<ngx_rewrite> module and its command
65-
L<ngx_rewrite/rewrite>
66-
can execute exactly the same kind of internal jump. To rewrite our example:
66+
It can be observed that during a request's lifetime, the copy
67+
of Nginx variable containers does not change at all even when Nginx goes across
68+
different C<location> configuration blocks. Here we also meet the concept of
69+
"internal redirections" for the first time and it's worth mentioning that, the
70+
L<ngx_rewrite/rewrite> directive of the L<ngx_rewrite> module can also be used
71+
to initiate "internal redirections". For instance, we can rewrite the example
72+
above with the L<ngx_rewrite/rewrite> directive as follows:
6773

6874
:nginx
6975
server {
@@ -79,45 +85,42 @@ can execute exactly the same kind of internal jump. To rewrite our example:
7985
}
8086
}
8187

88+
It's functionally equivalent to L<ngx_echo/echo_exec>. We will discuss the
89+
L<ngx_rewrite/rewrite> directive in more depth in later chapters, like
90+
initiating "external redirections" like C<301> and C<302>.
91+
92+
To conclude, the lifetime of Nginx variable containers is indeed bound to the
93+
request being processed, and is irrelevant to C<location>.
94+
95+
== Nginx Built-in Variables ==
96+
97+
The Nginx variables we have seen so far are all (implicitly) created by
98+
directives like L<ngx_rewite/set>. We usually call such variables "user-defined
99+
varaibles", or simply "user variables". There is also another kind of Nginx
100+
variables that are I<pre-defined> by either the Nginx core or Nginx modules.
101+
Let's call this kind of variables "built-in variables".
82102

83-
Net effect has no differences with L<ngx_echo/echo_exec>. L<ngx_rewrite/rewrite>
84-
will
85-
be addressed more specifically later, for its usage in C<301> and C<302>
86-
redirects.
87-
88-
Again, we have asserted that Nginx variable's life time is bounded with
89-
the request being
90-
handled and it has nothing to do with C<location> directives.
91-
92-
So far, the variables we have discussed are implicitly created by L<ngx_rewrite/set>.
93-
They
94-
are called "user defined variables" or simply "user variables". Besides
95-
variables defined
96-
by user, Nginx core and various Nginx modules can provide "pre-defined
97-
variables" or "builtin
98-
variables).
99-
100-
Builtin variables are mostly used to provide request or response information.
101-
For instance
102-
builtin variable L<ngx_core/$uri>, declared by L<ngx_http_core> module,
103-
gives the URI of
104-
the request being handled (url-decoded and exclude request parameters).
105-
Another builtin
106-
variable L<ngx_core/$request_uri> gives the original request URI (url-encoded
107-
and include
108-
request parameter). Another example:
103+
=== $uri & $request_uri ===
104+
105+
One common use of Nginx built-in variables is to retrieve various types of
106+
information about the current request or response. For instance, the built-in
107+
variable L<ngx_core/$uri> provided by L<ngx_http_core> is used to fetch the
108+
(decoded) URI of the current request, excluding any query-string arguments.
109+
Another example is the L<ngx_core/$request_uri> variable provided by the same
110+
module, which is used to fetch the raw, non-decoded form of the URI, including
111+
any query-string. Let's look at the following example.
109112

110113
:nginx
111114
location /test {
112115
echo "uri = $uri";
113116
echo "request_uri = $request_uri";
114117
}
115118

116-
for the sake of clearness, C<server> directive is omitted. As usual the
117-
server is listening
118-
on C<8080> port, the example prints Nginx builtin variables L<ngx_core/$uri>
119-
and L<ngx_core/$request_uri>
120-
in the response. Now let's send a request to C<test>:
119+
We omit the C<server> configuration block here for brevity. Just as all those
120+
samples above, we still listen to the C<8080> local port. In this example, we
121+
output both the L<ngx_core/$uri> and L<ngx_core/$request_uri> into the response
122+
body. Below is the result of testing this C</test> interface with different
123+
requests:
121124

122125
:bash
123126
$ curl 'http://localhost:8080/test'
@@ -132,22 +135,30 @@ in the response. Now let's send a request to C<test>:
132135
uri = /test/hello world
133136
request_uri = /test/hello%20world?a=3&b=4
134137

135-
There is another category of builtin variables, these variable names
136-
has the C<arg_> prefix, such as C<$arg_name>, its value is the url-encoded
137-
URI parameter C<name>, here is a finer example:
138+
=== Variables with Infinite Names ===
139+
140+
There is another very common built-in variable that does not have a fixed
141+
variable name. Instead, It has I<infinite> variations. That is, all those
142+
variables whose names have the prefix C<arg_>, like C<$arg_foo> and
143+
C<$arg_bar>. Let's just call it the C<$arg_XXX> "variable group". For example,
144+
the C<$arg_name> variable is evaluated to the value of the C<name> URI argument
145+
for the current request. Also, the URI argument's value obtained here is not
146+
decoded yet, potentially containing C<%XX> sequences. Let's check out a
147+
complete example:
138148

139149
:nginx
140150
location /test {
141151
echo "name: $arg_name";
142152
echo "class: $arg_class";
143153
}
144154

145-
We test C</test> with a few scenarios, each with different URL parameter
155+
Then we test this interface out with various different URI argument
156+
combinations:
146157

147158
:bash
148159
$ curl 'http://localhost:8080/test'
149-
name:
150-
class:
160+
name:
161+
class:
151162

152163
$ curl 'http://localhost:8080/test?name=Tom&class=3'
153164
name: Tom
@@ -157,22 +168,24 @@ We test C</test> with a few scenarios, each with different URL parameter
157168
name: hello%20world
158169
class: 9
159170

160-
C<$arg_name> is case-insensitive, it matches to C<name> URL parameter
161-
and it matches the C<NAME> or C<Name> as well:
171+
In fact, C<$arg_name> does not only match the C<name> argument name, but also
172+
C<NAME> or even C<Name>. That is, the letter case does not matter here:
162173

163174
$ curl 'http://localhost:8080/test?NAME=Marry'
164175
name: Marry
165-
class:
176+
class:
166177

167178
$ curl 'http://localhost:8080/test?Name=Jimmy'
168179
name: Jimmy
169-
class:
180+
class:
170181

171-
Nginx lower-cases all URL parameter keys before it declares those builtin
172-
variables.
182+
Behind the scene, Nginx just converts the URI argument names into the
183+
pure lower-case form before matching against the name specified by
184+
C<$arg_XXX>.
173185

174-
A 3rd party module L<ngx_set_misc> and its command L<ngx_set_misc/set_unescape_uri>
175-
can execute URL decoding for string sequences like C<%XX>
186+
If you want to decode the special sequences like C<%20> in the URI argument
187+
values, then you could use the L<ngx_set_misc/set_unescape_uri> directive
188+
provided by the 3rd-party module L<ngx_set_misc>.
176189

177190
:nginx
178191
location /test {
@@ -183,47 +196,58 @@ can execute URL decoding for string sequences like C<%XX>
183196
echo "class: $class";
184197
}
185198

186-
Again:
199+
Let's check out the actual effect:
187200

188201
:bash
189202
$ curl 'http://localhost:8080/test?name=hello%20world&class=9'
190203
name: hello world
191204
class: 9
192205

193-
white space is decoded !
206+
The encoded space has indeed been decoded!
207+
208+
Another thing that we can observe from this example is that the
209+
L<ngx_set_misc/set_unescape_uri> directive can also implicitly create Nginx
210+
user-defined variables, just like the L<ngx_rewrite/set> directive. We will
211+
discuss the
212+
L<ngx_set_misc> module in more detail in future chapters.
213+
214+
This type of variables like L<$arg_XXX> possesses infinite number of possible
215+
names, so they do not correspond to any value containers.
216+
Furthermore, such variables are handled in a very specific way within the Nginx
217+
core. It
218+
is thus not possible for 3rd-party modules to introduce such magical built-in
219+
variables of their own.
220+
221+
The Nginx core offers a lot of such built-in variables in addition to
222+
L<$arg_XXX>, like the L<$cookie_XXX> variable group for fetching HTTP cookie
223+
values, the L<$http_XXX> variable group for fetching request headers, as well
224+
as the L<$sent_http_XXX> variable group for retrieving response headers. We
225+
will not go into the details for each of them here. Interested readers can
226+
refer to the official documentation for the L<ngx_http_core> module.
194227

195-
As we can see, command L<ngx_set_misc/set_unescape_uri> is like command
196-
L<ngx_rewrite/set>
197-
has the capability of declare and initialize Nginx variables. Later on
198-
we will discuss more of
199-
the L<ngx_rewrite/set> module.
228+
=== Writing to Built-in Variables ===
200229

201-
Variables like L<$arg_XXX>, are declared specifically within Nginx core.
202-
3rd party module
203-
has no equivalent capabilities. There are similar category of variables,
204-
C<$cookie_XXX> to retrieve
205-
cookie, L<$http_XXX> the headers and L<$sent_http_XXX> response headers.
206-
Please reference
207-
official documentation of L<ngx_http_core> module for details.
230+
All the user-defined variables are writable. Actually the way that we declare
231+
or create such variables so far is to use a configure directive, like
232+
L<ngx_rewrite/set>, that performs value assignment at request time. But it is
233+
I<not> necessarily the case for built-in variables.
208234

209-
Attention, many builtin variables are read-only. Such as the one we have
210-
lately introduced
211-
L<ngx_core/$uri> and L<ngx_core/$request_uri>. One must avoid to assign
212-
values to read-only variables,
213-
unless they enjoy surprises, for example:
235+
Most of the built-in variables are effectively I<read-only>, like the
236+
L<ngx_core/$uri> and L<ngx_core/$request_uri> variables that we just introduced
237+
earlier. Assignments to such read-only variables must always be avoided.
238+
Otherwise it will lead to unexpected consequences, for example,
214239

215240
:nginx
216241
? location /bad {
217242
? set $uri /blah;
218243
? echo $uri;
219244
? }
220245

221-
This problematic configuration dumps fatal error when Nginx is started
222-
and leaves absolute no clue:
246+
This problematic configuration just triggers a confusing error message when
247+
Nginx is started:
223248

224249
[emerg] the duplicate "uri" variable in ...
225250

226-
Attempt to write other read-only variables such as L<$arg_XXX> variables,
227-
can blow the Nginx
228-
process right away in a few particular releases.
251+
Attempts of writing to some other read-only built-in variables like L<$arg_XXX>
252+
will just lead to server crashes in some particular Nginx versions.
229253

utils/wiki2html-en.pl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -267,15 +267,15 @@ sub usage {
267267
sub quote_anchor {
268268
my $id = shift;
269269
for ($id) {
270-
s/\$/dollar/g;
271-
s/\&/and/g;
270+
s/\$/-dollar-/g;
271+
s/\&/-and-/g;
272272
s/[^-\w.]/-/g;
273273
s/--+/-/g;
274274
s/^-+|-+$//g;
275275
$_ = lc;
276276
}
277277

278-
$id =~ s/^01-nginxvariables01-/nginx-variables-/;
278+
$id =~ s/^01-nginxvariables\d+-/nginx-variables-/;
279279

280280
return $id;
281281
}

0 commit comments

Comments
 (0)