Skip to content

Commit abcb371

Browse files
committed
[en] massive wording improvements in "Nginx Variables (07)".
1 parent fef718c commit abcb371

File tree

1 file changed

+124
-129
lines changed

1 file changed

+124
-129
lines changed

en/01-NginxVariables07.tut

Lines changed: 124 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,32 @@
11
= Nginx Variables (07) =
22

3-
We have learnt in L<vartut/ (01)>, that Nginx variables could only
4-
be strings. We are in fact, not entirely correct because variables
5-
can have non-values. There are 2 kinds of non-values in Nginx, one
6-
is "invalid" value, another is "not found" value.
7-
8-
For example, if Nginx variable C<$foo> is declared but not initialized
9-
it has an "invalid" value. Whereas if there exists no C<XXX> parameter
10-
in the current request URL, builtin variable C<$arg_XXX> has a "not found"
11-
value.
12-
13-
Nginx special values, such as "invalid" value or "not found" value, are
14-
totally different from empty string (""). Like C<undefined> and C<null>
15-
found in JavaScript, or C<nil> found in Lua, these non-values are not
16-
numerical value C<0>, they are not boolean value C<false> either. In fact
17-
the C<NULL> found in SQL is an equivalent element.
18-
19-
Although back in L<vartut/ (01)>, the uninitialized value becomes empty
20-
string in "variable interpolation" via command L<ngx_rewrite/set>. This
21-
is
22-
because L<ngx_rewrite/set> hooks a "get handler" for the variable it declares,
23-
and the handler turns "invalid" value into empty string. Let's review the
24-
example in L<vartut/ (01)> for this assertion:
3+
== Special Value "Invalid" and "Not Found" ==
4+
5+
We have mentioned that the values of Nginx variables can only be of one single
6+
type, that is, the string type, but variables could also have no meaningful
7+
values
8+
at all. Variables without any meaningful values still take a special value
9+
though.
10+
There are two possible special values: "invalid" and "not found".
11+
12+
For example, when a user variable C<$foo> is created but not assigned yet,
13+
C<$foo> takes the special value of "invalid". And when the current URL
14+
query string does not have the C<XXX> argument at all, the built-in variable
15+
L<$arg_XXX> takes the special value of "not found".
16+
17+
Both "invalid" and "not found" are special values, completely different from an
18+
empty string value (C<"">). This is very similar to those distinct special
19+
values in some dynamic programing languages, like C<undef> in Perl, C<nil> in
20+
Lua, and C<null>
21+
in JavaScript.
22+
23+
We have seen earlier that an uninitialized variable is evaluated to an
24+
empty
25+
string when used in an interpolated string, its real value, however, is not an
26+
empty
27+
string at all. It is the "get handler" registered by the L<ngx_rewrite/set>
28+
directive that automatically converts the "invalid" special value into an empty
29+
string. To verify this, let's return to the example we have discussed before:
2530

2631
:nginx
2732
location /foo {
@@ -33,93 +38,80 @@ example in L<vartut/ (01)> for this assertion:
3338
echo "foo = [$foo]";
3439
}
3540

36-
Again to make it clearer, the C<server> directive is omitted. In this example
37-
command L<ngx_rewrite/set> implicitly declares variable C<$foo> within
38-
C<location /bar>
39-
Then we print the uninitialized C<$foo> within C<location /foo> by using
40-
command
41-
L<ngx_echo/echo>. The result is following when C<location /foo> was requested:
41+
When accessing C</foo>, the user variable C<$foo> is uninitialized when used in
42+
the interpolated string for the L<ngx_echo/echo> directive. The output shows
43+
that the variable is evaluated to an empty string:
4244

4345
:bash
4446
$ curl 'http://localhost:8080/foo'
4547
foo = []
4648

47-
If we look at the output, uninitialized variable C<$foo> is equivalent
48-
to an empty
49-
string. However if we look further into Nginx error log (usually the file
50-
name is F<error.log>)
51-
it has a warning message when the request is handled:
49+
From the output, the uninitialized C<$foo> variable behaves just like
50+
taking an empty string value. But careful readers should have already noticed
51+
that, for the request above, there is a warning in the Nginx error log file
52+
(which is F<logs/error.log> by default):
5253

5354
[warn] 5765#0: *1 using uninitialized "foo" variable, ...
5455

55-
How is the warning generated ? The answer is the "get handler" hooked to
56-
variable C<$foo>
57-
when it is declared by command L<ngx_rewrite/set>. By the time command
58-
L<ngx_echo/echo>
59-
gets executed within C<location /foo>, it needs to evaluate its parameter
60-
C<"foo = [$foo]">
61-
this is where "variable interpolation" is happening and variable C<$foo>
62-
is devalued,
63-
Nginx first checks the value container, which has a special "invalid" value,
64-
so it decides
65-
to execute the variable's "get handler". The handler prints a warning message
66-
in Nginx's error
67-
log, then returns and caches an empty string as the value of C<$foo>.
68-
69-
You might have perceived, this is exactly the same process with which those
70-
builtin variable
71-
works, when it opt-in a value container as cache. Command L<ngx_rewrite/set>
72-
uses the very
73-
mechanism to handle those uninitialized Nginx variables. Be careful though,
74-
only special value
75-
"invalid" will trigger Nginx to execute its "get handler", another special
76-
value "no found" won't.
77-
78-
The warning message is helpful, as it tells we might have miss spelled
79-
variables in Nginx
80-
configuration, or we might have used uninitialized variables under an incorrect
81-
context. Since
82-
cache exists, the warning won't repeat itself for a request life cycle.
83-
Besides, the warning
84-
can be turned off by module L<ngx_rewrite> and its command L<ngx_rewrite/uninitialized_variable_warn>
85-
86-
As we said earlier, builtin variable L<$arg_XXX> has a special value "not
87-
found" when
88-
the request URL has no C<XXX> parameter. However we cannot as easily distinguish
89-
it from
90-
an empty string, using Nginx native syntax.
56+
Who on earth generates this warning? The answer is the "get handler" of C<$foo>,
57+
registered by the L<ngx_rewrite/set> directive. When C<$foo> is read, Nginx
58+
first checks the value in its container but sees the "invalid" special value,
59+
then Nginx decides to continue running C<$foo>'s "get handler", which first
60+
prints the warning (as shown above) and then returns an empty string value,
61+
which thereafter gets cached in C<$foo>'s value container.
62+
63+
Careful readers should have identified that this process for user variables is
64+
exactly the same as the mechanism we discussed earlier for built-in variables
65+
involving "get handlers" and result caching in value containers. Yes, it is the
66+
same mechanism in action. It is also worth noting that only the "invalid"
67+
special value will trigger the "get handler" invocation in the Nginx core while
68+
"not found" will not.
69+
70+
The warning message above usually indicates a typo in the variable name or
71+
misuse of uninitialized variables, not necessarily in the context of an
72+
interpolated string. Because of the existence of value caching in the variable
73+
container, this warning will not get printed multiple times in the lifetime of
74+
the current request. Also, the L<ngx_rewrite> module provides the
75+
L<ngx_rewrite/uninitialized_variable_warn> directive for disabling this warning
76+
altogether.
77+
78+
=== Testing Special Values of Nginx Variables in Lua ===
79+
80+
As we have just mentioned, the built-in variable L<$arg_XXX> takes the special
81+
value "not found" when the URL argument C<XXX> does not exist, but
82+
unfortunately, it is not easy to distinguish it from the empty string value
83+
directly in the Nginx configuration file, for example:
9184

9285
:nginx
9386
location /test {
9487
echo "name: [$arg_name]";
9588
}
9689

97-
We print variable C<$arg_name> meanwhile not to provide C<name> parameter
98-
in the request
90+
Here we intentionally omit the URL argument C<name> in our request:
9991

10092
:bash
10193
$ curl 'http://localhost:8080/test'
10294
name: []
10395

104-
Special value "not found" cannot be asserted in the output, it looks like
105-
an empty string.
106-
The "variable interpolation" of Nginx simply ignores "not found" when it
107-
is evaluated.
96+
We can see that we are still getting an empty string value, because this time
97+
it is the Nginx "script engine" that automatically converts the "not found"
98+
special value to an empty string when performing variable interpolation.
10899

109-
So how do we trace "not found" ? What exactly we can do to distinguish
110-
it from an empty
111-
string ? Obviously, URL parameter C<name> has an empty string in the request
112-
below:
100+
Then how can we test the special value "not found"? Or in other
101+
words, how can we distinguish it from normal empty string values? Obviously, in
102+
the following example, the URL argument C<name> does take an ordinary value,
103+
which is a
104+
true empty string:
113105

114106
:bash
115107
$ curl 'http://localhost:8080/test?name='
116108
name: []
117109

118-
We cannot yet tell any differences from the earlier example.
110+
But we cannot really differentiate this from the earlier case that does not
111+
mention the C<name> argument at all.
119112

120-
Good news is, with the help of 3rd party module L<ngx_lua>, it can be done
121-
in
122-
lua code. Now check example below:
113+
Luckily, we can easily achieve this in Lua by means of the 3rd-party module
114+
L<ngx_lua>. Please look at the following example:
123115

124116
:nginx
125117
location /test {
@@ -132,70 +124,73 @@ lua code. Now check example below:
132124
';
133125
}
134126

135-
This configuration is pretty close to the earlier one, except
136-
we have used module L<ngx_lua> and its command L<ngx_lua/content_by_lua>,
137-
to check Nginx variables and their possible special values using lua code.
138-
Specifically, we print C<name: missing> if variable C<$arg_name> has
139-
a non-value "not found" or "invalid":
127+
This example is very close to the previous one in terms of functionality.
128+
We use the L<ngx_lua/content_by_lua> directive from the L<ngx_lua> module to
129+
embed a small piece of our own Lua code to test against the special value of
130+
the Nginx variable C<$arg_name>. When C<$arg_name> takes a special value
131+
(either "not found" or "invalid"), we will get the following output when
132+
requesting C</foo>:
140133

141134
:bash
142-
curl 'http://localhost:8080/test'
135+
$ curl 'http://localhost:8080/test'
143136
name: missing
144137

145-
Let me briefly introduce module L<ngx_lua>, the module embeds lua interpreter
146-
(standard or L<LuaJIT|http://luajit.org/luajit.html> in Nginx core, so
147-
that
148-
lua programs can be executed directly inside Nginx. The lua programs can
149-
be
150-
written right away in Nginx configuration or be written in external F<.
151-
lua>
152-
file and loaded via Nginx command referencing the F<.lua> path.
153-
154-
Back to our example, Nginx variables are referenced by C<ngx.var> from
155-
within
156-
lua, it is bridged by module L<ngx_lua>. For example, Nginx variable C<$VARIABLE>
157-
can be written as L<ngx_lua/ngx.var.VARIABLE> in lua code. When Nginx variable
158-
C<$arg_name> has non-value (special value "invalid" or "not found"), the
159-
corresponding
160-
variable C<ngx.var.arg_name> is C<nil> in lua. Further more, module L<ngx_lua>
161-
provides lua function L<ngx_lua/ngx.say>, functionally it is equivalent
162-
to
163-
module L<ngx_echo> and its command L<ngx_echo/echo>.
164-
165-
Now if we request with C<name> parameter being an empty string, the output
166-
becomes
167-
different:
138+
This is our first time meeting the L<ngx_lua> module, which deserves a brief
139+
introduction. This module embeds the Lua language interpreter (or LuaJIT's
140+
Just-in-Time compiler) into the Nginx core, to allow Nginx users directly run
141+
their own Lua programs inside the server. The user can choose to insert
142+
her Lua code into different running phases of the server, to fulfill different
143+
requirements. Such Lua code are either specified directly as literal strings in
144+
the Nginx
145+
configuration file, or reside in external F<.lua> source files (or Lua binary
146+
bytecode
147+
files) whose paths are specified in the Nginx configuration.
148+
149+
Back to our example, we cannot directly write something like C<$arg_name> in
150+
our Lua code. Instead, we reference Nginx variables in Lua by means of the
151+
C<ngx.var> API provided by the L<ngx_lua> module. For example, to reference the
152+
Nginx variable C<$VARIABLE> in Lua, we just write L<ngx_luua/ngx.var.VARIABLE>.
153+
When the Nginx variable C<$arg_name> takes the special value "not found" (or
154+
"invalid"), C<ngx.var.arg_name> is evaluated to the C<nil> value in the Lua
155+
world. It should also be noting that we use the Lua function L<ngx_lua/ngx.say>
156+
to print out the response body contents, which is functionally equivalent to
157+
the L<ngx_echo/echo> directive we are already very familiar with.
158+
159+
If we provide a C<name> URI argument that takes an empty value in the request,
160+
the output is now very different:
168161

169162
:bash
170163
$ curl 'http://localhost:8080/test?name='
171164
name: []
172165

173-
In this case, Nginx variable C<$arg_name> is an empty string, which
174-
is neither "not found" nor "invalid", so Lua code prints empty string
175-
"" for C<ngx.var.arg_name>. Apparently we have distinguished it from
176-
Lua C<nil>
166+
In this test, the value of the Nginx variable C<$arg_name> is a true empty
167+
string, neither "not found" nor "invalid". So in Lua, the expression
168+
C<ngx.var.arg_name> evaluates to the Lua empty string (C<"">), clearly
169+
distinguished from the Lua C<nil> value in the previous test.
177170

178-
The distinction becomes significant in a few scenarios. For example,
179-
a web service might filter its returns by C<name> by checking if
180-
C<name> parameter exists in URL parameters, even if C<name> has an
181-
empty string, it still can be used in a filtering operation.
171+
This differentiation is important in certain application scenarios. For
172+
instance, some web services have to decide whether to use a column value to
173+
filter the data set by checking the I<existence> of the corresponding URI
174+
argument. For these serives, when the C<name> URI argument is absent, the
175+
whole data set are just returned; when the C<name> argument takes an empty
176+
value, however, only those records that take an empty value are returned.
182177

183-
Admittedly, there are some restrictions with builtin variable L<$arg_XXX>
184-
as we can see from our request to C<location /test>:
178+
It is worth mentioning a few limitations in the standard L<$arg_XXX> variable.
179+
Consider using the following request to test C</test> in our previous example
180+
using Lua:
185181

186182
$ curl 'http://localhost:8080/test?name'
187183
name: missing
188184

189-
In this case, C<$arg_name> is still computed as "not found" non-value,
190-
which
191-
is counter common sense. Besides, L<$arg_XXX> only resolutes to the first
192-
C<XXX>
193-
parameter if there are multiple C<XXX> URL parameters, the rest are discarded:
185+
Now the C<$arg_name> variable still reads the "not found" special value, which
186+
is apparently counter-intuitive. Additionally, when multiple URI arguments with
187+
the same name are specified in the request, L<$arg_XXX> just
188+
returns the first value of the argument, discarding other values silently:
194189

195190
:bash
196191
$ curl 'http://localhost:8080/test?name=Tom&name=Jim&name=Bob'
197192
name: [Tom]
198193

199-
To fix these defects, one can use module L<ngx_lua> and its lua function
200-
L<ngx_lua/ngx.req.get_uri_args> in lua code.
194+
To solve these problems, we can directly use the Lua function
195+
L<ngx_lua/ngx.req.get_uri_args> provided by the L<ngx_lua> module.
201196

0 commit comments

Comments
 (0)