11= Nginx Variables (02) =
22
3- One common misunderstanding with Nginx variable, is its life cycle be bounded
4- with the C<Location> directive. Let's challenge it by checking another
5- example
3+ == Variable Lifetime & Internal Redirection ==
4+
5+ We already know that Nginx variables are bound to each request handled by
6+ Nginx, for this reason they have exactly the same lifetime as the corresponding
7+ request.
8+
9+ There is another common misunderstanding here though: some newcomers tend to
10+ assume
11+ that the lifetime of Nginx variables is bound to the C<location> configuration
12+ block. Let's consider the following counterexample:
613
714 :nginx
815 server {
@@ -18,52 +25,51 @@ example
1825 }
1926 }
2027
21- We use the 3rd party module L<ngx_echo> and its command L<ngx_echo/echo_exec>
22- to execute C<location /bar> from within C<location /foo>. The mechanism
23- is like
24- "jumping" from one C<location> to another C<location> in Nginx internals
25- when
26- handling a specific request. This is different with HTTP C<301> and C<302>
27- redirect, which conducts the external jump by collaborating with HTTP client.
28- External redirect can be asserted when requesting URL is modified. Whereas
29- Nginx
30- internal jumps has no impact on the client side, and just like the C<exec>
31- command
32- found in C<Bourne Shell> (or C<Bash>), the execution path has no returns.
33- Another
34- close schema is the C<goto> statement in C<C>.
35-
36- Being an internal jump in between the C<location> directives for Nginx
37- processing,
38- the request remains to be the original one, its copy of declared variables
39- has not changed neither.
40- Back to our example, if C</foo> is requested, the processing is like following:
41- variable C<$a>
42- is declared and initialized with string value C<hello> by command L<ngx_rewrite/set>
43- within
44- C<location> directive, then internal jump occurs by L<ngx_echo/echo_exec>
45- and processing continues
46- from C<location /bar>. Since C<$a> in the latter case is the same C<$a>
47- variable initialized earlier
48- we can expect the outcome is C<hello> when variable C<$a> is printed. Let'
49- s prove ourselves with
50- request:
28+ Here in C<location /foo> we use the L<ngx_echo/echo_exec> directive (provided
29+ by the 3rd-party module L<ngx_echo>) to initiate an "internal redirection" to
30+ C<location /bar>. The "internal redirection" is an operation that makes Nginx
31+ jump
32+ from one C<location> to another while processing a request. This "jumping"
33+ happens
34+ completely within the server itself. This is different from those "external
35+ redirections"
36+ based on the HTTP C<301> and C<302> responses because the latter is
37+ collaborated externally, by the HTTP clients. Also, in case of "external
38+ redirections", the
39+ end user could usually observe the change of the URL in her web browser's
40+ address bar while this is not the case for internal ones. "Internal
41+ redirections"
42+ are very similar to the C<exec> command in
43+ C<Bourne Shell> (or C<Bash>); it is a "one way trip" and never returns. Another
44+ similar example is the C<goto> statement in the C<C> language.
45+
46+ Being an "internal redirection", the request after the redirection
47+ remains the original one. It is just the current C<location> that is changed,
48+ so we are still using the original copy of the Nginx variable containers. Back
49+ to our example, the whole workflow is like this: Nginx first assigns to the
50+ C<$a> variable the string value C<hello> via the L<ngx_rewrite/set> directive
51+ in C<location /foo>, and then it issues an internal redirection via the
52+ L<ngx_echo/echo_exec> directive, thus leaving C<location /foo> and entering
53+ C<location /bar>, and finally it
54+ outputs the value of C<$a>. Because the value container of C<$a> remains the
55+ original one, we can expect the response output to be C<hello>. The test result
56+ confirms this:
5157
5258 :bash
5359 $ curl localhost:8080/foo
5460 a = [hello]
5561
56- If however, the C</bar> is requested directly, C<$a> still has an empty
57- value because
58- it is initialized in C<location /foo> only .
62+ But when accessing C</bar> directly from the client side, we will get an empty
63+ value for the C<$a> value, since this variable relies on C<location /foo> to
64+ get initialized.
5965
60- The example tells, when a request is being handled, even across multiple
61- C<location> directives,
62- its copy of Nginx variables has not been reconstructed. The concept of
63- "internal jumps" is also
64- worth of noting, that the built-in L<ngx_rewrite> module and its command
65- L<ngx_rewrite/ rewrite>
66- can execute exactly the same kind of internal jump. To rewrite our example :
66+ It can be observed that during a request's lifetime, the copy
67+ of Nginx variable containers does not change at all even when Nginx goes across
68+ different C<location> configuration blocks. Here we also meet the concept of
69+ "internal redirections" for the first time and it's worth mentioning that, the
70+ L<ngx_rewrite/rewrite> directive of the L<ngx_rewrite> module can also be used
71+ to initiate "internal redirections". For instance, we can rewrite the example
72+ above with the L<ngx_rewrite/ rewrite> directive as follows :
6773
6874 :nginx
6975 server {
@@ -79,45 +85,42 @@ can execute exactly the same kind of internal jump. To rewrite our example:
7985 }
8086 }
8187
88+ It's functionally equivalent to L<ngx_echo/echo_exec>. We will discuss the
89+ L<ngx_rewrite/rewrite> directive in more depth in later chapters, like
90+ initiating "external redirections" like C<301> and C<302>.
91+
92+ To conclude, the lifetime of Nginx variable containers is indeed bound to the
93+ request being processed, and is irrelevant to C<location>.
94+
95+ == Nginx Built-in Variables ==
96+
97+ The Nginx variables we have seen so far are all (implicitly) created by
98+ directives like L<ngx_rewite/set>. We usually call such variables "user-defined
99+ varaibles", or simply "user variables". There is also another kind of Nginx
100+ variables that are I<pre-defined> by either the Nginx core or Nginx modules.
101+ Let's call this kind of variables "built-in variables".
82102
83- Net effect has no differences with L<ngx_echo/echo_exec>. L<ngx_rewrite/rewrite>
84- will
85- be addressed more specifically later, for its usage in C<301> and C<302>
86- redirects.
87-
88- Again, we have asserted that Nginx variable's life time is bounded with
89- the request being
90- handled and it has nothing to do with C<location> directives.
91-
92- So far, the variables we have discussed are implicitly created by L<ngx_rewrite/set>.
93- They
94- are called "user defined variables" or simply "user variables". Besides
95- variables defined
96- by user, Nginx core and various Nginx modules can provide "pre-defined
97- variables" or "builtin
98- variables).
99-
100- Builtin variables are mostly used to provide request or response information.
101- For instance
102- builtin variable L<ngx_core/$uri>, declared by L<ngx_http_core> module,
103- gives the URI of
104- the request being handled (url-decoded and exclude request parameters).
105- Another builtin
106- variable L<ngx_core/$request_uri> gives the original request URI (url-encoded
107- and include
108- request parameter). Another example:
103+ === $uri & $request_uri ===
104+
105+ One common use of Nginx built-in variables is to retrieve various types of
106+ information about the current request or response. For instance, the built-in
107+ variable L<ngx_core/$uri> provided by L<ngx_http_core> is used to fetch the
108+ (decoded) URI of the current request, excluding any query-string arguments.
109+ Another example is the L<ngx_core/$request_uri> variable provided by the same
110+ module, which is used to fetch the raw, non-decoded form of the URI, including
111+ any query-string. Let's look at the following example.
109112
110113 :nginx
111114 location /test {
112115 echo "uri = $uri";
113116 echo "request_uri = $request_uri";
114117 }
115118
116- for the sake of clearness, C<server> directive is omitted. As usual the
117- server is listening
118- on C<8080> port, the example prints Nginx builtin variables L<ngx_core/$uri>
119- and L<ngx_core/$request_uri>
120- in the response. Now let's send a request to C<test> :
119+ We omit the C<server> configuration block here for brevity. Just as all those
120+ samples above, we still listen to the C<8080> local port. In this example, we
121+ output both the L<ngx_core/$uri> and L<ngx_core/$request_uri> into the response
122+ body. Below is the result of testing this C</test> interface with different
123+ requests :
121124
122125 :bash
123126 $ curl 'http://localhost:8080/test'
@@ -132,22 +135,30 @@ in the response. Now let's send a request to C<test>:
132135 uri = /test/hello world
133136 request_uri = /test/hello%20world?a=3&b=4
134137
135- There is another category of builtin variables, these variable names
136- has the C<arg_> prefix, such as C<$arg_name>, its value is the url-encoded
137- URI parameter C<name>, here is a finer example:
138+ === Variables with Infinite Names ===
139+
140+ There is another very common built-in variable that does not have a fixed
141+ variable name. Instead, It has I<infinite> variations. That is, all those
142+ variables whose names have the prefix C<arg_>, like C<$arg_foo> and
143+ C<$arg_bar>. Let's just call it the C<$arg_XXX> "variable group". For example,
144+ the C<$arg_name> variable is evaluated to the value of the C<name> URI argument
145+ for the current request. Also, the URI argument's value obtained here is not
146+ decoded yet, potentially containing C<%XX> sequences. Let's check out a
147+ complete example:
138148
139149 :nginx
140150 location /test {
141151 echo "name: $arg_name";
142152 echo "class: $arg_class";
143153 }
144154
145- We test C</test> with a few scenarios, each with different URL parameter
155+ Then we test this interface out with various different URI argument
156+ combinations:
146157
147158 :bash
148159 $ curl 'http://localhost:8080/test'
149- name:
150- class:
160+ name:
161+ class:
151162
152163 $ curl 'http://localhost:8080/test?name=Tom&class=3'
153164 name: Tom
@@ -157,22 +168,24 @@ We test C</test> with a few scenarios, each with different URL parameter
157168 name: hello%20world
158169 class: 9
159170
160- C<$arg_name> is case-insensitive, it matches to C<name> URL parameter
161- and it matches the C<NAME> or C<Name> as well :
171+ In fact, C<$arg_name> does not only match the C<name> argument name, but also
172+ C<NAME> or even C<Name>. That is, the letter case does not matter here :
162173
163174 $ curl 'http://localhost:8080/test?NAME=Marry'
164175 name: Marry
165- class:
176+ class:
166177
167178 $ curl 'http://localhost:8080/test?Name=Jimmy'
168179 name: Jimmy
169- class:
180+ class:
170181
171- Nginx lower-cases all URL parameter keys before it declares those builtin
172- variables.
182+ Behind the scene, Nginx just converts the URI argument names into the
183+ pure lower-case form before matching against the name specified by
184+ C<$arg_XXX>.
173185
174- A 3rd party module L<ngx_set_misc> and its command L<ngx_set_misc/set_unescape_uri>
175- can execute URL decoding for string sequences like C<%XX>
186+ If you want to decode the special sequences like C<%20> in the URI argument
187+ values, then you could use the L<ngx_set_misc/set_unescape_uri> directive
188+ provided by the 3rd-party module L<ngx_set_misc>.
176189
177190 :nginx
178191 location /test {
@@ -183,47 +196,58 @@ can execute URL decoding for string sequences like C<%XX>
183196 echo "class: $class";
184197 }
185198
186- Again :
199+ Let's check out the actual effect :
187200
188201 :bash
189202 $ curl 'http://localhost:8080/test?name=hello%20world&class=9'
190203 name: hello world
191204 class: 9
192205
193- white space is decoded !
206+ The encoded space has indeed been decoded!
207+
208+ Another thing that we can observe from this example is that the
209+ L<ngx_set_misc/set_unescape_uri> directive can also implicitly create Nginx
210+ user-defined variables, just like the L<ngx_rewrite/set> directive. We will
211+ discuss the
212+ L<ngx_set_misc> module in more detail in future chapters.
213+
214+ This type of variables like L<$arg_XXX> possesses infinite number of possible
215+ names, so they do not correspond to any value containers.
216+ Furthermore, such variables are handled in a very specific way within the Nginx
217+ core. It
218+ is thus not possible for 3rd-party modules to introduce such magical built-in
219+ variables of their own.
220+
221+ The Nginx core offers a lot of such built-in variables in addition to
222+ L<$arg_XXX>, like the L<$cookie_XXX> variable group for fetching HTTP cookie
223+ values, the L<$http_XXX> variable group for fetching request headers, as well
224+ as the L<$sent_http_XXX> variable group for retrieving response headers. We
225+ will not go into the details for each of them here. Interested readers can
226+ refer to the official documentation for the L<ngx_http_core> module.
194227
195- As we can see, command L<ngx_set_misc/set_unescape_uri> is like command
196- L<ngx_rewrite/set>
197- has the capability of declare and initialize Nginx variables. Later on
198- we will discuss more of
199- the L<ngx_rewrite/set> module.
228+ === Writing to Built-in Variables ===
200229
201- Variables like L<$arg_XXX>, are declared specifically within Nginx core.
202- 3rd party module
203- has no equivalent capabilities. There are similar category of variables,
204- C<$cookie_XXX> to retrieve
205- cookie, L<$http_XXX> the headers and L<$sent_http_XXX> response headers.
206- Please reference
207- official documentation of L<ngx_http_core> module for details.
230+ All the user-defined variables are writable. Actually the way that we declare
231+ or create such variables so far is to use a configure directive, like
232+ L<ngx_rewrite/set>, that performs value assignment at request time. But it is
233+ I<not> necessarily the case for built-in variables.
208234
209- Attention, many builtin variables are read-only. Such as the one we have
210- lately introduced
211- L<ngx_core/$uri> and L<ngx_core/$request_uri>. One must avoid to assign
212- values to read-only variables,
213- unless they enjoy surprises, for example:
235+ Most of the built-in variables are effectively I<read-only>, like the
236+ L<ngx_core/$uri> and L<ngx_core/$request_uri> variables that we just introduced
237+ earlier. Assignments to such read-only variables must always be avoided.
238+ Otherwise it will lead to unexpected consequences, for example,
214239
215240 :nginx
216241 ? location /bad {
217242 ? set $uri /blah;
218243 ? echo $uri;
219244 ? }
220245
221- This problematic configuration dumps fatal error when Nginx is started
222- and leaves absolute no clue :
246+ This problematic configuration just triggers a confusing error message when
247+ Nginx is started :
223248
224249 [emerg] the duplicate "uri" variable in ...
225250
226- Attempt to write other read-only variables such as L<$arg_XXX> variables,
227- can blow the Nginx
228- process right away in a few particular releases.
251+ Attempts of writing to some other read-only built-in variables like L<$arg_XXX>
252+ will just lead to server crashes in some particular Nginx versions.
229253
0 commit comments