Provided by: varnish-selector_2.6.0-1_amd64
NAME
vmod_selector - Varnish Module for matching fixed strings, and mapping strings to backends, regexen and other data
SYNOPSIS
import selector; # Set creation new <obj> = selector.set([BOOL case_sensitive] [, BOOL allow_overlaps]) VOID <obj>.add(STRING [, STRING string] [, STRING regex] [, BACKEND backend] [, INT integer] [, BOOL bool] [, SUB sub]) VOID <obj>.create_stats() # Matching BOOL <obj>.match(STRING) BOOL <obj>.hasprefix(STRING) # Match properties INT <obj>.nmatches() BOOL <obj>.matched([INT n] [, STRING element] [, ENUM select]) INT <obj>.which([ENUM select] [, STRING element]) BOOL <obj>.check_call([INT n] [, STRING element] [, ENUM select]) # Retrieving objects by index, by string, or after match STRING <obj>.element([INT n] [, ENUM select]) STRING <obj>.string([INT n] [, STRING element] [, ENUM select]) BACKEND <obj>.backend([INT n] [, STRING element] [, ENUM select]) INT <obj>.integer([INT n] [, STRING element] [, ENUM select]) BOOL <obj>.bool([INT n] [, STRING element] [, ENUM select]) BOOL <obj>.re_match(STRING [, INT n] [, STRING element] [, ENUM select]) STRING <obj>.sub(STRING text, STRING rewrite [, BOOL all] [, INT n] [, STRING element] [, ENUM select]) SUB <obj>.subroutine([INT n] [, STRING element] [, ENUM select]) # VMOD version STRING selector.version()
DESCRIPTION
Varnish Module (VMOD) for matching strings against sets of fixed strings. A VMOD object may also function as an associative array, mapping the matched string to one or more of a backend, another string, an integer, or a regular expression. The string may also map to a subroutine that can be invoked with call. The VMOD is intended to support a variety of use cases that are typical for VCL deployments, such as: · Determining the backend based on the Host header or the prefix of the URL. · Rewriting the URL or a header. · Generating redirect responses, based on a header or the URL. · Permitting or rejecting request methods. · Matching the Basic Authentication credentials in an Authorization request header. · Matching media types in the Content-Type header of a backend response to determine if the content is compressible. · Accessing data by string match, as in an associative array, or by numeric index, as in a standard array. · Dispatching subroutine calls based on string matches. · Executing conditional logic that depends on features of the request or response that can be determined by matching headers or URLs. Operations such as these are commonly implemented in native VCL with an if-elsif-elsif sequence of string comparisons or regex matches. As the number of matches increases, such a sequence becomes cumbersome and scales poorly -- the time needed to execute the sequence increases with the number of matches to be performed. With the VMOD, the strings to be matched are declared in a tabular form in vcl_init, and the operation is executed in a few lines. For example: import selector; # Assume that you have defined these subroutines to execute logic # in vcl_recv for URLs beginning with /foo/, /bar/ or /baz/. sub foo { # ... } sub bar { # ... } sub baz { # ... } sub vcl_init { # Requests for URLs with these prefixes will be sent to the # associated backend. In vcl_recv, the associated subroutine # will be called. new url_prefix = selector.set(); url_prefix.add("/foo/", backend=foo_backend, sub=foo); url_prefix.add("/bar/", backend=bar_backend, sub=bar); url_prefix.add("/baz/", backend=baz_backend, sub=baz); # For requests with these Host headers, generate a redirect # response, using the associated string to construct the # Location header, and the integer to set the response code. new redirect = selector.set(); redirect.add("www.foo.com", string="/foo", integer=301); redirect.add("www.bar.com", string="/bar", integer=302); redirect.add("www.baz.com", string="/baz", integer=303); redirect.add("www.quux.com", string="/quux", integer=307); # Requests for these URLs are rewritten by altering the # query string, using the associated regex for a # substitution operation, each of which removes a # parameter. new rewrite = selector.set(); rewrite.add("/alpha/beta", regex="(\?.*)\bfoo=[^&]+&?(.*)$"); rewrite.add("/delta/gamma", regex="(\?.*)\bbar=[^&]+&?(.*)$"); rewrite.add("/epsilon/zeta", regex="(\?.*)\bbaz=[^&]+&?(.*)$"); } sub vcl_recv { # .match() returns true if the Host header exactly matches # one of the strings in the set. if (redirect.match(req.http.Host)) { # .string() returns the string added to the set above with # the 'string' parameter, for the string that was # matched. We use it to construct a Location header, which # will be retrieved in vcl_synth below to construct the # redirect response. # # .integer() returns the integer added to the set with the # 'integer' parameter, for the string that was matched. We # use it as the argument of synth() to set the response # status (one of the redirect status codes). set req.http.Location = "http://other.com" + redirect.string() + req.url; return (synth(redirect.integer())); } # If the URL matches the rewrite set, change the query string by # applying a substitution using the associated regex (removing a # query parameter). if (rewrite.match(req.url)) { set req.url = rewrite.sub(req.url, "\1\2"); } # If the URL has a prefix in the url_prefix set, call the # associated subroutine. if (url_prefix.hasprefix(req.url)) { call url_prefix.subroutine(); } } sub vcl_synth { # We come here when Host matched the redirect set in vcl_recv # above. Set the Location response header from the request header # set in vcl_recv. if (req.http.Location && resp.status >= 301 && resp.status <= 307) { set resp.http.Location = req.http.Location; return (deliver); } } sub vcl_backend_fetch { # The .hasprefix() method returns true if the URL has a prefix # in the set. if (url_prefix.hasprefix(bereq.url)) { # .backend() returns the backend associated with the # string in the set that was matched as a prefix. set bereq.backend = url_prefix.backend(); } } Matches with the .match() and .hasprefix() methods scale well as the number of strings in the set increases. Experience has shown that both operations are predictable and fast for large sets of strings. When new strings are added to a set (with new .add() statements in vcl_init), the VCL code that executes the various operations (rewrites, backend assignment and so forth) can remain unchanged. So the VMOD can contribute to better code maintainability. Matches with .match() and .hasprefix() are fixed string matches; characters such as wildcards and regex metacharacters are matched literally, and have no special meaning. Regex operations such as matching or substitution can be performed after set matches, using the regex saved with the regex parameter. But if you need to match against sets of patterns, consider using the set interface of VMOD re2, which provides techniques similar to the present VMOD. The limited expressiveness of strings to be matched means that this VMOD can implement fast algorithms. While regexen and a VMOD like re2 can be used to match fixed strings and prefixes, the matching operations of VMOD selector are orders of magnitude faster. That in turn contributes to scalability by consuming less CPU time for matches. So if your use case allows matches against strings without patterns, prefer the use of this VMOD. Selecting matched elements of a set The .match() operation is an exact, fixed string match, and hence always matches exactly one string in the set if it succeeds. With .hasprefix(), more than one string in the set may be matched, if the set includes strings that are prefixes of other strings in the same set: sub vcl_init { new myset = selector.set(); myset.add("/foo/"); # element 1 myset.add("/foo/bar/"); # element 2 myset.add("/foo/bar/baz/"); # element 3 } sub vcl_recv { # With .hasprefix(), a URL such as /foo/bar/baz/quux matches all # 3 elements in the set. if (myset.hasprefix(req.url)) { # ... } } Just calling .hasprefix() may be sufficient if all that matters is whether a string has any prefix that appears in the set. But for some uses it may be necessary to identify one matching element of the set; this is done in particular for the methods that retrieve data associated with a specific set element. For such cases, the method parameters INT n, STRING element and ENUM select are used to choose a matched element. As indicated in the example, elements of a set are implicitly numbered in the order in which they were added to the set using the .add() method, starting from 1. In all of the following, the n, element and select parameters for a method call are evaluated as follows: · If n >= 1, then the n-th element of the set is chosen, and the element and select parameters have no effect. A method with n >= 1 can be called in any context, and does not depend on prior match operations. This is essentially a lookup by index. · If n is greater than the number of elements in the set, the method invokes VCL failure (see ERRORS). · If n <= 0 and the element parameter is set, then the VMOD searches for the string specified by element, in the same way that the .match() method is executed. This is in essence a lookup in an associative array. If element is set but the lookup fails, that is if there is no such element in the set, then VCL failure is invoked, with the string "no such element" in the VCL_Error log message. If the lookup for the element succeeds, then the successful match establishes a match context for subsequent code. That means that the rules presently described can be applied again, as if .match() had returned true for the element (internally, that is in fact what happens). The internal match against element is case sensitive if and only if the case_sensitive flag was true in the set constructor (this is the default). n is 0 by default, so it can be left out of the method call when element is set. · If n <= 0 and element is unset, then the select parameter is used to choose an element based on the most recent .match() or .hasprefix() call for the same set object in the same task scope; that is, the most recent call in the same client or backend context. Thus a method call in one of the vcl_backend_* subroutines refers back to the most recent .match() or .hasprefix() invocation in the same backend context. By default, n is 0 and element is unset, so both of them can be left out of the call to use select. · If n <= 0 and element is unset, and neither of .match() or .hasprefix() has been called for the same set object in the same task scope, or if the most recent call resulted in a failed match, then the method invokes VCL failure. · When n <= 0 and element is unset after a successful .match() call, then for any value of select, the element chosen is the one that matched. · When n <= 0 and element is unset after a successful .hasprefix() call, then the value of select determines the element chosen, as follows: · UNIQUE (default): if exactly one element of the set matched, choose that element. The method invokes VCL failure in this case if more than one element matched. Since the defaults for n and select are 0 and UNIQUE, and element is unset by default, select=UNIQUE is in effect if all three parameters are left out of the method call. · EXACT: if one of the elements in the set matched exactly (even if other prefixes in the set matched as well), choose that element. VCL failure is invoked if there was no exact match. Thus if a prefix match for /foo/bar is run against a set containing /foo and /foo/bar, the latter element is chosen with select=EXACT. · FIRST: choose the first element in the set that matched (in the order in which they were added with .add()). · LAST: choose the last element in the set that matched. · SHORTEST: choose the shortest element in the set that matched. · LONGEST: choose the longest element in the set that matched. So for sets of strings with common prefixes, a strategy for selecting the matched element after a prefix match can be implemented by ordering the strings added to the set, by choosing only an exact match or the longest match, and so on: # In this example, we set the backend for a fetch based on the most # specific matching prefix of the URL, i.e. the longest prefix in # the URL that appears in the set. sub vcl_init { new myset = selector.set(); myset.add("/foo/", backend=foo_backend); myset.add("/foo/bar/", backend=bar_backend); myset.add("/foo/bar/baz/", backend=baz_backend); } sub vcl_backend_fetch { if (myset.hasprefix(bereq.url)) { set bereq.backend = myset.backend(select=LONGEST); } } # This sets baz_backend for /foo/bar/baz/quux # bar_backend for /foo/bar/quux # foo_backend for /foo/quux To re-state the rules more informally: · Use only one of n, element or select to select a string in the set. · If n > 0, use n. n = 0 by default. · Otherwise if element is set, use element. element is unset by default. · Otherwise use select, default UNIQUE. · n is a lookup by numeric index, as implied by the order of .add() in vcl_init. · element is an associative array lookup by string. · select refers back to the previous invocation of .match() or .hasprefix(). · The value of select is irrelevant (and can just as well be left out) if the prior invocation was .match(), or if it was .hasprefix() and exactly one string was found (which is always the case if strings in the set have no common prefixes). select is meant to pick an element when .hasprefix() finds more than one string. new xset = selector.set(BOOL case_sensitive, BOOL allow_overlaps) new xset = selector.set( BOOL case_sensitive=1, BOOL allow_overlaps=1 ) Create a set object. When case_sensitive is false, matches using the .match() and .hasprefix() methods are case-insensitive. By default, case_sensitive is true. When allow_overlaps is false, the VCL load fails if any string added to the set is a prefix of another string in the set. This can be used to ensure that methods using the select=UNIQUE enum will always succeed after .hasprefix() matches (and to fail fast if the restriction is not met). By default, allow_overlaps is true. The initialization of a set is completed when vcl_init finishes, or when the deprecated .compile() method is called. This prepares the set for use with the strings added with the .add() method described below. The VCL load fails if: · The same string is added to the same set more than once (that string is included in the error message). · The set contains a string that is a prefix of another string in the same set, but allow_overlaps was set to false in the constructor. Set initialization may also fail due to conditions such as out of memory. If no strings were added to the set before vcl_init finishes or .compile() is invoked, the VCL load will not fail, but all match operations on the set will fail. In that case, a warning is emitted to the log with the VCL_Error tag. Since that happens outside of any request/response transaction, the error message can only be seen when a tool like varnishlog(1) is used with raw grouping (-g raw). Examples: sub vcl_init { # By default, matches are case-sensitive, and overlapping # prefixes are permitted. new myset = selector.set(); # ... # For case-insensitive matching. new caseless = selector.set(case_sensitive=false); # ... # Forbid overlapping prefixes. new allunique = selector.set(allow_overlaps=false); # ... } VOID xset.add(STRING, [STRING string], [STRING regex], [BACKEND backend], [INT integer], [BOOL bool], [SUB sub]) VOID xset.add( STRING, [STRING string], [STRING regex], [BACKEND backend], [INT integer], [BOOL bool], [SUB sub] ) Add the given string to the set. As indicated above, elements added to the set are implicitly numbered in the order in which they are added with .add(), starting with 1. If values are set for any of the following optional parameters, then those values are associated with this element, and can be retrieved with the method shown in the second column. The retrieval methods are documented below. ┌─────────────────┬─────────────────────┐ │.add() parameter │ Retrieval methods │ ├─────────────────┼─────────────────────┤ │string │ .string() │ ├─────────────────┼─────────────────────┤ │regex │ .re_match(), .sub() │ ├─────────────────┼─────────────────────┤ │backend │ .backend() │ ├─────────────────┼─────────────────────┤ │integer │ .integer() │ ├─────────────────┼─────────────────────┤ │bool │ .bool() │ ├─────────────────┼─────────────────────┤ │sub │ .subroutine() │ └─────────────────┴─────────────────────┘ A regular expression in the regex parameter is compiled at VCL load time. If the compile fails, then the VCL load fails with an error message. Regular expressions are evaluated exactly as native regexen in VCL. A VCL subroutine specified by the sub parameter MUST be defined prior to the definition of vcl_init in which .add() is invoked. The VCL compiler does not support forward definitions for this purpose. .add() invokes VCL failure if it is called in any subroutine besides vcl_init. The VCL load fails if: · The string to be added is NULL. · A regular expression in the regex parameter fails to compile. · A subroutine specified by the sub parameter was not defined previously in the VCL source. · The deprecated .compile() method has already been called. Example: sub my_quux_sub { set req.http.Quux = "xyzzy"; } sub vcl_init { new myset = selector.set(); myset.add("www.foo.com"); myset.add("www.bar.com", string="/bar"); myset.add("www.baz.com", string="/baz", backend=baz_backend); myset.add("www.quux.com", string="/quux", backend=quux_backend, regex="^/quux/([^/]+)/", sub=my_quux_sub); } VOID xset.compile() This method is deprecated, and will be removed in a future version. .compile() may be omitted, since compilation happens automatically when vcl_init finishes. .compile() compiles the set. This is done after all of the strings have been added. .compile() invokes VCL failure if it is called in any subroutine besides vcl_init. The VCL load may fail for the same reasons described for set initialization above, or if .compile() is invoked more than once. VOID xset.create_stats() Create statistics counters for this object that are displayed by tools such as varnishstat(1). See STATISTICS for details. It must be called in vcl_init. No statistics are created for a set object if .create_stats() is not invoked. .create_stats() invokes VCL failure if it is called in any VCL subroutine besides vcl_init. Example: sub vcl_init { new myset = selector.set(); myset.add("foo"); myset.add("bar"); myset.add("baz"); myset.create_stats(); } BOOL xset.match(STRING) Returns true if the given STRING exactly matches one of the strings in the set. The match is case insensitive if and only if the parameter case_sensitive was set to false in the set constructor (matches are case sensitive by default). .match() invokes VCL failure if: · No strings were added to the set. · There is insufficient workspace for internal operations. If the string to be matched is NULL, for example when an unset header is unspecified, then .match() returns false, and a warning is emitted to the log with the Notice header (see LOGGING). This is because a match against an unset header may or may not have been intentional. If you need to distinguish whether or not the header exists when using .match(), you can evaluate the header in boolean context: if (!myset.match(req.http.Foo)) { # Either there is no such header in the client request, or # the header does not match the set. # ... } if (req.http.Foo && !myset.match(req.http.Foo)) { # The header exists, but does not match the set. # ... } BOOL xset.hasprefix(STRING) Returns true if the STRING to be matched has a prefix that is in the set. The match is case insensitive if case_sensitive was set to false in the constructor. .hasprefix() invokes VCL failure under the same conditions given for .match() above. Like .match(), .hasprefix() returns false if the string to be matched is NULL, for example if it is an unset header, and a Notice message is emitted to the log (see LOGGING). Example: if (myset.hasprefix(req.url)) { call do_if_prefix_matched; } INT xset.nmatches() Returns the number of elements that were matched by the most recent successful invocation of .match() or .hasprefix() for the same set object in the same task scope (that is, in the same client or backend context). .nmatches() returns 0 after either of .match() or .hasprefix() returned false, and it returns 1 after .match() returned true. After a successful .hasprefix() call, it returns the number of strings in the set that are prefixes of the string that was matched. .nmatches() invokes VCL failure if there was no prior invocation of .match() or .hasprefix() in the same task scope. Example: # For a use case that requires a unique prefix match, use # .nmatches() to ensure that there was exactly one match, and fail # fast with VCL failure otherwise. if (myset.hasprefix(bereq.url)) { if (myset.nmatches() != 1) { std.log(bereq.url + " matched > 1 prefix in the set"); return (fail); } set bereq.backend = myset.backend(select=UNIQUE); } BOOL xset.matched(INT n, STRING element, ENUM select) BOOL xset.matched( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) After a successful .match() or .hasprefix() call for the same set object in the same task scope, return true if the element indicated by the n, element and select parameters was matched, according to the rules described above. For example if n > 0, .matched(n) returns true if and only if the n-th element matched. The numbering corresponds to the order of .add() invocations in vcl_init (starting from 1). The select and element parameters are ignored in this case. If n <= 0 and element is set, then .matched() returns true if and only if the string specified by element was matched in the previous successful .match() or .hasprefix() call. If element is not in the set, then .matched() does not invoke VCL failure (this is a deviation from the general rules for element), but .matched() always returns false in that case. Thus .matched() can always used with element to safely check if a string was previously matched, regardless of whether the string is in the set. n defaults to 0, so the n parameter can be left out if element is set. If n <= 0 and element is unset, the set element is determined by the select enum. In that case, .matched() returns true if and only if the element indicated by the enum was matched by the previous successful match operation. These distinctions are only relevant if the previous operation was .hasprefix(), and more than one string was matched due to overlapping prefixes. .matched() returns true for all values of select if the previous successful operation was .match(). n defaults to 0 and element is unset by default, so the n and element parameters can be left out if the use of select is intended. If n <= 0, element is unset, and select is UNIQUE or EXACT, then .matched() returns true if the enum's criteria are met; otherwise it returns false, and does not fail. This can be used as a safeguard for the methods described below, which invoke VCL failure if either of these two enums are specified, but their criteria are not met. The other enum values (FIRST, LAST, SHORTEST and LONGEST) are included for consistency with the other methods, but they don't make a relevant distinction. If the prior invocation of .match() or .hasprefix() was successful (returned true), then .matched() returns true for each of these, since there is always an element that meets the criteria. .matched() always returns false if the most recent .match() or .hasprefix() call returned false. .matched() invokes VCL failure if: · The n parameter is out of range -- greater than the number of elements in the set. · There was no prior invocation of .match() or .hasprefix() in the same task scope. Example: if (hosts.match(req.http.Host)) { if (hosts.matched(1)) { call do_if_the_first_host_element_matched; } } if (url_prefixes.hasprefix(req.url)) { if (urls.matched(select=UNIQUE)) { call do_if_a_unique_url_prefix_was_matched; } } if (url_prefixes.hasprefix(bereq.url)) { if (urls.matched(element="/foo/")) { call do_if_foo_was_matched; } } INT xset.which(ENUM select, STRING element) INT xset.which( ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE, STRING element=0 ) Return the index of the element indicated by element or select. The numbering corresponds to the order of .add() calls in vcl_init, starting from 1. If the element parameter is set, then return the numeric index for that string in the set. If element is unset, then the index is chosen with the select parameter, and refers to the previous .match() or .hasprefix() call for the same set object in the same task scope, according to the rules given above. By default, select is UNIQUE. If element is unset, and the most recent .match() or .hasprefix() call returned false, return 0. .which() invokes VCL failure if: · The choice of element or select indicates failure, as documented above; that is, if element is a string that is not in the set, or select is UNIQUE or EXACT, but there was no unique or exact match, respectively. · There was no prior invocation of .match() or .hasprefix() in the same task scope. Example: if (myset.hasprefix(req.url)) { if (myset.which(select=SHORTEST) > 1) { call do_if_the_shortest_match_was_not_the_first_element; } } if (myset.which(element=bereq.url) == 1) { call do_if_the_url_was_the_first_element; } STRING xset.element(INT n, ENUM select) STRING xset.element( INT n=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the element of the set indicated by the n and select parameters as described above. Thus if n >= 1, the n-th element of the set is returned; otherwise the matched element indicated by select is returned after calling .match() or .hasprefix(). The string returned is the same as it was added to the set; even if a prior match was case insensitive, and the matched string differs in case, the string with the case as added to the set is returned. .element() invokes VCL failure if the rules for n and select indicate failure; that is: · n is out of range (greater than the number of elements in the set) · n < 1 and select fails for UNIQUE or EXACT · n < 1 and there was no prior invocation of .match() or .hasprefix(). Example: if (myset.hasprefix(req.url)) { # Construct a redirect response for another host, using the # matching prefix in the request URL as the new URL path. set resp.http.Location = "http://other.com" + myset.element(); } BACKEND xset.backend(INT n, STRING element, ENUM select) BACKEND xset.backend( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the backend associated with the element of the set indicated by n, element and select, according to the rules given above; that is, it returns the backend that was set via the backend parameter in .add(). .backend() invokes VCL failure if: · The rules for n, element and select indicate failure. · No backend was set with the backend parameter in the .add() call corresponding to the selected element. Example: if (myset.hasprefix(bereq.url)) { # Set the backend associated with the string in the set that # forms the longest prefix of the URL set bereq.backend = myset.backend(select=LONGEST); } STRING xset.string(INT n, STRING element, ENUM select) STRING xset.string( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the string set by the string parameter for the element of the set indicated by n, element and select, according to the rules given above. .string() invokes VCL failure if: · The rules for n, element and select indicate failure. · No string was set with the string parameter in .add(). Example: # Rewrite the URL if it matches one of the strings in the set. if (myset.match(req.url)) { set req.url = myset.string(); } INT xset.integer(INT n, STRING element, ENUM select) INT xset.integer( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the integer set by the integer parameter for the element of the set indicated by n, element and select, according to the rules given above. .integer() invokes VCL failure if: · The rules for n, element and select indicate failure. · No integer was set with the integer parameter in .add(). Example: # Send a synthetic response if the URL has a prefix in the set, # using the response code set in .add(). if (myset.hasprefix(req.url)) { # Check .nmatches() to ensure that select=UNIQUE can be used # without risk of VCL failure. if (myset.nmatches() == 1) { return( synth(myset.integer(select=UNIQUE)) ); } } BOOL xset.bool(INT n, STRING element, ENUM select) BOOL xset.bool( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the boolean value set by the bool parameter for the element of the set indicated by n, element and select, according to the rules given above. .bool() invokes VCL failure if: · The rules for n, element and select indicate failure. · No boolean was set with the bool parameter in .add(). Example: # Match domains to the Host header, and append "www." where # necessary. sub vcl_init { new domains = selector.set(); domains.add("example.com", bool=true); domains.add("www.example.net", bool=false); domains.add("example.org", bool=true); domains.add("www.example.edu", bool=false) } sub vcl_recv { if (domains.match(req.http.Host)) { if (domains.bool()) { set req.http.Host = "www." + req.http.Host; } } } BOOL xset.re_match(STRING subject, INT n, STRING element, ENUM select) BOOL xset.re_match( STRING subject, INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Using the regular expression set by the regex parameter for the element of the set indicated by n, element and select, return the result of matching the regex against subject. The regex match is the same operation performed for the native VCL ~ operator, see vcl(7). In other words, this method can be used to perform a second match with the saved regular expression, after matching a fixed string against the set. The regex match is subject to the same conditions imposed for matching in native VCL; in particular, it may be limited by the varnishd parameters pcre_match_limit and pcre_match_limit_recursion (see varnishd(1)). .re_match() invokes VCL failure if: · The rules for n, element and select indicate failure. · No regular expression was set with the regex parameter in .add(). The regex match may fail for any of the reasons that cause a native match to fail. In that case, .re_match() returns false, and a log message with tag VCL_Error is emitted (as for native regeex match failures). Example: # If the Host header exactly matches a string in the set, perform a # regex match against the URL. if (myset.match(req.http.Host)) { if (myset.re_match(req.url)) { call do_if_the_URL_matches_the_regex_for_Host; } } STRING xset.sub(STRING str, STRING sub, BOOL all, INT n, STRING element, ENUM select) STRING xset.sub( STRING str, STRING sub, BOOL all=0, INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Using the regular expression set by the regex parameter for the element of the set indicated by n, element and select, return the result of a substitution using str and sub. Note that the method name "sub" refers to string substitution. To retrieve the subroutine set with the sub parameter in .add(), use the .subroutine() method documented below. If all is false, then return the result of replacing the first portion of str that matches the regex with sub. sub may contain backreferences \0 through \9, to include captured substrings from str in the substitution. This is the same operation performed by the native VCL function regsub(str, regex, sub) (see vcl(7)). By default, all is false. If all is true, return the result of replacing each non-overlapping portion of str that matches the regex with sub (possibly with backreferences). This is the same operation as native VCL's regsuball(str, regex, sub). .sub() invokes VCL failure if: · The rules for n, element and select indicate failure. · No regular expression was set with the regex parameter in .add(). The substitution may fail for any of the reasons that cause native regsub() or regsuball() to fail. In that case, .sub() returns str, and a VCL_Error message is written to the log, as for failures of native match substitution functions. As with the native functions, str is returned if regex does not match str. Example: # In this example we match the URL prefix, and if a match is found, # rewrite the URL by exchanging path components as indicated. sub vcl_init { new rewrite = selector.set(); rewrite.add("/foo/", regex="^(/foo)/([^/]+)/([^/]+)/"); rewrite.add("/foo/bar/", regex="^(/foo/bar)/([^/]+)/([^/]+)/"); rewrite.add("/foo/bar/baz/", regex="^(/foo/bar/baz)/([^/]+)/([^/]+)/"); } if (rewrite.hasprefix(req.url)) { set req.url = rewrite.sub(req.url, "\1/\3/\2/", select=LAST); } # /foo/1/2/* is rewritten as /foo/2/1/* # /foo/bar/1/2/* is rewritten as /foo/bar/2/1/* # /foo/bar/baz/1/2/* is rewritten as /foo/bar/baz/2/1/* SUB xset.subroutine(INT n, STRING element, ENUM select) SUB xset.subroutine( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns the subroutine set by the sub parameter for the element of the set indicated by n, element and select, according to the rules given above. The subroutine may be invoked with VCL call. Note: you must ensure that the subroutine may invoked legally in the context in which it is called. This means that: · The subroutine may only refer to VCL elements that are legal in the invocation context. For example, if the subroutine only refers to headers in req.http.*, then it may be called in vcl_recv, but not if it refers to any header in resp.http.*. See vcl-var(7) for the specification of which VCL variables may be used in which contexts. · Recursive subroutine calls are not permitted in VCL. The subroutine invocation may not appear anywhere in its own call stack. For standard subroutine invocations with call, the VCL compiler checks these conditions and issues a compile-time error if either one is violated. This is not possible with invocations using .subroutine(); the error can only be determined at runtime. So it is advisable to test the use of .subroutine() carefully before using it in production. You can use the .check_call() method described below to determine if the subroutine call is legal. .subroutine() invokes VCL failure if: · The rules for n, element and select indicate failure. · No subroutine was set with the sub parameter in .add(). · The subroutine is invoked with call, but the call is not legal in the invocation context, for the reasons given above. Example: # Due to the use of resp.http.*, this subroutine may only be invoked # in vcl_deliver or vcl_synth, as documented in vcl-var(7). Note # that subroutine definitions must appear before vcl_init to # permitted for the sub parameter in .add(). sub resp_sub { set resp.http.Call-Me = "but only in deliver or synth"; } sub vcl_init { new myset = selector.set(); myset.add("/foo", sub=resp_sub); myset.add("/foo/bar", sub=some_other_sub); # ... } sub vcl_deliver { if (resp_sub.hasprefix(req.url)) { call resp_sub.subroutine(select=LONGEST); } } BOOL xset.check_call(INT n, STRING element, ENUM select) BOOL xset.check_call( INT n=0, STRING element=0, ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE ) Returns true iff the subroutine returned by .subroutine() for the element of the set indicated by n, element and select may be invoked legally in the current context. The conditions for legal invocation are documented for .subroutine() above. .check_call() never invokes VCL failure, but rather returns false under conditions for which the use of .subroutine() would invoke VCL failure, as described above. In that case, a message is emitted to the Vanrish log using the Notice tag (the same message that would appear with the VCL_Error tag if the subroutine were called). Example: sub vcl_deliver { if (resp_sub.hasprefix(req.url)) { if (resp_sub.check_call(select=LONGEST)) { call resp_sub.subroutine(select=LONGEST); } else { call do_if_resp_sub_is_illegal; } } } STRING version() Return the version string for this VMOD. Example: std.log("Using VMOD selector version: " + selector.version());
STATISTICS
When .create_stats() is invoked for a set object, statistics are created that can be viewed with a tool like varnishstat(1). The stats have the following naming schema: SELECTOR.<vcl>.<object>.<stat> ... where <vcl> is the VCL instance name, <object> is the object name, and <stat> is the statistic. So the elements stat of the myset object in the VCL instance boot is named: SELECTOR.boot.myset.elements The statistics describe properties of the set, and their values are constant, never changing during the lifetime of the VCL instance. Statistics provided by the VMOD include: · elements: the number of elements in the set (added via .add()) · setsz: the total size of the strings in the set -- the sum of the lengths of all of the strings, including their terminating null bytes · minlen: the length of the shortest string in the set · maxlen: the length of the shortest string in the set The remaining stats refer to properties of a set object's internal data structures, and depend on the internal implementation. The implementation may be changed in any new version of the VMOD, and hence the stats may change. These are described further in an external document (see STATISTICS in the source repository). The stats for a VCL instance are removed from view when the instance is set to the cold state, and become visible again when it set to the warm state. They are removed permanently when the VCL instance is discarded (see varnish-cli(7)).
ERRORS
The method documentation above describes illegal uses for which VCL failure is invoked. VCL failure has the same results as if return(fail) is called from a VCL subroutine: · If the failure occurs in vcl_init, then the VCL load fails with an error message. · If the failure occurs in any other subroutine besides vcl_synth, then a VCL_Error message is written to the log, and control is directed immediately to vcl_synth, with resp.status set to 503 and resp.reason set to "VCL failed". · If the failure occurs in vcl_synth, then vcl_synth is aborted, and the response line "503 VCL failed" is sent. VCL failure is meant to "fail fast" on conditions that cannot be correct, or when resource limitations such as workspace exhaustion prevent further processing. Depending on your use case, you may be able to use the VMOD's methods without additional checking and with no risk of failure. For example, if it is known that none of the strings in a set have common prefixes, then methods with select=UNIQUE can be used safely after calling .hasprefix(). If you need to check against possible failure conditions: · If .nmatches() == 1, then select=UNIQUE can be used safely. · The UNIQUE and EXACT conditions can also be checked with .matched(select=UNIQUE) and .matched(select=EXACT). · The allow_overlaps flag can be set in the constructor, to ensure that VCL load fails if a set unintentionally has strings with common prefixes. · In most cases, a method invokes VCL failure if the value of the element parameter is not in the set. But element can be used safely with any string in .matched() to check if a string matched previously -- .matched() returns false if the element is not in the set. · The .check_call() method may be used to avoid VCL failure if a subroutine call using .subroutine() would be illegal. See LIMITATIONS for considerations if you encounter conditions such as workspace exhaustion.
LOGGING
Both of .match() and .hasprefix() return false when the string to be matched is NULL, typically because an unset header was specified. Such usage may be deliberate; you might intend VCL logic to depend on whether a header either doesn't match or does not exist. But it may be an error, for example due to misspelling the header name. When the string to be matched is NULL, the VMOD emits a warning to the log with the tag Notice, in this format: vmod_selector: <obj>.<method>(): subject string is NULL ... where <obj> is the object name and <method> is either match or hasprefix. If .check_call() returns false, indicating that the use of .subroutine() would be illegal in that context, then the VMOD emits a log meesage using Notice in this format: vmod_selector: <obj>.check_call(): <errmsg> ... where <obj> is the object name and <errmsg> is the message that would have been logged with VCL_Error if the subroutine were invoked. As noted above, VCL failure during request/response transactions (after successful VCL load) is logged with an error message using the VCL_Error tag. These messages begin with the prefix vmod selector failure.
REQUIREMENTS
The VMOD requires Varnish since version 6.6.0. See the source repository for versions of the VMOD that are compatible with released versions of Varnish.
INSTALLATION
See INSTALL.rst in the source repository.
LIMITATIONS
The VMOD uses workspace for two purposes: · Saving task-scoped data about a match with .match() and .hasprefix(), for use by the methods that retrieve information about the prior match. This data is stored separately for each object for which a match is executed. · A copy of the string to be matched for case insensitive matches (the copy is set to all one case). The default workspace sizes are usually more than large enough for typical usages, but that depends on workspace consumption for other purposes. If you find that methods are failing with VCL_Error messages indicating "out of space", consider increasing the varnishd parameters workspace_client and/or workspace_backend (see varnishd(1)). Set objects and their internal structures are allocated from the heap, and hence are only limited by available RAM. The regex methods .re_match() and .sub() use the same internal mechanisms as native VCL's ~ operator and the regsub/all() functions, and are subject to the same limitations. In particular, they may be limited by the varnishd parameters pcre_match_limit and pcre_match_limit_recursion, in which case they emit the same VCL_Error messages as the native operations. If necessary, adjust these parameters as advised in varnishd(1).
SEE ALSO
· varnishd(1) · vcl(7) · vcl-var(7) · varnishstat(1) · varnishlog(1) · varnish-cli(7) · VMOD source repository: https://code.uplex.de/uplex-varnish/libvmod-selector · Gitlab mirror: https://gitlab.com/uplex/varnish/libvmod-selector · VMOD re2: https://code.uplex.de/uplex-varnish/libvmod-re2
COPYRIGHT
Copyright (c) 2018 UPLEX Nils Goroll Systemoptimierung All rights reserved Author: Geoffrey Simmons <geoffrey.simmons@uplex.de> See LICENSE VMOD_SELECTOR(3)