Provided by: erlang-manpages_18.3-dfsg-1ubuntu3.1_all 

NAME
qlc - Query Interface to Mnesia, ETS, Dets, etc
DESCRIPTION
The qlc module provides a query interface to Mnesia, ETS, Dets and other data structures that implement
an iterator style traversal of objects.
OVERVIEW
The qlc module implements a query interface to QLC tables. Typical QLC tables are ETS, Dets, and Mnesia
tables. There is also support for user defined tables, see the Implementing a QLC table section. A query
is stated using Query List Comprehensions (QLCs). The answers to a query are determined by data in QLC
tables that fulfill the constraints expressed by the QLCs of the query. QLCs are similar to ordinary list
comprehensions as described in the Erlang Reference Manual and Programming Examples except that variables
introduced in patterns cannot be used in list expressions. In fact, in the absence of optimizations and
options such as cache and unique (see below), every QLC free of QLC tables evaluates to the same list of
answers as the identical ordinary list comprehension.
While ordinary list comprehensions evaluate to lists, calling qlc:q/1,2 returns a Query Handle. To
obtain all the answers to a query, qlc:eval/1,2 should be called with the query handle as first argument.
Query handles are essentially functional objects ("funs") created in the module calling q/1,2. As the
funs refer to the module's code, one should be careful not to keep query handles too long if the module's
code is to be replaced. Code replacement is described in the Erlang Reference Manual. The list of answers
can also be traversed in chunks by use of a Query Cursor. Query cursors are created by calling
qlc:cursor/1,2 with a query handle as first argument. Query cursors are essentially Erlang processes. One
answer at a time is sent from the query cursor process to the process that created the cursor.
SYNTAX
Syntactically QLCs have the same parts as ordinary list comprehensions:
[Expression || Qualifier1, Qualifier2, ...]
Expression (the template) is an arbitrary Erlang expression. Qualifiers are either filters or generators.
Filters are Erlang expressions returning bool(). Generators have the form Pattern <- ListExpression,
where ListExpression is an expression evaluating to a query handle or a list. Query handles are returned
from qlc:table/2, qlc:append/1,2, qlc:sort/1,2, qlc:keysort/2,3, qlc:q/1,2, and
qlc:string_to_handle/1,2,3.
EVALUATION
The evaluation of a query handle begins by the inspection of options and the collection of information
about tables. As a result qualifiers are modified during the optimization phase. Next all list
expressions are evaluated. If a cursor has been created evaluation takes place in the cursor process. For
those list expressions that are QLCs, the list expressions of the QLCs' generators are evaluated as well.
One has to be careful if list expressions have side effects since the order in which list expressions are
evaluated is unspecified. Finally the answers are found by evaluating the qualifiers from left to right,
backtracking when some filter returns false, or collecting the template when all filters return true.
Filters that do not return bool() but fail are handled differently depending on their syntax: if the
filter is a guard it returns false, otherwise the query evaluation fails. This behavior makes it possible
for the qlc module to do some optimizations without affecting the meaning of a query. For example, when
testing some position of a table and one or more constants for equality, only the objects with equal
values are candidates for further evaluation. The other objects are guaranteed to make the filter return
false, but never fail. The (small) set of candidate objects can often be found by looking up some key
values of the table or by traversing the table using a match specification. It is necessary to place the
guard filters immediately after the table's generator, otherwise the candidate objects will not be
restricted to a small set. The reason is that objects that could make the query evaluation fail must not
be excluded by looking up a key or running a match specification.
JOIN
The qlc module supports fast join of two query handles. Fast join is possible if some position P1 of one
query handler and some position P2 of another query handler are tested for equality. Two fast join
methods have been implemented:
* Lookup join traverses all objects of one query handle and finds objects of the other handle (a QLC
table) such that the values at P1 and P2 match or compare equal. The qlc module does not create any
indices but looks up values using the key position and the indexed positions of the QLC table.
* Merge join sorts the objects of each query handle if necessary and filters out objects where the
values at P1 and P2 do not compare equal. If there are many objects with the same value of P2 a
temporary file will be used for the equivalence classes.
The qlc module warns at compile time if a QLC combines query handles in such a way that more than one
join is possible. In other words, there is no query planner that can choose a good order between possible
join operations. It is up to the user to order the joins by introducing query handles.
The join is to be expressed as a guard filter. The filter must be placed immediately after the two joined
generators, possibly after guard filters that use variables from no other generators but the two joined
generators. The qlc module inspects the operands of =:=/2, ==/2, is_record/2, element/2, and logical
operators (and/2, or/2, andalso/2, orelse/2, xor/2) when determining which joins to consider.
COMMON OPTIONS
The following options are accepted by cursor/2, eval/2, fold/4, and info/2:
* {cache_all, Cache} where Cache is equal to ets or list adds a {cache, Cache} option to every list
expression of the query except tables and lists. Default is {cache_all, no}. The option cache_all is
equivalent to {cache_all, ets}.
* {max_list_size, MaxListSize} where MaxListSize is the size in bytes of terms on the external format.
If the accumulated size of collected objects exceeds MaxListSize the objects are written onto a
temporary file. This option is used by the {cache, list} option as well as by the merge join method.
Default is 512*1024 bytes.
* {tmpdir_usage, TmpFileUsage} determines the action taken when qlc is about to create temporary files
on the directory set by the tmpdir option. If the value is not_allowed an error tuple is returned,
otherwise temporary files are created as needed. Default is allowed which means that no further
action is taken. The values info_msg, warning_msg, and error_msg mean that the function with the
corresponding name in the module error_logger is called for printing some information (currently the
stacktrace).
* {tmpdir, TempDirectory} sets the directory used by merge join for temporary files and by the {cache,
list} option. The option also overrides the tmpdir option of keysort/3 and sort/2. The default value
is "" which means that the directory returned by file:get_cwd() is used.
* {unique_all, true} adds a {unique, true} option to every list expression of the query. Default is
{unique_all, false}. The option unique_all is equivalent to {unique_all, true}.
GETTING STARTED
As already mentioned queries are stated in the list comprehension syntax as described in the Erlang
Reference Manual. In the following some familiarity with list comprehensions is assumed. There are
examples in Programming Examples that can get you started. It should be stressed that list comprehensions
do not add any computational power to the language; anything that can be done with list comprehensions
can also be done without them. But they add a syntax for expressing simple search problems which is
compact and clear once you get used to it.
Many list comprehension expressions can be evaluated by the qlc module. Exceptions are expressions such
that variables introduced in patterns (or filters) are used in some generator later in the list
comprehension. As an example consider an implementation of lists:append(L): [X ||Y <- L, X <- Y]. Y is
introduced in the first generator and used in the second. The ordinary list comprehension is normally to
be preferred when there is a choice as to which to use. One difference is that qlc:eval/1,2 collects
answers in a list which is finally reversed, while list comprehensions collect answers on the stack which
is finally unwound.
What the qlc module primarily adds to list comprehensions is that data can be read from QLC tables in
small chunks. A QLC table is created by calling qlc:table/2. Usually qlc:table/2 is not called directly
from the query but via an interface function of some data structure. There are a few examples of such
functions in Erlang/OTP: mnesia:table/1,2, ets:table/1,2, and dets:table/1,2. For a given data structure
there can be several functions that create QLC tables, but common for all these functions is that they
return a query handle created by qlc:table/2. Using the QLC tables provided by OTP is probably sufficient
in most cases, but for the more advanced user the section Implementing a QLC table describes the
implementation of a function calling qlc:table/2.
Besides qlc:table/2 there are other functions that return query handles. They might not be used as often
as tables, but are useful from time to time. qlc:append traverses objects from several tables or lists
after each other. If, for instance, you want to traverse all answers to a query QH and then finish off by
a term {finished}, you can do that by calling qlc:append(QH, [{finished}]). append first returns all
objects of QH, then {finished}. If there is one tuple {finished} among the answers to QH it will be
returned twice from append.
As another example, consider concatenating the answers to two queries QH1 and QH2 while removing all
duplicates. The means to accomplish this is to use the unique option:
qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})
The cost is substantial: every returned answer will be stored in an ETS table. Before returning an answer
it is looked up in the ETS table to check if it has already been returned. Without the unique options all
answers to QH1 would be returned followed by all answers to QH2. The unique options keeps the order
between the remaining answers.
If the order of the answers is not important there is the alternative to sort the answers uniquely:
qlc:sort(qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})).
This query also removes duplicates but the answers will be sorted. If there are many answers temporary
files will be used. Note that in order to get the first unique answer all answers have to be found and
sorted. Both alternatives find duplicates by comparing answers, that is, if A1 and A2 are answers found
in that order, then A2 is a removed if A1 == A2.
To return just a few answers cursors can be used. The following code returns no more than five answers
using an ETS table for storing the unique answers:
C = qlc:cursor(qlc:q([X || X <- qlc:append(QH1, QH2)],{unique,true})),
R = qlc:next_answers(C, 5),
ok = qlc:delete_cursor(C),
R.
Query list comprehensions are convenient for stating constraints on data from two or more tables. An
example that does a natural join on two query handles on position 2:
qlc:q([{X1,X2,X3,Y1} ||
{X1,X2,X3} <- QH1,
{Y1,Y2} <- QH2,
X2 =:= Y2])
The qlc module will evaluate this differently depending on the query handles QH1 and QH2. If, for
example, X2 is matched against the key of a QLC table the lookup join method will traverse the objects of
QH2 while looking up key values in the table. On the other hand, if neither X2 nor Y2 is matched against
the key or an indexed position of a QLC table, the merge join method will make sure that QH1 and QH2 are
both sorted on position 2 and next do the join by traversing the objects one by one.
The join option can be used to force the qlc module to use a certain join method. For the rest of this
section it is assumed that the excessively slow join method called "nested loop" has been chosen:
qlc:q([{X1,X2,X3,Y1} ||
{X1,X2,X3} <- QH1,
{Y1,Y2} <- QH2,
X2 =:= Y2],
{join, nested_loop})
In this case the filter will be applied to every possible pair of answers to QH1 and QH2, one at a time.
If there are M answers to QH1 and N answers to QH2 the filter will be run M*N times.
If QH2 is a call to the function for gb_trees as defined in the Implementing a QLC table section,
gb_table:table/1, the iterator for the gb-tree will be initiated for each answer to QH1 after which the
objects of the gb-tree will be returned one by one. This is probably the most efficient way of traversing
the table in that case since it takes minimal computational power to get the following object. But if QH2
is not a table but a more complicated QLC, it can be more efficient use some RAM memory for collecting
the answers in a cache, particularly if there are only a few answers. It must then be assumed that
evaluating QH2 has no side effects so that the meaning of the query does not change if QH2 is evaluated
only once. One way of caching the answers is to evaluate QH2 first of all and substitute the list of
answers for QH2 in the query. Another way is to use the cache option. It is stated like this:
QH2' = qlc:q([X || X <- QH2], {cache, ets})
or just
QH2' = qlc:q([X || X <- QH2], cache)
The effect of the cache option is that when the generator QH2' is run the first time every answer is
stored in an ETS table. When next answer of QH1 is tried, answers to QH2' are copied from the ETS table
which is very fast. As for the unique option the cost is a possibly substantial amount of RAM memory. The
{cache, list} option offers the possibility to store the answers in a list on the process heap. While
this has the potential of being faster than ETS tables since there is no need to copy answers from the
table it can often result in slower evaluation due to more garbage collections of the process' heap as
well as increased RAM memory consumption due to larger heaps. Another drawback with cache lists is that
if the size of the list exceeds a limit a temporary file will be used. Reading the answers from a file is
very much slower than copying them from an ETS table. But if the available RAM memory is scarce setting
the limit to some low value is an alternative.
There is an option cache_all that can be set to ets or list when evaluating a query. It adds a cache or
{cache, list} option to every list expression except QLC tables and lists on all levels of the query.
This can be used for testing if caching would improve efficiency at all. If the answer is yes further
testing is needed to pinpoint the generators that should be cached.
IMPLEMENTING A QLC TABLE
As an example of how to use the qlc:table/2 function the implementation of a QLC table for the gb_trees
module is given:
-module(gb_table).
-export([table/1]).
table(T) ->
TF = fun() -> qlc_next(gb_trees:next(gb_trees:iterator(T))) end,
InfoFun = fun(num_of_objects) -> gb_trees:size(T);
(keypos) -> 1;
(is_sorted_key) -> true;
(is_unique_objects) -> true;
(_) -> undefined
end,
LookupFun =
fun(1, Ks) ->
lists:flatmap(fun(K) ->
case gb_trees:lookup(K, T) of
{value, V} -> [{K,V}];
none -> []
end
end, Ks)
end,
FormatFun =
fun({all, NElements, ElementFun}) ->
ValsS = io_lib:format("gb_trees:from_orddict(~w)",
[gb_nodes(T, NElements, ElementFun)]),
io_lib:format("gb_table:table(~s)", [ValsS]);
({lookup, 1, KeyValues, _NElements, ElementFun}) ->
ValsS = io_lib:format("gb_trees:from_orddict(~w)",
[gb_nodes(T, infinity, ElementFun)]),
io_lib:format("lists:flatmap(fun(K) -> "
"case gb_trees:lookup(K, ~s) of "
"{value, V} -> [{K,V}];none -> [] end "
"end, ~w)",
[ValsS, [ElementFun(KV) || KV <- KeyValues]])
end,
qlc:table(TF, [{info_fun, InfoFun}, {format_fun, FormatFun},
{lookup_fun, LookupFun},{key_equality,'=='}]).
qlc_next({X, V, S}) ->
[{X,V} | fun() -> qlc_next(gb_trees:next(S)) end];
qlc_next(none) ->
[].
gb_nodes(T, infinity, ElementFun) ->
gb_nodes(T, -1, ElementFun);
gb_nodes(T, NElements, ElementFun) ->
gb_iter(gb_trees:iterator(T), NElements, ElementFun).
gb_iter(_I, 0, _EFun) ->
'...';
gb_iter(I0, N, EFun) ->
case gb_trees:next(I0) of
{X, V, I} ->
[EFun({X,V}) | gb_iter(I, N-1, EFun)];
none ->
[]
end.
TF is the traversal function. The qlc module requires that there is a way of traversing all objects of
the data structure; in gb_trees there is an iterator function suitable for that purpose. Note that for
each object returned a new fun is created. As long as the list is not terminated by [] it is assumed that
the tail of the list is a nullary function and that calling the function returns further objects (and
functions).
The lookup function is optional. It is assumed that the lookup function always finds values much faster
than it would take to traverse the table. The first argument is the position of the key. Since qlc_next
returns the objects as {Key, Value} pairs the position is 1. Note that the lookup function should return
{Key, Value} pairs, just as the traversal function does.
The format function is also optional. It is called by qlc:info to give feedback at runtime of how the
query will be evaluated. One should try to give as good feedback as possible without showing too much
details. In the example at most 7 objects of the table are shown. The format function handles two cases:
all means that all objects of the table will be traversed; {lookup, 1, KeyValues} means that the lookup
function will be used for looking up key values.
Whether the whole table will be traversed or just some keys looked up depends on how the query is stated.
If the query has the form
qlc:q([T || P <- LE, F])
and P is a tuple, the qlc module analyzes P and F in compile time to find positions of the tuple P that
are tested for equality to constants. If such a position at runtime turns out to be the key position, the
lookup function can be used, otherwise all objects of the table have to be traversed. It is the info
function InfoFun that returns the key position. There can be indexed positions as well, also returned by
the info function. An index is an extra table that makes lookup on some position fast. Mnesia maintains
indices upon request, thereby introducing so called secondary keys. The qlc module prefers to look up
objects using the key before secondary keys regardless of the number of constants to look up.
KEY EQUALITY
In Erlang there are two operators for testing term equality, namely ==/2 and =:=/2. The difference
between them is all about the integers that can be represented by floats. For instance, 2 == 2.0
evaluates to true while 2 =:= 2.0 evaluates to false. Normally this is a minor issue, but the qlc module
cannot ignore the difference, which affects the user's choice of operators in QLCs.
If the qlc module can find out at compile time that some constant is free of integers, it does not matter
which one of ==/2 or =:=/2 is used:
1> E1 = ets:new(t, [set]), % uses =:=/2 for key equality
Q1 = qlc:q([K ||
{K} <- ets:table(E1),
K == 2.71 orelse K == a]),
io:format("~s~n", [qlc:info(Q1)]).
ets:match_spec_run(lists:flatmap(fun(V) ->
ets:lookup(20493, V)
end,
[a,2.71]),
ets:match_spec_compile([{{'$1'},[],['$1']}]))
In the example the ==/2 operator has been handled exactly as =:=/2 would have been handled. On the other
hand, if it cannot be determined at compile time that some constant is free of integers and the table
uses =:=/2 when comparing keys for equality (see the option key_equality), the qlc module will not try to
look up the constant. The reason is that there is in the general case no upper limit on the number of key
values that can compare equal to such a constant; every combination of integers and floats has to be
looked up:
2> E2 = ets:new(t, [set]),
true = ets:insert(E2, [{{2,2},a},{{2,2.0},b},{{2.0,2},c}]),
F2 = fun(I) ->
qlc:q([V || {K,V} <- ets:table(E2), K == I])
end,
Q2 = F2({2,2}),
io:format("~s~n", [qlc:info(Q2)]).
ets:table(53264,
[{traverse,
{select,[{{'$1','$2'},[{'==','$1',{const,{2,2}}}],['$2']}]}}])
3> lists:sort(qlc:e(Q2)).
[a,b,c]
Looking up just {2,2} would not return b and c.
If the table uses ==/2 when comparing keys for equality, the qlc module will look up the constant
regardless of which operator is used in the QLC. However, ==/2 is to be preferred:
4> E3 = ets:new(t, [ordered_set]), % uses ==/2 for key equality
true = ets:insert(E3, [{{2,2.0},b}]),
F3 = fun(I) ->
qlc:q([V || {K,V} <- ets:table(E3), K == I])
end,
Q3 = F3({2,2}),
io:format("~s~n", [qlc:info(Q3)]).
ets:match_spec_run(ets:lookup(86033, {2,2}),
ets:match_spec_compile([{{'$1','$2'},[],['$2']}]))
5> qlc:e(Q3).
[b]
Lookup join is handled analogously to lookup of constants in a table: if the join operator is ==/2 and
the table where constants are to be looked up uses =:=/2 when testing keys for equality, the qlc module
will not consider lookup join for that table.
DATA TYPES
abstract_expr() = erl_parse:abstract_expr()
Parse trees for Erlang expression, see the abstract format documentation in the ERTS User's Guide.
answer() = term()
answers() = [answer()]
cache() = ets | list | no
match_expression() = ets:match_spec()
Match specification, see the match specification documentation in the ERTS User's Guide and
ms_transform(3erl).
no_files() = integer() >= 1
Actually an integer > 1.
key_pos() = integer() >= 1 | [integer() >= 1]
max_list_size() = integer() >= 0
order() = ascending | descending | order_fun()
order_fun() = fun((term(), term()) -> boolean())
query_cursor()
A query cursor.
query_handle()
A query handle.
query_handle_or_list() = query_handle() | list()
query_list_comprehension() = term()
A literal query list comprehension.
spawn_options() = default | [proc_lib:spawn_option()]
sort_options() = [sort_option()] | sort_option()
sort_option() =
{compressed, boolean()} |
{no_files, no_files()} |
{order, order()} |
{size, integer() >= 1} |
{tmpdir, tmp_directory()} |
{unique, boolean()}
See file_sorter(3erl).
tmp_directory() = [] | file:name()
tmp_file_usage() =
allowed | not_allowed | info_msg | warning_msg | error_msg
EXPORTS
append(QHL) -> QH
Types:
QHL = [query_handle_or_list()]
QH = query_handle()
Returns a query handle. When evaluating the query handle QH all answers to the first query handle
in QHL are returned followed by all answers to the rest of the query handles in QHL.
append(QH1, QH2) -> QH3
Types:
QH1 = QH2 = query_handle_or_list()
QH3 = query_handle()
Returns a query handle. When evaluating the query handle QH3 all answers to QH1 are returned
followed by all answers to QH2.
append(QH1, QH2) is equivalent to append([QH1, QH2]).
cursor(QH) -> Cursor
cursor(QH, Options) -> Cursor
Types:
QH = query_handle_or_list()
Options = [Option] | Option
Option =
{cache_all, cache()} |
cache_all |
{max_list_size, max_list_size()} |
{spawn_options, spawn_options()} |
{tmpdir_usage, tmp_file_usage()} |
{tmpdir, tmp_directory()} |
{unique_all, boolean()} |
unique_all
Cursor = query_cursor()
Creates a query cursor and makes the calling process the owner of the cursor. The cursor is to be
used as argument to next_answers/1,2 and (eventually) delete_cursor/1. Calls erlang:spawn_opt to
spawn and link a process which will evaluate the query handle. The value of the option
spawn_options is used as last argument when calling spawn_opt. The default value is [link].
1> QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),
QC = qlc:cursor(QH),
qlc:next_answers(QC, 1).
[{a,1}]
2> qlc:next_answers(QC, 1).
[{a,2}]
3> qlc:next_answers(QC, all_remaining).
[{b,1},{b,2}]
4> qlc:delete_cursor(QC).
ok
cursor(QH) is equivalent to cursor(QH, []).
delete_cursor(QueryCursor) -> ok
Types:
QueryCursor = query_cursor()
Deletes a query cursor. Only the owner of the cursor can delete the cursor.
eval(QH) -> Answers | Error
eval(QH, Options) -> Answers | Error
e(QH) -> Answers | Error
e(QH, Options) -> Answers | Error
Types:
QH = query_handle_or_list()
Options = [Option] | Option
Option =
{cache_all, cache()} |
cache_all |
{max_list_size, max_list_size()} |
{tmpdir_usage, tmp_file_usage()} |
{tmpdir, tmp_directory()} |
{unique_all, boolean()} |
unique_all
Answers = answers()
Error = {error, module(), Reason}
Reason = file_sorter:reason()
Evaluates a query handle in the calling process and collects all answers in a list.
1> QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),
qlc:eval(QH).
[{a,1},{a,2},{b,1},{b,2}]
eval(QH) is equivalent to eval(QH, []).
fold(Function, Acc0, QH) -> Acc1 | Error
fold(Function, Acc0, QH, Options) -> Acc1 | Error
Types:
QH = query_handle_or_list()
Function = fun((answer(), AccIn) -> AccOut)
Acc0 = Acc1 = AccIn = AccOut = term()
Options = [Option] | Option
Option =
{cache_all, cache()} |
cache_all |
{max_list_size, max_list_size()} |
{tmpdir_usage, tmp_file_usage()} |
{tmpdir, tmp_directory()} |
{unique_all, boolean()} |
unique_all
Error = {error, module(), Reason}
Reason = file_sorter:reason()
Calls Function on successive answers to the query handle together with an extra argument AccIn.
The query handle and the function are evaluated in the calling process. Function must return a new
accumulator which is passed to the next call. Acc0 is returned if there are no answers to the
query handle.
1> QH = [1,2,3,4,5,6],
qlc:fold(fun(X, Sum) -> X + Sum end, 0, QH).
21
fold(Function, Acc0, QH) is equivalent to fold(Function, Acc0, QH, []).
format_error(Error) -> Chars
Types:
Error = {error, module(), term()}
Chars = io_lib:chars()
Returns a descriptive string in English of an error tuple returned by some of the functions of the
qlc module or the parse transform. This function is mainly used by the compiler invoking the parse
transform.
info(QH) -> Info
info(QH, Options) -> Info
Types:
QH = query_handle_or_list()
Options = [Option] | Option
Option = EvalOption | ReturnOption
EvalOption =
{cache_all, cache()} |
cache_all |
{max_list_size, max_list_size()} |
{tmpdir_usage, tmp_file_usage()} |
{tmpdir, tmp_directory()} |
{unique_all, boolean()} |
unique_all
ReturnOption =
{depth, Depth} |
{flat, boolean()} |
{format, Format} |
{n_elements, NElements}
Depth = infinity | integer() >= 0
Format = abstract_code | string
NElements = infinity | integer() >= 1
Info = abstract_expr() | string()
Returns information about a query handle. The information describes the simplifications and
optimizations that are the results of preparing the query for evaluation. This function is
probably useful mostly during debugging.
The information has the form of an Erlang expression where QLCs most likely occur. Depending on
the format functions of mentioned QLC tables it may not be absolutely accurate.
The default is to return a sequence of QLCs in a block, but if the option {flat, false} is given,
one single QLC is returned. The default is to return a string, but if the option {format,
abstract_code} is given, abstract code is returned instead. In the abstract code port identifiers,
references, and pids are represented by strings. The default is to return all elements in lists,
but if the {n_elements, NElements} option is given, only a limited number of elements are
returned. The default is to show all of objects and match specifications, but if the {depth,
Depth} option is given, parts of terms below a certain depth are replaced by '...'.
1> QH = qlc:q([{X,Y} || X <- [x,y], Y <- [a,b]]),
io:format("~s~n", [qlc:info(QH, unique_all)]).
begin
V1 =
qlc:q([
SQV ||
SQV <- [x,y]
],
[{unique,true}]),
V2 =
qlc:q([
SQV ||
SQV <- [a,b]
],
[{unique,true}]),
qlc:q([
{X,Y} ||
X <- V1,
Y <- V2
],
[{unique,true}])
end
In this example two simple QLCs have been inserted just to hold the {unique, true} option.
1> E1 = ets:new(e1, []),
E2 = ets:new(e2, []),
true = ets:insert(E1, [{1,a},{2,b}]),
true = ets:insert(E2, [{a,1},{b,2}]),
Q = qlc:q([{X,Z,W} ||
{X, Z} <- ets:table(E1),
{W, Y} <- ets:table(E2),
X =:= Y]),
io:format("~s~n", [qlc:info(Q)]).
begin
V1 =
qlc:q([
P0 ||
P0 = {W,Y} <- ets:table(17)
]),
V2 =
qlc:q([
[G1|G2] ||
G2 <- V1,
G1 <- ets:table(16),
element(2, G1) =:= element(1, G2)
],
[{join,lookup}]),
qlc:q([
{X,Z,W} ||
[{X,Z}|{W,Y}] <- V2
])
end
In this example the query list comprehension V2 has been inserted to show the joined generators
and the join method chosen. A convention is used for lookup join: the first generator (G2) is the
one traversed, the second one (G1) is the table where constants are looked up.
info(QH) is equivalent to info(QH, []).
keysort(KeyPos, QH1) -> QH2
keysort(KeyPos, QH1, SortOptions) -> QH2
Types:
KeyPos = key_pos()
SortOptions = sort_options()
QH1 = query_handle_or_list()
QH2 = query_handle()
Returns a query handle. When evaluating the query handle QH2 the answers to the query handle QH1
are sorted by file_sorter:keysort/4 according to the options.
The sorter will use temporary files only if QH1 does not evaluate to a list and the size of the
binary representation of the answers exceeds Size bytes, where Size is the value of the size
option.
keysort(KeyPos, QH1) is equivalent to keysort(KeyPos, QH1, []).
next_answers(QueryCursor) -> Answers | Error
next_answers(QueryCursor, NumberOfAnswers) -> Answers | Error
Types:
QueryCursor = query_cursor()
Answers = answers()
NumberOfAnswers = all_remaining | integer() >= 1
Error = {error, module(), Reason}
Reason = file_sorter:reason()
Returns some or all of the remaining answers to a query cursor. Only the owner of QueryCursor can
retrieve answers.
The optional argument NumberOfAnswersdetermines the maximum number of answers returned. The
default value is 10. If less than the requested number of answers is returned, subsequent calls to
next_answers will return [].
q(QLC) -> QH
q(QLC, Options) -> QH
Types:
QH = query_handle()
Options = [Option] | Option
Option =
{max_lookup, MaxLookup} |
{cache, cache()} |
cache |
{join, Join} |
{lookup, Lookup} |
{unique, boolean()} |
unique
MaxLookup = integer() >= 0 | infinity
Join = any | lookup | merge | nested_loop
Lookup = boolean() | any
QLC = query_list_comprehension()
Returns a query handle for a query list comprehension. The query list comprehension must be the
first argument to qlc:q/1,2 or it will be evaluated as an ordinary list comprehension. It is also
necessary to add the line
-include_lib("stdlib/include/qlc.hrl").
to the source file. This causes a parse transform to substitute a fun for the query list
comprehension. The (compiled) fun will be called when the query handle is evaluated.
When calling qlc:q/1,2 from the Erlang shell the parse transform is automatically called. When
this happens the fun substituted for the query list comprehension is not compiled but will be
evaluated by erl_eval(3erl). This is also true when expressions are evaluated by means of
file:eval/1,2 or in the debugger.
To be very explicit, this will not work:
...
A = [X || {X} <- [{1},{2}]],
QH = qlc:q(A),
...
The variable A will be bound to the evaluated value of the list comprehension ([1,2]). The
compiler complains with an error message ("argument is not a query list comprehension"); the shell
process stops with a badarg reason.
q(QLC) is equivalent to q(QLC, []).
The {cache, ets} option can be used to cache the answers to a query list comprehension. The
answers are stored in one ETS table for each cached query list comprehension. When a cached query
list comprehension is evaluated again, answers are fetched from the table without any further
computations. As a consequence, when all answers to a cached query list comprehension have been
found, the ETS tables used for caching answers to the query list comprehension's qualifiers can be
emptied. The option cache is equivalent to {cache, ets}.
The {cache, list} option can be used to cache the answers to a query list comprehension just like
{cache, ets}. The difference is that the answers are kept in a list (on the process heap). If the
answers would occupy more than a certain amount of RAM memory a temporary file is used for storing
the answers. The option max_list_size sets the limit in bytes and the temporary file is put on the
directory set by the tmpdir option.
The cache option has no effect if it is known that the query list comprehension will be evaluated
at most once. This is always true for the top-most query list comprehension and also for the list
expression of the first generator in a list of qualifiers. Note that in the presence of side
effects in filters or callback functions the answers to query list comprehensions can be affected
by the cache option.
The {unique, true} option can be used to remove duplicate answers to a query list comprehension.
The unique answers are stored in one ETS table for each query list comprehension. The table is
emptied every time it is known that there are no more answers to the query list comprehension. The
option unique is equivalent to {unique, true}. If the unique option is combined with the {cache,
ets} option, two ETS tables are used, but the full answers are stored in one table only. If the
unique option is combined with the {cache, list} option the answers are sorted twice using
keysort/3; once to remove duplicates, and once to restore the order.
The cache and unique options apply not only to the query list comprehension itself but also to the
results of looking up constants, running match specifications, and joining handles.
1> Q = qlc:q([{A,X,Z,W} ||
A <- [a,b,c],
{X,Z} <- [{a,1},{b,4},{c,6}],
{W,Y} <- [{2,a},{3,b},{4,c}],
X =:= Y],
{cache, list}),
io:format("~s~n", [qlc:info(Q)]).
begin
V1 =
qlc:q([
P0 ||
P0 = {X,Z} <-
qlc:keysort(1, [{a,1},{b,4},{c,6}], [])
]),
V2 =
qlc:q([
P0 ||
P0 = {W,Y} <-
qlc:keysort(2, [{2,a},{3,b},{4,c}], [])
]),
V3 =
qlc:q([
[G1|G2] ||
G1 <- V1,
G2 <- V2,
element(1, G1) == element(2, G2)
],
[{join,merge},{cache,list}]),
qlc:q([
{A,X,Z,W} ||
A <- [a,b,c],
[{X,Z}|{W,Y}] <- V3,
X =:= Y
])
end
In this example the cached results of the merge join are traversed for each value of A. Note that
without the cache option the join would have been carried out three times, once for each value of
A
sort/1,2 and keysort/2,3 can also be used for caching answers and for removing duplicates. When
sorting answers are cached in a list, possibly stored on a temporary file, and no ETS tables are
used.
Sometimes (see qlc:table/2 below) traversal of tables can be done by looking up key values, which
is assumed to be fast. Under certain (rare) circumstances it could happen that there are too many
key values to look up. The {max_lookup, MaxLookup} option can then be used to limit the number of
lookups: if more than MaxLookup lookups would be required no lookups are done but the table
traversed instead. The default value is infinity which means that there is no limit on the number
of keys to look up.
1> T = gb_trees:empty(),
QH = qlc:q([X || {{X,Y},_} <- gb_table:table(T),
((X == 1) or (X == 2)) andalso
((Y == a) or (Y == b) or (Y == c))]),
io:format("~s~n", [qlc:info(QH)]).
ets:match_spec_run(
lists:flatmap(fun(K) ->
case
gb_trees:lookup(K,
gb_trees:from_orddict([]))
of
{value,V} ->
[{K,V}];
none ->
[]
end
end,
[{1,a},{1,b},{1,c},{2,a},{2,b},{2,c}]),
ets:match_spec_compile([{{{'$1','$2'},'_'},[],['$1']}]))
In this example using the gb_table module from the Implementing a QLC table section there are six
keys to look up: {1,a}, {1,b}, {1,c}, {2,a}, {2,b}, and {2,c}. The reason is that the two elements
of the key {X, Y} are compared separately.
The {lookup, true} option can be used to ensure that the qlc module will look up constants in some
QLC table. If there are more than one QLC table among the generators' list expressions, constants
have to be looked up in at least one of the tables. The evaluation of the query fails if there are
no constants to look up. This option is useful in situations when it would be unacceptable to
traverse all objects in some table. Setting the lookup option to false ensures that no constants
will be looked up ({max_lookup, 0} has the same effect). The default value is any which means that
constants will be looked up whenever possible.
The {join, Join} option can be used to ensure that a certain join method will be used: {join,
lookup} invokes the lookup join method; {join, merge} invokes the merge join method; and {join,
nested_loop} invokes the method of matching every pair of objects from two handles. The last
method is mostly very slow. The evaluation of the query fails if the qlc module cannot carry out
the chosen join method. The default value is any which means that some fast join method will be
used if possible.
sort(QH1) -> QH2
sort(QH1, SortOptions) -> QH2
Types:
SortOptions = sort_options()
QH1 = query_handle_or_list()
QH2 = query_handle()
Returns a query handle. When evaluating the query handle QH2 the answers to the query handle QH1
are sorted by file_sorter:sort/3 according to the options.
The sorter will use temporary files only if QH1 does not evaluate to a list and the size of the
binary representation of the answers exceeds Size bytes, where Size is the value of the size
option.
sort(QH1) is equivalent to sort(QH1, []).
string_to_handle(QueryString) -> QH | Error
string_to_handle(QueryString, Options) -> QH | Error
string_to_handle(QueryString, Options, Bindings) -> QH | Error
Types:
QueryString = string()
Options = [Option] | Option
Option =
{max_lookup, MaxLookup} |
{cache, cache()} |
cache |
{join, Join} |
{lookup, Lookup} |
{unique, boolean()} |
unique
MaxLookup = integer() >= 0 | infinity
Join = any | lookup | merge | nested_loop
Lookup = boolean() | any
Bindings = erl_eval:binding_struct()
QH = query_handle()
Error = {error, module(), Reason}
Reason = erl_parse:error_info() | erl_scan:error_info()
A string version of qlc:q/1,2. When the query handle is evaluated the fun created by the parse
transform is interpreted by erl_eval(3erl). The query string is to be one single query list
comprehension terminated by a period.
1> L = [1,2,3],
Bs = erl_eval:add_binding('L', L, erl_eval:new_bindings()),
QH = qlc:string_to_handle("[X+1 || X <- L].", [], Bs),
qlc:eval(QH).
[2,3,4]
string_to_handle(QueryString) is equivalent to string_to_handle(QueryString, []).
string_to_handle(QueryString, Options) is equivalent to string_to_handle(QueryString, Options,
erl_eval:new_bindings()).
This function is probably useful mostly when called from outside of Erlang, for instance from a
driver written in C.
table(TraverseFun, Options) -> QH
Types:
TraverseFun = TraverseFun0 | TraverseFun1
TraverseFun0 = fun(() -> TraverseResult)
TraverseFun1 = fun((match_expression()) -> TraverseResult)
TraverseResult = Objects | term()
Objects = [] | [term() | ObjectList]
ObjectList = TraverseFun0 | Objects
Options = [Option] | Option
Option =
{format_fun, FormatFun} |
{info_fun, InfoFun} |
{lookup_fun, LookupFun} |
{parent_fun, ParentFun} |
{post_fun, PostFun} |
{pre_fun, PreFun} |
{key_equality, KeyComparison}
FormatFun = undefined | fun((SelectedObjects) -> FormatedTable)
SelectedObjects =
all |
{all, NElements, DepthFun} |
{match_spec, match_expression()} |
{lookup, Position, Keys} |
{lookup, Position, Keys, NElements, DepthFun}
NElements = infinity | integer() >= 1
DepthFun = fun((term()) -> term())
FormatedTable = {Mod, Fun, Args} | abstract_expr() | string()
InfoFun = undefined | fun((InfoTag) -> InfoValue)
InfoTag = indices | is_unique_objects | keypos | num_of_objects
InfoValue = undefined | term()
LookupFun = undefined | fun((Position, Keys) -> LookupResult)
LookupResult = [term()] | term()
ParentFun = undefined | fun(() -> ParentFunValue)
PostFun = undefined | fun(() -> term())
PreFun = undefined | fun((PreArgs) -> term())
PreArgs = [PreArg]
PreArg = {parent_value, ParentFunValue} | {stop_fun, StopFun}
ParentFunValue = undefined | term()
StopFun = undefined | fun(() -> term())
KeyComparison = '=:=' | '=='
Position = integer() >= 1
Keys = [term()]
Mod = Fun = atom()
Args = [term()]
QH = query_handle()
Returns a query handle for a QLC table. In Erlang/OTP there is support for ETS, Dets and Mnesia
tables, but it is also possible to turn many other data structures into QLC tables. The way to
accomplish this is to let function(s) in the module implementing the data structure create a query
handle by calling qlc:table/2. The different ways to traverse the table as well as properties of
the table are handled by callback functions provided as options to qlc:table/2.
The callback function TraverseFun is used for traversing the table. It is to return a list of
objects terminated by either [] or a nullary fun to be used for traversing the not yet traversed
objects of the table. Any other return value is immediately returned as value of the query
evaluation. Unary TraverseFuns are to accept a match specification as argument. The match
specification is created by the parse transform by analyzing the pattern of the generator calling
qlc:table/2 and filters using variables introduced in the pattern. If the parse transform cannot
find a match specification equivalent to the pattern and filters, TraverseFun will be called with
a match specification returning every object. Modules that can utilize match specifications for
optimized traversal of tables should call qlc:table/2 with a unary TraverseFun while other modules
can provide a nullary TraverseFun. ets:table/2 is an example of the former; gb_table:table/1 in
the Implementing a QLC table section is an example of the latter.
PreFun is a unary callback function that is called once before the table is read for the first
time. If the call fails, the query evaluation fails. Similarly, the nullary callback function
PostFun is called once after the table was last read. The return value, which is caught, is
ignored. If PreFun has been called for a table, PostFun is guaranteed to be called for that table,
even if the evaluation of the query fails for some reason. The order in which pre (post) functions
for different tables are evaluated is not specified. Other table access than reading, such as
calling InfoFun, is assumed to be OK at any time. The argument PreArgs is a list of tagged values.
Currently there are two tags, parent_value and stop_fun, used by Mnesia for managing transactions.
The value of parent_value is the value returned by ParentFun, or undefined if there is no
ParentFun. ParentFun is called once just before the call of PreFun in the context of the process
calling eval, fold, or cursor. The value of stop_fun is a nullary fun that deletes the cursor if
called from the parent, or undefined if there is no cursor.
The binary callback function LookupFun is used for looking up objects in the table. The first
argument Position is the key position or an indexed position and the second argument Keys is a
sorted list of unique values. The return value is to be a list of all objects (tuples) such that
the element at Position is a member of Keys. Any other return value is immediately returned as
value of the query evaluation. LookupFun is called instead of traversing the table if the parse
transform at compile time can find out that the filters match and compare the element at Position
in such a way that only Keys need to be looked up in order to find all potential answers. The key
position is obtained by calling InfoFun(keypos) and the indexed positions by calling
InfoFun(indices). If the key position can be used for lookup it is always chosen, otherwise the
indexed position requiring the least number of lookups is chosen. If there is a tie between two
indexed positions the one occurring first in the list returned by InfoFun is chosen. Positions
requiring more than max_lookup lookups are ignored.
The unary callback function InfoFun is to return information about the table. undefined should be
returned if the value of some tag is unknown:
* indices. Returns a list of indexed positions, a list of positive integers.
* is_unique_objects. Returns true if the objects returned by TraverseFun are unique.
* keypos. Returns the position of the table's key, a positive integer.
* is_sorted_key. Returns true if the objects returned by TraverseFun are sorted on the key.
* num_of_objects. Returns the number of objects in the table, a non-negative integer.
The unary callback function FormatFun is used by qlc:info/1,2 for displaying the call that created
the table's query handle. The default value, undefined, means that info/1,2 displays a call to
'$MOD':'$FUN'/0. It is up to FormatFun to present the selected objects of the table in a suitable
way. However, if a character list is chosen for presentation it must be an Erlang expression that
can be scanned and parsed (a trailing dot will be added by qlc:info though). FormatFun is called
with an argument that describes the selected objects based on optimizations done as a result of
analyzing the filters of the QLC where the call to qlc:table/2 occurs. The possible values of the
argument are:
* {lookup, Position, Keys, NElements, DepthFun}. LookupFun is used for looking up objects in the
table.
* {match_spec, MatchExpression}. No way of finding all possible answers by looking up keys was
found, but the filters could be transformed into a match specification. All answers are found
by calling TraverseFun(MatchExpression).
* {all, NElements, DepthFun}. No optimization was found. A match specification matching all
objects will be used if TraverseFun is unary.
NElements is the value of the info/1,2 option n_elements, and DepthFun is a function that can be
used for limiting the size of terms; calling DepthFun(Term) substitutes '...' for parts of Term
below the depth specified by the info/1,2 option depth. If calling FormatFun with an argument
including NElements and DepthFun fails, FormatFun is called once again with an argument excluding
NElements and DepthFun ({lookup, Position, Keys} or all).
The value of key_equality is to be '=:=' if the table considers two keys equal if they match, and
to be '==' if two keys are equal if they compare equal. The default is '=:='.
See ets(3erl), dets(3erl) and mnesia(3erl) for the various options recognized by table/1,2 in
respective module.
SEE ALSO
dets(3erl), Erlang Reference Manual, erl_eval(3erl), erlang(3erl), ets(3erl), file(3erl),
error_logger(3erl), file_sorter(3erl), mnesia(3erl), Programming Examples, shell(3erl)
Ericsson AB stdlib 2.8 qlc(3erl)