blob: e69a88f24582ceba42a8488ad4160f399d8cbb99 [file] [log] [blame]
Ian Maxond00eca82018-10-05 17:29:55 -07001<!DOCTYPE html>
2<!--
Ian Maxon41b806c2019-03-07 15:58:20 -08003 | Generated by Apache Maven Doxia Site Renderer 1.8.1 from src/site/markdown/aql/manual.md at 2019-03-07
Ian Maxond00eca82018-10-05 17:29:55 -07004 | Rendered using Apache Maven Fluido Skin 1.7
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
Ian Maxon41b806c2019-03-07 15:58:20 -080010 <meta name="Date-Revision-yyyymmdd" content="20190307" />
Ian Maxond00eca82018-10-05 17:29:55 -070011 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; The Asterix Query Language, Version 1.0</title>
13 <link rel="stylesheet" href="../css/apache-maven-fluido-1.7.min.css" />
14 <link rel="stylesheet" href="../css/site.css" />
15 <link rel="stylesheet" href="../css/print.css" media="print" />
16 <script type="text/javascript" src="../js/apache-maven-fluido-1.7.min.js"></script>
17
18 </head>
19 <body class="topBarDisabled">
20 <div class="container-fluid">
21 <div id="banner">
22 <div class="pull-left"><a href=".././" id="bannerLeft"><img src="../images/asterixlogo.png" alt="AsterixDB"/></a></div>
23 <div class="pull-right"></div>
24 <div class="clear"><hr/></div>
25 </div>
26
27 <div id="breadcrumbs">
28 <ul class="breadcrumb">
Ian Maxon41b806c2019-03-07 15:58:20 -080029 <li id="publishDate">Last Published: 2019-03-07</li>
30 <li id="projectVersion" class="pull-right">Version: 0.9.4</li>
Ian Maxond00eca82018-10-05 17:29:55 -070031 <li class="pull-right"><a href="../index.html" title="Documentation Home">Documentation Home</a></li>
32 </ul>
33 </div>
34 <div class="row-fluid">
35 <div id="leftColumn" class="span2">
36 <div class="well sidebar-nav">
37 <ul class="nav nav-list">
38 <li class="nav-header">Get Started - Installation</li>
39 <li><a href="../ncservice.html" title="Option 1: using NCService"><span class="none"></span>Option 1: using NCService</a></li>
40 <li><a href="../ansible.html" title="Option 2: using Ansible"><span class="none"></span>Option 2: using Ansible</a></li>
41 <li><a href="../aws.html" title="Option 3: using Amazon Web Services"><span class="none"></span>Option 3: using Amazon Web Services</a></li>
42 <li class="nav-header">AsterixDB Primer</li>
Ian Maxon41b806c2019-03-07 15:58:20 -080043 <li><a href="../sqlpp/primer-sqlpp.html" title="Option 1: using SQL++"><span class="none"></span>Option 1: using SQL++</a></li>
44 <li><a href="../aql/primer.html" title="Option 2: using AQL"><span class="none"></span>Option 2: using AQL</a></li>
Ian Maxond00eca82018-10-05 17:29:55 -070045 <li class="nav-header">Data Model</li>
46 <li><a href="../datamodel.html" title="The Asterix Data Model"><span class="none"></span>The Asterix Data Model</a></li>
Ian Maxon41b806c2019-03-07 15:58:20 -080047 <li class="nav-header">Queries - SQL++</li>
Ian Maxond00eca82018-10-05 17:29:55 -070048 <li><a href="../sqlpp/manual.html" title="The SQL++ Query Language"><span class="none"></span>The SQL++ Query Language</a></li>
49 <li><a href="../sqlpp/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li>
Ian Maxon41b806c2019-03-07 15:58:20 -080050 <li class="nav-header">Queries - AQL</li>
51 <li class="active"><a href="#"><span class="none"></span>The Asterix Query Language (AQL)</a></li>
52 <li><a href="../aql/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li>
Ian Maxond00eca82018-10-05 17:29:55 -070053 <li class="nav-header">API/SDK</li>
54 <li><a href="../api.html" title="HTTP API"><span class="none"></span>HTTP API</a></li>
55 <li><a href="../csv.html" title="CSV Output"><span class="none"></span>CSV Output</a></li>
56 <li class="nav-header">Advanced Features</li>
Ian Maxon41b806c2019-03-07 15:58:20 -080057 <li><a href="../aql/fulltext.html" title="Support of Full-text Queries"><span class="none"></span>Support of Full-text Queries</a></li>
Ian Maxond00eca82018-10-05 17:29:55 -070058 <li><a href="../aql/externaldata.html" title="Accessing External Data"><span class="none"></span>Accessing External Data</a></li>
Ian Maxon41b806c2019-03-07 15:58:20 -080059 <li><a href="../feeds/tutorial.html" title="Support for Data Ingestion"><span class="none"></span>Support for Data Ingestion</a></li>
Ian Maxond00eca82018-10-05 17:29:55 -070060 <li><a href="../udf.html" title="User Defined Functions"><span class="none"></span>User Defined Functions</a></li>
Ian Maxon41b806c2019-03-07 15:58:20 -080061 <li><a href="../aql/filters.html" title="Filter-Based LSM Index Acceleration"><span class="none"></span>Filter-Based LSM Index Acceleration</a></li>
62 <li><a href="../aql/similarity.html" title="Support of Similarity Queries"><span class="none"></span>Support of Similarity Queries</a></li>
Ian Maxond00eca82018-10-05 17:29:55 -070063</ul>
64 <hr />
65 <div id="poweredBy">
66 <div class="clear"></div>
67 <div class="clear"></div>
68 <div class="clear"></div>
69 <div class="clear"></div>
70<a href=".././" title="AsterixDB" class="builtBy"><img class="builtBy" alt="AsterixDB" src="../images/asterixlogo.png" /></a>
71 </div>
72 </div>
73 </div>
74 <div id="bodyColumn" class="span10" >
75<!--
76 ! Licensed to the Apache Software Foundation (ASF) under one
77 ! or more contributor license agreements. See the NOTICE file
78 ! distributed with this work for additional information
79 ! regarding copyright ownership. The ASF licenses this file
80 ! to you under the Apache License, Version 2.0 (the
81 ! "License"); you may not use this file except in compliance
82 ! with the License. You may obtain a copy of the License at
83 !
84 ! http://www.apache.org/licenses/LICENSE-2.0
85 !
86 ! Unless required by applicable law or agreed to in writing,
87 ! software distributed under the License is distributed on an
88 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
89 ! KIND, either express or implied. See the License for the
90 ! specific language governing permissions and limitations
91 ! under the License.
92 !-->
93<h1>The Asterix Query Language, Version 1.0</h1>
94<div class="section">
95<h2><a name="Table_of_Contents"></a><a name="toc" id="toc">Table of Contents</a></h2>
96<ul>
97
98<li><a href="#Introduction">1. Introduction</a></li>
99<li><a href="#Expressions">2. Expressions</a></li>
100<li><a href="#Statements">3. Statements</a></li>
101</ul></div>
102<div class="section">
103<h2><a name="a1._Introduction_.5BBack_to_TOC.5D"></a><a name="Introduction" id="Introduction">1. Introduction</a><font size="4"> <a href="#toc">[Back to TOC]</a></font></h2>
104<p>This document is intended as a reference guide to the full syntax and semantics of the Asterix Query Language (AQL), the language for talking to AsterixDB. This guide covers both the data manipulation language (DML) aspects of AQL, including its support for queries and data modification, as well as its data definition language (DDL) aspects. New AsterixDB users are encouraged to read and work through the (friendlier) guide &#x201c;AsterixDB 101: An ADM and AQL Primer&#x201d; before attempting to make use of this document. In addition, readers are advised to read and understand the Asterix Data Model (ADM) reference guide since a basic understanding of ADM concepts is a prerequisite to understanding AQL. In what follows, we detail the features of the AQL language in a grammar-guided manner: We list and briefly explain each of the productions in the AQL grammar, offering examples for clarity in cases where doing so seems needed or helpful.</p></div>
105<div class="section">
106<h2><a name="a2._Expressions_.5BBack_to_TOC.5D"></a><a name="Expressions" id="Expressions">2. Expressions</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
107
108<div>
109<div>
110<pre class="source">Query ::= Expression
111</pre></div></div>
112
113<p>An AQL query can be any legal AQL expression.</p>
114
115<div>
116<div>
117<pre class="source">Expression ::= ( OperatorExpr | IfThenElse | FLWOR | QuantifiedExpression )
118</pre></div></div>
119
120<p>AQL is a fully composable expression language. Each AQL expression returns zero or more Asterix Data Model (ADM) instances. There are four major kinds of expressions in AQL. At the topmost level, an AQL expression can be an OperatorExpr (similar to a mathematical expression), an IfThenElse (to choose between two alternative values), a FLWOR expression (the heart of AQL, pronounced &#x201c;flower expression&#x201d;), or a QuantifiedExpression (which yields a boolean value). Each will be detailed as we explore the full AQL grammar.</p>
121<div class="section">
122<h3><a name="Primary_Expressions"></a>Primary Expressions</h3>
123
124<div>
125<div>
126<pre class="source">PrimaryExpr ::= Literal
127 | VariableRef
128 | ParenthesizedExpression
129 | FunctionCallExpr
130 | DatasetAccessExpression
131 | ListConstructor
132 | ObjectConstructor
133</pre></div></div>
134
135<p>The most basic building block for any AQL expression is the PrimaryExpr. This can be a simple literal (constant) value, a reference to a query variable that is in scope, a parenthesized expression, a function call, an expression accessing the ADM contents of a dataset, a newly constructed list of ADM instances, or a newly constructed ADM object.</p>
136<div class="section">
137<h4><a name="Literals"></a>Literals</h4>
138
139<div>
140<div>
141<pre class="source">Literal ::= StringLiteral
142 | IntegerLiteral
143 | FloatLiteral
144 | DoubleLiteral
145 | &quot;null&quot;
146 | &quot;true&quot;
147 | &quot;false&quot;
148StringLiteral ::= (&quot;\&quot;&quot; (&lt;ESCAPE_QUOT&gt; | ~[&quot;\&quot;&quot;])* &quot;\&quot;&quot;)
149 | (&quot;\'&quot; (&lt;ESCAPE_APOS&gt; | ~[&quot;\'&quot;])* &quot;\'&quot;)
150&lt;ESCAPE_QUOT&gt; ::= &quot;\\\&quot;&quot;
151&lt;ESCAPE_APOS&gt; ::= &quot;\\\'&quot;
152IntegerLiteral ::= &lt;DIGITS&gt;
153&lt;DIGITS&gt; ::= [&quot;0&quot; - &quot;9&quot;]+
154FloatLiteral ::= &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; )
155 | &lt;DIGITS&gt; ( &quot;.&quot; &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; ) )?
156 | &quot;.&quot; &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; )
157DoubleLiteral ::= &lt;DIGITS&gt;
158 | &lt;DIGITS&gt; ( &quot;.&quot; &lt;DIGITS&gt; )?
159 | &quot;.&quot; &lt;DIGITS&gt;
160</pre></div></div>
161
162<p>Literals (constants) in AQL can be strings, integers, floating point values, double values, boolean constants, or the constant value null. The null value in AQL has &#x201c;unknown&#x201d; or &#x201c;missing&#x201d; value semantics, similar to (though not identical to) nulls in the relational query language SQL.</p>
163<p>The following are some simple examples of AQL literals. Since AQL is an expression language, each example is also a complete, legal AQL query (!).</p>
164<div class="section">
165<h5><a name="Examples"></a>Examples</h5>
166
167<div>
168<div>
169<pre class="source">&quot;a string&quot;
17042
171</pre></div></div>
172</div></div>
173<div class="section">
174<h4><a name="Variable_References"></a>Variable References</h4>
175
176<div>
177<div>
178<pre class="source">VariableRef ::= &lt;VARIABLE&gt;
179&lt;VARIABLE&gt; ::= &quot;$&quot; &lt;LETTER&gt; (&lt;LETTER&gt; | &lt;DIGIT&gt; | &quot;_&quot;)*
180&lt;LETTER&gt; ::= [&quot;A&quot; - &quot;Z&quot;, &quot;a&quot; - &quot;z&quot;]
181</pre></div></div>
182
183<p>A variable in AQL can be bound to any legal ADM value. A variable reference refers to the value to which an in-scope variable is bound. (E.g., a variable binding may originate from one of the for or let clauses of a FLWOR expression or from an input parameter in the context of an AQL function body.)</p>
184<div class="section">
185<h5><a name="Examples"></a>Examples</h5>
186
187<div>
188<div>
189<pre class="source">$tweet
190$id
191</pre></div></div>
192</div></div>
193<div class="section">
194<h4><a name="Parenthesized_Expressions"></a>Parenthesized Expressions</h4>
195
196<div>
197<div>
198<pre class="source">ParenthesizedExpression ::= &quot;(&quot; Expression &quot;)&quot;
199</pre></div></div>
200
201<p>As in most languages, an expression may be parenthesized.</p>
202<p>Since AQL is an expression language, the following example expression is actually also a complete, legal AQL query whose result is the value 2. (As such, you can have Big Fun explaining to your boss how AsterixDB and AQL can turn your 1000-node shared-nothing Big Data cluster into a $5M calculator in its spare time.)</p>
203<div class="section">
204<h5><a name="Example"></a>Example</h5>
205
206<div>
207<div>
208<pre class="source">( 1 + 1 )
209</pre></div></div>
210</div></div>
211<div class="section">
212<h4><a name="Function_Calls"></a>Function Calls</h4>
213
214<div>
215<div>
216<pre class="source">FunctionCallExpr ::= FunctionOrTypeName &quot;(&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;)&quot;
217</pre></div></div>
218
219<p>Functions are included in AQL, like most languages, as a way to package useful functionality or to componentize complicated or reusable AQL computations. A function call is a legal AQL query expression that represents the ADM value resulting from the evaluation of its body expression with the given parameter bindings; the parameter value bindings can themselves be any AQL expressions.</p>
220<p>The following example is a (built-in) function call expression whose value is 8.</p>
221<div class="section">
222<h5><a name="Example"></a>Example</h5>
223
224<div>
225<div>
226<pre class="source">string-length(&quot;a string&quot;)
227</pre></div></div>
228</div></div>
229<div class="section">
230<h4><a name="Dataset_Access"></a>Dataset Access</h4>
231
232<div>
233<div>
234<pre class="source">DatasetAccessExpression ::= &quot;dataset&quot; ( ( Identifier ( &quot;.&quot; Identifier )? )
235 | ( &quot;(&quot; Expression &quot;)&quot; ) )
236Identifier ::= &lt;IDENTIFIER&gt; | StringLiteral
237&lt;IDENTIFIER&gt; ::= &lt;LETTER&gt; (&lt;LETTER&gt; | &lt;DIGIT&gt; | &lt;SPECIALCHARS&gt;)*
238&lt;SPECIALCHARS&gt; ::= [&quot;$&quot;, &quot;_&quot;, &quot;-&quot;]
239</pre></div></div>
240
241<p>Querying Big Data is the main point of AsterixDB and AQL. Data in AsterixDB reside in datasets (collections of ADM objects), each of which in turn resides in some namespace known as a dataverse (data universe). Data access in a query expression is accomplished via a DatasetAccessExpression. Dataset access expressions are most commonly used in FLWOR expressions, where variables are bound to their contents.</p>
242<p>Note that the Identifier that identifies a dataset (or any other Identifier in AQL) can also be a StringLiteral. This is especially useful to avoid conficts with AQL keywords (e.g. &#x201c;dataset&#x201d;, &#x201c;null&#x201d;, or &#x201c;type&#x201d;).</p>
243<p>The following are three examples of legal dataset access expressions. The first one accesses a dataset called Customers in the dataverse called SalesDV. The second one accesses the Customers dataverse in whatever the current dataverse is. The third one does the same thing as the second but uses a slightly older AQL syntax.</p>
244<div class="section">
245<h5><a name="Examples"></a>Examples</h5>
246
247<div>
248<div>
249<pre class="source">dataset SalesDV.Customers
250dataset Customers
251dataset(&quot;Customers&quot;)
252</pre></div></div>
253</div></div>
254<div class="section">
255<h4><a name="Constructors"></a>Constructors</h4>
256
257<div>
258<div>
259<pre class="source">ListConstructor ::= ( OrderedListConstructor | UnorderedListConstructor )
260OrderedListConstructor ::= &quot;[&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;]&quot;
261UnorderedListConstructor ::= &quot;{{&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;}}&quot;
262ObjectConstructor ::= &quot;{&quot; ( FieldBinding ( &quot;,&quot; FieldBinding )* )? &quot;}&quot;
263FieldBinding ::= Expression &quot;:&quot; Expression
264</pre></div></div>
265
266<p>A major feature of AQL is its ability to construct new ADM data instances. This is accomplished using its constructors for each of the major ADM complex object structures, namely lists (ordered or unordered) and objects. Ordered lists are like JSON arrays, while unordered lists have bag (multiset) semantics. Objects are built from attributes that are field-name/field-value pairs, again like JSON. (See the AsterixDB Data Model document for more details on each.)</p>
267<p>The following examples illustrate how to construct a new ordered list with 3 items, a new unordered list with 4 items, and a new object with 2 fields, respectively. List elements can be homogeneous (as in the first example), which is the common case, or they may be heterogeneous (as in the second example). The data values and field name values used to construct lists and objects in constructors are all simply AQL expressions. Thus the list elements, field names, and field values used in constructors can be simple literals (as in these three examples) or they can come from query variable references or even arbitrarily complex AQL expressions.</p>
268<div class="section">
269<h5><a name="Examples"></a>Examples</h5>
270
271<div>
272<div>
273<pre class="source">[ &quot;a&quot;, &quot;b&quot;, &quot;c&quot; ]
274
275{{ 42, &quot;forty-two&quot;, &quot;AsterixDB!&quot;, 3.14f }}
276
277{
278 &quot;project name&quot;: &quot;AsterixDB&quot;
279 &quot;project members&quot;: {{ &quot;vinayakb&quot;, &quot;dtabass&quot;, &quot;chenli&quot; }}
280}
281</pre></div></div>
282</div>
283<div class="section">
284<h5><a name="Note"></a>Note</h5>
285<p>When constructing nested objects there needs to be a space between the closing braces to avoid confusion with the <tt>}}</tt> token that ends an unordered list constructor: <tt>{ &quot;a&quot; : { &quot;b&quot; : &quot;c&quot; }}</tt> will fail to parse while <tt>{ &quot;a&quot; : { &quot;b&quot; : &quot;c&quot; } }</tt> will work.</p></div></div></div>
286<div class="section">
287<h3><a name="Path_Expressions"></a>Path Expressions</h3>
288
289<div>
290<div>
291<pre class="source">ValueExpr ::= PrimaryExpr ( Field | Index )*
292Field ::= &quot;.&quot; Identifier
293Index ::= &quot;[&quot; ( Expression | &quot;?&quot; ) &quot;]&quot;
294</pre></div></div>
295
296<p>Components of complex types in ADM are accessed via path expressions. Path access can be applied to the result of an AQL expression that yields an instance of such a type, e.g., a object or list instance. For objects, path access is based on field names. For ordered lists, path access is based on (zero-based) array-style indexing. AQL also supports an &#x201c;I&#x2019;m feeling lucky&#x201d; style index accessor, [?], for selecting an arbitrary element from an ordered list. Attempts to access non-existent fields or list elements produce a null (i.e., missing information) result as opposed to signaling a runtime error.</p>
297<p>The following examples illustrate field access for a object, index-based element access for an ordered list, and also a composition thereof.</p>
298<div class="section">
299<div class="section">
300<h5><a name="Examples"></a>Examples</h5>
301
302<div>
303<div>
304<pre class="source">({&quot;list&quot;: [ &quot;a&quot;, &quot;b&quot;, &quot;c&quot;]}).list
305
306([&quot;a&quot;, &quot;b&quot;, &quot;c&quot;])[2]
307
308({ &quot;list&quot;: [ &quot;a&quot;, &quot;b&quot;, &quot;c&quot;]}).list[2]
309</pre></div></div>
310</div></div></div>
311<div class="section">
312<h3><a name="Logical_Expressions"></a>Logical Expressions</h3>
313
314<div>
315<div>
316<pre class="source">OperatorExpr ::= AndExpr ( &quot;or&quot; AndExpr )*
317AndExpr ::= RelExpr ( &quot;and&quot; RelExpr )*
318</pre></div></div>
319
320<p>As in most languages, boolean expressions can be built up from smaller expressions by combining them with the logical connectives and/or. Legal boolean values in AQL are true, false, and null. (Nulls in AQL are treated much like SQL treats its unknown truth value in boolean expressions.)</p>
321<p>The following is an example of a conjuctive range predicate in AQL. It will yield true if $a is bound to 4, null if $a is bound to null, and false otherwise.</p>
322<div class="section">
323<div class="section">
324<h5><a name="Example"></a>Example</h5>
325
326<div>
327<div>
328<pre class="source">$a &gt; 3 and $a &lt; 5
329</pre></div></div>
330</div></div></div>
331<div class="section">
332<h3><a name="Comparison_Expressions"></a>Comparison Expressions</h3>
333
334<div>
335<div>
336<pre class="source">RelExpr ::= AddExpr ( ( &quot;&lt;&quot; | &quot;&gt;&quot; | &quot;&lt;=&quot; | &quot;&gt;=&quot; | &quot;=&quot; | &quot;!=&quot; | &quot;~=&quot; ) AddExpr )?
337</pre></div></div>
338
339<p>AQL has the usual list of suspects, plus one, for comparing pairs of atomic values. The &#x201c;plus one&#x201d; is the last operator listed above, which is the &#x201c;roughly equal&#x201d; operator provided for similarity queries. (See the separate document on <a href="similarity.html">AsterixDB Similarity Queries</a> for more details on similarity matching.)</p>
340<p>An example comparison expression (which yields the boolean value true) is shown below.</p>
341<div class="section">
342<div class="section">
343<h5><a name="Example"></a>Example</h5>
344
345<div>
346<div>
347<pre class="source">5 &gt; 3
348</pre></div></div>
349</div></div></div>
350<div class="section">
351<h3><a name="Arithmetic_Expressions"></a>Arithmetic Expressions</h3>
352
353<div>
354<div>
355<pre class="source">AddExpr ::= MultExpr ( ( &quot;+&quot; | &quot;-&quot; ) MultExpr )*
356MultExpr ::= UnaryExpr ( ( &quot;*&quot; | &quot;/&quot; | &quot;%&quot; | &quot;^&quot;| &quot;idiv&quot; ) UnaryExpr )*
357UnaryExpr ::= ( ( &quot;+&quot; | &quot;-&quot; ) )? ValueExpr
358</pre></div></div>
359
360<p>AQL also supports the usual cast of characters for arithmetic expressions. The example below evaluates to 25.</p>
361<div class="section">
362<div class="section">
363<h5><a name="Example"></a>Example</h5>
364
365<div>
366<div>
367<pre class="source">3 ^ 2 + 4 ^ 2
368</pre></div></div>
369</div></div></div>
370<div class="section">
371<h3><a name="FLWOR_Expression"></a>FLWOR Expression</h3>
372
373<div>
374<div>
375<pre class="source">FLWOR ::= ( ForClause | LetClause ) ( Clause )* (&quot;return&quot;|&quot;select&quot;) Expression
376Clause ::= ForClause | LetClause | WhereClause | OrderbyClause
377 | GroupClause | LimitClause | DistinctClause
378ForClause ::= (&quot;for&quot;|&quot;from&quot;) Variable ( &quot;at&quot; Variable )? &quot;in&quot; ( Expression )
379LetClause ::= (&quot;let&quot;|&quot;with&quot;) Variable &quot;:=&quot; Expression
380WhereClause ::= &quot;where&quot; Expression
381OrderbyClause ::= &quot;order&quot; &quot;by&quot; Expression ( ( &quot;asc&quot; ) | ( &quot;desc&quot; ) )?
382 ( &quot;,&quot; Expression ( ( &quot;asc&quot; ) | ( &quot;desc&quot; ) )? )*
383GroupClause ::= &quot;group&quot; &quot;by&quot; ( Variable &quot;:=&quot; )? Expression ( &quot;,&quot; ( Variable &quot;:=&quot; )? Expression )*
384 (&quot;with&quot;|&quot;keeping&quot;) VariableRef ( &quot;,&quot; VariableRef )*
385LimitClause ::= &quot;limit&quot; Expression ( &quot;offset&quot; Expression )?
386DistinctClause ::= &quot;distinct&quot; &quot;by&quot; Expression ( &quot;,&quot; Expression )*
387Variable ::= &lt;VARIABLE&gt;
388</pre></div></div>
389
390<p>The heart of AQL is the FLWOR (for-let-where-orderby-return) expression. The roots of this expression were borrowed from the expression of the same name in XQuery. A FLWOR expression starts with one or more clauses that establish variable bindings. A <tt>for</tt> clause binds a variable incrementally to each element of its associated expression; it includes an optional positional variable for counting/numbering the bindings. By default no ordering is implied or assumed by a <tt>for</tt> clause. A <tt>let</tt> clause binds a variable to the collection of elements computed by its associated expression.</p>
391<p>Following the initial <tt>for</tt> or <tt>let</tt> clause(s), a FLWOR expression may contain an arbitrary sequence of other clauses. The <tt>where</tt> clause in a FLWOR expression filters the preceding bindings via a boolean expression, much like a <tt>where</tt> clause does in a SQL query. The <tt>order by</tt> clause in a FLWOR expression induces an ordering on the data. The <tt>group by</tt> clause, discussed further below, forms groups based on its group by expressions, optionally naming the expressions&#x2019; values (which together form the grouping key for the expression). The <tt>with</tt> subclause of a <tt>group by</tt> clause specifies the variable(s) whose values should be grouped based on the grouping key(s); following the grouping clause, only the grouping key(s) and the variables named in the with subclause remain in scope, and the named grouping variables now contain lists formed from their input values. The <tt>limit</tt> clause caps the number of values returned, optionally starting its result count from a specified offset. (Web applications can use this feature for doing pagination.) The <tt>distinct</tt> clause is similar to the <tt>group-by</tt> clause, but it forms no groups; it serves only to eliminate duplicate values. As indicated by the grammar, the clauses in an AQL query can appear in any order. To interpret a query, one can think of data as flowing down through the query from the first clause to the <tt>return</tt> clause.</p>
392<p>The following example shows a FLWOR expression that selects and returns one user from the dataset FacebookUsers.</p>
393<div class="section">
394<div class="section">
395<h5><a name="Example"></a>Example</h5>
396
397<div>
398<div>
399<pre class="source">for $user in dataset FacebookUsers
400where $user.id = 8
401return $user
402</pre></div></div>
403
404<p>The next example shows a FLWOR expression that joins two datasets, FacebookUsers and FacebookMessages, returning user/message pairs. The results contain one object per pair, with result objects containing the user&#x2019;s name and an entire message.</p></div>
405<div class="section">
406<h5><a name="Example"></a>Example</h5>
407
408<div>
409<div>
410<pre class="source">for $user in dataset FacebookUsers
411for $message in dataset FacebookMessages
412where $message.author-id = $user.id
413return
414 {
415 &quot;uname&quot;: $user.name,
416 &quot;message&quot;: $message.message
417 };
418</pre></div></div>
419
420<p>In the next example, a <tt>let</tt> clause is used to bind a variable to all of a user&#x2019;s FacebookMessages. The query returns one object per user, with result objects containing the user&#x2019;s name and the set of all messages by that user.</p></div>
421<div class="section">
422<h5><a name="Example"></a>Example</h5>
423
424<div>
425<div>
426<pre class="source">for $user in dataset FacebookUsers
427let $messages :=
428 for $message in dataset FacebookMessages
429 where $message.author-id = $user.id
430 return $message.message
431return
432 {
433 &quot;uname&quot;: $user.name,
434 &quot;messages&quot;: $messages
435 };
436</pre></div></div>
437
438<p>The following example returns all TwitterUsers ordered by their followers count (most followers first) and language. When ordering <tt>null</tt> is treated as being smaller than any other value if <tt>null</tt>s are encountered in the ordering key(s).</p></div>
439<div class="section">
440<h5><a name="Example"></a>Example</h5>
441
442<div>
443<div>
444<pre class="source"> for $user in dataset TwitterUsers
445 order by $user.followers_count desc, $user.lang asc
446 return $user
447</pre></div></div>
448
449<p>The next example illustrates the use of the <tt>group by</tt> clause in AQL. After the <tt>group by</tt> clause in the query, only variables that are either in the <tt>group by</tt> list or in the <tt>with</tt> list are in scope. The variables in the clause&#x2019;s <tt>with</tt> list will each contain a collection of items following the <tt>group by</tt> clause; the collected items are the values that the source variable was bound to in the tuples that formed the group. For grouping <tt>null</tt> is handled as a single value.</p></div>
450<div class="section">
451<h5><a name="Example"></a>Example</h5>
452
453<div>
454<div>
455<pre class="source"> for $x in dataset FacebookMessages
456 let $messages := $x.message
457 group by $loc := $x.sender-location with $messages
458 return
459 {
460 &quot;location&quot; : $loc,
461 &quot;message&quot; : $messages
462 }
463</pre></div></div>
464
465<p>The use of the <tt>limit</tt> clause is illustrated in the next example.</p></div>
466<div class="section">
467<h5><a name="Example"></a>Example</h5>
468
469<div>
470<div>
471<pre class="source"> for $user in dataset TwitterUsers
472 order by $user.followers_count desc
473 limit 2
474 return $user
475</pre></div></div>
476
477<p>The final example shows how AQL&#x2019;s <tt>distinct by</tt> clause works. Each variable in scope before the distinct clause is also in scope after the <tt>distinct by</tt> clause. This clause works similarly to <tt>group by</tt>, but for each variable that contains more than one value after the <tt>distinct by</tt> clause, one value is picked nondeterministically. (If the variable is in the <tt>distinct by</tt> list, then its value will be deterministic.) Nulls are treated as a single value when they occur in a grouping field.</p></div>
478<div class="section">
479<h5><a name="Example"></a>Example</h5>
480
481<div>
482<div>
483<pre class="source"> for $x in dataset FacebookMessages
484 distinct by $x.sender-location
485 return
486 {
487 &quot;location&quot; : $x.sender-location,
488 &quot;message&quot; : $x.message
489 }
490</pre></div></div>
491
492<p>In order to allow SQL fans to write queries in their favored ways, AQL provides synonyms: <i>from</i> for <i>for</i>, <i>select</i> for <i>return</i>, <i>with</i> for <i>let</i>, and <i>keeping</i> for <i>with</i> in the group by clause. The following query is such an example.</p></div>
493<div class="section">
494<h5><a name="Example"></a>Example</h5>
495
496<div>
497<div>
498<pre class="source"> from $x in dataset FacebookMessages
499 with $messages := $x.message
500 group by $loc := $x.sender-location keeping $messages
501 select
502 {
503 &quot;location&quot; : $loc,
504 &quot;message&quot; : $messages
505 }
506</pre></div></div>
507</div></div></div>
508<div class="section">
509<h3><a name="Conditional_Expression"></a>Conditional Expression</h3>
510
511<div>
512<div>
513<pre class="source">IfThenElse ::= &quot;if&quot; &quot;(&quot; Expression &quot;)&quot; &quot;then&quot; Expression &quot;else&quot; Expression
514</pre></div></div>
515
516<p>A conditional expression is useful for choosing between two alternative values based on a boolean condition. If its first (<tt>if</tt>) expression is true, its second (<tt>then</tt>) expression&#x2019;s value is returned, and otherwise its third (<tt>else</tt>) expression is returned.</p>
517<p>The following example illustrates the form of a conditional expression.</p>
518<div class="section">
519<div class="section">
520<h5><a name="Example"></a>Example</h5>
521
522<div>
523<div>
524<pre class="source">if (2 &lt; 3) then &quot;yes&quot; else &quot;no&quot;
525</pre></div></div>
526</div></div></div>
527<div class="section">
528<h3><a name="Quantified_Expressions"></a>Quantified Expressions</h3>
529
530<div>
531<div>
532<pre class="source">QuantifiedExpression ::= ( ( &quot;some&quot; ) | ( &quot;every&quot; ) ) Variable &quot;in&quot; Expression
533 ( &quot;,&quot; Variable &quot;in&quot; Expression )* &quot;satisfies&quot; Expression
534</pre></div></div>
535
536<p>Quantified expressions are used for expressing existential or universal predicates involving the elements of a collection.</p>
537<p>The following pair of examples illustrate the use of a quantified expression to test that every (or some) element in the set [1, 2, 3] of integers is less than three. The first example yields <tt>false</tt> and second example yields <tt>true</tt>.</p>
538<p>It is useful to note that if the set were instead the empty set, the first expression would yield <tt>true</tt> (&#x201c;every&#x201d; value in an empty set satisfies the condition) while the second expression would yield <tt>false</tt> (since there isn&#x2019;t &#x201c;some&#x201d; value, as there are no values in the set, that satisfies the condition).</p>
539<div class="section">
540<div class="section">
541<h5><a name="Examples"></a>Examples</h5>
542
543<div>
544<div>
545<pre class="source">every $x in [ 1, 2, 3 ] satisfies $x &lt; 3
546some $x in [ 1, 2, 3 ] satisfies $x &lt; 3
547</pre></div></div>
548</div></div></div></div>
549<div class="section">
550<h2><a name="a3._Statements_.5BBack_to_TOC.5D"></a><a name="Statements" id="Statements">3. Statements</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
551
552<div>
553<div>
554<pre class="source">Statement ::= ( SingleStatement ( &quot;;&quot; )? )* &lt;EOF&gt;
555SingleStatement ::= DataverseDeclaration
556 | FunctionDeclaration
557 | CreateStatement
558 | DropStatement
559 | LoadStatement
560 | SetStatement
561 | InsertStatement
562 | DeleteStatement
563 | UpsertStatement
564 | Query
565</pre></div></div>
566
567<p>In addition to expresssions for queries, AQL supports a variety of statements for data definition and manipulation purposes as well as controlling the context to be used in evaluating AQL expressions. AQL supports object-level ACID transactions that begin and terminate implicitly for each object inserted, deleted, upserted, or searched while a given AQL statement is being executed.</p>
568<p>This section details the statements supported in the AQL language.</p>
569<div class="section">
570<h3><a name="Declarations"></a>Declarations</h3>
571
572<div>
573<div>
574<pre class="source">DataverseDeclaration ::= &quot;use&quot; &quot;dataverse&quot; Identifier
575</pre></div></div>
576
577<p>The world of data in an AsterixDB cluster is organized into data namespaces called dataverses. To set the default dataverse for a series of statements, the use dataverse statement is provided.</p>
578<p>As an example, the following statement sets the default dataverse to be TinySocial.</p>
579<div class="section">
580<div class="section">
581<h5><a name="Example"></a>Example</h5>
582
583<div>
584<div>
585<pre class="source">use dataverse TinySocial;
586</pre></div></div>
587
588<p>The set statement in AQL is used to control aspects of the expression evalation context for queries.</p>
589
590<div>
591<div>
592<pre class="source">SetStatement ::= &quot;set&quot; Identifier StringLiteral
593</pre></div></div>
594
595<p>As an example, the following set statements request that Jaccard similarity with a similarity threshold 0.6 be used for set similarity matching when the ~= operator is used in a query expression.</p></div>
596<div class="section">
597<h5><a name="Example"></a>Example</h5>
598
599<div>
600<div>
601<pre class="source">set simfunction &quot;jaccard&quot;;
602set simthreshold &quot;0.6f&quot;;
603</pre></div></div>
604
605<p>When writing a complex AQL query, it can sometimes be helpful to define one or more auxilliary functions that each address a sub-piece of the overall query. The declare function statement supports the creation of such helper functions.</p>
606
607<div>
608<div>
609<pre class="source">FunctionDeclaration ::= &quot;declare&quot; &quot;function&quot; Identifier ParameterList &quot;{&quot; Expression &quot;}&quot;
610ParameterList ::= &quot;(&quot; ( &lt;VARIABLE&gt; ( &quot;,&quot; &lt;VARIABLE&gt; )* )? &quot;)&quot;
611</pre></div></div>
612
613<p>The following is a very simple example of a temporary AQL function definition.</p></div>
614<div class="section">
615<h5><a name="Example"></a>Example</h5>
616
617<div>
618<div>
619<pre class="source">declare function add($a, $b) {
620 $a + $b
621};
622</pre></div></div>
623</div></div></div>
624<div class="section">
625<h3><a name="Lifecycle_Management_Statements"></a>Lifecycle Management Statements</h3>
626
627<div>
628<div>
629<pre class="source">CreateStatement ::= &quot;create&quot; ( DataverseSpecification
630 | TypeSpecification
631 | DatasetSpecification
632 | IndexSpecification
633 | FunctionSpecification )
634
635QualifiedName ::= Identifier ( &quot;.&quot; Identifier )?
636DoubleQualifiedName ::= Identifier &quot;.&quot; Identifier ( &quot;.&quot; Identifier )?
637</pre></div></div>
638
639<p>The create statement in AQL is used for creating persistent artifacts in the context of dataverses. It can be used to create new dataverses, datatypes, datasets, indexes, and user-defined AQL functions.</p>
640<div class="section">
641<h4><a name="Dataverses"></a>Dataverses</h4>
642
643<div>
644<div>
645<pre class="source">DataverseSpecification ::= &quot;dataverse&quot; Identifier IfNotExists ( &quot;with format&quot; StringLiteral )?
646</pre></div></div>
647
648<p>The create dataverse statement is used to create new dataverses. To ease the authoring of reusable AQL scripts, its optional IfNotExists clause allows creation to be requested either unconditionally or only if the the dataverse does not already exist. If this clause is absent, an error will be returned if the specified dataverse already exists. The <tt>with format</tt> clause is a placeholder for future functionality that can safely be ignored.</p>
649<p>The following example creates a dataverse named TinySocial.</p>
650<div class="section">
651<h5><a name="Example"></a>Example</h5>
652
653<div>
654<div>
655<pre class="source">create dataverse TinySocial;
656</pre></div></div>
657</div></div>
658<div class="section">
659<h4><a name="Types"></a>Types</h4>
660
661<div>
662<div>
663<pre class="source">TypeSpecification ::= &quot;type&quot; FunctionOrTypeName IfNotExists &quot;as&quot; TypeExpr
664FunctionOrTypeName ::= QualifiedName
665IfNotExists ::= ( &quot;if not exists&quot; )?
666TypeExpr ::= ObjectTypeDef | TypeReference | OrderedListTypeDef | UnorderedListTypeDef
667ObjectTypeDef ::= ( &quot;closed&quot; | &quot;open&quot; )? &quot;{&quot; ( ObjectField ( &quot;,&quot; ObjectField )* )? &quot;}&quot;
668ObjectField ::= Identifier &quot;:&quot; ( TypeExpr ) ( &quot;?&quot; )?
669NestedField ::= Identifier ( &quot;.&quot; Identifier )*
670IndexField ::= NestedField ( &quot;:&quot; TypeReference )?
671TypeReference ::= Identifier
672OrderedListTypeDef ::= &quot;[&quot; ( TypeExpr ) &quot;]&quot;
673UnorderedListTypeDef ::= &quot;{{&quot; ( TypeExpr ) &quot;}}&quot;
674</pre></div></div>
675
676<p>The create type statement is used to create a new named ADM datatype. This type can then be used to create datasets or utilized when defining one or more other ADM datatypes. Much more information about the Asterix Data Model (ADM) is available in the <a href="datamodel.html">data model reference guide</a> to ADM. A new type can be a object type, a renaming of another type, an ordered list type, or an unordered list type. A object type can be defined as being either open or closed. Instances of a closed object type are not permitted to contain fields other than those specified in the create type statement. Instances of an open object type may carry additional fields, and open is the default for a new type (if neither option is specified).</p>
677<p>The following example creates a new ADM object type called FacebookUser type. Since it is closed, its instances will contain only what is specified in the type definition. The first four fields are traditional typed name/value pairs. The friend-ids field is an unordered list of 32-bit integers. The employment field is an ordered list of instances of another named object type, EmploymentType.</p>
678<div class="section">
679<h5><a name="Example"></a>Example</h5>
680
681<div>
682<div>
683<pre class="source">create type FacebookUserType as closed {
684 &quot;id&quot; : int32,
685 &quot;alias&quot; : string,
686 &quot;name&quot; : string,
687 &quot;user-since&quot; : datetime,
688 &quot;friend-ids&quot; : {{ int32 }},
689 &quot;employment&quot; : [ EmploymentType ]
690}
691</pre></div></div>
692
693<p>The next example creates a new ADM object type called FbUserType. Note that the type of the id field is UUID. You need to use this field type if you want to have this field be an autogenerated-PK field. Refer to the Datasets section later for more details.</p></div>
694<div class="section">
695<h5><a name="Example"></a>Example</h5>
696
697<div>
698<div>
699<pre class="source">create type FbUserType as closed {
700 &quot;id&quot; : uuid,
701 &quot;alias&quot; : string,
702 &quot;name&quot; : string
703}
704</pre></div></div>
705</div></div>
706<div class="section">
707<h4><a name="Datasets"></a>Datasets</h4>
708
709<div>
710<div>
711<pre class="source">DatasetSpecification ::= &quot;internal&quot;? &quot;dataset&quot; QualifiedName &quot;(&quot; QualifiedName &quot;)&quot; IfNotExists
712 PrimaryKey ( &quot;on&quot; Identifier )? ( &quot;hints&quot; Properties )?
713 ( &quot;using&quot; &quot;compaction&quot; &quot;policy&quot; CompactionPolicy ( Configuration )? )?
714 ( &quot;with filter on&quot; Identifier )?
715 | &quot;external&quot; &quot;dataset&quot; QualifiedName &quot;(&quot; QualifiedName &quot;)&quot; IfNotExists
716 &quot;using&quot; AdapterName Configuration ( &quot;hints&quot; Properties )?
717 ( &quot;using&quot; &quot;compaction&quot; &quot;policy&quot; CompactionPolicy ( Configuration )? )?
718AdapterName ::= Identifier
719Configuration ::= &quot;(&quot; ( KeyValuePair ( &quot;,&quot; KeyValuePair )* )? &quot;)&quot;
720KeyValuePair ::= &quot;(&quot; StringLiteral &quot;=&quot; StringLiteral &quot;)&quot;
721Properties ::= ( &quot;(&quot; Property ( &quot;,&quot; Property )* &quot;)&quot; )?
722Property ::= Identifier &quot;=&quot; ( StringLiteral | IntegerLiteral )
723FunctionSignature ::= FunctionOrTypeName &quot;@&quot; IntegerLiteral
724PrimaryKey ::= &quot;primary&quot; &quot;key&quot; NestedField ( &quot;,&quot; NestedField )* ( &quot;autogenerated &quot;)?
725CompactionPolicy ::= Identifier
726PrimaryKey ::= &quot;primary&quot; &quot;key&quot; Identifier ( &quot;,&quot; Identifier )* ( &quot;autogenerated &quot;)?
727</pre></div></div>
728
729<p>The create dataset statement is used to create a new dataset. Datasets are named, unordered collections of ADM object instances; they are where data lives persistently and are the targets for queries in AsterixDB. Datasets are typed, and AsterixDB will ensure that their contents conform to their type definitions. An Internal dataset (the default) is a dataset that is stored in and managed by AsterixDB. It must have a specified unique primary key that can be used to partition data across nodes of an AsterixDB cluster. The primary key is also used in secondary indexes to uniquely identify the indexed primary data objects. Random primary key (UUID) values can be auto-generated by declaring the field to be UUID and putting &#x201c;autogenerated&#x201d; after the &#x201c;primary key&#x201d; identifier. In this case, values for the auto-generated PK field should not be provided by the user since it will be auto-generated by AsterixDB. Optionally, a filter can be created on a field to further optimize range queries with predicates on the filter&#x2019;s field. (Refer to <a href="filters.html">Filter-Based LSM Index Acceleration</a> for more information about filters.)</p>
730<p>An External dataset is stored outside of AsterixDB (currently datasets in HDFS or on the local filesystem(s) of the cluster&#x2019;s nodes are supported). External dataset support allows AQL queries to treat external data as though it were stored in AsterixDB, making it possible to query &#x201c;legacy&#x201d; file data (e.g., Hive data) without having to physically import it into AsterixDB. For an external dataset, an appropriate adapter must be selected to handle the nature of the desired external data. (See the <a href="externaldata.html">guide to external data</a> for more information on the available adapters.)</p>
731<p>When creating a dataset, it is possible to choose a merge policy that controls which of the underlaying LSM storage components to be merged. Currently, AsterixDB provides four different merge policies that can be configured per dataset: no-merge, constant, prefix, and correlated-prefix. The no-merge policy simply never merges disk components. While the constant policy merges disk components when the number of components reaches some constant number k, which can be configured by the user. The prefix policy relies on component sizes and the number of components to decide which components to merge. Specifically, it works by first trying to identify the smallest ordered (oldest to newest) sequence of components such that the sequence does not contain a single component that exceeds some threshold size M and that either the sum of the component&#x2019;s sizes exceeds M or the number of components in the sequence exceeds another threshold C. If such a sequence of components exists, then each of the components in the sequence are merged together to form a single component. Finally, the correlated-prefix is similar to the prefix policy but it delegates the decision of merging the disk components of all the indexes in a dataset to the primary index. When the policy decides that the primary index needs to be merged (using the same decision criteria as for the prefix policy), then it will issue successive merge requests on behalf of all other indexes associated with the same dataset. The default policy for AsterixDB is the prefix policy except when there is a filter on a dataset, where the preferred policy for filters is the correlated-prefix.</p>
732<p>The following example creates an internal dataset for storing FacefookUserType objects. It specifies that their id field is their primary key.</p>
733<div class="section">
734<h5><a name="Example"></a>Example</h5>
735
736<div>
737<div>
738<pre class="source">create internal dataset FacebookUsers(FacebookUserType) primary key id;
739</pre></div></div>
740
741<p>The following example creates an internal dataset for storing FbUserType objects. It specifies that their id field is their primary key. It also specifies that the id field is an auto-generated field, meaning that a randomly generated UUID value will be assigned to each object by the system. (A user should therefore not proivde a value for this field.) Note that the id field should be UUID.</p></div>
742<div class="section">
743<h5><a name="Example"></a>Example</h5>
744
745<div>
746<div>
747<pre class="source">create internal dataset FbMsgs(FbUserType) primary key id autogenerated;
748</pre></div></div>
749
750<p>The next example creates an external dataset for storing LineitemType objects. The choice of the <tt>hdfs</tt> adapter means that its data will reside in HDFS. The create statement provides parameters used by the hdfs adapter: the URL and path needed to locate the data in HDFS and a description of the data format.</p></div>
751<div class="section">
752<h5><a name="Example"></a>Example</h5>
753
754<div>
755<div>
756<pre class="source">create external dataset Lineitem('LineitemType) using hdfs (
757 (&quot;hdfs&quot;=&quot;hdfs://HOST:PORT&quot;),
758 (&quot;path&quot;=&quot;HDFS_PATH&quot;),
759 (&quot;input-format&quot;=&quot;text-input-format&quot;),
760 (&quot;format&quot;=&quot;delimited-text&quot;),
761 (&quot;delimiter&quot;=&quot;|&quot;));
762</pre></div></div>
763</div></div>
764<div class="section">
765<h4><a name="Indices"></a>Indices</h4>
766
767<div>
768<div>
769<pre class="source">IndexSpecification ::= &quot;index&quot; Identifier IfNotExists &quot;on&quot; QualifiedName
770 &quot;(&quot; ( IndexField ) ( &quot;,&quot; IndexField )* &quot;)&quot; ( &quot;type&quot; IndexType )? ( &quot;enforced&quot; )?
771IndexType ::= &quot;btree&quot;
772 | &quot;rtree&quot;
773 | &quot;keyword&quot;
774 | &quot;ngram&quot; &quot;(&quot; IntegerLiteral &quot;)&quot;
775 | &quot;fulltext&quot;
776</pre></div></div>
777
778<p>The create index statement creates a secondary index on one or more fields of a specified dataset. Supported index types include <tt>btree</tt> for totally ordered datatypes, <tt>rtree</tt> for spatial data, and <tt>keyword</tt>, <tt>ngram</tt>, and <tt>fulltext</tt> for textual (string) data. An index can be created on a nested field (or fields) by providing a valid path expression as an index field identifier. An index field is not required to be part of the datatype associated with a dataset if that datatype is declared as open and the field&#x2019;s type is provided along with its type and the <tt>enforced</tt> keyword is specified in the end of index definition. <tt>Enforcing</tt> an open field will introduce a check that will make sure that the actual type of an indexed field (if the field exists in the object) always matches this specified (open) field type.</p>
779<p>The following example creates a btree index called fbAuthorIdx on the author-id field of the FacebookMessages dataset. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the author-id field.</p>
780<div class="section">
781<h5><a name="Example"></a>Example</h5>
782
783<div>
784<div>
785<pre class="source">create index fbAuthorIdx on FacebookMessages(author-id) type btree;
786</pre></div></div>
787
788<p>The following example creates an open btree index called fbSendTimeIdx on the open send-time field of the FacebookMessages dataset having datetime type. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the send-time field.</p></div>
789<div class="section">
790<h5><a name="Example"></a>Example</h5>
791
792<div>
793<div>
794<pre class="source">create index fbSendTimeIdx on FacebookMessages(send-time:datetime) type btree enforced;
795</pre></div></div>
796
797<p>The following example creates a btree index called twUserScrNameIdx on the screen-name field, which is a nested field of the user field in the TweetMessages dataset. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the screen-name field.</p></div>
798<div class="section">
799<h5><a name="Example"></a>Example</h5>
800
801<div>
802<div>
803<pre class="source">create index twUserScrNameIdx on TweetMessages(user.screen-name) type btree;
804</pre></div></div>
805
806<p>The following example creates an rtree index called fbSenderLocIdx on the sender-location field of the FacebookMessages dataset. This index can be useful for accelerating queries that use the <a href="functions.html#spatial-intersect"><tt>spatial-intersect</tt> function</a> in a predicate involving the sender-location field.</p></div>
807<div class="section">
808<h5><a name="Example"></a>Example</h5>
809
810<div>
811<div>
812<pre class="source">create index fbSenderLocIndex on FacebookMessages(sender-location) type rtree;
813</pre></div></div>
814
815<p>The following example creates a 3-gram index called fbUserIdx on the name field of the FacebookUsers dataset. This index can be used to accelerate some similarity or substring maching queries on the name field. For details refer to the <a href="similarity.html#NGram_Index">document on similarity queries</a>.</p></div>
816<div class="section">
817<h5><a name="Example"></a>Example</h5>
818
819<div>
820<div>
821<pre class="source">create index fbUserIdx on FacebookUsers(name) type ngram(3);
822</pre></div></div>
823
824<p>The following example creates a keyword index called fbMessageIdx on the message field of the FacebookMessages dataset. This keyword index can be used to optimize queries with token-based similarity predicates on the message field. For details refer to the <a href="similarity.html#Keyword_Index">document on similarity queries</a>.</p></div>
825<div class="section">
826<h5><a name="Example"></a>Example</h5>
827
828<div>
829<div>
830<pre class="source">create index fbMessageIdx on FacebookMessages(message) type keyword;
831</pre></div></div>
832
833<p>The following example creates a full-text index called fbMessageIdx on the message field of the FacebookMessages dataset. This full-text index can be used to optimize queries with full-text search predicates on the message field. For details refer to the <a href="fulltext.html#toc">document on full-text queries</a>.</p></div>
834<div class="section">
835<h5><a name="Example"></a>Example</h5>
836
837<div>
838<div>
839<pre class="source">create index fbMessageIdx on FacebookMessages(message) type fulltext;
840</pre></div></div>
841</div></div>
842<div class="section">
843<h4><a name="Functions"></a>Functions</h4>
844<p>The create function statement creates a named function that can then be used and reused in AQL queries. The body of a function can be any AQL expression involving the function&#x2019;s parameters.</p>
845
846<div>
847<div>
848<pre class="source">FunctionSpecification ::= &quot;function&quot; FunctionOrTypeName IfNotExists ParameterList &quot;{&quot; Expression &quot;}&quot;
849</pre></div></div>
850
851<p>The following is a very simple example of a create function statement. It differs from the declare function example shown previously in that it results in a function that is persistently registered by name in the specified dataverse.</p>
852<div class="section">
853<h5><a name="Example"></a>Example</h5>
854
855<div>
856<div>
857<pre class="source">create function add($a, $b) {
858 $a + $b
859};
860</pre></div></div>
861</div></div>
862<div class="section">
863<h4><a name="Removal"></a>Removal</h4>
864
865<div>
866<div>
867<pre class="source">DropStatement ::= &quot;drop&quot; ( &quot;dataverse&quot; Identifier IfExists
868 | &quot;type&quot; FunctionOrTypeName IfExists
869 | &quot;dataset&quot; QualifiedName IfExists
870 | &quot;index&quot; DoubleQualifiedName IfExists
871 | &quot;function&quot; FunctionSignature IfExists )
872IfExists ::= ( &quot;if&quot; &quot;exists&quot; )?
873</pre></div></div>
874
875<p>The drop statement in AQL is the inverse of the create statement. It can be used to drop dataverses, datatypes, datasets, indexes, and functions.</p>
876<p>The following examples illustrate uses of the drop statement.</p>
877<div class="section">
878<h5><a name="Example"></a>Example</h5>
879
880<div>
881<div>
882<pre class="source">drop dataset FacebookUsers if exists;
883
884drop index FacebookUsers.fbSenderLocIndex;
885
886drop type FacebookUserType;
887
888drop dataverse TinySocial;
889
890drop function add;
891</pre></div></div>
892</div></div></div>
893<div class="section">
894<h3><a name="Import.2FExport_Statements"></a>Import/Export Statements</h3>
895
896<div>
897<div>
898<pre class="source">LoadStatement ::= &quot;load&quot; &quot;dataset&quot; QualifiedName &quot;using&quot; AdapterName Configuration ( &quot;pre-sorted&quot; )?
899</pre></div></div>
900
901<p>The load statement is used to initially populate a dataset via bulk loading of data from an external file. An appropriate adapter must be selected to handle the nature of the desired external data. The load statement accepts the same adapters and the same parameters as external datasets. (See the <a href="externaldata.html">guide to external data</a> for more information on the available adapters.) If a dataset has an auto-generated primary key field, a file to be imported should not include that field in it.</p>
902<p>The following example shows how to bulk load the FacebookUsers dataset from an external file containing data that has been prepared in ADM format.</p>
903<div class="section">
904<div class="section">
905<h5><a name="Example"></a>Example</h5>
906
907<div>
908<div>
909<pre class="source">load dataset FacebookUsers using localfs
910((&quot;path&quot;=&quot;localhost:///Users/zuck/AsterixDB/load/fbu.adm&quot;),(&quot;format&quot;=&quot;adm&quot;));
911</pre></div></div>
912</div></div></div>
913<div class="section">
914<h3><a name="Modification_Statements"></a>Modification Statements</h3>
915<div class="section">
916<h4><a name="Insert"></a>Insert</h4>
917
918<div>
919<div>
920<pre class="source">InsertStatement ::= &quot;insert&quot; &quot;into&quot; &quot;dataset&quot; QualifiedName ( &quot;as&quot; Variable )? Query ( &quot;returning&quot; Query )?
921</pre></div></div>
922
923<p>The AQL insert statement is used to insert data into a dataset. The data to be inserted comes from an AQL query expression. The expression can be as simple as a constant expression, or in general it can be any legal AQL query. Inserts in AsterixDB are processed transactionally, with the scope of each insert transaction being the insertion of a single object plus its affiliated secondary index entries (if any). If the query part of an insert returns a single object, then the insert statement itself will be a single, atomic transaction. If the query part returns multiple objects, then each object inserted will be handled independently as a tranaction. If a dataset has an auto-generated primary key field, an insert statement should not include a value for that field in it. (The system will automatically extend the provided object with this additional field and a corresponding value.). The optional &#x201c;as Variable&#x201d; provides a variable binding for the inserted objects, which can be used in the &#x201c;returning&#x201d; clause. The optional &#x201c;returning Query&#x201d; allows users to run simple queries/functions on the objects returned by the insert. This query cannot refer to any datasets.</p>
924<p>The following example illustrates a query-based insertion.</p>
925<div class="section">
926<h5><a name="Example"></a>Example</h5>
927
928<div>
929<div>
930<pre class="source">insert into dataset UsersCopy as $inserted (for $user in dataset FacebookUsers return $user ) returning $inserted.screen-name
931</pre></div></div>
932</div></div>
933<div class="section">
934<h4><a name="Delete"></a>Delete</h4>
935
936<div>
937<div>
938<pre class="source">DeleteStatement ::= &quot;delete&quot; Variable &quot;from&quot; &quot;dataset&quot; QualifiedName ( &quot;where&quot; Expression )?
939</pre></div></div>
940
941<p>The AQL delete statement is used to delete data from a target dataset. The data to be deleted is identified by a boolean expression involving the variable bound to the target dataset in the delete statement. Deletes in AsterixDB are processed transactionally, with the scope of each delete transaction being the deletion of a single object plus its affiliated secondary index entries (if any). If the boolean expression for a delete identifies a single object, then the delete statement itself will be a single, atomic transaction. If the expression identifies multiple objects, then each object deleted will be handled independently as a transaction.</p>
942<p>The following example illustrates a single-object deletion.</p>
943<div class="section">
944<h5><a name="Example"></a>Example</h5>
945
946<div>
947<div>
948<pre class="source">delete $user from dataset FacebookUsers where $user.id = 8;
949</pre></div></div>
950</div></div>
951<div class="section">
952<h4><a name="Upsert"></a>Upsert</h4>
953
954<div>
955<div>
956<pre class="source">UpsertStatement ::= &quot;upsert&quot; &quot;into&quot; &quot;dataset&quot; QualifiedName Query
957</pre></div></div>
958
959<p>The AQL upsert statement is used to couple delete (if found) with insert data into a dataset. The data to be upserted comes from an AQL query expression. The expression can be as simple as a constant expression, or in general it can be any legal AQL query. Upserts in AsterixDB are processed transactionally, with the scope of each upsert transaction being the upsertion (deletion if found + insertion) of a single object plus its affiliated secondary index entries (if any). If the query part of an upsert returns a single object, then the upsert statement itself will be a single, atomic transaction. If the query part returns multiple objects, then each object upserted will be handled independently as a tranaction.</p>
960<p>The following example illustrates a query-based upsertion.</p>
961<div class="section">
962<h5><a name="Example"></a>Example</h5>
963
964<div>
965<div>
966<pre class="source">upsert into dataset Users (for $user in dataset FacebookUsers return $user)
967</pre></div></div>
968
969<p>We close this guide to AQL with one final example of a query expression.</p></div>
970<div class="section">
971<h5><a name="Example"></a>Example</h5>
972
973<div>
974<div>
975<pre class="source">for $praise in {{ &quot;great&quot;, &quot;brilliant&quot;, &quot;awesome&quot; }}
976return
977 string-concat([&quot;AsterixDB is &quot;, $praise])
978</pre></div></div></div></div></div></div>
979 </div>
980 </div>
981 </div>
982 <hr/>
983 <footer>
984 <div class="container-fluid">
985 <div class="row-fluid">
986<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
987 feather logo, and the Apache AsterixDB project logo are either
988 registered trademarks or trademarks of The Apache Software
989 Foundation in the United States and other countries.
990 All other marks mentioned may be trademarks or registered
991 trademarks of their respective owners.
992 </div>
993 </div>
994 </div>
995 </footer>
996 </body>
997</html>