blob: 894b68bc144493993d78dbc9ee7f2dc860f8b384 [file] [log] [blame]
Ian Maxon444ca1b2017-08-25 11:41:41 -07001<!DOCTYPE html>
2<!--
Ian Maxon7a4bed92017-09-15 02:01:18 +02003 | Generated by Apache Maven Doxia at 2017-09-14
Ian Maxon444ca1b2017-08-25 11:41:41 -07004 | Rendered using Apache Maven Fluido Skin 1.3.0
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
Ian Maxon7a4bed92017-09-15 02:01:18 +020010 <meta name="Date-Revision-yyyymmdd" content="20170914" />
Ian Maxon444ca1b2017-08-25 11:41:41 -070011 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; AsterixDB 101: An ADM and AQL Primer</title>
13 <link rel="stylesheet" href="../css/apache-maven-fluido-1.3.0.min.css" />
14 <link rel="stylesheet" href="../css/site.css" />
15 <link rel="stylesheet" href="../css/print.css" media="print" />
16
17
18 <script type="text/javascript" src="../js/apache-maven-fluido-1.3.0.min.js"></script>
19
20
21
22<script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
23 (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
24 m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
25 })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
26
27 ga('create', 'UA-41536543-1', 'uci.edu');
28 ga('send', 'pageview');</script>
29
30 </head>
31 <body class="topBarDisabled">
32
33
34
35
36 <div class="container-fluid">
37 <div id="banner">
38 <div class="pull-left">
39 <a href=".././" id="bannerLeft">
40 <img src="../images/asterixlogo.png" alt="AsterixDB"/>
41 </a>
42 </div>
43 <div class="pull-right"> </div>
44 <div class="clear"><hr/></div>
45 </div>
46
47 <div id="breadcrumbs">
48 <ul class="breadcrumb">
49
50
Ian Maxon7a4bed92017-09-15 02:01:18 +020051 <li id="publishDate">Last Published: 2017-09-14</li>
Ian Maxon444ca1b2017-08-25 11:41:41 -070052
53
54
Ian Maxon7a4bed92017-09-15 02:01:18 +020055 <li id="projectVersion" class="pull-right">Version: 0.9.2</li>
Ian Maxon444ca1b2017-08-25 11:41:41 -070056
57 <li class="divider pull-right">|</li>
58
59 <li class="pull-right"> <a href="../index.html" title="Documentation Home">
60 Documentation Home</a>
61 </li>
62
63 </ul>
64 </div>
65
66
67 <div class="row-fluid">
68 <div id="leftColumn" class="span3">
69 <div class="well sidebar-nav">
70
71
72 <ul class="nav nav-list">
73 <li class="nav-header">Get Started - Installation</li>
74
75 <li>
76
77 <a href="../ncservice.html" title="Option 1: using NCService">
78 <i class="none"></i>
79 Option 1: using NCService</a>
80 </li>
81
82 <li>
83
84 <a href="../ansible.html" title="Option 2: using Ansible">
85 <i class="none"></i>
86 Option 2: using Ansible</a>
87 </li>
88
89 <li>
90
91 <a href="../aws.html" title="Option 3: using Amazon Web Services">
92 <i class="none"></i>
93 Option 3: using Amazon Web Services</a>
94 </li>
95
96 <li>
97
98 <a href="../yarn.html" title="Option 4: using YARN">
99 <i class="none"></i>
100 Option 4: using YARN</a>
101 </li>
102
103 <li>
104
105 <a href="../install.html" title="Option 5: using Managix (deprecated)">
106 <i class="none"></i>
107 Option 5: using Managix (deprecated)</a>
108 </li>
109 <li class="nav-header">AsterixDB Primer</li>
110
111 <li>
112
113 <a href="../sqlpp/primer-sqlpp.html" title="Option 1: using SQL++">
114 <i class="none"></i>
115 Option 1: using SQL++</a>
116 </li>
117
118 <li class="active">
119
120 <a href="#"><i class="none"></i>Option 2: using AQL</a>
121 </li>
122 <li class="nav-header">Data Model</li>
123
124 <li>
125
126 <a href="../datamodel.html" title="The Asterix Data Model">
127 <i class="none"></i>
128 The Asterix Data Model</a>
129 </li>
130 <li class="nav-header">Queries - SQL++</li>
131
132 <li>
133
134 <a href="../sqlpp/manual.html" title="The SQL++ Query Language">
135 <i class="none"></i>
136 The SQL++ Query Language</a>
137 </li>
138
139 <li>
140
141 <a href="../sqlpp/builtins.html" title="Builtin Functions">
142 <i class="none"></i>
143 Builtin Functions</a>
144 </li>
145 <li class="nav-header">Queries - AQL</li>
146
147 <li>
148
149 <a href="../aql/manual.html" title="The Asterix Query Language (AQL)">
150 <i class="none"></i>
151 The Asterix Query Language (AQL)</a>
152 </li>
153
154 <li>
155
156 <a href="../aql/builtins.html" title="Builtin Functions">
157 <i class="none"></i>
158 Builtin Functions</a>
159 </li>
160 <li class="nav-header">API/SDK</li>
161
162 <li>
163
164 <a href="../api.html" title="HTTP API">
165 <i class="none"></i>
166 HTTP API</a>
167 </li>
168
169 <li>
170
171 <a href="../csv.html" title="CSV Output">
172 <i class="none"></i>
173 CSV Output</a>
174 </li>
175 <li class="nav-header">Advanced Features</li>
176
177 <li>
178
179 <a href="../aql/fulltext.html" title="Support of Full-text Queries">
180 <i class="none"></i>
181 Support of Full-text Queries</a>
182 </li>
183
184 <li>
185
186 <a href="../aql/externaldata.html" title="Accessing External Data">
187 <i class="none"></i>
188 Accessing External Data</a>
189 </li>
190
191 <li>
192
193 <a href="../feeds/tutorial.html" title="Support for Data Ingestion">
194 <i class="none"></i>
195 Support for Data Ingestion</a>
196 </li>
197
198 <li>
199
200 <a href="../udf.html" title="User Defined Functions">
201 <i class="none"></i>
202 User Defined Functions</a>
203 </li>
204
205 <li>
206
207 <a href="../aql/filters.html" title="Filter-Based LSM Index Acceleration">
208 <i class="none"></i>
209 Filter-Based LSM Index Acceleration</a>
210 </li>
211
212 <li>
213
214 <a href="../aql/similarity.html" title="Support of Similarity Queries">
215 <i class="none"></i>
216 Support of Similarity Queries</a>
217 </li>
218 </ul>
219
220
221
222 <hr class="divider" />
223
224 <div id="poweredBy">
225 <div class="clear"></div>
226 <div class="clear"></div>
227 <div class="clear"></div>
228 <a href=".././" title="AsterixDB" class="builtBy">
229 <img class="builtBy" alt="AsterixDB" src="../images/asterixlogo.png" />
230 </a>
231 </div>
232 </div>
233 </div>
234
235
236 <div id="bodyColumn" class="span9" >
237
238 <!-- ! Licensed to the Apache Software Foundation (ASF) under one
239 ! or more contributor license agreements. See the NOTICE file
240 ! distributed with this work for additional information
241 ! regarding copyright ownership. The ASF licenses this file
242 ! to you under the Apache License, Version 2.0 (the
243 ! "License"); you may not use this file except in compliance
244 ! with the License. You may obtain a copy of the License at
245 !
246 ! http://www.apache.org/licenses/LICENSE-2.0
247 !
248 ! Unless required by applicable law or agreed to in writing,
249 ! software distributed under the License is distributed on an
250 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
251 ! KIND, either express or implied. See the License for the
252 ! specific language governing permissions and limitations
253 ! under the License.
254 ! --><h1>AsterixDB 101: An ADM and AQL Primer</h1>
255<div class="section">
256<h2><a name="Welcome_to_AsterixDB"></a>Welcome to AsterixDB!</h2>
257<p>This document introduces the main features of AsterixDB&#x2019;s data model (ADM) and query language (AQL) by example. The example is a simple scenario involving (synthetic) sample data modeled after data from the social domain. This document describes a set of sample ADM datasets, together with a set of illustrative AQL queries, to introduce you to the &#x201c;AsterixDB user experience&#x201d;. The complete set of steps required to create and load a handful of sample datasets, along with runnable queries and the expected results for each query, are included.</p>
258<p>This document assumes that you are at least vaguely familiar with AsterixDB and why you might want to use it. Most importantly, it assumes you already have a running instance of AsterixDB and that you know how to query it using AsterixDB&#x2019;s basic web interface. For more information on these topics, you should go through the steps in <a href="../install.html">Installing Asterix Using Managix</a> before reading this document and make sure that you have a running AsterixDB instance ready to go. To get your feet wet, you should probably start with a simple local installation of AsterixDB on your favorite machine, accepting all of the default settings that Managix offers. Later you can graduate to trying AsterixDB on a cluster, its real intended home (since it targets Big Data). (Note: With the exception of specifying the correct locations where you put the source data for this example, there should no changes needed in your ADM or AQL statements to run the examples locally and/or to run them on a cluster when you are ready to take that step.)</p>
259<p>As you read through this document, you should try each step for yourself on your own AsterixDB instance. Once you have reached the end, you will be fully armed and dangerous, with all the basic AsterixDB knowledge that you&#x2019;ll need to start down the path of modeling, storing, and querying your own semistructured data.</p></div>
260<div class="section">
261<h2><a name="ADM:_Modeling_Semistructed_Data_in_AsterixDB"></a>ADM: Modeling Semistructed Data in AsterixDB</h2>
262<p>In this section you will learn all about modeling Big Data using ADM, the data model of the AsterixDB BDMS.</p>
263<div class="section">
264<h3><a name="Dataverses_Datatypes_and_Datasets"></a>Dataverses, Datatypes, and Datasets</h3>
265<p>The top-level organizing concept in the AsterixDB world is the <i>dataverse</i>. A dataverse&#x2014;short for &#x201c;data universe&#x201d;&#x2014;is a place (similar to a database in a relational DBMS) in which to create and manage the types, datasets, functions, and other artifacts for a given AsterixDB application. When you start using an AsterixDB instance for the first time, it starts out &#x201c;empty&#x201d;; it contains no data other than the AsterixDB system catalogs (which live in a special dataverse called the Metadata dataverse). To store your data in AsterixDB, you will first create a dataverse and then you use it for the <i>datatypes</i> and <i>datasets</i> for managing your own data. A datatype tells AsterixDB what you know (or more accurately, what you want it to know) a priori about one of the kinds of data instances that you want AsterixDB to hold for you. A dataset is a collection of data instances of a datatype, and AsterixDB makes sure that the data instances that you put in it conform to its specified type. Since AsterixDB targets semistructured data, you can use <i>open</i> datatypes and tell it as little or as much as you wish about your data up front; the more you tell it up front, the less information it will have to store repeatedly in the individual data instances that you give it. Instances of open datatypes are permitted to have additional content, beyond what the datatype says, as long as they at least contain the information prescribed by the datatype definition. Open typing allows data to vary from one instance to another and it leaves wiggle room for application evolution in terms of what might need to be stored in the future. If you want to restrict data instances in a dataset to have only what the datatype says, and nothing extra, you can define a <i>closed</i> datatype for that dataset and AsterixDB will keep users from storing objects that have extra data in them. Datatypes are open by default unless you tell AsterixDB otherwise. Let&#x2019;s put these concepts to work</p>
266<p>Our little sample scenario involves information about users of two hypothetical social networks, Gleambook and Chirp, and their messages. We&#x2019;ll start by defining a dataverse called &#x201c;TinySocial&#x201d; to hold our datatypes and datasets. The AsterixDB data model (ADM) is essentially a superset of JSON&#x2014;it&#x2019;s what you get by extending JSON with more data types and additional data modeling constructs borrowed from object databases. The following shows how we can create the TinySocial dataverse plus a set of ADM types for modeling Chirp users, their Chirps, Gleambook users, their users&#x2019; employment information, and their messages. (Note: Keep in mind that this is just a tiny and somewhat silly example intended for illustrating some of the key features of AsterixDB. :-))</p>
267
268<div class="source">
269<div class="source">
270<pre> drop dataverse TinySocial if exists;
271 create dataverse TinySocial;
272 use dataverse TinySocial;
273
274 create type ChirpUserType as {
275 screenName: string,
276 lang: string,
277 friendsCount: int,
278 statusesCount: int,
279 name: string,
280 followersCount: int
281 };
282
283 create type ChirpMessageType as closed {
284 chirpId: string,
285 user: ChirpUserType,
286 senderLocation: point?,
287 sendTime: datetime,
288 referredTopics: {{ string }},
289 messageText: string
290 };
291
292 create type EmploymentType as {
293 organizationName: string,
294 startDate: date,
295 endDate: date?
296 };
297
298 create type GleambookUserType as {
299 id: int,
300 alias: string,
301 name: string,
302 userSince: datetime,
303 friendIds: {{ int }},
304 employment: [EmploymentType]
305 };
306
307 create type GleambookMessageType as {
308 messageId: int,
309 authorId: int,
310 inResponseTo: int?,
311 senderLocation: point?,
312 message: string
313 };
314</pre></div></div>
315<p>The first three lines above tell AsterixDB to drop the old TinySocial dataverse, if one already exists, and then to create a brand new one and make it the focus of the statements that follow. The first <i>create type</i> statement creates a datatype for holding information about Chirp users. It is a object type with a mix of integer and string data, very much like a (flat) relational tuple. The indicated fields are all mandatory, but because the type is open, additional fields are welcome. The second statement creates a datatype for Chirp messages; this shows how to specify a closed type. Interestingly (based on one of Chirp&#x2019;s APIs), each Chirp message actually embeds an instance of the sending user&#x2019;s information (current as of when the message was sent), so this is an example of a nested object in ADM. Chirp messages can optionally contain the sender&#x2019;s location, which is modeled via the senderLocation field of spatial type <i>point</i>; the question mark following the field type indicates its optionality. An optional field is like a nullable field in SQL&#x2014;it may be present or missing, but when it&#x2019;s present, its value&#x2019;s data type will conform to the datatype&#x2019;s specification. The sendTime field illustrates the use of a temporal primitive type, <i>datetime</i>. Lastly, the referredTopics field illustrates another way that ADM is richer than the relational model; this field holds a bag (<i>a.k.a.</i> an unordered list) of strings. Since the overall datatype definition for Chirp messages says &#x201c;closed&#x201d;, the fields that it lists are the only fields that instances of this type will be allowed to contain. The next two <i>create type</i> statements create a object type for holding information about one component of the employment history of a Gleambook user and then a object type for holding the user information itself. The Gleambook user type highlights a few additional ADM data model features. Its friendIds field is a bag of integers, presumably the Gleambook user ids for this user&#x2019;s friends, and its employment field is an ordered list of employment objects. The final <i>create type</i> statement defines a type for handling the content of a Gleambook message in our hypothetical social data storage scenario.</p>
316<p>Before going on, we need to once again emphasize the idea that AsterixDB is aimed at storing and querying not just Big Data, but Big <i>Semistructured</i> Data. This means that most of the fields listed in the <i>create type</i> statements above could have been omitted without changing anything other than the resulting size of stored data instances on disk. AsterixDB stores its information about the fields defined a priori as separate metadata, whereas the information about other fields that are &#x201c;just there&#x201d; in instances of open datatypes is stored with each instance&#x2014;making for more bits on disk and longer times for operations affected by data size (e.g., dataset scans). The only fields that <i>must</i> be specified a priori are the primary key fields of each dataset.</p></div>
317<div class="section">
318<h3><a name="Creating_Datasets_and_Indexes"></a>Creating Datasets and Indexes</h3>
319<p>Now that we have defined our datatypes, we can move on and create datasets to store the actual data. (If we wanted to, we could even have several named datasets based on any one of these datatypes.) We can do this as follows, utilizing the DDL capabilities of AsterixDB.</p>
320
321<div class="source">
322<div class="source">
323<pre> use dataverse TinySocial;
324
325 create dataset GleambookUsers(GleambookUserType)
326 primary key id;
327
328 create dataset GleambookMessages(GleambookMessageType)
329 primary key messageId;
330
331 create dataset ChirpUsers(ChirpUserType)
332 primary key screenName;
333
334 create dataset ChirpMessages(ChirpMessageType)
335 primary key chirpId
336 hints(cardinality=100);
337
338 create index gbUserSinceIdx on GleambookUsers(userSince);
339 create index gbAuthorIdx on GleambookMessages(authorId) type btree;
340 create index gbSenderLocIndex on GleambookMessages(senderLocation) type rtree;
341 create index gbMessageIdx on GleambookMessages(message) type keyword;
342
343 for $ds in dataset Metadata.Dataset return $ds;
344 for $ix in dataset Metadata.Index return $ix;
345</pre></div></div>
346<p>The DDL statements above create four datasets for holding our social data in the TinySocial dataverse: GleambookUsers, GleambookMessages, ChirpUsers, and ChirpMessages. The first <i>create dataset</i> statement creates the GleambookUsers data set. It specifies that this dataset will store data instances conforming to GleambookUserType and that it has a primary key which is the id field of each instance. The primary key information is used by AsterixDB to uniquely identify instances for the purpose of later lookup and for use in secondary indexes. Each AsterixDB dataset is stored (and indexed) in the form of a B+ tree on primary key; secondary indexes point to their indexed data by primary key. In AsterixDB clusters, the primary key is also used to hash-partition (<i>a.k.a.</i> shard) the dataset across the nodes of the cluster. The next three <i>create dataset</i> statements are similar. The last one illustrates an optional clause for providing useful hints to AsterixDB. In this case, the hint tells AsterixDB that the dataset definer is anticipating that the ChirpMessages dataset will contain roughly 100 objects; knowing this can help AsterixDB to more efficiently manage and query this dataset. (AsterixDB does not yet gather and maintain data statistics; it will currently, abitrarily, assume a cardinality of one million objects per dataset in the absence of such an optional definition-time hint.)</p>
347<p>The <i>create dataset</i> statements above are followed by four more DDL statements, each of which creates a secondary index on a field of one of the datasets. The first one indexes the GleambookUsers dataset on its userSince field. This index will be a B+ tree index; its type is unspecified and <i>btree</i> is the default type. The other three illustrate how you can explicitly specify the desired type of index. In addition to btree, <i>rtree</i> and inverted <i>keyword</i> indexes are supported by AsterixDB. Indexes can also have composite keys, and more advanced text indexing is available as well (ngram(k), where k is the desired gram length).</p></div>
348<div class="section">
349<h3><a name="Querying_the_Metadata_Dataverse"></a>Querying the Metadata Dataverse</h3>
350<p>The last two statements above show how you can use queries in AQL to examine the AsterixDB system catalogs and tell what artifacts you have created. Just as relational DBMSs use their own tables to store their catalogs, AsterixDB uses its own datasets to persist descriptions of its datasets, datatypes, indexes, and so on. Running the first of the two queries above will list all of your newly created datasets, and it will also show you a full list of all the metadata datasets. (You can then explore from there on your own if you are curious) These last two queries also illustrate one other factoid worth knowing: AsterixDB allows queries to span dataverses by allowing the optional use of fully-qualified dataset names (i.e., <i>dataversename.datasetname</i>) to reference datasets that live in a dataverse other than the one that was named in the most recently executed <i>use dataverse</i> directive.</p></div></div>
351<div class="section">
352<h2><a name="Loading_Data_Into_AsterixDB"></a>Loading Data Into AsterixDB</h2>
353<p>Okay, so far so good&#x2014;AsterixDB is now ready for data, so let&#x2019;s give it some data to store. Our next task will be to insert some sample data into the four datasets that we just defined. Here we will load a tiny set of objects, defined in ADM format (a superset of JSON), into each dataset. In the boxes below you can see insert statements with a list of the objects to be inserted. The files themselves are also linked. Take a few minutes to look carefully at each of the sample data sets. This will give you a better sense of the nature of the data that we are about to load and query. We should note that ADM format is a textual serialization of what AsterixDB will actually store; when persisted in AsterixDB, the data format will be binary and the data in the predefined fields of the data instances will be stored separately from their associated field name and type metadata.</p>
354<p><a href="../data/chu.adm">Chirp Users</a></p>
355
356<div class="source">
357<div class="source">
358<pre> use dataverse TinySocial;
359
360 insert into dataset ChirpUsers
361 ([
362 {&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:18,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},
363 {&quot;screenName&quot;:&quot;ColineGeyer@63&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:121,&quot;statusesCount&quot;:362,&quot;name&quot;:&quot;Coline Geyer&quot;,&quot;followersCount&quot;:17159},
364 {&quot;screenName&quot;:&quot;NilaMilliron_tw&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:445,&quot;statusesCount&quot;:164,&quot;name&quot;:&quot;Nila Milliron&quot;,&quot;followersCount&quot;:22649},
365 {&quot;screenName&quot;:&quot;ChangEwing_573&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:182,&quot;statusesCount&quot;:394,&quot;name&quot;:&quot;Chang Ewing&quot;,&quot;followersCount&quot;:32136}
366 ]);
367</pre></div></div>
368<p><a href="../data/chm.adm">Chirp Messages</a></p>
369
370<div class="source">
371<div class="source">
372<pre> use dataverse TinySocial;
373
374 insert into dataset ChirpMessages
375 ([
376 {&quot;chirpId&quot;:&quot;1&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;47.44,80.65&quot;),&quot;sendTime&quot;:datetime(&quot;2008-04-26T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-z&quot;,&quot;customization&quot;}},&quot;messageText&quot;:&quot; love product-z its customization is good:)&quot;},
377 {&quot;chirpId&quot;:&quot;2&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;ColineGeyer@63&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:121,&quot;statusesCount&quot;:362,&quot;name&quot;:&quot;Coline Geyer&quot;,&quot;followersCount&quot;:17159},&quot;senderLocation&quot;:point(&quot;32.84,67.14&quot;),&quot;sendTime&quot;:datetime(&quot;2010-05-13T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;ccast&quot;,&quot;shortcut-menu&quot;}},&quot;messageText&quot;:&quot; like ccast its shortcut-menu is awesome:)&quot;},
378 {&quot;chirpId&quot;:&quot;3&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;29.72,75.8&quot;),&quot;sendTime&quot;:datetime(&quot;2006-11-04T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-w&quot;,&quot;speed&quot;}},&quot;messageText&quot;:&quot; like product-w the speed is good:)&quot;},
379 {&quot;chirpId&quot;:&quot;4&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;39.28,70.48&quot;),&quot;sendTime&quot;:datetime(&quot;2011-12-26T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-b&quot;,&quot;voice-command&quot;}},&quot;messageText&quot;:&quot; like product-b the voice-command is mind-blowing:)&quot;},
380 {&quot;chirpId&quot;:&quot;5&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;40.09,92.69&quot;),&quot;sendTime&quot;:datetime(&quot;2006-08-04T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-w&quot;,&quot;speed&quot;}},&quot;messageText&quot;:&quot; can't stand product-w its speed is terrible:(&quot;},
381 {&quot;chirpId&quot;:&quot;6&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;ColineGeyer@63&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:121,&quot;statusesCount&quot;:362,&quot;name&quot;:&quot;Coline Geyer&quot;,&quot;followersCount&quot;:17159},&quot;senderLocation&quot;:point(&quot;47.51,83.99&quot;),&quot;sendTime&quot;:datetime(&quot;2010-05-07T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;x-phone&quot;,&quot;voice-clarity&quot;}},&quot;messageText&quot;:&quot; like x-phone the voice-clarity is good:)&quot;},
382 {&quot;chirpId&quot;:&quot;7&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;ChangEwing_573&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:182,&quot;statusesCount&quot;:394,&quot;name&quot;:&quot;Chang Ewing&quot;,&quot;followersCount&quot;:32136},&quot;senderLocation&quot;:point(&quot;36.21,72.6&quot;),&quot;sendTime&quot;:datetime(&quot;2011-08-25T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-y&quot;,&quot;platform&quot;}},&quot;messageText&quot;:&quot; like product-y the platform is good&quot;},
383 {&quot;chirpId&quot;:&quot;8&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;46.05,93.34&quot;),&quot;sendTime&quot;:datetime(&quot;2005-10-14T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-z&quot;,&quot;shortcut-menu&quot;}},&quot;messageText&quot;:&quot; like product-z the shortcut-menu is awesome:)&quot;},
384 {&quot;chirpId&quot;:&quot;9&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NathanGiesen@211&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:39339,&quot;statusesCount&quot;:473,&quot;name&quot;:&quot;Nathan Giesen&quot;,&quot;followersCount&quot;:49416},&quot;senderLocation&quot;:point(&quot;36.86,74.62&quot;),&quot;sendTime&quot;:datetime(&quot;2012-07-21T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;ccast&quot;,&quot;voicemail-service&quot;}},&quot;messageText&quot;:&quot; love ccast its voicemail-service is awesome&quot;},
385 {&quot;chirpId&quot;:&quot;10&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;ColineGeyer@63&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:121,&quot;statusesCount&quot;:362,&quot;name&quot;:&quot;Coline Geyer&quot;,&quot;followersCount&quot;:17159},&quot;senderLocation&quot;:point(&quot;29.15,76.53&quot;),&quot;sendTime&quot;:datetime(&quot;2008-01-26T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;ccast&quot;,&quot;voice-clarity&quot;}},&quot;messageText&quot;:&quot; hate ccast its voice-clarity is OMG:(&quot;},
386 {&quot;chirpId&quot;:&quot;11&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;NilaMilliron_tw&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:445,&quot;statusesCount&quot;:164,&quot;name&quot;:&quot;Nila Milliron&quot;,&quot;followersCount&quot;:22649},&quot;senderLocation&quot;:point(&quot;37.59,68.42&quot;),&quot;sendTime&quot;:datetime(&quot;2008-03-09T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;x-phone&quot;,&quot;platform&quot;}},&quot;messageText&quot;:&quot; can't stand x-phone its platform is terrible&quot;},
387 {&quot;chirpId&quot;:&quot;12&quot;,&quot;user&quot;:{&quot;screenName&quot;:&quot;OliJackson_512&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;friendsCount&quot;:445,&quot;statusesCount&quot;:164,&quot;name&quot;:&quot;Oli Jackson&quot;,&quot;followersCount&quot;:22649},&quot;senderLocation&quot;:point(&quot;24.82,94.63&quot;),&quot;sendTime&quot;:datetime(&quot;2010-02-13T10:10:00&quot;),&quot;referredTopics&quot;:{{&quot;product-y&quot;,&quot;voice-command&quot;}},&quot;messageText&quot;:&quot; like product-y the voice-command is amazing:)&quot;}
388 ]);
389</pre></div></div>
390<p><a href="../data/gbu.adm">Gleambook Users</a></p>
391
392<div class="source">
393<div class="source">
394<pre> use dataverse TinySocial;
395
396 insert into dataset GleambookUsers
397 ([
398 {&quot;id&quot;:1,&quot;alias&quot;:&quot;Margarita&quot;,&quot;name&quot;:&quot;MargaritaStoddard&quot;,&quot;nickname&quot;:&quot;Mags&quot;,&quot;userSince&quot;:datetime(&quot;2012-08-20T10:10:00&quot;),&quot;friendIds&quot;:{{2,3,6,10}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Codetechno&quot;,&quot;startDate&quot;:date(&quot;2006-08-06&quot;)},{&quot;organizationName&quot;:&quot;geomedia&quot;,&quot;startDate&quot;:date(&quot;2010-06-17&quot;),&quot;endDate&quot;:date(&quot;2010-01-26&quot;)}],&quot;gender&quot;:&quot;F&quot;},
399 {&quot;id&quot;:2,&quot;alias&quot;:&quot;Isbel&quot;,&quot;name&quot;:&quot;IsbelDull&quot;,&quot;nickname&quot;:&quot;Izzy&quot;,&quot;userSince&quot;:datetime(&quot;2011-01-22T10:10:00&quot;),&quot;friendIds&quot;:{{1,4}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Hexviafind&quot;,&quot;startDate&quot;:date(&quot;2010-04-27&quot;)}]},
400 {&quot;id&quot;:3,&quot;alias&quot;:&quot;Emory&quot;,&quot;name&quot;:&quot;EmoryUnk&quot;,&quot;userSince&quot;:datetime(&quot;2012-07-10T10:10:00&quot;),&quot;friendIds&quot;:{{1,5,8,9}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;geomedia&quot;,&quot;startDate&quot;:date(&quot;2010-06-17&quot;),&quot;endDate&quot;:date(&quot;2010-01-26&quot;)}]},
401 {&quot;id&quot;:4,&quot;alias&quot;:&quot;Nicholas&quot;,&quot;name&quot;:&quot;NicholasStroh&quot;,&quot;userSince&quot;:datetime(&quot;2010-12-27T10:10:00&quot;),&quot;friendIds&quot;:{{2}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Zamcorporation&quot;,&quot;startDate&quot;:date(&quot;2010-06-08&quot;)}]},
402 {&quot;id&quot;:5,&quot;alias&quot;:&quot;Von&quot;,&quot;name&quot;:&quot;VonKemble&quot;,&quot;userSince&quot;:datetime(&quot;2010-01-05T10:10:00&quot;),&quot;friendIds&quot;:{{3,6,10}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Kongreen&quot;,&quot;startDate&quot;:date(&quot;2010-11-27&quot;)}]},
403 {&quot;id&quot;:6,&quot;alias&quot;:&quot;Willis&quot;,&quot;name&quot;:&quot;WillisWynne&quot;,&quot;userSince&quot;:datetime(&quot;2005-01-17T10:10:00&quot;),&quot;friendIds&quot;:{{1,3,7}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;jaydax&quot;,&quot;startDate&quot;:date(&quot;2009-05-15&quot;)}]},
404 {&quot;id&quot;:7,&quot;alias&quot;:&quot;Suzanna&quot;,&quot;name&quot;:&quot;SuzannaTillson&quot;,&quot;userSince&quot;:datetime(&quot;2012-08-07T10:10:00&quot;),&quot;friendIds&quot;:{{6}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Labzatron&quot;,&quot;startDate&quot;:date(&quot;2011-04-19&quot;)}]},
405 {&quot;id&quot;:8,&quot;alias&quot;:&quot;Nila&quot;,&quot;name&quot;:&quot;NilaMilliron&quot;,&quot;userSince&quot;:datetime(&quot;2008-01-01T10:10:00&quot;),&quot;friendIds&quot;:{{3}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Plexlane&quot;,&quot;startDate&quot;:date(&quot;2010-02-28&quot;)}]},
406 {&quot;id&quot;:9,&quot;alias&quot;:&quot;Woodrow&quot;,&quot;name&quot;:&quot;WoodrowNehling&quot;,&quot;nickname&quot;:&quot;Woody&quot;,&quot;userSince&quot;:datetime(&quot;2005-09-20T10:10:00&quot;),&quot;friendIds&quot;:{{3,10}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;Zuncan&quot;,&quot;startDate&quot;:date(&quot;2003-04-22&quot;),&quot;endDate&quot;:date(&quot;2009-12-13&quot;)}]},
407 {&quot;id&quot;:10,&quot;alias&quot;:&quot;Bram&quot;,&quot;name&quot;:&quot;BramHatch&quot;,&quot;userSince&quot;:datetime(&quot;2010-10-16T10:10:00&quot;),&quot;friendIds&quot;:{{1,5,9}},&quot;employment&quot;:[{&quot;organizationName&quot;:&quot;physcane&quot;,&quot;startDate&quot;:date(&quot;2007-06-05&quot;),&quot;endDate&quot;:date(&quot;2011-11-05&quot;)}]}
408 ]);
409</pre></div></div>
410<p><a href="../data/gbm.adm">Gleambook Messages</a></p>
411
412<div class="source">
413<div class="source">
414<pre> use dataverse TinySocial;
415
416 insert into dataset GleambookMessages
417 ([
418 {&quot;messageId&quot;:1,&quot;authorId&quot;:3,&quot;inResponseTo&quot;:2,&quot;senderLocation&quot;:point(&quot;47.16,77.75&quot;),&quot;message&quot;:&quot; love product-b its shortcut-menu is awesome:)&quot;},
419 {&quot;messageId&quot;:2,&quot;authorId&quot;:1,&quot;inResponseTo&quot;:4,&quot;senderLocation&quot;:point(&quot;41.66,80.87&quot;),&quot;message&quot;:&quot; dislike x-phone its touch-screen is horrible&quot;},
420 {&quot;messageId&quot;:3,&quot;authorId&quot;:2,&quot;inResponseTo&quot;:4,&quot;senderLocation&quot;:point(&quot;48.09,81.01&quot;),&quot;message&quot;:&quot; like product-y the plan is amazing&quot;},
421 {&quot;messageId&quot;:4,&quot;authorId&quot;:1,&quot;inResponseTo&quot;:2,&quot;senderLocation&quot;:point(&quot;37.73,97.04&quot;),&quot;message&quot;:&quot; can't stand acast the network is horrible:(&quot;},
422 {&quot;messageId&quot;:5,&quot;authorId&quot;:6,&quot;inResponseTo&quot;:2,&quot;senderLocation&quot;:point(&quot;34.7,90.76&quot;),&quot;message&quot;:&quot; love product-b the customization is mind-blowing&quot;},
423 {&quot;messageId&quot;:6,&quot;authorId&quot;:2,&quot;inResponseTo&quot;:1,&quot;senderLocation&quot;:point(&quot;31.5,75.56&quot;),&quot;message&quot;:&quot; like product-z its platform is mind-blowing&quot;},
424 {&quot;messageId&quot;:7,&quot;authorId&quot;:5,&quot;inResponseTo&quot;:15,&quot;senderLocation&quot;:point(&quot;32.91,85.05&quot;),&quot;message&quot;:&quot; dislike product-b the speed is horrible&quot;},
425 {&quot;messageId&quot;:8,&quot;authorId&quot;:1,&quot;inResponseTo&quot;:11,&quot;senderLocation&quot;:point(&quot;40.33,80.87&quot;),&quot;message&quot;:&quot; like ccast the 3G is awesome:)&quot;},
426 {&quot;messageId&quot;:9,&quot;authorId&quot;:3,&quot;inResponseTo&quot;:12,&quot;senderLocation&quot;:point(&quot;34.45,96.48&quot;),&quot;message&quot;:&quot; love ccast its wireless is good&quot;},
427 {&quot;messageId&quot;:10,&quot;authorId&quot;:1,&quot;inResponseTo&quot;:12,&quot;senderLocation&quot;:point(&quot;42.5,70.01&quot;),&quot;message&quot;:&quot; can't stand product-w the touch-screen is terrible&quot;},
428 {&quot;messageId&quot;:11,&quot;authorId&quot;:1,&quot;inResponseTo&quot;:1,&quot;senderLocation&quot;:point(&quot;38.97,77.49&quot;),&quot;message&quot;:&quot; can't stand acast its plan is terrible&quot;},
429 {&quot;messageId&quot;:12,&quot;authorId&quot;:10,&quot;inResponseTo&quot;:6,&quot;senderLocation&quot;:point(&quot;42.26,77.76&quot;),&quot;message&quot;:&quot; can't stand product-z its voicemail-service is OMG:(&quot;},
430 {&quot;messageId&quot;:13,&quot;authorId&quot;:10,&quot;inResponseTo&quot;:4,&quot;senderLocation&quot;:point(&quot;42.77,78.92&quot;),&quot;message&quot;:&quot; dislike x-phone the voice-command is bad:(&quot;},
431 {&quot;messageId&quot;:14,&quot;authorId&quot;:9,&quot;inResponseTo&quot;:12,&quot;senderLocation&quot;:point(&quot;41.33,85.28&quot;),&quot;message&quot;:&quot; love acast its 3G is good:)&quot;},
432 {&quot;messageId&quot;:15,&quot;authorId&quot;:7,&quot;inResponseTo&quot;:11,&quot;senderLocation&quot;:point(&quot;44.47,67.11&quot;),&quot;message&quot;:&quot; like x-phone the voicemail-service is awesome&quot;}
433 ]);
434</pre></div></div></div>
435<div class="section">
436<h2><a name="AQL:_Querying_Your_AsterixDB_Data"></a>AQL: Querying Your AsterixDB Data</h2>
437<p>Congratulations! You now have sample social data stored (and indexed) in AsterixDB. (You are part of an elite and adventurous group of individuals. :-)) Now that you have successfully loaded the provided sample data into the datasets that we defined, you can start running queries against them.</p>
438<p>The query language for AsterixDB is AQL&#x2014;the Asterix Query Language. AQL is loosely based on XQuery, the language developed and standardized in the early to mid 2000&#x2019;s by the World Wide Web Consortium (W3C) for querying semistructured data stored in their XML format. We have tossed all of the &#x201c;XML cruft&#x201d; out of their language but retained many of its core ideas. We did this because its design was developed over a period of years by a diverse committee of smart and experienced language designers, including &#x201c;SQL people&#x201d;, &#x201c;functional programming people&#x201d;, and &#x201c;XML people&#x201d;, all of whom were focused on how to design a new query language that operates well over semistructured data. (We decided to stand on their shoulders instead of starting from scratch and revisiting many of the same issues.) Note that AQL is not SQL and not based on SQL: In other words, AsterixDB is fully &#x201c;NoSQL compliant&#x201d;. :-)</p>
439<p>In this section we introduce AQL via a set of example queries, along with their expected results, based on the data above, to help you get started. Many of the most important features of AQL are presented in this set of representative queries. You can find more details in the document on the <a href="datamodel.html">Asterix Data Model (ADM)</a>, in the <a href="manual.html">AQL Reference Manual</a>, and a complete list of built-in functions is available in the <a href="functions.html">Asterix Functions</a> document.</p>
440<p>AQL is an expression language. Even the expression 1+1 is a valid AQL query that evaluates to 2. (Try it for yourself! Okay, maybe that&#x2019;s <i>not</i> the best use of a 512-node shared-nothing compute cluster.) Most useful AQL queries will be based on the <i>FLWOR</i> (pronounced &#x201c;flower&#x201d;) expression structure that AQL has borrowed from XQuery ((<a class="externalLink" href="http://en.wikipedia.org/wiki/FLWOR))">http://en.wikipedia.org/wiki/FLWOR))</a>. The FLWOR expression syntax supports both the incremental binding (<i>for</i>) of variables to ADM data instances in a dataset (or in the result of any AQL expression, actually) and the full binding (<i>let</i>) of variables to entire intermediate results in a fashion similar to temporary views in the SQL world. FLWOR is an acronym that is short for <i>for</i>-<i>let</i>-<i>where</i>-<i>order by</i>-<i>return</i>, naming five of the most frequently used clauses from the syntax of a full AQL query. AQL also includes <i>group by</i> and <i>limit</i> clauses, as you will see shortly. Roughly speaking, for SQL afficiandos, the <i>for</i> clause in AQL is like the <i>from</i> clause in SQL, the <i>return</i> clause in AQL is like the <i>select</i> clause in SQL (but appears at the end instead of the beginning of a query), the <i>let</i> clause in AQL is like SQL&#x2019;s <i>with</i> clause, and the <i>where</i> and <i>order by</i> clauses in both languages are similar.</p>
441<p>Based on user demand, in order to let SQL afficiandos to write AQL queries in their favored ways, AQL supports a few synonyms: <i>from</i> for <i>for</i>, <i>select</i> for <i>return</i>, <i>with</i> for <i>let</i>, and <i>keeping</i> for <i>with</i> in the group by clause. These have been found to help die-hard SQL fans to feel a little more at home in AQL and to be less likely to (mis)interpret <i>for</i> as imperative looping, <i>return</i> as returning from a function call, and so on.</p>
442<p>Enough talk! Let&#x2019;s go ahead and try writing some queries and see about learning AQL by example.</p>
443<div class="section">
444<h3><a name="Query_0-A_-_Exact-Match_Lookup"></a>Query 0-A - Exact-Match Lookup</h3>
445<p>For our first query, let&#x2019;s find a Gleambook user based on his or her user id. Suppose the user we want is the user whose id is 8:</p>
446
447<div class="source">
448<div class="source">
449<pre> use dataverse TinySocial;
450
451 for $user in dataset GleambookUsers
452 where $user.id = 8
453 return $user;
454</pre></div></div>
455<p>The query&#x2019;s <i>for</i> clause binds the variable <tt>$user</tt> incrementally to the data instances residing in the dataset named GleambookUsers. Its <i>where</i> clause selects only those bindings having a user id of interest, filtering out the rest. The <i>return</i> clause returns the (entire) data instance for each binding that satisfies the predicate. Since this dataset is indexed on user id (its primary key), this query will be done via a quick index lookup.</p>
456<p>The expected result for our sample data is as follows:</p>
457
458<div class="source">
459<div class="source">
460<pre> { &quot;id&quot;: 8, &quot;alias&quot;: &quot;Nila&quot;, &quot;name&quot;: &quot;NilaMilliron&quot;, &quot;userSince&quot;: datetime(&quot;2008-01-01T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 3 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Plexlane&quot;, &quot;startDate&quot;: date(&quot;2010-02-28&quot;) } ] }
461</pre></div></div>
462<p>Note the using the SQL keyword synonyms, another way of phrasing the same query would be:</p>
463
464<div class="source">
465<div class="source">
466<pre> use dataverse TinySocial;
467
468 from $user in dataset GleambookUsers
469 where $user.id = 8
470 select $user;
471</pre></div></div></div>
472<div class="section">
473<h3><a name="Query_0-B_-_Range_Scan"></a>Query 0-B - Range Scan</h3>
474<p>AQL, like SQL, supports a variety of different predicates. For example, for our next query, let&#x2019;s find the Gleambook users whose ids are in the range between 2 and 4:</p>
475
476<div class="source">
477<div class="source">
478<pre> use dataverse TinySocial;
479
480 for $user in dataset GleambookUsers
481 where $user.id &gt;= 2 and $user.id &lt;= 4
482 return $user;
483</pre></div></div>
484<p>This query&#x2019;s expected result, also evaluable using the primary index on user id, is:</p>
485
486<div class="source">
487<div class="source">
488<pre> { &quot;id&quot;: 2, &quot;alias&quot;: &quot;Isbel&quot;, &quot;name&quot;: &quot;IsbelDull&quot;, &quot;userSince&quot;: datetime(&quot;2011-01-22T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 4 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Hexviafind&quot;, &quot;startDate&quot;: date(&quot;2010-04-27&quot;) } ], &quot;nickname&quot;: &quot;Izzy&quot; }
489 { &quot;id&quot;: 4, &quot;alias&quot;: &quot;Nicholas&quot;, &quot;name&quot;: &quot;NicholasStroh&quot;, &quot;userSince&quot;: datetime(&quot;2010-12-27T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 2 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Zamcorporation&quot;, &quot;startDate&quot;: date(&quot;2010-06-08&quot;) } ] }
490 { &quot;id&quot;: 3, &quot;alias&quot;: &quot;Emory&quot;, &quot;name&quot;: &quot;EmoryUnk&quot;, &quot;userSince&quot;: datetime(&quot;2012-07-10T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 5, 8, 9 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;geomedia&quot;, &quot;startDate&quot;: date(&quot;2010-06-17&quot;), &quot;endDate&quot;: date(&quot;2010-01-26&quot;) } ] }
491</pre></div></div></div>
492<div class="section">
493<h3><a name="Query_1_-_Other_Query_Filters"></a>Query 1 - Other Query Filters</h3>
494<p>AQL can do range queries on any data type that supports the appropriate set of comparators. As an example, this next query retrieves the Gleambook users who joined between July 22, 2010 and July 29, 2012:</p>
495
496<div class="source">
497<div class="source">
498<pre> use dataverse TinySocial;
499
500 for $user in dataset GleambookUsers
501 where $user.userSince &gt;= datetime('2010-07-22T00:00:00')
502 and $user.userSince &lt;= datetime('2012-07-29T23:59:59')
503 return $user;
504</pre></div></div>
505<p>The expected result for this query, also an indexable query, is as follows:</p>
506
507<div class="source">
508<div class="source">
509<pre> { &quot;id&quot;: 2, &quot;alias&quot;: &quot;Isbel&quot;, &quot;name&quot;: &quot;IsbelDull&quot;, &quot;userSince&quot;: datetime(&quot;2011-01-22T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 4 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Hexviafind&quot;, &quot;startDate&quot;: date(&quot;2010-04-27&quot;) } ], &quot;nickname&quot;: &quot;Izzy&quot; }
510 { &quot;id&quot;: 4, &quot;alias&quot;: &quot;Nicholas&quot;, &quot;name&quot;: &quot;NicholasStroh&quot;, &quot;userSince&quot;: datetime(&quot;2010-12-27T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 2 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Zamcorporation&quot;, &quot;startDate&quot;: date(&quot;2010-06-08&quot;) } ] }
511 { &quot;id&quot;: 10, &quot;alias&quot;: &quot;Bram&quot;, &quot;name&quot;: &quot;BramHatch&quot;, &quot;userSince&quot;: datetime(&quot;2010-10-16T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 5, 9 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;physcane&quot;, &quot;startDate&quot;: date(&quot;2007-06-05&quot;), &quot;endDate&quot;: date(&quot;2011-11-05&quot;) } ] }
512 { &quot;id&quot;: 3, &quot;alias&quot;: &quot;Emory&quot;, &quot;name&quot;: &quot;EmoryUnk&quot;, &quot;userSince&quot;: datetime(&quot;2012-07-10T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 5, 8, 9 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;geomedia&quot;, &quot;startDate&quot;: date(&quot;2010-06-17&quot;), &quot;endDate&quot;: date(&quot;2010-01-26&quot;) } ] }
513</pre></div></div></div>
514<div class="section">
515<h3><a name="Query_2-A_-_Equijoin"></a>Query 2-A - Equijoin</h3>
516<p>In addition to simply binding variables to data instances and returning them &#x201c;whole&#x201d;, an AQL query can construct new ADM instances to return based on combinations of its variable bindings. This gives AQL the power to do joins much like those done using multi-table <i>from</i> clauses in SQL. For example, suppose we wanted a list of all Gleambook users paired with their associated messages, with the list enumerating the author name and the message text associated with each Gleambook message. We could do this as follows in AQL:</p>
517
518<div class="source">
519<div class="source">
520<pre> use dataverse TinySocial;
521
522 for $user in dataset GleambookUsers
523 for $message in dataset GleambookMessages
524 where $message.authorId = $user.id
525 return {
526 &quot;uname&quot;: $user.name,
527 &quot;message&quot;: $message.message
528 };
529</pre></div></div>
530<p>The result of this query is a sequence of new ADM instances, one for each author/message pair. Each instance in the result will be an ADM object containing two fields, &#x201c;uname&#x201d; and &#x201c;message&#x201d;, containing the user&#x2019;s name and the message text, respectively, for each author/message pair. (Note that &#x201c;uname&#x201d; and &#x201c;message&#x201d; are both simple AQL expressions themselves&#x2014;so in the most general case, even the resulting field names can be computed as part of the query, making AQL a very powerful tool for slicing and dicing semistructured data.)</p>
531<p>The expected result of this example AQL join query for our sample data set is:</p>
532
533<div class="source">
534<div class="source">
535<pre> { &quot;uname&quot;: &quot;WillisWynne&quot;, &quot;message&quot;: &quot; love product-b the customization is mind-blowing&quot; }
536 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot; }
537 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot; }
538 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot; }
539 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot; }
540 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot; }
541 { &quot;uname&quot;: &quot;IsbelDull&quot;, &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot; }
542 { &quot;uname&quot;: &quot;IsbelDull&quot;, &quot;message&quot;: &quot; like product-y the plan is amazing&quot; }
543 { &quot;uname&quot;: &quot;WoodrowNehling&quot;, &quot;message&quot;: &quot; love acast its 3G is good:)&quot; }
544 { &quot;uname&quot;: &quot;BramHatch&quot;, &quot;message&quot;: &quot; can't stand product-z its voicemail-service is OMG:(&quot; }
545 { &quot;uname&quot;: &quot;BramHatch&quot;, &quot;message&quot;: &quot; dislike x-phone the voice-command is bad:(&quot; }
546 { &quot;uname&quot;: &quot;EmoryUnk&quot;, &quot;message&quot;: &quot; love product-b its shortcut-menu is awesome:)&quot; }
547 { &quot;uname&quot;: &quot;EmoryUnk&quot;, &quot;message&quot;: &quot; love ccast its wireless is good&quot; }
548 { &quot;uname&quot;: &quot;VonKemble&quot;, &quot;message&quot;: &quot; dislike product-b the speed is horrible&quot; }
549 { &quot;uname&quot;: &quot;SuzannaTillson&quot;, &quot;message&quot;: &quot; like x-phone the voicemail-service is awesome&quot; }
550</pre></div></div>
551<p>Again, as an aside, note that the same query expressed using AQL&#x2019;s SQL keyword synonyms would be:</p>
552
553<div class="source">
554<div class="source">
555<pre> use dataverse TinySocial;
556
557 from $user in dataset GleambookUsers
558 from $message in dataset GleambookMessages
559 where $message.authorId = $user.id
560 select {
561 &quot;uname&quot;: $user.name,
562 &quot;message&quot;: $message.message
563 };
564</pre></div></div></div>
565<div class="section">
566<h3><a name="Query_2-B_-_Index_join"></a>Query 2-B - Index join</h3>
567<p>By default, AsterixDB evaluates equijoin queries using hash-based join methods that work well for doing ad hoc joins of very large data sets (<a class="externalLink" href="http://en.wikipedia.org/wiki/Hash_join">http://en.wikipedia.org/wiki/Hash_join</a>). On a cluster, hash partitioning is employed as AsterixDB&#x2019;s divide-and-conquer strategy for computing large parallel joins. AsterixDB includes other join methods, but in the absence of data statistics and selectivity estimates, it doesn&#x2019;t (yet) have the know-how to intelligently choose among its alternatives. We therefore asked ourselves the classic question&#x2014;WWOD?&#x2014;What Would Oracle Do?&#x2014;and in the interim, AQL includes a clunky (but useful) hint-based mechanism for addressing the occasional need to suggest to AsterixDB which join method it should use for a particular AQL query.</p>
568<p>The following query is similar to Query 2-A but includes a suggestion to AsterixDB that it should consider employing an index-based nested-loop join technique to process the query:</p>
569
570<div class="source">
571<div class="source">
572<pre> use dataverse TinySocial;
573
574 for $user in dataset GleambookUsers
575 for $message in dataset GleambookMessages
576 where $message.authorId /*+ indexnl */ = $user.id
577 return {
578 &quot;uname&quot;: $user.name,
579 &quot;message&quot;: $message.message
580 };
581</pre></div></div>
582<p>The expected result is (of course) the same as before, modulo the order of the instances. Result ordering is (intentionally) undefined in AQL in the absence of an <i>order by</i> clause. The query result for our sample data in this case is:</p>
583
584<div class="source">
585<div class="source">
586<pre> { &quot;uname&quot;: &quot;IsbelDull&quot;, &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot; }
587 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot; }
588 { &quot;uname&quot;: &quot;BramHatch&quot;, &quot;message&quot;: &quot; can't stand product-z its voicemail-service is OMG:(&quot; }
589 { &quot;uname&quot;: &quot;WoodrowNehling&quot;, &quot;message&quot;: &quot; love acast its 3G is good:)&quot; }
590 { &quot;uname&quot;: &quot;EmoryUnk&quot;, &quot;message&quot;: &quot; love product-b its shortcut-menu is awesome:)&quot; }
591 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot; }
592 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot; }
593 { &quot;uname&quot;: &quot;BramHatch&quot;, &quot;message&quot;: &quot; dislike x-phone the voice-command is bad:(&quot; }
594 { &quot;uname&quot;: &quot;SuzannaTillson&quot;, &quot;message&quot;: &quot; like x-phone the voicemail-service is awesome&quot; }
595 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot; }
596 { &quot;uname&quot;: &quot;EmoryUnk&quot;, &quot;message&quot;: &quot; love ccast its wireless is good&quot; }
597 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot; }
598 { &quot;uname&quot;: &quot;IsbelDull&quot;, &quot;message&quot;: &quot; like product-y the plan is amazing&quot; }
599 { &quot;uname&quot;: &quot;WillisWynne&quot;, &quot;message&quot;: &quot; love product-b the customization is mind-blowing&quot; }
600 { &quot;uname&quot;: &quot;VonKemble&quot;, &quot;message&quot;: &quot; dislike product-b the speed is horrible&quot; }
601</pre></div></div>
602<p>(It is worth knowing, with respect to influencing AsterixDB&#x2019;s query evaluation, that nested <i>for</i> clauses&#x2014;a.k.a. joins&#x2014; are currently evaluated with the &#x201c;outer&#x201d; clause probing the data of the &#x201c;inner&#x201d; clause.)</p></div>
603<div class="section">
604<h3><a name="Query_3_-_Nested_Outer_Join"></a>Query 3 - Nested Outer Join</h3>
605<p>In order to support joins between tables with missing/dangling join tuples, the designers of SQL ended up shoe-horning a subset of the relational algebra into SQL&#x2019;s <i>from</i> clause syntax&#x2014;and providing a variety of join types there for users to choose from. Left outer joins are particularly important in SQL, e.g., to print a summary of customers and orders, grouped by customer, without omitting those customers who haven&#x2019;t placed any orders yet.</p>
606<p>The AQL language supports nesting, both of queries and of query results, and the combination allows for an arguably cleaner/more natural approach to such queries. As an example, supposed we wanted, for each Gleambook user, to produce a object that has his/her name plus a list of the messages written by that user. In SQL, this would involve a left outer join between users and messages, grouping by user, and having the user name repeated along side each message. In AQL, this sort of use case can be handled (more naturally) as follows:</p>
607
608<div class="source">
609<div class="source">
610<pre> use dataverse TinySocial;
611
612 for $user in dataset GleambookUsers
613 return {
614 &quot;uname&quot;: $user.name,
615 &quot;messages&quot;: for $message in dataset GleambookMessages
616 where $message.authorId = $user.id
617 return $message.message
618 };
619</pre></div></div>
620<p>This AQL query binds the variable <tt>$user</tt> to the data instances in GleambookUsers; for each user, it constructs a result object containing a &#x201c;uname&#x201d; field with the user&#x2019;s name and a &#x201c;messages&#x201d; field with a nested collection of all messages for that user. The nested collection for each user is specified by using a correlated subquery. (Note: While it looks like nested loops could be involved in computing the result, AsterixDB recogizes the equivalence of such a query to an outerjoin, and it will use an efficient hash-based strategy when actually computing the query&#x2019;s result.)</p>
621<p>Here is this example query&#x2019;s expected output:</p>
622
623<div class="source">
624<div class="source">
625<pre> { &quot;uname&quot;: &quot;WillisWynne&quot;, &quot;messages&quot;: [ &quot; love product-b the customization is mind-blowing&quot; ] }
626 { &quot;uname&quot;: &quot;MargaritaStoddard&quot;, &quot;messages&quot;: [ &quot; can't stand acast its plan is terrible&quot;, &quot; dislike x-phone its touch-screen is horrible&quot;, &quot; can't stand acast the network is horrible:(&quot;, &quot; like ccast the 3G is awesome:)&quot;, &quot; can't stand product-w the touch-screen is terrible&quot; ] }
627 { &quot;uname&quot;: &quot;IsbelDull&quot;, &quot;messages&quot;: [ &quot; like product-z its platform is mind-blowing&quot;, &quot; like product-y the plan is amazing&quot; ] }
628 { &quot;uname&quot;: &quot;NicholasStroh&quot;, &quot;messages&quot;: [ ] }
629 { &quot;uname&quot;: &quot;NilaMilliron&quot;, &quot;messages&quot;: [ ] }
630 { &quot;uname&quot;: &quot;WoodrowNehling&quot;, &quot;messages&quot;: [ &quot; love acast its 3G is good:)&quot; ] }
631 { &quot;uname&quot;: &quot;BramHatch&quot;, &quot;messages&quot;: [ &quot; can't stand product-z its voicemail-service is OMG:(&quot;, &quot; dislike x-phone the voice-command is bad:(&quot; ] }
632 { &quot;uname&quot;: &quot;EmoryUnk&quot;, &quot;messages&quot;: [ &quot; love product-b its shortcut-menu is awesome:)&quot;, &quot; love ccast its wireless is good&quot; ] }
633 { &quot;uname&quot;: &quot;VonKemble&quot;, &quot;messages&quot;: [ &quot; dislike product-b the speed is horrible&quot; ] }
634 { &quot;uname&quot;: &quot;SuzannaTillson&quot;, &quot;messages&quot;: [ &quot; like x-phone the voicemail-service is awesome&quot; ] }
635</pre></div></div></div>
636<div class="section">
637<h3><a name="Query_4_-_Theta_Join"></a>Query 4 - Theta Join</h3>
638<p>Not all joins are expressible as equijoins and computable using equijoin-oriented algorithms. The join predicates for some use cases involve predicates with functions; AsterixDB supports the expression of such queries and will still evaluate them as best it can using nested loop based techniques (and broadcast joins in the parallel case).</p>
639<p>As an example of such a use case, suppose that we wanted, for each chirp T, to find all of the other chirps that originated from within a circle of radius of 1 surrounding chirp T&#x2019;s location. In AQL, this can be specified in a manner similar to the previous query using one of the built-in functions on the spatial data type instead of id equality in the correlated query&#x2019;s <i>where</i> clause:</p>
640
641<div class="source">
642<div class="source">
643<pre> use dataverse TinySocial;
644
645 for $cm in dataset ChirpMessages
646 return {
647 &quot;message&quot;: $cm.messageText,
648 &quot;nearbyMessages&quot;: for $cm2 in dataset ChirpMessages
649 where spatial-distance($cm.senderLocation, $cm2.senderLocation) &lt;= 1
650 return { &quot;msgtxt&quot;:$cm2.messageText}
651 };
652</pre></div></div>
653<p>Here is the expected result for this query:</p>
654
655<div class="source">
656<div class="source">
657<pre> { &quot;message&quot;: &quot; can't stand x-phone its platform is terrible&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; can't stand x-phone its platform is terrible&quot; } ] }
658 { &quot;message&quot;: &quot; like ccast its shortcut-menu is awesome:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like ccast its shortcut-menu is awesome:)&quot; } ] }
659 { &quot;message&quot;: &quot; like product-b the voice-command is mind-blowing:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-b the voice-command is mind-blowing:)&quot; } ] }
660 { &quot;message&quot;: &quot; love ccast its voicemail-service is awesome&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; love ccast its voicemail-service is awesome&quot; } ] }
661 { &quot;message&quot;: &quot; love product-z its customization is good:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; love product-z its customization is good:)&quot; } ] }
662 { &quot;message&quot;: &quot; can't stand product-w its speed is terrible:(&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; can't stand product-w its speed is terrible:(&quot; } ] }
663 { &quot;message&quot;: &quot; like product-w the speed is good:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-w the speed is good:)&quot; }, { &quot;msgtxt&quot;: &quot; hate ccast its voice-clarity is OMG:(&quot; } ] }
664 { &quot;message&quot;: &quot; like x-phone the voice-clarity is good:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like x-phone the voice-clarity is good:)&quot; } ] }
665 { &quot;message&quot;: &quot; like product-y the platform is good&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-y the platform is good&quot; } ] }
666 { &quot;message&quot;: &quot; hate ccast its voice-clarity is OMG:(&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-w the speed is good:)&quot; }, { &quot;msgtxt&quot;: &quot; hate ccast its voice-clarity is OMG:(&quot; } ] }
667 { &quot;message&quot;: &quot; like product-y the voice-command is amazing:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-y the voice-command is amazing:)&quot; } ] }
668 { &quot;message&quot;: &quot; like product-z the shortcut-menu is awesome:)&quot;, &quot;nearbyMessages&quot;: [ { &quot;msgtxt&quot;: &quot; like product-z the shortcut-menu is awesome:)&quot; } ] }
669</pre></div></div></div>
670<div class="section">
671<h3><a name="Query_5_-_Fuzzy_Join"></a>Query 5 - Fuzzy Join</h3>
672<p>As another example of a non-equijoin use case, we could ask AsterixDB to find, for each Gleambook user, all Chirp users with names &#x201c;similar&#x201d; to their name. AsterixDB supports a variety of &#x201c;fuzzy match&#x201d; functions for use with textual and set-based data. As one example, we could choose to use edit distance with a threshold of 3 as the definition of name similarity, in which case we could write the following query using AQL&#x2019;s operator-based syntax (~=) for testing whether or not two values are similar:</p>
673
674<div class="source">
675<div class="source">
676<pre> use dataverse TinySocial;
677
678 set simfunction &quot;edit-distance&quot;;
679 set simthreshold &quot;3&quot;;
680
681 for $gbu in dataset GleambookUsers
682 return {
683 &quot;id&quot;: $gbu.id,
684 &quot;name&quot;: $gbu.name,
685 &quot;similarUsers&quot;: for $cm in dataset ChirpMessages
686 let $cu := $cm.user
687 where $cu.name ~= $gbu.name
688 return {
689 &quot;chirpScreenname&quot;: $cu.screenName,
690 &quot;chirpName&quot;: $cu.name
691 }
692 };
693</pre></div></div>
694<p>The expected result for this query against our sample data is:</p>
695
696<div class="source">
697<div class="source">
698<pre> { &quot;id&quot;: 6, &quot;name&quot;: &quot;WillisWynne&quot;, &quot;similarUsers&quot;: [ ] }
699 { &quot;id&quot;: 1, &quot;name&quot;: &quot;MargaritaStoddard&quot;, &quot;similarUsers&quot;: [ ] }
700 { &quot;id&quot;: 2, &quot;name&quot;: &quot;IsbelDull&quot;, &quot;similarUsers&quot;: [ ] }
701 { &quot;id&quot;: 4, &quot;name&quot;: &quot;NicholasStroh&quot;, &quot;similarUsers&quot;: [ ] }
702 { &quot;id&quot;: 8, &quot;name&quot;: &quot;NilaMilliron&quot;, &quot;similarUsers&quot;: [ { &quot;chirpScreenname&quot;: &quot;NilaMilliron_tw&quot;, &quot;chirpName&quot;: &quot;Nila Milliron&quot; } ] }
703 { &quot;id&quot;: 9, &quot;name&quot;: &quot;WoodrowNehling&quot;, &quot;similarUsers&quot;: [ ] }
704 { &quot;id&quot;: 10, &quot;name&quot;: &quot;BramHatch&quot;, &quot;similarUsers&quot;: [ ] }
705 { &quot;id&quot;: 3, &quot;name&quot;: &quot;EmoryUnk&quot;, &quot;similarUsers&quot;: [ ] }
706 { &quot;id&quot;: 5, &quot;name&quot;: &quot;VonKemble&quot;, &quot;similarUsers&quot;: [ ] }
707 { &quot;id&quot;: 7, &quot;name&quot;: &quot;SuzannaTillson&quot;, &quot;similarUsers&quot;: [ ] }
708</pre></div></div></div>
709<div class="section">
710<h3><a name="Query_6_-_Existential_Quantification"></a>Query 6 - Existential Quantification</h3>
711<p>The expressive power of AQL includes support for queries involving &#x201c;some&#x201d; (existentially quantified) and &#x201c;all&#x201d; (universally quantified) query semantics. As an example of an existential AQL query, here we show a query to list the Gleambook users who are currently employed. Such employees will have an employment history containing a object with the endDate value missing, which leads us to the following AQL query:</p>
712
713<div class="source">
714<div class="source">
715<pre> use dataverse TinySocial;
716
717 for $gbu in dataset GleambookUsers
718 where (some $e in $gbu.employment satisfies is-missing($e.endDate))
719 return $gbu;
720</pre></div></div>
721<p>The expected result in this case is:</p>
722
723<div class="source">
724<div class="source">
725<pre> { &quot;id&quot;: 6, &quot;alias&quot;: &quot;Willis&quot;, &quot;name&quot;: &quot;WillisWynne&quot;, &quot;userSince&quot;: datetime(&quot;2005-01-17T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 3, 7 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;jaydax&quot;, &quot;startDate&quot;: date(&quot;2009-05-15&quot;) } ] }
726 { &quot;id&quot;: 1, &quot;alias&quot;: &quot;Margarita&quot;, &quot;name&quot;: &quot;MargaritaStoddard&quot;, &quot;userSince&quot;: datetime(&quot;2012-08-20T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 2, 3, 6, 10 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Codetechno&quot;, &quot;startDate&quot;: date(&quot;2006-08-06&quot;) }, { &quot;organizationName&quot;: &quot;geomedia&quot;, &quot;startDate&quot;: date(&quot;2010-06-17&quot;), &quot;endDate&quot;: date(&quot;2010-01-26&quot;) } ], &quot;nickname&quot;: &quot;Mags&quot;, &quot;gender&quot;: &quot;F&quot; }
727 { &quot;id&quot;: 2, &quot;alias&quot;: &quot;Isbel&quot;, &quot;name&quot;: &quot;IsbelDull&quot;, &quot;userSince&quot;: datetime(&quot;2011-01-22T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 4 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Hexviafind&quot;, &quot;startDate&quot;: date(&quot;2010-04-27&quot;) } ], &quot;nickname&quot;: &quot;Izzy&quot; }
728 { &quot;id&quot;: 4, &quot;alias&quot;: &quot;Nicholas&quot;, &quot;name&quot;: &quot;NicholasStroh&quot;, &quot;userSince&quot;: datetime(&quot;2010-12-27T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 2 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Zamcorporation&quot;, &quot;startDate&quot;: date(&quot;2010-06-08&quot;) } ] }
729 { &quot;id&quot;: 8, &quot;alias&quot;: &quot;Nila&quot;, &quot;name&quot;: &quot;NilaMilliron&quot;, &quot;userSince&quot;: datetime(&quot;2008-01-01T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 3 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Plexlane&quot;, &quot;startDate&quot;: date(&quot;2010-02-28&quot;) } ] }
730 { &quot;id&quot;: 5, &quot;alias&quot;: &quot;Von&quot;, &quot;name&quot;: &quot;VonKemble&quot;, &quot;userSince&quot;: datetime(&quot;2010-01-05T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 3, 6, 10 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Kongreen&quot;, &quot;startDate&quot;: date(&quot;2010-11-27&quot;) } ] }
731 { &quot;id&quot;: 7, &quot;alias&quot;: &quot;Suzanna&quot;, &quot;name&quot;: &quot;SuzannaTillson&quot;, &quot;userSince&quot;: datetime(&quot;2012-08-07T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 6 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Labzatron&quot;, &quot;startDate&quot;: date(&quot;2011-04-19&quot;) } ] }
732</pre></div></div></div>
733<div class="section">
734<h3><a name="Query_7_-_Universal_Quantification"></a>Query 7 - Universal Quantification</h3>
735<p>As an example of a universal AQL query, here we show a query to list the Gleambook users who are currently unemployed. Such employees will have an employment history containing no objects that miss endDate values, leading us to the following AQL query:</p>
736
737<div class="source">
738<div class="source">
739<pre> use dataverse TinySocial;
740
741 for $gbu in dataset GleambookUsers
742 where (every $e in $gbu.employment satisfies not(is-missing($e.endDate)))
743 return $gbu;
744</pre></div></div>
745<p>Here is the expected result for our sample data:</p>
746
747<div class="source">
748<div class="source">
749<pre> { &quot;id&quot;: 9, &quot;alias&quot;: &quot;Woodrow&quot;, &quot;name&quot;: &quot;WoodrowNehling&quot;, &quot;userSince&quot;: datetime(&quot;2005-09-20T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 3, 10 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;Zuncan&quot;, &quot;startDate&quot;: date(&quot;2003-04-22&quot;), &quot;endDate&quot;: date(&quot;2009-12-13&quot;) } ], &quot;nickname&quot;: &quot;Woody&quot; }
750 { &quot;id&quot;: 10, &quot;alias&quot;: &quot;Bram&quot;, &quot;name&quot;: &quot;BramHatch&quot;, &quot;userSince&quot;: datetime(&quot;2010-10-16T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 5, 9 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;physcane&quot;, &quot;startDate&quot;: date(&quot;2007-06-05&quot;), &quot;endDate&quot;: date(&quot;2011-11-05&quot;) } ] }
751 { &quot;id&quot;: 3, &quot;alias&quot;: &quot;Emory&quot;, &quot;name&quot;: &quot;EmoryUnk&quot;, &quot;userSince&quot;: datetime(&quot;2012-07-10T10:10:00.000Z&quot;), &quot;friendIds&quot;: {{ 1, 5, 8, 9 }}, &quot;employment&quot;: [ { &quot;organizationName&quot;: &quot;geomedia&quot;, &quot;startDate&quot;: date(&quot;2010-06-17&quot;), &quot;endDate&quot;: date(&quot;2010-01-26&quot;) } ] }
752</pre></div></div></div>
753<div class="section">
754<h3><a name="Query_8_-_Simple_Aggregation"></a>Query 8 - Simple Aggregation</h3>
755<p>Like SQL, the AQL language of AsterixDB provides support for computing aggregates over large amounts of data. As a very simple example, the following AQL query computes the total number of Gleambook users:</p>
756
757<div class="source">
758<div class="source">
759<pre> use dataverse TinySocial;
760
761 count(for $gbu in dataset GleambookUsers return $gbu);
762</pre></div></div>
763<p>In AQL, aggregate functions can be applied to arbitrary subquery results; in this case, the count function is applied to the result of a query that enumerates the Gleambook users. The expected result here is:</p>
764
765<div class="source">
766<div class="source">
767<pre> 10
768</pre></div></div></div>
769<div class="section">
770<h3><a name="Query_9-A_-_Grouping_and_Aggregation"></a>Query 9-A - Grouping and Aggregation</h3>
771<p>Also like SQL, AQL supports grouped aggregation. For every Chirp user, the following group-by/aggregate query counts the number of chirps sent by that user:</p>
772
773<div class="source">
774<div class="source">
775<pre> use dataverse TinySocial;
776
777 for $cm in dataset ChirpMessages
778 group by $uid := $cm.user.screenName with $cm
779 return {
780 &quot;user&quot;: $uid,
781 &quot;count&quot;: count($cm)
782 };
783</pre></div></div>
784<p>The <i>for</i> clause incrementally binds $cm to chirps, and the <i>group by</i> clause groups the chirps by its issuer&#x2019;s Chirp screenName. Unlike SQL, where data is tabular&#x2014;flat&#x2014;the data model underlying AQL allows for nesting. Thus, following the <i>group by</i> clause, the <i>return</i> clause in this query sees a sequence of $cm groups, with each such group having an associated $uid variable value (i.e., the chirping user&#x2019;s screen name). In the context of the return clause, due to &#x201c;&#x2026; with $cm &#x2026;&#x201d;, $uid is bound to the chirper&#x2019;s id and $cm is bound to the <i>set</i> of chirps issued by that chirper. The return clause constructs a result object containing the chirper&#x2019;s user id and the count of the items in the associated chirp set. The query result will contain one such object per screen name. This query also illustrates another feature of AQL; notice that each user&#x2019;s screen name is accessed via a path syntax that traverses each chirp&#x2019;s nested object structure.</p>
785<p>Here is the expected result for this query over the sample data:</p>
786
787<div class="source">
788<div class="source">
789<pre> { &quot;user&quot;: &quot;OliJackson_512&quot;, &quot;count&quot;: 1 }
790 { &quot;user&quot;: &quot;ChangEwing_573&quot;, &quot;count&quot;: 1 }
791 { &quot;user&quot;: &quot;ColineGeyer@63&quot;, &quot;count&quot;: 3 }
792 { &quot;user&quot;: &quot;NathanGiesen@211&quot;, &quot;count&quot;: 6 }
793 { &quot;user&quot;: &quot;NilaMilliron_tw&quot;, &quot;count&quot;: 1 }
794</pre></div></div></div>
795<div class="section">
796<h3><a name="Query_9-B_-_Hash-Based_Grouping_and_Aggregation"></a>Query 9-B - (Hash-Based) Grouping and Aggregation</h3>
797<p>As for joins, AsterixDB has multiple evaluation strategies available for processing grouped aggregate queries. For grouped aggregation, the system knows how to employ both sort-based and hash-based aggregation methods, with sort-based methods being used by default and a hint being available to suggest that a different approach be used in processing a particular AQL query.</p>
798<p>The following query is similar to Query 9-A, but adds a hash-based aggregation hint:</p>
799
800<div class="source">
801<div class="source">
802<pre> use dataverse TinySocial;
803
804 for $cm in dataset ChirpMessages
805 /*+ hash*/
806 group by $uid := $cm.user.screenName with $cm
807 return {
808 &quot;user&quot;: $uid,
809 &quot;count&quot;: count($cm)
810 };
811</pre></div></div>
812<p>Here is the expected result:</p>
813
814<div class="source">
815<div class="source">
816<pre> { &quot;user&quot;: &quot;OliJackson_512&quot;, &quot;count&quot;: 1 }
817 { &quot;user&quot;: &quot;ChangEwing_573&quot;, &quot;count&quot;: 1 }
818 { &quot;user&quot;: &quot;ColineGeyer@63&quot;, &quot;count&quot;: 3 }
819 { &quot;user&quot;: &quot;NathanGiesen@211&quot;, &quot;count&quot;: 6 }
820 { &quot;user&quot;: &quot;NilaMilliron_tw&quot;, &quot;count&quot;: 1 }
821</pre></div></div></div>
822<div class="section">
823<h3><a name="Query_10_-_Grouping_and_Limits"></a>Query 10 - Grouping and Limits</h3>
824<p>In some use cases it is not necessary to compute the entire answer to a query. In some cases, just having the first <i>N</i> or top <i>N</i> results is sufficient. This is expressible in AQL using the <i>limit</i> clause combined with the <i>order by</i> clause.</p>
825<p>The following AQL query returns the top 3 Chirp users based on who has issued the most chirps:</p>
826
827<div class="source">
828<div class="source">
829<pre> use dataverse TinySocial;
830
831 for $cm in dataset ChirpMessages
832 group by $uid := $cm.user.screenName with $cm
833 let $c := count($cm)
834 order by $c desc
835 limit 3
836 return {
837 &quot;user&quot;: $uid,
838 &quot;count&quot;: $c
839 };
840</pre></div></div>
841<p>The expected result for this query is:</p>
842
843<div class="source">
844<div class="source">
845<pre> { &quot;user&quot;: &quot;NathanGiesen@211&quot;, &quot;count&quot;: 6 }
846 { &quot;user&quot;: &quot;ColineGeyer@63&quot;, &quot;count&quot;: 3 }
847 { &quot;user&quot;: &quot;OliJackson_512&quot;, &quot;count&quot;: 1 }
848</pre></div></div></div>
849<div class="section">
850<h3><a name="Query_11_-_Left_Outer_Fuzzy_Join"></a>Query 11 - Left Outer Fuzzy Join</h3>
851<p>As a last example of AQL and its query power, the following query, for each chirp, finds all of the chirps that are similar based on the topics that they refer to:</p>
852
853<div class="source">
854<div class="source">
855<pre> use dataverse TinySocial;
856
857 set simfunction &quot;jaccard&quot;;
858 set simthreshold &quot;0.3&quot;;
859
860 for $cm in dataset ChirpMessages
861 return {
862 &quot;chirp&quot;: $cm,
863 &quot;similarChirps&quot;: for $cm2 in dataset ChirpMessages
864 where $cm2.referredTopics ~= $cm.referredTopics
865 and $cm2.chirpId != $cm.chirpId
866 return $cm2.referredTopics
867 };
868</pre></div></div>
869<p>This query illustrates several things worth knowing in order to write fuzzy queries in AQL. First, as mentioned earlier, AQL offers an operator-based syntax for seeing whether two values are &#x201c;similar&#x201d; to one another or not. Second, recall that the referredTopics field of objects of datatype ChirpMessageType is a bag of strings. This query sets the context for its similarity join by requesting that Jaccard-based similarity semantics (<a class="externalLink" href="http://en.wikipedia.org/wiki/Jaccard_index">http://en.wikipedia.org/wiki/Jaccard_index</a>) be used for the query&#x2019;s similarity operator and that a similarity index of 0.3 be used as its similarity threshold.</p>
870<p>The expected result for this fuzzy join query is:</p>
871
872<div class="source">
873<div class="source">
874<pre> { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;11&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NilaMilliron_tw&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 445, &quot;statusesCount&quot;: 164, &quot;name&quot;: &quot;Nila Milliron&quot;, &quot;followersCount&quot;: 22649 }, &quot;senderLocation&quot;: point(&quot;37.59,68.42&quot;), &quot;sendTime&quot;: datetime(&quot;2008-03-09T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;x-phone&quot;, &quot;platform&quot; }}, &quot;messageText&quot;: &quot; can't stand x-phone its platform is terrible&quot; }, &quot;similarChirps&quot;: [ {{ &quot;x-phone&quot;, &quot;voice-clarity&quot; }}, {{ &quot;product-y&quot;, &quot;platform&quot; }} ] }
875 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;2&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;ColineGeyer@63&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 121, &quot;statusesCount&quot;: 362, &quot;name&quot;: &quot;Coline Geyer&quot;, &quot;followersCount&quot;: 17159 }, &quot;senderLocation&quot;: point(&quot;32.84,67.14&quot;), &quot;sendTime&quot;: datetime(&quot;2010-05-13T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;ccast&quot;, &quot;shortcut-menu&quot; }}, &quot;messageText&quot;: &quot; like ccast its shortcut-menu is awesome:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;ccast&quot;, &quot;voicemail-service&quot; }}, {{ &quot;ccast&quot;, &quot;voice-clarity&quot; }}, {{ &quot;product-z&quot;, &quot;shortcut-menu&quot; }} ] }
876 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;4&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;39.28,70.48&quot;), &quot;sendTime&quot;: datetime(&quot;2011-12-26T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-b&quot;, &quot;voice-command&quot; }}, &quot;messageText&quot;: &quot; like product-b the voice-command is mind-blowing:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;product-y&quot;, &quot;voice-command&quot; }} ] }
877 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;9&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;36.86,74.62&quot;), &quot;sendTime&quot;: datetime(&quot;2012-07-21T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;ccast&quot;, &quot;voicemail-service&quot; }}, &quot;messageText&quot;: &quot; love ccast its voicemail-service is awesome&quot; }, &quot;similarChirps&quot;: [ {{ &quot;ccast&quot;, &quot;shortcut-menu&quot; }}, {{ &quot;ccast&quot;, &quot;voice-clarity&quot; }} ] }
878 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;1&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;47.44,80.65&quot;), &quot;sendTime&quot;: datetime(&quot;2008-04-26T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-z&quot;, &quot;customization&quot; }}, &quot;messageText&quot;: &quot; love product-z its customization is good:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;product-z&quot;, &quot;shortcut-menu&quot; }} ] }
879 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;5&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;40.09,92.69&quot;), &quot;sendTime&quot;: datetime(&quot;2006-08-04T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-w&quot;, &quot;speed&quot; }}, &quot;messageText&quot;: &quot; can't stand product-w its speed is terrible:(&quot; }, &quot;similarChirps&quot;: [ {{ &quot;product-w&quot;, &quot;speed&quot; }} ] }
880 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;3&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;29.72,75.8&quot;), &quot;sendTime&quot;: datetime(&quot;2006-11-04T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-w&quot;, &quot;speed&quot; }}, &quot;messageText&quot;: &quot; like product-w the speed is good:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;product-w&quot;, &quot;speed&quot; }} ] }
881 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;6&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;ColineGeyer@63&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 121, &quot;statusesCount&quot;: 362, &quot;name&quot;: &quot;Coline Geyer&quot;, &quot;followersCount&quot;: 17159 }, &quot;senderLocation&quot;: point(&quot;47.51,83.99&quot;), &quot;sendTime&quot;: datetime(&quot;2010-05-07T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;x-phone&quot;, &quot;voice-clarity&quot; }}, &quot;messageText&quot;: &quot; like x-phone the voice-clarity is good:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;x-phone&quot;, &quot;platform&quot; }}, {{ &quot;ccast&quot;, &quot;voice-clarity&quot; }} ] }
882 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;7&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;ChangEwing_573&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 182, &quot;statusesCount&quot;: 394, &quot;name&quot;: &quot;Chang Ewing&quot;, &quot;followersCount&quot;: 32136 }, &quot;senderLocation&quot;: point(&quot;36.21,72.6&quot;), &quot;sendTime&quot;: datetime(&quot;2011-08-25T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-y&quot;, &quot;platform&quot; }}, &quot;messageText&quot;: &quot; like product-y the platform is good&quot; }, &quot;similarChirps&quot;: [ {{ &quot;x-phone&quot;, &quot;platform&quot; }}, {{ &quot;product-y&quot;, &quot;voice-command&quot; }} ] }
883 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;10&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;ColineGeyer@63&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 121, &quot;statusesCount&quot;: 362, &quot;name&quot;: &quot;Coline Geyer&quot;, &quot;followersCount&quot;: 17159 }, &quot;senderLocation&quot;: point(&quot;29.15,76.53&quot;), &quot;sendTime&quot;: datetime(&quot;2008-01-26T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;ccast&quot;, &quot;voice-clarity&quot; }}, &quot;messageText&quot;: &quot; hate ccast its voice-clarity is OMG:(&quot; }, &quot;similarChirps&quot;: [ {{ &quot;ccast&quot;, &quot;shortcut-menu&quot; }}, {{ &quot;ccast&quot;, &quot;voicemail-service&quot; }}, {{ &quot;x-phone&quot;, &quot;voice-clarity&quot; }} ] }
884 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;12&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;OliJackson_512&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 445, &quot;statusesCount&quot;: 164, &quot;name&quot;: &quot;Oli Jackson&quot;, &quot;followersCount&quot;: 22649 }, &quot;senderLocation&quot;: point(&quot;24.82,94.63&quot;), &quot;sendTime&quot;: datetime(&quot;2010-02-13T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-y&quot;, &quot;voice-command&quot; }}, &quot;messageText&quot;: &quot; like product-y the voice-command is amazing:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;product-b&quot;, &quot;voice-command&quot; }}, {{ &quot;product-y&quot;, &quot;platform&quot; }} ] }
885 { &quot;chirp&quot;: { &quot;chirpId&quot;: &quot;8&quot;, &quot;user&quot;: { &quot;screenName&quot;: &quot;NathanGiesen@211&quot;, &quot;lang&quot;: &quot;en&quot;, &quot;friendsCount&quot;: 39339, &quot;statusesCount&quot;: 473, &quot;name&quot;: &quot;Nathan Giesen&quot;, &quot;followersCount&quot;: 49416 }, &quot;senderLocation&quot;: point(&quot;46.05,93.34&quot;), &quot;sendTime&quot;: datetime(&quot;2005-10-14T10:10:00.000Z&quot;), &quot;referredTopics&quot;: {{ &quot;product-z&quot;, &quot;shortcut-menu&quot; }}, &quot;messageText&quot;: &quot; like product-z the shortcut-menu is awesome:)&quot; }, &quot;similarChirps&quot;: [ {{ &quot;ccast&quot;, &quot;shortcut-menu&quot; }}, {{ &quot;product-z&quot;, &quot;customization&quot; }} ] }
886</pre></div></div></div>
887<div class="section">
Ian Maxon444ca1b2017-08-25 11:41:41 -0700888<h3><a name="Deleting_Existing_Data"></a>Deleting Existing Data</h3>
889<p>In addition to inserting new data, AsterixDB supports deletion from datasets via the AQL <i>delete</i> statement. The statement supports &#x201c;searched delete&#x201d; semantics, and its <i>where</i> clause can involve any valid XQuery expression.</p>
890<p>The following example deletes the chirp that we just added from user &quot;<a class="externalLink" href="mailto:NathanGiesen@211&quot;">NathanGiesen@211&quot;</a>. (Easy come, easy go. :-))</p>
891
892<div class="source">
893<div class="source">
894<pre> use dataverse TinySocial;
895
896 delete $cm from dataset ChirpMessages where $cm.chirpId = &quot;13&quot;;
897</pre></div></div>
898<p>It should be noted that one form of data change not yet supported by AsterixDB is in-place data modification (<i>update</i>). Currently, only insert and delete operations are supported; update is not. To achieve the effect of an update, two statements are currently needed&#x2014;one to delete the old object from the dataset where it resides, and another to insert the new replacement object (with the same primary key but with different field values for some of the associated data content).</p></div>
899<div class="section">
900<h3><a name="Upserting_Data"></a>Upserting Data</h3>
901<p>In addition to loading, querying, inserting, and deleting data, AsterixDB supports upserting objects using the AQL <i>upsert</i> statement.</p>
902<p>The following example deletes the chirp with chirpId = 20 (if one exists) and inserts the new chirp with chirpId = 20 by user &#x201c;SwanSmitty&#x201d; to the ChirpMessages dataset. The two operations (delete if found and insert) are performed as an atomic operation that is either performed completely or not at all.</p>
903
904<div class="source">
905<div class="source">
906<pre> use dataverse TinySocial;
907 upsert into dataset ChirpMessages
908 (
909 {&quot;chirpId&quot;: &quot;20&quot;,
910 &quot;user&quot;:
911 {&quot;screenName&quot;: &quot;SwanSmitty&quot;,
912 &quot;lang&quot;: &quot;en&quot;,
913 &quot;friendsCount&quot;: 91345,
914 &quot;statusesCount&quot;: 4079,
915 &quot;name&quot;: &quot;Swanson Smith&quot;,
916 &quot;followersCount&quot;: 50420
917 },
918 &quot;senderLocation&quot;: point(&quot;47.44,80.65&quot;),
919 &quot;sendTime&quot;: datetime(&quot;2008-04-26T10:10:35&quot;),
920 &quot;referredTopics&quot;: {{&quot;football&quot;}},
921 &quot;messageText&quot;: &quot;football is the best sport, period.!&quot;
922 }
923 );
924</pre></div></div>
925<p>The data to be upserted may be specified using any valid AQL query expression. For example, the following statement might be used to double the followers count of all existing users.</p>
926
927<div class="source">
928<div class="source">
929<pre> use dataverse TinySocial;
930 upsert into dataset ChirpUsers
931 (
932 for $user in dataset ChirpUsers
933 return {
934 &quot;screenName&quot;: $user.screenName,
935 &quot;lang&quot;: $user.lang,
936 &quot;friendsCount&quot;: $user.friendsCount,
937 &quot;statusesCount&quot;: $user.statusesCount,
938 &quot;name&quot;: $user.name,
939 &quot;followersCount&quot;: $user.followersCount * 2
940 }
941 );
942</pre></div></div>
943<p>Note that such an upsert operation is executed in two steps: The query is performed, after which the query&#x2019;s locks are released, and then its result is upserted into the dataset. This means that a object can be modified between computing the query result and performing the upsert.</p></div>
944<div class="section">
945<h3><a name="Transaction_Support"></a>Transaction Support</h3>
946<p>AsterixDB supports object-level ACID transactions that begin and terminate implicitly for each object inserted, deleted, or searched while a given AQL statement is being executed. This is quite similar to the level of transaction support found in today&#x2019;s NoSQL stores. AsterixDB does not support multi-statement transactions, and in fact an AQL statement that involves multiple objects can itself involve multiple independent object-level transactions. An example consequence of this is that, when an AQL statement attempts to insert 1000 objects, it is possible that the first 800 objects could end up being committed while the remaining 200 objects fail to be inserted. This situation could happen, for example, if a duplicate key exception occurs as the 801st insertion is attempted. If this happens, AsterixDB will report the error (e.g., a duplicate key exception) as the result of the offending AQL insert statement, and the application logic above will need to take the appropriate action(s) needed to assess the resulting state and to clean up and/or continue as appropriate.</p></div>
947<div class="section">
948<h3><a name="Loading_New_Data_in_Bulk"></a>Loading New Data in Bulk</h3>
949<p>In addition to incremental additions to datasets via the AQL <i>insert</i> statement, the <i>load</i> statement can be used to take a file from a given node and load it in a more efficient fashion. Note however that a dataset can currently only be loaded if it is empty.</p>
950<p>The following example loads a file in ADM format from &#x201c;/home/user/gbm.adm&#x201d; from the node named &#x201c;nc1&#x201d; into the GleambookUsers dataset.</p>
951
952<div class="source">
953<div class="source">
954<pre>use dataverse TinySocial;
955
956load dataset GleambookUsers using localfs
957 ((&quot;path&quot;=&quot;nc1://home/user/gbu.adm&quot;),(&quot;format&quot;=&quot;adm&quot;));
958</pre></div></div></div></div>
959<div class="section">
960<h2><a name="Further_Help"></a>Further Help</h2>
961<p>That&#x2019;s it! You are now armed and dangerous with respect to semistructured data management using AsterixDB and AQL.</p>
962<p>AsterixDB is a powerful new BDMS&#x2014;Big Data Management System&#x2014;that we hope may usher in a new era of much more declarative Big Data management. AsterixDB is powerful, so use it wisely, and remember: &#x201c;With great power comes great responsibility&#x2026;&#x201d; :-)</p>
963<p>Please e-mail the AsterixDB user group (users (at) asterixdb.apache.org) if you run into any problems or simply have further questions about the AsterixDB system, its features, or their proper use.</p></div>
964 </div>
965 </div>
966 </div>
967
968 <hr/>
969
970 <footer>
971 <div class="container-fluid">
972 <div class="row span12">Copyright &copy; 2017
973 <a href="https://www.apache.org/">The Apache Software Foundation</a>.
974 All Rights Reserved.
975
976 </div>
977
978 <?xml version="1.0" encoding="UTF-8"?>
979<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
980 feather logo, and the Apache AsterixDB project logo are either
981 registered trademarks or trademarks of The Apache Software
982 Foundation in the United States and other countries.
983 All other marks mentioned may be trademarks or registered
984 trademarks of their respective owners.</div>
985
986
987 </div>
988 </footer>
989 </body>
990</html>