blob: 82cecc024ace9e533d686f82618a3466f64b12ab [file] [log] [blame]
Ian Maxon9c40a662018-02-09 12:42:56 -08001<!DOCTYPE html>
2<!--
3 | Generated by Apache Maven Doxia at 2018-02-09
4 | Rendered using Apache Maven Fluido Skin 1.3.0
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
10 <meta name="Date-Revision-yyyymmdd" content="20180209" />
11 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; The SQL++ Query Language</title>
13 <link rel="stylesheet" href="../css/apache-maven-fluido-1.3.0.min.css" />
14 <link rel="stylesheet" href="../css/site.css" />
15 <link rel="stylesheet" href="../css/print.css" media="print" />
16
17
18 <script type="text/javascript" src="../js/apache-maven-fluido-1.3.0.min.js"></script>
19
20
21
Ian Maxon9c40a662018-02-09 12:42:56 -080022
Ian Maxon9c40a662018-02-09 12:42:56 -080023
24 </head>
25 <body class="topBarDisabled">
26
27
28
29
30 <div class="container-fluid">
31 <div id="banner">
32 <div class="pull-left">
33 <a href=".././" id="bannerLeft">
34 <img src="../images/asterixlogo.png" alt="AsterixDB"/>
35 </a>
36 </div>
37 <div class="pull-right"> </div>
38 <div class="clear"><hr/></div>
39 </div>
40
41 <div id="breadcrumbs">
42 <ul class="breadcrumb">
43
44
45 <li id="publishDate">Last Published: 2018-02-09</li>
46
47
48
49 <li id="projectVersion" class="pull-right">Version: 0.9.3</li>
50
51 <li class="divider pull-right">|</li>
52
53 <li class="pull-right"> <a href="../index.html" title="Documentation Home">
54 Documentation Home</a>
55 </li>
56
57 </ul>
58 </div>
59
60
61 <div class="row-fluid">
62 <div id="leftColumn" class="span3">
63 <div class="well sidebar-nav">
64
65
66 <ul class="nav nav-list">
67 <li class="nav-header">Get Started - Installation</li>
68
69 <li>
70
71 <a href="../ncservice.html" title="Option 1: using NCService">
72 <i class="none"></i>
73 Option 1: using NCService</a>
74 </li>
75
76 <li>
77
78 <a href="../ansible.html" title="Option 2: using Ansible">
79 <i class="none"></i>
80 Option 2: using Ansible</a>
81 </li>
82
83 <li>
84
85 <a href="../aws.html" title="Option 3: using Amazon Web Services">
86 <i class="none"></i>
87 Option 3: using Amazon Web Services</a>
88 </li>
89
90 <li>
91
92 <a href="../yarn.html" title="Option 4: using YARN">
93 <i class="none"></i>
94 Option 4: using YARN</a>
95 </li>
96
97 <li>
98
99 <a href="../install.html" title="Option 5: using Managix (deprecated)">
100 <i class="none"></i>
101 Option 5: using Managix (deprecated)</a>
102 </li>
103 <li class="nav-header">AsterixDB Primer</li>
104
105 <li>
106
107 <a href="../sqlpp/primer-sqlpp.html" title="Option 1: using SQL++">
108 <i class="none"></i>
109 Option 1: using SQL++</a>
110 </li>
111
112 <li>
113
114 <a href="../aql/primer.html" title="Option 2: using AQL">
115 <i class="none"></i>
116 Option 2: using AQL</a>
117 </li>
118 <li class="nav-header">Data Model</li>
119
120 <li>
121
122 <a href="../datamodel.html" title="The Asterix Data Model">
123 <i class="none"></i>
124 The Asterix Data Model</a>
125 </li>
126 <li class="nav-header">Queries - SQL++</li>
127
128 <li class="active">
129
130 <a href="#"><i class="none"></i>The SQL++ Query Language</a>
131 </li>
132
133 <li>
134
135 <a href="../sqlpp/builtins.html" title="Builtin Functions">
136 <i class="none"></i>
137 Builtin Functions</a>
138 </li>
139 <li class="nav-header">Queries - AQL</li>
140
141 <li>
142
143 <a href="../aql/manual.html" title="The Asterix Query Language (AQL)">
144 <i class="none"></i>
145 The Asterix Query Language (AQL)</a>
146 </li>
147
148 <li>
149
150 <a href="../aql/builtins.html" title="Builtin Functions">
151 <i class="none"></i>
152 Builtin Functions</a>
153 </li>
154 <li class="nav-header">API/SDK</li>
155
156 <li>
157
158 <a href="../api.html" title="HTTP API">
159 <i class="none"></i>
160 HTTP API</a>
161 </li>
162
163 <li>
164
165 <a href="../csv.html" title="CSV Output">
166 <i class="none"></i>
167 CSV Output</a>
168 </li>
169 <li class="nav-header">Advanced Features</li>
170
171 <li>
172
173 <a href="../aql/fulltext.html" title="Support of Full-text Queries">
174 <i class="none"></i>
175 Support of Full-text Queries</a>
176 </li>
177
178 <li>
179
180 <a href="../aql/externaldata.html" title="Accessing External Data">
181 <i class="none"></i>
182 Accessing External Data</a>
183 </li>
184
185 <li>
186
187 <a href="../feeds/tutorial.html" title="Support for Data Ingestion">
188 <i class="none"></i>
189 Support for Data Ingestion</a>
190 </li>
191
192 <li>
193
194 <a href="../udf.html" title="User Defined Functions">
195 <i class="none"></i>
196 User Defined Functions</a>
197 </li>
198
199 <li>
200
201 <a href="../aql/filters.html" title="Filter-Based LSM Index Acceleration">
202 <i class="none"></i>
203 Filter-Based LSM Index Acceleration</a>
204 </li>
205
206 <li>
207
208 <a href="../aql/similarity.html" title="Support of Similarity Queries">
209 <i class="none"></i>
210 Support of Similarity Queries</a>
211 </li>
212 </ul>
213
214
215
216 <hr class="divider" />
217
218 <div id="poweredBy">
219 <div class="clear"></div>
220 <div class="clear"></div>
221 <div class="clear"></div>
222 <a href=".././" title="AsterixDB" class="builtBy">
223 <img class="builtBy" alt="AsterixDB" src="../images/asterixlogo.png" />
224 </a>
225 </div>
226 </div>
227 </div>
228
229
230 <div id="bodyColumn" class="span9" >
231
232 <!-- ! Licensed to the Apache Software Foundation (ASF) under one
233 ! or more contributor license agreements. See the NOTICE file
234 ! distributed with this work for additional information
235 ! regarding copyright ownership. The ASF licenses this file
236 ! to you under the Apache License, Version 2.0 (the
237 ! "License"); you may not use this file except in compliance
238 ! with the License. You may obtain a copy of the License at
239 !
240 ! http://www.apache.org/licenses/LICENSE-2.0
241 !
242 ! Unless required by applicable law or agreed to in writing,
243 ! software distributed under the License is distributed on an
244 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
245 ! KIND, either express or implied. See the License for the
246 ! specific language governing permissions and limitations
247 ! under the License.
248 ! --><h1>The SQL++ Query Language</h1>
249
250<ul>
251
252<li><a href="#Introduction">1. Introduction</a></li>
253
254<li><a href="#Expressions">2. Expressions</a>
255
256<ul>
257
258<li><a href="#Operator_expressions">Operator Expressions</a>
259
260<ul>
261
262<li><a href="#Arithmetic_operators">Arithmetic Operators</a></li>
263
264<li><a href="#Collection_operators">Collection Operators</a></li>
265
266<li><a href="#Comparison_operators">Comparison Operators</a></li>
267
268<li><a href="#Logical_operators">Logical Operators</a></li>
269 </ul></li>
270
271<li><a href="#Case_expressions">Case Expressions</a></li>
272
273<li><a href="#Quantified_expressions">Quantified Expressions</a></li>
274
275<li><a href="#Path_expressions">Path Expressions</a></li>
276
277<li><a href="#Primary_expressions">Primary Expressions</a>
278
279<ul>
280
281<li><a href="#Literals">Literals</a></li>
282
283<li><a href="#Variable_references">Variable References</a></li>
284
285<li><a href="#Parenthesized_expressions">Parenthesized Expressions</a></li>
286
287<li><a href="#Function_call_expressions">Function call Expressions</a></li>
288
289<li><a href="#Constructors">Constructors</a></li>
290 </ul></li>
291 </ul></li>
292
293<li><a href="#Queries">3. Queries</a>
294
295<ul>
296
297<li><a href="#Declarations">Declarations</a></li>
298
299<li><a href="#SELECT_statements">SELECT Statements</a></li>
300
301<li><a href="#Select_clauses">SELECT Clauses</a>
302
303<ul>
304
305<li><a href="#Select_element">Select Element/Value/Raw</a></li>
306
307<li><a href="#SQL_select">SQL-style Select</a></li>
308
309<li><a href="#Select_star">Select *</a></li>
310
311<li><a href="#Select_distinct">Select Distinct</a></li>
312
313<li><a href="#Unnamed_projections">Unnamed Projections</a></li>
314
315<li><a href="#Abbreviated_field_access_expressions">Abbreviated Field Access Expressions</a></li>
316 </ul></li>
317
318<li><a href="#Unnest_clauses">UNNEST Clauses</a>
319
320<ul>
321
322<li><a href="#Inner_unnests">Inner Unnests</a></li>
323
324<li><a href="#Left_outer_unnests">Left Outer Unnests</a></li>
325
326<li><a href="#Expressing_joins_using_unnests">Expressing Joins Using Unnests</a></li>
327 </ul></li>
328
329<li><a href="#From_clauses">FROM clauses</a>
330
331<ul>
332
333<li><a href="#Binding_expressions">Binding Expressions</a></li>
334
335<li><a href="#Multiple_from_terms">Multiple From Terms</a></li>
336
337<li><a href="#Expressing_joins_using_from_terms">Expressing Joins Using From Terms</a></li>
338
339<li><a href="#Implicit_binding_variables">Implicit Binding Variables</a></li>
340 </ul></li>
341
342<li><a href="#Join_clauses">JOIN Clauses</a>
343
344<ul>
345
346<li><a href="#Inner_joins">Inner Joins</a></li>
347
348<li><a href="#Left_outer_joins">Left Outer Joins</a></li>
349 </ul></li>
350
351<li><a href="#Group_By_clauses">GROUP BY Clauses</a>
352
353<ul>
354
355<li><a href="#Group_variables">Group Variables</a></li>
356
357<li><a href="#Implicit_group_key_variables">Implicit Group Key Variables</a></li>
358
359<li><a href="#Implicit_group_variables">Implicit Group Variables</a></li>
360
361<li><a href="#Aggregation_functions">Aggregation Functions</a></li>
362
363<li><a href="#SQL-92_aggregation_functions">SQL-92 Aggregation Functions</a></li>
364
365<li><a href="#SQL-92_compliant_gby">SQL-92 Compliant GROUP BY Aggregations</a></li>
366
367<li><a href="#Column_aliases">Column Aliases</a></li>
368 </ul></li>
369
370<li><a href="#Where_having_clauses">WHERE Clauses and HAVING Clauses</a></li>
371
372<li><a href="#Order_By_clauses">ORDER BY Clauses</a></li>
373
374<li><a href="#Limit_clauses">LIMIT Clauses</a></li>
375
376<li><a href="#With_clauses">WITH Clauses</a></li>
377
378<li><a href="#Let_clauses">LET Clauses</a></li>
379
380<li><a href="#Union_all">UNION ALL</a></li>
381
382<li><a href="#Vs_SQL-92">SQL++ Vs. SQL-92</a></li>
383 </ul></li>
384
385<li><a href="#Errors">4. Errors</a>
386
387<ul>
388
389<li><a href="#Syntax_errors">Syntax Errors</a></li>
390
391<li><a href="#Identifier_resolution_errors">Identifier Resolution Errors</a></li>
392
393<li><a href="#Type_errors">Type Errors</a></li>
394
395<li><a href="#Resource_errors">Resource Errors</a></li>
396 </ul></li>
397
398<li><a href="#DDL_and_DML_statements">5. DDL and DML Statements</a>
399
400<ul>
401
402<li><a href="#Lifecycle_management_statements">Lifecycle Management Statements</a>
403
404<ul>
405
406<li><a href="#Dataverses">Dataverses</a></li>
407
408<li><a href="#Types">Types</a></li>
409
410<li><a href="#Datasets">Datasets</a></li>
411
412<li><a href="#Indices">Indices</a></li>
413
414<li><a href="#Functions">Functions</a></li>
415
416<li><a href="#Removal">Removal</a></li>
417
418<li><a href="#Load_statement">Load Statement</a></li>
419 </ul></li>
420
421<li><a href="#Modification_statements">Modification Statements</a>
422
423<ul>
424
425<li><a href="#Inserts">Inserts</a></li>
426
427<li><a href="#Upserts">Upserts</a></li>
428
429<li><a href="#Deletes">Deletes</a></li>
430 </ul></li>
431 </ul></li>
432
433<li><a href="#Reserved_keywords">Appendix 1. Reserved Keywords</a></li>
434
435<li><a href="#Performance_tuning">Appendix 2. Performance Tuning</a>
436
437<ul>
438
439<li><a href="#Parallelism_parameter">Parallelism Parameter</a></li>
440
441<li><a href="#Memory_parameters">Memory Parameters</a></li>
442 </ul></li>
443</ul>
444<!-- ! Licensed to the Apache Software Foundation (ASF) under one
445 ! or more contributor license agreements. See the NOTICE file
446 ! distributed with this work for additional information
447 ! regarding copyright ownership. The ASF licenses this file
448 ! to you under the Apache License, Version 2.0 (the
449 ! "License"); you may not use this file except in compliance
450 ! with the License. You may obtain a copy of the License at
451 !
452 ! http://www.apache.org/licenses/LICENSE-2.0
453 !
454 ! Unless required by applicable law or agreed to in writing,
455 ! software distributed under the License is distributed on an
456 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
457 ! KIND, either express or implied. See the License for the
458 ! specific language governing permissions and limitations
459 ! under the License.
460 ! -->
461<h1><a name="Introduction" id="Introduction">1. Introduction</a><font size="3" /></h1>
462<p>This document is intended as a reference guide to the full syntax and semantics of the SQL++ Query Language, a SQL-inspired language for working with semistructured data. SQL++ has much in common with SQL, but some differences do exist due to the different data models that the two languages were designed to serve. SQL was designed in the 1970&#x2019;s for interacting with the flat, schema-ified world of relational databases, while SQL++ is much newer and targets the nested, schema-optional (or even schema-less) world of modern NoSQL systems.</p>
463<p>In the context of Apache AsterixDB, SQL++ is intended for working with the Asterix Data Model (<a href="../datamodel.html">ADM</a>),a data model based on a superset of JSON with an enriched and flexible type system. New AsterixDB users are encouraged to read and work through the (much friendlier) guide &#x201c;<a href="primer-sqlpp.html">AsterixDB 101: An ADM and SQL++ Primer</a>&#x201d; before attempting to make use of this document. In addition, readers are advised to read through the <a href="../datamodel.html">Asterix Data Model (ADM) reference guide</a> first as well, as an understanding of the data model is a prerequisite to understanding SQL++.</p>
464<p>In what follows, we detail the features of the SQL++ language in a grammar-guided manner. We list and briefly explain each of the productions in the SQL++ grammar, offering examples (and results) for clarity.</p>
465<!-- ! Licensed to the Apache Software Foundation (ASF) under one
466 ! or more contributor license agreements. See the NOTICE file
467 ! distributed with this work for additional information
468 ! regarding copyright ownership. The ASF licenses this file
469 ! to you under the Apache License, Version 2.0 (the
470 ! "License"); you may not use this file except in compliance
471 ! with the License. You may obtain a copy of the License at
472 !
473 ! http://www.apache.org/licenses/LICENSE-2.0
474 !
475 ! Unless required by applicable law or agreed to in writing,
476 ! software distributed under the License is distributed on an
477 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
478 ! KIND, either express or implied. See the License for the
479 ! specific language governing permissions and limitations
480 ! under the License.
481 ! -->
482<h1><a name="Expressions" id="Expressions">2. Expressions</a></h1>
483<!-- ! Licensed to the Apache Software Foundation (ASF) under one
484 ! or more contributor license agreements. See the NOTICE file
485 ! distributed with this work for additional information
486 ! regarding copyright ownership. The ASF licenses this file
487 ! to you under the Apache License, Version 2.0 (the
488 ! "License"); you may not use this file except in compliance
489 ! with the License. You may obtain a copy of the License at
490 !
491 ! http://www.apache.org/licenses/LICENSE-2.0
492 !
493 ! Unless required by applicable law or agreed to in writing,
494 ! software distributed under the License is distributed on an
495 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
496 ! KIND, either express or implied. See the License for the
497 ! specific language governing permissions and limitations
498 ! under the License.
499 ! -->
500<p>SQL++ is a highly composable expression language. Each SQL++ expression returns zero or more data model instances. There are three major kinds of expressions in SQL++. At the topmost level, a SQL++ expression can be an OperatorExpression (similar to a mathematical expression), an ConditionalExpression (to choose between alternative values), or a QuantifiedExpression (which yields a boolean value). Each will be detailed as we explore the full SQL++ grammar.</p>
501
502<div class="source">
503<div class="source">
504<pre>Expression ::= OperatorExpression | CaseExpression | QuantifiedExpression
505</pre></div></div>
506<p>Note that in the following text, words enclosed in angle brackets denote keywords that are not case-sensitive.</p>
507<div class="section">
508<h2><a name="Operator_Expressions"></a><a name="Operator_expressions" id="Operator_expressions">Operator Expressions</a></h2>
509<p>Operators perform a specific operation on the input values or expressions. The syntax of an operator expression is as follows:</p>
510
511<div class="source">
512<div class="source">
513<pre>OperatorExpression ::= PathExpression
514 | Operator OperatorExpression
515 | OperatorExpression Operator (OperatorExpression)?
516 | OperatorExpression &lt;BETWEEN&gt; OperatorExpression &lt;AND&gt; OperatorExpression
517</pre></div></div>
518<p>SQL++ provides a full set of operators that you can use within its statements. Here are the categories of operators:</p>
519
520<ul>
521
522<li><a href="#Arithmetic_operators">Arithmetic Operators</a>, to perform basic mathematical operations;</li>
523
524<li><a href="#Collection_operators">Collection Operators</a>, to evaluate expressions on collections or objects;</li>
525
526<li><a href="#Comparison_operators">Comparison Operators</a>, to compare two expressions;</li>
527
528<li><a href="#Logical_operators">Logical Operators</a>, to combine operators using Boolean logic.</li>
529</ul>
530<p>The following table summarizes the precedence order (from higher to lower) of the major unary and binary operators:</p>
531
532<table border="0" class="table table-striped">
533 <thead>
534
535<tr class="a">
536
537<th>Operator </th>
538
539<th>Operation </th>
540 </tr>
541 </thead>
542 <tbody>
543
544<tr class="b">
545
546<td>EXISTS, NOT EXISTS </td>
547
548<td>Collection emptiness testing </td>
549 </tr>
550
551<tr class="a">
552
553<td>^ </td>
554
555<td>Exponentiation </td>
556 </tr>
557
558<tr class="b">
559
560<td>*, /, % </td>
561
562<td>Multiplication, division, modulo </td>
563 </tr>
564
565<tr class="a">
566
567<td>+, - </td>
568
569<td>Addition, subtraction </td>
570 </tr>
571
572<tr class="b">
573
574<td>|| </td>
575
576<td>String concatenation </td>
577 </tr>
578
579<tr class="a">
580
581<td>IS NULL, IS NOT NULL, IS MISSING, IS NOT MISSING, <br />IS UNKNOWN, IS NOT UNKNOWN</td>
582
583<td>Unknown value comparison </td>
584 </tr>
585
586<tr class="b">
587
588<td>BETWEEN, NOT BETWEEN </td>
589
590<td>Range comparison (inclusive on both sides) </td>
591 </tr>
592
593<tr class="a">
594
595<td>=, !=, &lt;&gt;, &lt;, &gt;, &lt;=, &gt;=, LIKE, NOT LIKE, IN, NOT IN </td>
596
597<td>Comparison </td>
598 </tr>
599
600<tr class="b">
601
602<td>NOT </td>
603
604<td>Logical negation </td>
605 </tr>
606
607<tr class="a">
608
609<td>AND </td>
610
611<td>Conjunction </td>
612 </tr>
613
614<tr class="b">
615
616<td>OR </td>
617
618<td>Disjunction </td>
619 </tr>
620 </tbody>
621</table>
622<p>In general, if any operand evaluates to a <tt>MISSING</tt> value, the enclosing operator will return <tt>MISSING</tt>; if none of operands evaluates to a <tt>MISSING</tt> value but there is an operand evaluates to a <tt>NULL</tt> value, the enclosing operator will return <tt>NULL</tt>. However, there are a few exceptions listed in <a href="#Comparison_operators">comparison operators</a> and <a href="#Logical_operators">logical operators</a>.</p>
623<div class="section">
624<h3><a name="Arithmetic_Operators"></a><a name="Arithmetic_operators" id="Arithmetic_operators">Arithmetic Operators</a></h3>
625<p>Arithmetic operators are used to exponentiate, add, subtract, multiply, and divide numeric values, or concatenate string values.</p>
626
627<table border="0" class="table table-striped">
628 <thead>
629
630<tr class="a">
631
632<th>Operator </th>
633
634<th>Purpose </th>
635
636<th>Example </th>
637 </tr>
638 </thead>
639 <tbody>
640
641<tr class="b">
642
643<td>+, - </td>
644
645<td>As unary operators, they denote a <br />positive or negative expression </td>
646
647<td>SELECT VALUE -1; </td>
648 </tr>
649
650<tr class="a">
651
652<td>+, - </td>
653
654<td>As binary operators, they add or subtract </td>
655
656<td>SELECT VALUE 1 + 2; </td>
657 </tr>
658
659<tr class="b">
660
661<td>*, /, % </td>
662
663<td>Multiply, divide, modulo </td>
664
665<td>SELECT VALUE 4 / 2.0; </td>
666 </tr>
667
668<tr class="a">
669
670<td>^ </td>
671
672<td>Exponentiation </td>
673
674<td>SELECT VALUE 2^3; </td>
675 </tr>
676
677<tr class="b">
678
679<td>|| </td>
680
681<td>String concatenation </td>
682
683<td>SELECT VALUE &#x201c;ab&#x201d;||&#x201c;c&#x201d;||&#x201c;d&#x201d;; </td>
684 </tr>
685 </tbody>
686</table></div>
687<div class="section">
688<h3><a name="Collection_Operators"></a><a name="Collection_operators" id="Collection_operators">Collection Operators</a></h3>
689<p>Collection operators are used for membership tests (IN, NOT IN) or empty collection tests (EXISTS, NOT EXISTS).</p>
690
691<table border="0" class="table table-striped">
692 <thead>
693
694<tr class="a">
695
696<th>Operator </th>
697
698<th>Purpose </th>
699
700<th>Example </th>
701 </tr>
702 </thead>
703 <tbody>
704
705<tr class="b">
706
707<td>IN </td>
708
709<td>Membership test </td>
710
711<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.lang IN [&#x201c;en&#x201d;, &#x201c;de&#x201d;]; </td>
712 </tr>
713
714<tr class="a">
715
716<td>NOT IN </td>
717
718<td>Non-membership test </td>
719
720<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.lang NOT IN [&#x201c;en&#x201d;]; </td>
721 </tr>
722
723<tr class="b">
724
725<td>EXISTS </td>
726
727<td>Check whether a collection is not empty </td>
728
729<td>SELECT * FROM ChirpMessages cm <br />WHERE EXISTS cm.referredTopics; </td>
730 </tr>
731
732<tr class="a">
733
734<td>NOT EXISTS </td>
735
736<td>Check whether a collection is empty </td>
737
738<td>SELECT * FROM ChirpMessages cm <br />WHERE NOT EXISTS cm.referredTopics; </td>
739 </tr>
740 </tbody>
741</table></div>
742<div class="section">
743<h3><a name="Comparison_Operators"></a><a name="Comparison_operators" id="Comparison_operators">Comparison Operators</a></h3>
744<p>Comparison operators are used to compare values. The comparison operators fall into one of two sub-categories: missing value comparisons and regular value comparisons. SQL++ (and JSON) has two ways of representing missing information in a object - the presence of the field with a NULL for its value (as in SQL), and the absence of the field (which JSON permits). For example, the first of the following objects represents Jack, whose friend is Jill. In the other examples, Jake is friendless a la SQL, with a friend field that is NULL, while Joe is friendless in a more natural (for JSON) way, i.e., by not having a friend field.</p>
745<div class="section">
746<div class="section">
747<h5><a name="Examples"></a>Examples</h5>
748<p>{&#x201c;name&#x201d;: &#x201c;Jack&#x201d;, &#x201c;friend&#x201d;: &#x201c;Jill&#x201d;}</p>
749<p>{&#x201c;name&#x201d;: &#x201c;Jake&#x201d;, &#x201c;friend&#x201d;: NULL}</p>
750<p>{&#x201c;name&#x201d;: &#x201c;Joe&#x201d;}</p>
751<p>The following table enumerates all of SQL++&#x2019;s comparison operators.</p>
752
753<table border="0" class="table table-striped">
754 <thead>
755
756<tr class="a">
757
758<th>Operator </th>
759
760<th>Purpose </th>
761
762<th>Example </th>
763 </tr>
764 </thead>
765 <tbody>
766
767<tr class="b">
768
769<td>IS NULL </td>
770
771<td>Test if a value is NULL </td>
772
773<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS NULL; </td>
774 </tr>
775
776<tr class="a">
777
778<td>IS NOT NULL </td>
779
780<td>Test if a value is not NULL </td>
781
782<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS NOT NULL; </td>
783 </tr>
784
785<tr class="b">
786
787<td>IS MISSING </td>
788
789<td>Test if a value is MISSING </td>
790
791<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS MISSING; </td>
792 </tr>
793
794<tr class="a">
795
796<td>IS NOT MISSING </td>
797
798<td>Test if a value is not MISSING </td>
799
800<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS NOT MISSING;</td>
801 </tr>
802
803<tr class="b">
804
805<td>IS UNKNOWN </td>
806
807<td>Test if a value is NULL or MISSING </td>
808
809<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS UNKNOWN; </td>
810 </tr>
811
812<tr class="a">
813
814<td>IS NOT UNKNOWN </td>
815
816<td>Test if a value is neither NULL nor MISSING </td>
817
818<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name IS NOT UNKNOWN;</td>
819 </tr>
820
821<tr class="b">
822
823<td>BETWEEN </td>
824
825<td>Test if a value is between a start value and <br />a end value. The comparison is inclusive <br />to both start and end values. </td>
826
827<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId BETWEEN 10 AND 20;</td>
828 </tr>
829
830<tr class="a">
831
832<td>= </td>
833
834<td>Equality test </td>
835
836<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId=10; </td>
837 </tr>
838
839<tr class="b">
840
841<td>!= </td>
842
843<td>Inequality test </td>
844
845<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId!=10;</td>
846 </tr>
847
848<tr class="a">
849
850<td>&lt;&gt; </td>
851
852<td>Inequality test </td>
853
854<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId&lt;&gt;10;</td>
855 </tr>
856
857<tr class="b">
858
859<td>&lt; </td>
860
861<td>Less than </td>
862
863<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId&lt;10; </td>
864 </tr>
865
866<tr class="a">
867
868<td>&gt; </td>
869
870<td>Greater than </td>
871
872<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId&gt;10; </td>
873 </tr>
874
875<tr class="b">
876
877<td>&lt;= </td>
878
879<td>Less than or equal to </td>
880
881<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId&lt;=10; </td>
882 </tr>
883
884<tr class="a">
885
886<td>&gt;= </td>
887
888<td>Greater than or equal to </td>
889
890<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.chirpId&gt;=10; </td>
891 </tr>
892
893<tr class="b">
894
895<td>LIKE </td>
896
897<td>Test if the left side matches a<br /> pattern defined on the right<br /> side; in the pattern, &#x201c;%&#x201d; matches <br />any string while &#x201c;_&#x201d; matches <br /> any character. </td>
898
899<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name LIKE &#x201c;%Giesen%&#x201d;;</td>
900 </tr>
901
902<tr class="a">
903
904<td>NOT LIKE </td>
905
906<td>Test if the left side does not <br />match a pattern defined on the right<br /> side; in the pattern, &#x201c;%&#x201d; matches <br />any string while &#x201c;_&#x201d; matches <br /> any character. </td>
907
908<td>SELECT * FROM ChirpMessages cm <br />WHERE cm.user.name NOT LIKE &#x201c;%Giesen%&#x201d;;</td>
909 </tr>
910 </tbody>
911</table>
912<p>The following table summarizes how the missing value comparison operators work.</p>
913
914<table border="0" class="table table-striped">
915 <thead>
916
917<tr class="a">
918
919<th>Operator </th>
920
921<th>Non-NULL/Non-MISSING value </th>
922
923<th>NULL </th>
924
925<th>MISSING </th>
926 </tr>
927 </thead>
928 <tbody>
929
930<tr class="b">
931
932<td>IS NULL </td>
933
934<td>FALSE </td>
935
936<td>TRUE </td>
937
938<td>MISSING </td>
939 </tr>
940
941<tr class="a">
942
943<td>IS NOT NULL </td>
944
945<td>TRUE </td>
946
947<td>FALSE </td>
948
949<td>MISSING </td>
950 </tr>
951
952<tr class="b">
953
954<td>IS MISSING </td>
955
956<td>FALSE </td>
957
958<td>FALSE </td>
959
960<td>TRUE </td>
961 </tr>
962
963<tr class="a">
964
965<td>IS NOT MISSING </td>
966
967<td>TRUE </td>
968
969<td>TRUE </td>
970
971<td>FALSE </td>
972 </tr>
973
974<tr class="b">
975
976<td>IS UNKNOWN </td>
977
978<td>FALSE </td>
979
980<td>TRUE </td>
981
982<td>TRUE </td>
983 </tr>
984
985<tr class="a">
986
987<td>IS NOT UNKNOWN </td>
988
989<td>TRUE </td>
990
991<td>FALSE </td>
992
993<td>FALSE</td>
994 </tr>
995 </tbody>
996</table></div></div></div>
997<div class="section">
998<h3><a name="Logical_Operators"></a><a name="Logical_operators" id="Logical_operators">Logical Operators</a></h3>
999<p>Logical operators perform logical <tt>NOT</tt>, <tt>AND</tt>, and <tt>OR</tt> operations over Boolean values (<tt>TRUE</tt> and <tt>FALSE</tt>) plus <tt>NULL</tt> and <tt>MISSING</tt>.</p>
1000
1001<table border="0" class="table table-striped">
1002 <thead>
1003
1004<tr class="a">
1005
1006<th>Operator </th>
1007
1008<th>Purpose </th>
1009
1010<th>Example </th>
1011 </tr>
1012 </thead>
1013 <tbody>
1014
1015<tr class="b">
1016
1017<td>NOT </td>
1018
1019<td>Returns true if the following condition is false, otherwise returns false </td>
1020
1021<td>SELECT VALUE NOT TRUE; </td>
1022 </tr>
1023
1024<tr class="a">
1025
1026<td>AND </td>
1027
1028<td>Returns true if both branches are true, otherwise returns false </td>
1029
1030<td>SELECT VALUE TRUE AND FALSE; </td>
1031 </tr>
1032
1033<tr class="b">
1034
1035<td>OR </td>
1036
1037<td>Returns true if one branch is true, otherwise returns false </td>
1038
1039<td>SELECT VALUE FALSE OR FALSE; </td>
1040 </tr>
1041 </tbody>
1042</table>
1043<p>The following table is the truth table for <tt>AND</tt> and <tt>OR</tt>.</p>
1044
1045<table border="0" class="table table-striped">
1046 <thead>
1047
1048<tr class="a">
1049
1050<th>A </th>
1051
1052<th>B </th>
1053
1054<th>A AND B </th>
1055
1056<th>A OR B </th>
1057 </tr>
1058 </thead>
1059 <tbody>
1060
1061<tr class="b">
1062
1063<td>TRUE </td>
1064
1065<td>TRUE </td>
1066
1067<td>TRUE </td>
1068
1069<td>TRUE </td>
1070 </tr>
1071
1072<tr class="a">
1073
1074<td>TRUE </td>
1075
1076<td>FALSE </td>
1077
1078<td>FALSE </td>
1079
1080<td>TRUE </td>
1081 </tr>
1082
1083<tr class="b">
1084
1085<td>TRUE </td>
1086
1087<td>NULL </td>
1088
1089<td>NULL </td>
1090
1091<td>TRUE </td>
1092 </tr>
1093
1094<tr class="a">
1095
1096<td>TRUE </td>
1097
1098<td>MISSING </td>
1099
1100<td>MISSING </td>
1101
1102<td>TRUE </td>
1103 </tr>
1104
1105<tr class="b">
1106
1107<td>FALSE </td>
1108
1109<td>FALSE </td>
1110
1111<td>FALSE </td>
1112
1113<td>FALSE </td>
1114 </tr>
1115
1116<tr class="a">
1117
1118<td>FALSE </td>
1119
1120<td>NULL </td>
1121
1122<td>FALSE </td>
1123
1124<td>NULL </td>
1125 </tr>
1126
1127<tr class="b">
1128
1129<td>FALSE </td>
1130
1131<td>MISSING </td>
1132
1133<td>FALSE </td>
1134
1135<td>MISSING </td>
1136 </tr>
1137
1138<tr class="a">
1139
1140<td>NULL </td>
1141
1142<td>NULL </td>
1143
1144<td>NULL </td>
1145
1146<td>NULL </td>
1147 </tr>
1148
1149<tr class="b">
1150
1151<td>NULL </td>
1152
1153<td>MISSING </td>
1154
1155<td>MISSING </td>
1156
1157<td>NULL </td>
1158 </tr>
1159
1160<tr class="a">
1161
1162<td>MISSING </td>
1163
1164<td>MISSING </td>
1165
1166<td>MISSING </td>
1167
1168<td>MISSING </td>
1169 </tr>
1170 </tbody>
1171</table>
1172<p>The following table demonstrates the results of <tt>NOT</tt> on all possible inputs.</p>
1173
1174<table border="0" class="table table-striped">
1175 <thead>
1176
1177<tr class="a">
1178
1179<th>A </th>
1180
1181<th>NOT A </th>
1182 </tr>
1183 </thead>
1184 <tbody>
1185
1186<tr class="b">
1187
1188<td>TRUE </td>
1189
1190<td>FALSE </td>
1191 </tr>
1192
1193<tr class="a">
1194
1195<td>FALSE </td>
1196
1197<td>TRUE </td>
1198 </tr>
1199
1200<tr class="b">
1201
1202<td>NULL </td>
1203
1204<td>NULL </td>
1205 </tr>
1206
1207<tr class="a">
1208
1209<td>MISSING </td>
1210
1211<td>MISSING </td>
1212 </tr>
1213 </tbody>
1214</table></div></div>
1215<div class="section">
1216<h2><a name="Case_Expressions"></a><a name="Case_expressions" id="Case_expressions">Case Expressions</a></h2>
1217
1218<div class="source">
1219<div class="source">
1220<pre>CaseExpression ::= SimpleCaseExpression | SearchedCaseExpression
1221SimpleCaseExpression ::= &lt;CASE&gt; Expression ( &lt;WHEN&gt; Expression &lt;THEN&gt; Expression )+ ( &lt;ELSE&gt; Expression )? &lt;END&gt;
1222SearchedCaseExpression ::= &lt;CASE&gt; ( &lt;WHEN&gt; Expression &lt;THEN&gt; Expression )+ ( &lt;ELSE&gt; Expression )? &lt;END&gt;
1223</pre></div></div>
1224<p>In a simple <tt>CASE</tt> expression, the query evaluator searches for the first <tt>WHEN</tt> &#x2026; <tt>THEN</tt> pair in which the <tt>WHEN</tt> expression is equal to the expression following <tt>CASE</tt> and returns the expression following <tt>THEN</tt>. If none of the <tt>WHEN</tt> &#x2026; <tt>THEN</tt> pairs meet this condition, and an <tt>ELSE</tt> branch exists, it returns the <tt>ELSE</tt> expression. Otherwise, <tt>NULL</tt> is returned.</p>
1225<p>In a searched CASE expression, the query evaluator searches from left to right until it finds a <tt>WHEN</tt> expression that is evaluated to <tt>TRUE</tt>, and then returns its corresponding <tt>THEN</tt> expression. If no condition is found to be <tt>TRUE</tt>, and an <tt>ELSE</tt> branch exists, it returns the <tt>ELSE</tt> expression. Otherwise, it returns <tt>NULL</tt>.</p>
1226<p>The following example illustrates the form of a case expression.</p>
1227<div class="section">
1228<div class="section">
1229<div class="section">
1230<h5><a name="Example"></a>Example</h5>
1231
1232<div class="source">
1233<div class="source">
1234<pre>CASE (2 &lt; 3) WHEN true THEN &quot;yes&quot; ELSE &quot;no&quot; END
1235</pre></div></div></div></div></div></div>
1236<div class="section">
1237<h2><a name="Quantified_Expressions"></a><a name="Quantified_expressions" id="Quantified_expressions">Quantified Expressions</a></h2>
1238
1239<div class="source">
1240<div class="source">
1241<pre>QuantifiedExpression ::= ( (&lt;ANY&gt;|&lt;SOME&gt;) | &lt;EVERY&gt; ) Variable &lt;IN&gt; Expression ( &quot;,&quot; Variable &quot;in&quot; Expression )*
1242 &lt;SATISFIES&gt; Expression (&lt;END&gt;)?
1243</pre></div></div>
1244<p>Quantified expressions are used for expressing existential or universal predicates involving the elements of a collection.</p>
1245<p>The following pair of examples illustrate the use of a quantified expression to test that every (or some) element in the set [1, 2, 3] of integers is less than three. The first example yields <tt>FALSE</tt> and second example yields <tt>TRUE</tt>.</p>
1246<p>It is useful to note that if the set were instead the empty set, the first expression would yield <tt>TRUE</tt> (&#x201c;every&#x201d; value in an empty set satisfies the condition) while the second expression would yield <tt>FALSE</tt> (since there isn&#x2019;t &#x201c;some&#x201d; value, as there are no values in the set, that satisfies the condition).</p>
1247<p>A quantified expression will return a <tt>NULL</tt> (or <tt>MISSING</tt>) if the first expression in it evaluates to <tt>NULL</tt> (or <tt>MISSING</tt>). A type error will be raised if the first expression in a quantified expression does not return a collection.</p>
1248<div class="section">
1249<div class="section">
1250<div class="section">
1251<h5><a name="Examples"></a>Examples</h5>
1252
1253<div class="source">
1254<div class="source">
1255<pre>EVERY x IN [ 1, 2, 3 ] SATISFIES x &lt; 3
1256SOME x IN [ 1, 2, 3 ] SATISFIES x &lt; 3
1257</pre></div></div></div></div></div></div>
1258<div class="section">
1259<h2><a name="Path_Expressions"></a><a name="Path_expressions" id="Path_expressions">Path Expressions</a></h2>
1260
1261<div class="source">
1262<div class="source">
1263<pre>PathExpression ::= PrimaryExpression ( Field | Index )*
1264Field ::= &quot;.&quot; Identifier
1265Index ::= &quot;[&quot; ( Expression | &quot;?&quot; ) &quot;]&quot;
1266</pre></div></div>
1267<p>Components of complex types in the data model are accessed via path expressions. Path access can be applied to the result of a SQL++ expression that yields an instance of a complex type, for example, a object or array instance. For objects, path access is based on field names. For arrays, path access is based on (zero-based) array-style indexing. SQL++ also supports an &#x201c;I&#x2019;m feeling lucky&#x201d; style index accessor, [?], for selecting an arbitrary element from an array. Attempts to access non-existent fields or out-of-bound array elements produce the special value <tt>MISSING</tt>. Type errors will be raised for inappropriate use of a path expression, such as applying a field accessor to a numeric value.</p>
1268<p>The following examples illustrate field access for a object, index-based element access for an array, and also a composition thereof.</p>
1269<div class="section">
1270<div class="section">
1271<div class="section">
1272<h5><a name="Examples"></a>Examples</h5>
1273
1274<div class="source">
1275<div class="source">
1276<pre>({&quot;name&quot;: &quot;MyABCs&quot;, &quot;array&quot;: [ &quot;a&quot;, &quot;b&quot;, &quot;c&quot;]}).array
1277
1278([&quot;a&quot;, &quot;b&quot;, &quot;c&quot;])[2]
1279
1280({&quot;name&quot;: &quot;MyABCs&quot;, &quot;array&quot;: [ &quot;a&quot;, &quot;b&quot;, &quot;c&quot;]}).array[2]
1281</pre></div></div></div></div></div></div>
1282<div class="section">
1283<h2><a name="Primary_Expressions"></a><a name="Primary_expressions" id="Primary_expressions">Primary Expressions</a></h2>
1284
1285<div class="source">
1286<div class="source">
1287<pre>PrimaryExpr ::= Literal
1288 | VariableReference
1289 | ParenthesizedExpression
1290 | FunctionCallExpression
1291 | Constructor
1292</pre></div></div>
1293<p>The most basic building block for any SQL++ expression is PrimaryExpression. This can be a simple literal (constant) value, a reference to a query variable that is in scope, a parenthesized expression, a function call, or a newly constructed instance of the data model (such as a newly constructed object, array, or multiset of data model instances).</p></div>
1294<div class="section">
1295<h2><a name="Literals" id="Literals">Literals</a></h2>
1296
1297<div class="source">
1298<div class="source">
1299<pre>Literal ::= StringLiteral
1300 | IntegerLiteral
1301 | FloatLiteral
1302 | DoubleLiteral
1303 | &lt;NULL&gt;
1304 | &lt;MISSING&gt;
1305 | &lt;TRUE&gt;
1306 | &lt;FALSE&gt;
1307StringLiteral ::= &quot;\&quot;&quot; (
1308 &lt;EscapeQuot&gt;
1309 | &lt;EscapeBslash&gt;
1310 | &lt;EscapeSlash&gt;
1311 | &lt;EscapeBspace&gt;
1312 | &lt;EscapeFormf&gt;
1313 | &lt;EscapeNl&gt;
1314 | &lt;EscapeCr&gt;
1315 | &lt;EscapeTab&gt;
1316 | ~[&quot;\&quot;&quot;,&quot;\\&quot;])*
1317 &quot;\&quot;&quot;
1318 | &quot;\'&quot;(
1319 &lt;EscapeApos&gt;
1320 | &lt;EscapeBslash&gt;
1321 | &lt;EscapeSlash&gt;
1322 | &lt;EscapeBspace&gt;
1323 | &lt;EscapeFormf&gt;
1324 | &lt;EscapeNl&gt;
1325 | &lt;EscapeCr&gt;
1326 | &lt;EscapeTab&gt;
1327 | ~[&quot;\'&quot;,&quot;\\&quot;])*
1328 &quot;\'&quot;
1329&lt;ESCAPE_Apos&gt; ::= &quot;\\\'&quot;
1330&lt;ESCAPE_Quot&gt; ::= &quot;\\\&quot;&quot;
1331&lt;EscapeBslash&gt; ::= &quot;\\\\&quot;
1332&lt;EscapeSlash&gt; ::= &quot;\\/&quot;
1333&lt;EscapeBspace&gt; ::= &quot;\\b&quot;
1334&lt;EscapeFormf&gt; ::= &quot;\\f&quot;
1335&lt;EscapeNl&gt; ::= &quot;\\n&quot;
1336&lt;EscapeCr&gt; ::= &quot;\\r&quot;
1337&lt;EscapeTab&gt; ::= &quot;\\t&quot;
1338
1339IntegerLiteral ::= &lt;DIGITS&gt;
1340&lt;DIGITS&gt; ::= [&quot;0&quot; - &quot;9&quot;]+
1341FloatLiteral ::= &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; )
1342 | &lt;DIGITS&gt; ( &quot;.&quot; &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; ) )?
1343 | &quot;.&quot; &lt;DIGITS&gt; ( &quot;f&quot; | &quot;F&quot; )
1344DoubleLiteral ::= &lt;DIGITS&gt; &quot;.&quot; &lt;DIGITS&gt;
1345 | &quot;.&quot; &lt;DIGITS&gt;
1346</pre></div></div>
1347<p>Literals (constants) in SQL++ can be strings, integers, floating point values, double values, boolean constants, or special constant values like <tt>NULL</tt> and <tt>MISSING</tt>. The <tt>NULL</tt> value is like a <tt>NULL</tt> in SQL; it is used to represent an unknown field value. The specialy value <tt>MISSING</tt> is only meaningful in the context of SQL++ field accesses; it occurs when the accessed field simply does not exist at all in a object being accessed.</p>
1348<p>The following are some simple examples of SQL++ literals.</p>
1349<div class="section">
1350<div class="section">
1351<div class="section">
1352<h5><a name="Examples"></a>Examples</h5>
1353
1354<div class="source">
1355<div class="source">
1356<pre>'a string'
1357&quot;test string&quot;
135842
1359</pre></div></div>
1360<p>Different from standard SQL, double quotes play the same role as single quotes and may be used for string literals in SQL++.</p></div></div></div>
1361<div class="section">
1362<h3><a name="Variable_References"></a><a name="Variable_references" id="Variable_references">Variable References</a></h3>
1363
1364<div class="source">
1365<div class="source">
1366<pre>VariableReference ::= &lt;IDENTIFIER&gt;|&lt;DelimitedIdentifier&gt;
1367&lt;IDENTIFIER&gt; ::= &lt;LETTER&gt; (&lt;LETTER&gt; | &lt;DIGIT&gt; | &quot;_&quot; | &quot;$&quot;)*
1368&lt;LETTER&gt; ::= [&quot;A&quot; - &quot;Z&quot;, &quot;a&quot; - &quot;z&quot;]
1369DelimitedIdentifier ::= &quot;`&quot; (&lt;EscapeQuot&gt;
1370 | &lt;EscapeBslash&gt;
1371 | &lt;EscapeSlash&gt;
1372 | &lt;EscapeBspace&gt;
1373 | &lt;EscapeFormf&gt;
1374 | &lt;EscapeNl&gt;
1375 | &lt;EscapeCr&gt;
1376 | &lt;EscapeTab&gt;
1377 | ~[&quot;`&quot;,&quot;\\&quot;])*
1378 &quot;`&quot;
1379</pre></div></div>
1380<p>A variable in SQL++ can be bound to any legal data model value. A variable reference refers to the value to which an in-scope variable is bound. (E.g., a variable binding may originate from one of the <tt>FROM</tt>, <tt>WITH</tt> or <tt>LET</tt> clauses of a <tt>SELECT</tt> statement or from an input parameter in the context of a function body.) Backticks, for example, `id`, are used for delimited identifiers. Delimiting is needed when a variable&#x2019;s desired name clashes with a SQL++ keyword or includes characters not allowed in regular identifiers.</p>
1381<div class="section">
1382<div class="section">
1383<h5><a name="Examples"></a>Examples</h5>
1384
1385<div class="source">
1386<div class="source">
1387<pre>tweet
1388id
1389`SELECT`
1390`my-function`
1391</pre></div></div></div></div></div>
1392<div class="section">
1393<h3><a name="Parenthesized_Expressions"></a><a name="Parenthesized_expressions" id="Parenthesized_expressions">Parenthesized Expressions</a></h3>
1394
1395<div class="source">
1396<div class="source">
1397<pre>ParenthesizedExpression ::= &quot;(&quot; Expression &quot;)&quot; | Subquery
1398</pre></div></div>
1399<p>An expression can be parenthesized to control the precedence order or otherwise clarify a query. In SQL++, for composability, a subquery is also an parenthesized expression.</p>
1400<p>The following expression evaluates to the value 2.</p>
1401<div class="section">
1402<div class="section">
1403<h5><a name="Example"></a>Example</h5>
1404
1405<div class="source">
1406<div class="source">
1407<pre>( 1 + 1 )
1408</pre></div></div></div></div></div>
1409<div class="section">
1410<h3><a name="Function_Call_Expressions"></a><a name="Function_call_expressions" id="Function_call_expressions">Function Call Expressions</a></h3>
1411
1412<div class="source">
1413<div class="source">
1414<pre>FunctionCallExpression ::= FunctionName &quot;(&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;)&quot;
1415</pre></div></div>
1416<p>Functions are included in SQL++, like most languages, as a way to package useful functionality or to componentize complicated or reusable SQL++ computations. A function call is a legal SQL++ query expression that represents the value resulting from the evaluation of its body expression with the given parameter bindings; the parameter value bindings can themselves be any SQL++ expressions.</p>
1417<p>The following example is a (built-in) function call expression whose value is 8.</p>
1418<div class="section">
1419<div class="section">
1420<h5><a name="Example"></a>Example</h5>
1421
1422<div class="source">
1423<div class="source">
1424<pre>length('a string')
1425</pre></div></div></div></div></div>
1426<div class="section">
1427<h3><a name="Constructors" id="Constructors">Constructors</a></h3>
1428
1429<div class="source">
1430<div class="source">
1431<pre>Constructor ::= ArrayConstructor | MultisetConstructor | ObjectConstructor
1432ArrayConstructor ::= &quot;[&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;]&quot;
1433MultisetConstructor ::= &quot;{{&quot; ( Expression ( &quot;,&quot; Expression )* )? &quot;}}&quot;
1434ObjectConstructor ::= &quot;{&quot; ( FieldBinding ( &quot;,&quot; FieldBinding )* )? &quot;}&quot;
1435FieldBinding ::= Expression &quot;:&quot; Expression
1436</pre></div></div>
1437<p>A major feature of SQL++ is its ability to construct new data model instances. This is accomplished using its constructors for each of the model&#x2019;s complex object structures, namely arrays, multisets, and objects. Arrays are like JSON arrays, while multisets have bag semantics. Objects are built from fields that are field-name/field-value pairs, again like JSON.</p>
1438<p>The following examples illustrate how to construct a new array with 4 items and a new object with 2 fields respectively. Array elements can be homogeneous (as in the first example), which is the common case, or they may be heterogeneous (as in the second example). The data values and field name values used to construct arrays, multisets, and objects in constructors are all simply SQL++ expressions. Thus, the collection elements, field names, and field values used in constructors can be simple literals or they can come from query variable references or even arbitrarily complex SQL++ expressions (subqueries). Type errors will be raised if the field names in an object are not strings, and duplicate field errors will be raised if they are not distinct.</p>
1439<div class="section">
1440<div class="section">
1441<h5><a name="Examples"></a>Examples</h5>
1442
1443<div class="source">
1444<div class="source">
1445<pre>[ 'a', 'b', 'c', 'c' ]
1446
1447[ 42, &quot;forty-two!&quot;, { &quot;rank&quot; : &quot;Captain&quot;, &quot;name&quot;: &quot;America&quot; }, 3.14159 ]
1448
1449{
1450 'project name': 'Hyracks',
1451 'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' ]
1452}
1453</pre></div></div>
1454<!-- ! Licensed to the Apache Software Foundation (ASF) under one
1455 ! or more contributor license agreements. See the NOTICE file
1456 ! distributed with this work for additional information
1457 ! regarding copyright ownership. The ASF licenses this file
1458 ! to you under the Apache License, Version 2.0 (the
1459 ! "License"); you may not use this file except in compliance
1460 ! with the License. You may obtain a copy of the License at
1461 !
1462 ! http://www.apache.org/licenses/LICENSE-2.0
1463 !
1464 ! Unless required by applicable law or agreed to in writing,
1465 ! software distributed under the License is distributed on an
1466 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
1467 ! KIND, either express or implied. See the License for the
1468 ! specific language governing permissions and limitations
1469 ! under the License.
1470 ! -->
1471<h1><a name="Queries" id="Queries">3. Queries</a></h1>
1472<p>A SQL++ query can be any legal SQL++ expression or <tt>SELECT</tt> statement. A SQL++ query always ends with a semicolon.</p>
1473
1474<div class="source">
1475<div class="source">
1476<pre>Query ::= (Expression | SelectStatement) &quot;;&quot;
1477</pre></div></div>
1478<!-- ! Licensed to the Apache Software Foundation (ASF) under one
1479 ! or more contributor license agreements. See the NOTICE file
1480 ! distributed with this work for additional information
1481 ! regarding copyright ownership. The ASF licenses this file
1482 ! to you under the Apache License, Version 2.0 (the
1483 ! "License"); you may not use this file except in compliance
1484 ! with the License. You may obtain a copy of the License at
1485 !
1486 ! http://www.apache.org/licenses/LICENSE-2.0
1487 !
1488 ! Unless required by applicable law or agreed to in writing,
1489 ! software distributed under the License is distributed on an
1490 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
1491 ! KIND, either express or implied. See the License for the
1492 ! specific language governing permissions and limitations
1493 ! under the License.
1494 ! --></div></div></div></div>
1495<div class="section">
1496<h2><a name="Declarations" id="Declarations">Declarations</a></h2>
1497
1498<div class="source">
1499<div class="source">
1500<pre>DatabaseDeclaration ::= &quot;USE&quot; Identifier
1501</pre></div></div>
1502<p>At the uppermost level, the world of data is organized into data namespaces called <b>dataverses</b>. To set the default dataverse for a series of statements, the USE statement is provided in SQL++.</p>
1503<p>As an example, the following statement sets the default dataverse to be &#x201c;TinySocial&#x201d;.</p>
1504<div class="section">
1505<div class="section">
1506<div class="section">
1507<h5><a name="Example"></a>Example</h5>
1508
1509<div class="source">
1510<div class="source">
1511<pre>USE TinySocial;
1512</pre></div></div>
1513<!-- ! Licensed to the Apache Software Foundation (ASF) under one
1514 ! or more contributor license agreements. See the NOTICE file
1515 ! distributed with this work for additional information
1516 ! regarding copyright ownership. The ASF licenses this file
1517 ! to you under the Apache License, Version 2.0 (the
1518 ! "License"); you may not use this file except in compliance
1519 ! with the License. You may obtain a copy of the License at
1520 !
1521 ! http://www.apache.org/licenses/LICENSE-2.0
1522 !
1523 ! Unless required by applicable law or agreed to in writing,
1524 ! software distributed under the License is distributed on an
1525 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
1526 ! KIND, either express or implied. See the License for the
1527 ! specific language governing permissions and limitations
1528 ! under the License.
1529 ! -->
1530<p>When writing a complex SQL++ query, it can sometimes be helpful to define one or more auxilliary functions that each address a sub-piece of the overall query. The declare function statement supports the creation of such helper functions. In general, the function body (expression) can be any legal SQL++ query expression.</p>
1531
1532<div class="source">
1533<div class="source">
1534<pre>FunctionDeclaration ::= &quot;DECLARE&quot; &quot;FUNCTION&quot; Identifier ParameterList &quot;{&quot; Expression &quot;}&quot;
1535ParameterList ::= &quot;(&quot; ( &lt;VARIABLE&gt; ( &quot;,&quot; &lt;VARIABLE&gt; )* )? &quot;)&quot;
1536</pre></div></div>
1537<p>The following is a simple example of a temporary SQL++ function definition and its use.</p></div>
1538<div class="section">
1539<h5><a name="Example"></a>Example</h5>
1540
1541<div class="source">
1542<div class="source">
1543<pre>DECLARE FUNCTION friendInfo(userId) {
1544 (SELECT u.id, u.name, len(u.friendIds) AS friendCount
1545 FROM GleambookUsers u
1546 WHERE u.id = userId)[0]
1547 };
1548
1549SELECT VALUE friendInfo(2);
1550</pre></div></div>
1551<p>For our sample data set, this returns:</p>
1552
1553<div class="source">
1554<div class="source">
1555<pre>[
1556 { &quot;id&quot;: 2, &quot;name&quot;: &quot;IsbelDull&quot;, &quot;friendCount&quot;: 2 }
1557]
1558</pre></div></div>
1559<!-- ! Licensed to the Apache Software Foundation (ASF) under one
1560 ! or more contributor license agreements. See the NOTICE file
1561 ! distributed with this work for additional information
1562 ! regarding copyright ownership. The ASF licenses this file
1563 ! to you under the Apache License, Version 2.0 (the
1564 ! "License"); you may not use this file except in compliance
1565 ! with the License. You may obtain a copy of the License at
1566 !
1567 ! http://www.apache.org/licenses/LICENSE-2.0
1568 !
1569 ! Unless required by applicable law or agreed to in writing,
1570 ! software distributed under the License is distributed on an
1571 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
1572 ! KIND, either express or implied. See the License for the
1573 ! specific language governing permissions and limitations
1574 ! under the License.
1575 ! --></div></div></div></div>
1576<div class="section">
1577<h2><a name="SELECT_Statements"></a><a name="SELECT_statements" id="SELECT_statements">SELECT Statements</a></h2>
1578<p>The following shows the (rich) grammar for the <tt>SELECT</tt> statement in SQL++.</p>
1579
1580<div class="source">
1581<div class="source">
1582<pre>SelectStatement ::= ( WithClause )?
1583 SelectSetOperation (OrderbyClause )? ( LimitClause )?
1584SelectSetOperation ::= SelectBlock (&lt;UNION&gt; &lt;ALL&gt; ( SelectBlock | Subquery ) )*
1585Subquery ::= &quot;(&quot; SelectStatement &quot;)&quot;
1586
1587SelectBlock ::= SelectClause
1588 ( FromClause ( LetClause )?)?
1589 ( WhereClause )?
1590 ( GroupbyClause ( LetClause )? ( HavingClause )? )?
1591 |
1592 FromClause ( LetClause )?
1593 ( WhereClause )?
1594 ( GroupbyClause ( LetClause )? ( HavingClause )? )?
1595 SelectClause
1596
1597SelectClause ::= &lt;SELECT&gt; ( &lt;ALL&gt; | &lt;DISTINCT&gt; )? ( SelectRegular | SelectValue )
1598SelectRegular ::= Projection ( &quot;,&quot; Projection )*
1599SelectValue ::= ( &lt;VALUE&gt; | &lt;ELEMENT&gt; | &lt;RAW&gt; ) Expression
1600Projection ::= ( Expression ( &lt;AS&gt; )? Identifier | &quot;*&quot; )
1601
1602FromClause ::= &lt;FROM&gt; FromTerm ( &quot;,&quot; FromTerm )*
1603FromTerm ::= Expression (( &lt;AS&gt; )? Variable)?
1604 ( ( JoinType )? ( JoinClause | UnnestClause ) )*
1605
1606JoinClause ::= &lt;JOIN&gt; Expression (( &lt;AS&gt; )? Variable)? &lt;ON&gt; Expression
1607UnnestClause ::= ( &lt;UNNEST&gt; | &lt;CORRELATE&gt; | &lt;FLATTEN&gt; ) Expression
1608 ( &lt;AS&gt; )? Variable ( &lt;AT&gt; Variable )?
1609JoinType ::= ( &lt;INNER&gt; | &lt;LEFT&gt; ( &lt;OUTER&gt; )? )
1610
1611WithClause ::= &lt;WITH&gt; WithElement ( &quot;,&quot; WithElement )*
1612LetClause ::= (&lt;LET&gt; | &lt;LETTING&gt;) LetElement ( &quot;,&quot; LetElement )*
1613LetElement ::= Variable &quot;=&quot; Expression
1614WithElement ::= Variable &lt;AS&gt; Expression
1615
1616WhereClause ::= &lt;WHERE&gt; Expression
1617
1618GroupbyClause ::= &lt;GROUP&gt; &lt;BY&gt; Expression ( ( (&lt;AS&gt;)? Variable )?
1619 ( &quot;,&quot; Expression ( (&lt;AS&gt;)? Variable )? )* )
1620 ( &lt;GROUP&gt; &lt;AS&gt; Variable
1621 (&quot;(&quot; Variable &lt;AS&gt; VariableReference
1622 (&quot;,&quot; Variable &lt;AS&gt; VariableReference )* &quot;)&quot;)?
1623 )?
1624HavingClause ::= &lt;HAVING&gt; Expression
1625
1626OrderbyClause ::= &lt;ORDER&gt; &lt;BY&gt; Expression ( &lt;ASC&gt; | &lt;DESC&gt; )?
1627 ( &quot;,&quot; Expression ( &lt;ASC&gt; | &lt;DESC&gt; )? )*
1628LimitClause ::= &lt;LIMIT&gt; Expression ( &lt;OFFSET&gt; Expression )?
1629</pre></div></div>
1630<p>In this section, we will make use of two stored collections of objects (datasets), <tt>GleambookUsers</tt> and <tt>GleambookMessages</tt>, in a series of running examples to explain <tt>SELECT</tt> queries. The contents of the example collections are as follows:</p>
1631<p><tt>GleambookUsers</tt> collection (or, dataset):</p>
1632
1633<div class="source">
1634<div class="source">
1635<pre>[ {
1636 &quot;id&quot;:1,
1637 &quot;alias&quot;:&quot;Margarita&quot;,
1638 &quot;name&quot;:&quot;MargaritaStoddard&quot;,
1639 &quot;nickname&quot;:&quot;Mags&quot;,
1640 &quot;userSince&quot;:&quot;2012-08-20T10:10:00&quot;,
1641 &quot;friendIds&quot;:[2,3,6,10],
1642 &quot;employment&quot;:[{
1643 &quot;organizationName&quot;:&quot;Codetechno&quot;,
1644 &quot;start-date&quot;:&quot;2006-08-06&quot;
1645 },
1646 {
1647 &quot;organizationName&quot;:&quot;geomedia&quot;,
1648 &quot;start-date&quot;:&quot;2010-06-17&quot;,
1649 &quot;end-date&quot;:&quot;2010-01-26&quot;
1650 }],
1651 &quot;gender&quot;:&quot;F&quot;
1652},
1653{
1654 &quot;id&quot;:2,
1655 &quot;alias&quot;:&quot;Isbel&quot;,
1656 &quot;name&quot;:&quot;IsbelDull&quot;,
1657 &quot;nickname&quot;:&quot;Izzy&quot;,
1658 &quot;userSince&quot;:&quot;2011-01-22T10:10:00&quot;,
1659 &quot;friendIds&quot;:[1,4],
1660 &quot;employment&quot;:[{
1661 &quot;organizationName&quot;:&quot;Hexviafind&quot;,
1662 &quot;startDate&quot;:&quot;2010-04-27&quot;
1663 }]
1664},
1665{
1666 &quot;id&quot;:3,
1667 &quot;alias&quot;:&quot;Emory&quot;,
1668 &quot;name&quot;:&quot;EmoryUnk&quot;,
1669 &quot;userSince&quot;:&quot;2012-07-10T10:10:00&quot;,
1670 &quot;friendIds&quot;:[1,5,8,9],
1671 &quot;employment&quot;:[{
1672 &quot;organizationName&quot;:&quot;geomedia&quot;,
1673 &quot;startDate&quot;:&quot;2010-06-17&quot;,
1674 &quot;endDate&quot;:&quot;2010-01-26&quot;
1675 }]
1676} ]
1677</pre></div></div>
1678<p><tt>GleambookMessages</tt> collection (or, dataset):</p>
1679
1680<div class="source">
1681<div class="source">
1682<pre>[ {
1683 &quot;messageId&quot;:2,
1684 &quot;authorId&quot;:1,
1685 &quot;inResponseTo&quot;:4,
1686 &quot;senderLocation&quot;:[41.66,80.87],
1687 &quot;message&quot;:&quot; dislike x-phone its touch-screen is horrible&quot;
1688},
1689{
1690 &quot;messageId&quot;:3,
1691 &quot;authorId&quot;:2,
1692 &quot;inResponseTo&quot;:4,
1693 &quot;senderLocation&quot;:[48.09,81.01],
1694 &quot;message&quot;:&quot; like product-y the plan is amazing&quot;
1695},
1696{
1697 &quot;messageId&quot;:4,
1698 &quot;authorId&quot;:1,
1699 &quot;inResponseTo&quot;:2,
1700 &quot;senderLocation&quot;:[37.73,97.04],
1701 &quot;message&quot;:&quot; can't stand acast the network is horrible:(&quot;
1702},
1703{
1704 &quot;messageId&quot;:6,
1705 &quot;authorId&quot;:2,
1706 &quot;inResponseTo&quot;:1,
1707 &quot;senderLocation&quot;:[31.5,75.56],
1708 &quot;message&quot;:&quot; like product-z its platform is mind-blowing&quot;
1709}
1710{
1711 &quot;messageId&quot;:8,
1712 &quot;authorId&quot;:1,
1713 &quot;inResponseTo&quot;:11,
1714 &quot;senderLocation&quot;:[40.33,80.87],
1715 &quot;message&quot;:&quot; like ccast the 3G is awesome:)&quot;
1716},
1717{
1718 &quot;messageId&quot;:10,
1719 &quot;authorId&quot;:1,
1720 &quot;inResponseTo&quot;:12,
1721 &quot;senderLocation&quot;:[42.5,70.01],
1722 &quot;message&quot;:&quot; can't stand product-w the touch-screen is terrible&quot;
1723},
1724{
1725 &quot;messageId&quot;:11,
1726 &quot;authorId&quot;:1,
1727 &quot;inResponseTo&quot;:1,
1728 &quot;senderLocation&quot;:[38.97,77.49],
1729 &quot;message&quot;:&quot; can't stand acast its plan is terrible&quot;
1730} ]
1731</pre></div></div></div>
1732<div class="section">
1733<h2><a name="SELECT_Clause"></a><a name="Select_clauses" id="Select_clauses">SELECT Clause</a></h2>
1734<p>The SQL++ <tt>SELECT</tt> clause always returns a collection value as its result (even if the result is empty or a singleton).</p>
1735<div class="section">
1736<h3><a name="Select_ElementValueRaw"></a><a name="Select_element" id="Select_element">Select Element/Value/Raw</a></h3>
1737<p>The <tt>SELECT VALUE</tt> clause in SQL++ returns an array or multiset that contains the results of evaluating the <tt>VALUE</tt> expression, with one evaluation being performed per &#x201c;binding tuple&#x201d; (i.e., per <tt>FROM</tt> clause item) satisfying the statement&#x2019;s selection criteria. For historical reasons SQL++ also allows the keywords <tt>ELEMENT</tt> or <tt>RAW</tt> to be used in place of <tt>VALUE</tt> (not recommended).</p>
1738<p>If there is no FROM clause, the expression after <tt>VALUE</tt> is evaluated once with no binding tuples (except those inherited from an outer environment).</p>
1739<div class="section">
1740<div class="section">
1741<h5><a name="Example"></a>Example</h5>
1742
1743<div class="source">
1744<div class="source">
1745<pre>SELECT VALUE 1;
1746</pre></div></div>
1747<p>This query returns:</p>
1748
1749<div class="source">
1750<div class="source">
1751<pre>[
1752 1
1753]
1754</pre></div></div>
1755<p>The following example shows a query that selects one user from the GleambookUsers collection.</p></div>
1756<div class="section">
1757<h5><a name="Example"></a>Example</h5>
1758
1759<div class="source">
1760<div class="source">
1761<pre>SELECT VALUE user
1762FROM GleambookUsers user
1763WHERE user.id = 1;
1764</pre></div></div>
1765<p>This query returns:</p>
1766
1767<div class="source">
1768<div class="source">
1769<pre>[{
1770 &quot;userSince&quot;: &quot;2012-08-20T10:10:00.000Z&quot;,
1771 &quot;friendIds&quot;: [
1772 2,
1773 3,
1774 6,
1775 10
1776 ],
1777 &quot;gender&quot;: &quot;F&quot;,
1778 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
1779 &quot;nickname&quot;: &quot;Mags&quot;,
1780 &quot;alias&quot;: &quot;Margarita&quot;,
1781 &quot;id&quot;: 1,
1782 &quot;employment&quot;: [
1783 {
1784 &quot;organizationName&quot;: &quot;Codetechno&quot;,
1785 &quot;start-date&quot;: &quot;2006-08-06&quot;
1786 },
1787 {
1788 &quot;end-date&quot;: &quot;2010-01-26&quot;,
1789 &quot;organizationName&quot;: &quot;geomedia&quot;,
1790 &quot;start-date&quot;: &quot;2010-06-17&quot;
1791 }
1792 ]
1793} ]
1794</pre></div></div></div></div></div>
1795<div class="section">
1796<h3><a name="SQL-style_SELECT"></a><a name="SQL_select" id="SQL_select">SQL-style SELECT</a></h3>
1797<p>In SQL++, the traditional SQL-style <tt>SELECT</tt> syntax is also supported. This syntax can also be reformulated in a <tt>SELECT VALUE</tt> based manner in SQL++. (E.g., <tt>SELECT expA AS fldA, expB AS fldB</tt> is syntactic sugar for <tt>SELECT VALUE { 'fldA': expA, 'fldB': expB }</tt>.) Unlike in SQL, the result of an SQL++ query does not preserve the order of expressions in the <tt>SELECT</tt> clause.</p>
1798<div class="section">
1799<div class="section">
1800<h5><a name="Example"></a>Example</h5>
1801
1802<div class="source">
1803<div class="source">
1804<pre>SELECT user.alias user_alias, user.name user_name
1805FROM GleambookUsers user
1806WHERE user.id = 1;
1807</pre></div></div>
1808<p>Returns:</p>
1809
1810<div class="source">
1811<div class="source">
1812<pre>[ {
1813 &quot;user_name&quot;: &quot;MargaritaStoddard&quot;,
1814 &quot;user_alias&quot;: &quot;Margarita&quot;
1815} ]
1816</pre></div></div></div></div></div>
1817<div class="section">
1818<h3><a name="SELECT_"></a><a name="Select_star" id="Select_star">SELECT *</a></h3>
1819<p>In SQL++, <tt>SELECT *</tt> returns a object with a nested field for each input tuple. Each field has as its field name the name of a binding variable generated by either the <tt>FROM</tt> clause or <tt>GROUP BY</tt> clause in the current enclosing <tt>SELECT</tt> statement, and its field value is the value of that binding variable.</p>
1820<p>Note that the result of <tt>SELECT *</tt> is different from the result of query that selects all the fields of an object.</p>
1821<div class="section">
1822<div class="section">
1823<h5><a name="Example"></a>Example</h5>
1824
1825<div class="source">
1826<div class="source">
1827<pre>SELECT *
1828FROM GleambookUsers user;
1829</pre></div></div>
1830<p>Since <tt>user</tt> is the only binding variable generated in the <tt>FROM</tt> clause, this query returns:</p>
1831
1832<div class="source">
1833<div class="source">
1834<pre>[ {
1835 &quot;user&quot;: {
1836 &quot;userSince&quot;: &quot;2012-08-20T10:10:00.000Z&quot;,
1837 &quot;friendIds&quot;: [
1838 2,
1839 3,
1840 6,
1841 10
1842 ],
1843 &quot;gender&quot;: &quot;F&quot;,
1844 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
1845 &quot;nickname&quot;: &quot;Mags&quot;,
1846 &quot;alias&quot;: &quot;Margarita&quot;,
1847 &quot;id&quot;: 1,
1848 &quot;employment&quot;: [
1849 {
1850 &quot;organizationName&quot;: &quot;Codetechno&quot;,
1851 &quot;start-date&quot;: &quot;2006-08-06&quot;
1852 },
1853 {
1854 &quot;end-date&quot;: &quot;2010-01-26&quot;,
1855 &quot;organizationName&quot;: &quot;geomedia&quot;,
1856 &quot;start-date&quot;: &quot;2010-06-17&quot;
1857 }
1858 ]
1859 }
1860}, {
1861 &quot;user&quot;: {
1862 &quot;userSince&quot;: &quot;2011-01-22T10:10:00.000Z&quot;,
1863 &quot;friendIds&quot;: [
1864 1,
1865 4
1866 ],
1867 &quot;name&quot;: &quot;IsbelDull&quot;,
1868 &quot;nickname&quot;: &quot;Izzy&quot;,
1869 &quot;alias&quot;: &quot;Isbel&quot;,
1870 &quot;id&quot;: 2,
1871 &quot;employment&quot;: [
1872 {
1873 &quot;organizationName&quot;: &quot;Hexviafind&quot;,
1874 &quot;startDate&quot;: &quot;2010-04-27&quot;
1875 }
1876 ]
1877 }
1878}, {
1879 &quot;user&quot;: {
1880 &quot;userSince&quot;: &quot;2012-07-10T10:10:00.000Z&quot;,
1881 &quot;friendIds&quot;: [
1882 1,
1883 5,
1884 8,
1885 9
1886 ],
1887 &quot;name&quot;: &quot;EmoryUnk&quot;,
1888 &quot;alias&quot;: &quot;Emory&quot;,
1889 &quot;id&quot;: 3,
1890 &quot;employment&quot;: [
1891 {
1892 &quot;organizationName&quot;: &quot;geomedia&quot;,
1893 &quot;endDate&quot;: &quot;2010-01-26&quot;,
1894 &quot;startDate&quot;: &quot;2010-06-17&quot;
1895 }
1896 ]
1897 }
1898} ]
1899</pre></div></div></div>
1900<div class="section">
1901<h5><a name="Example"></a>Example</h5>
1902
1903<div class="source">
1904<div class="source">
1905<pre>SELECT *
1906FROM GleambookUsers u, GleambookMessages m
1907WHERE m.authorId = u.id and u.id = 2;
1908</pre></div></div>
1909<p>This query does an inner join that we will discuss in <a href="#Multiple_from_terms">multiple from terms</a>. Since both <tt>u</tt> and <tt>m</tt> are binding variables generated in the <tt>FROM</tt> clause, this query returns:</p>
1910
1911<div class="source">
1912<div class="source">
1913<pre>[ {
1914 &quot;u&quot;: {
1915 &quot;userSince&quot;: &quot;2011-01-22T10:10:00&quot;,
1916 &quot;friendIds&quot;: [
1917 1,
1918 4
1919 ],
1920 &quot;name&quot;: &quot;IsbelDull&quot;,
1921 &quot;nickname&quot;: &quot;Izzy&quot;,
1922 &quot;alias&quot;: &quot;Isbel&quot;,
1923 &quot;id&quot;: 2,
1924 &quot;employment&quot;: [
1925 {
1926 &quot;organizationName&quot;: &quot;Hexviafind&quot;,
1927 &quot;startDate&quot;: &quot;2010-04-27&quot;
1928 }
1929 ]
1930 },
1931 &quot;m&quot;: {
1932 &quot;senderLocation&quot;: [
1933 31.5,
1934 75.56
1935 ],
1936 &quot;inResponseTo&quot;: 1,
1937 &quot;messageId&quot;: 6,
1938 &quot;authorId&quot;: 2,
1939 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
1940 }
1941}, {
1942 &quot;u&quot;: {
1943 &quot;userSince&quot;: &quot;2011-01-22T10:10:00&quot;,
1944 &quot;friendIds&quot;: [
1945 1,
1946 4
1947 ],
1948 &quot;name&quot;: &quot;IsbelDull&quot;,
1949 &quot;nickname&quot;: &quot;Izzy&quot;,
1950 &quot;alias&quot;: &quot;Isbel&quot;,
1951 &quot;id&quot;: 2,
1952 &quot;employment&quot;: [
1953 {
1954 &quot;organizationName&quot;: &quot;Hexviafind&quot;,
1955 &quot;startDate&quot;: &quot;2010-04-27&quot;
1956 }
1957 ]
1958 },
1959 &quot;m&quot;: {
1960 &quot;senderLocation&quot;: [
1961 48.09,
1962 81.01
1963 ],
1964 &quot;inResponseTo&quot;: 4,
1965 &quot;messageId&quot;: 3,
1966 &quot;authorId&quot;: 2,
1967 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
1968 }
1969} ]
1970</pre></div></div></div></div></div>
1971<div class="section">
1972<h3><a name="SELECT_DISTINCT"></a><a name="Select_distinct" id="Select_distinct">SELECT DISTINCT</a></h3>
1973<p>SQL++&#x2019;s <tt>DISTINCT</tt> keyword is used to eliminate duplicate items in results. The following example shows how it works.</p>
1974<div class="section">
1975<div class="section">
1976<h5><a name="Example"></a>Example</h5>
1977
1978<div class="source">
1979<div class="source">
1980<pre>SELECT DISTINCT * FROM [1, 2, 2, 3] AS foo;
1981</pre></div></div>
1982<p>This query returns:</p>
1983
1984<div class="source">
1985<div class="source">
1986<pre>[ {
1987 &quot;foo&quot;: 1
1988}, {
1989 &quot;foo&quot;: 2
1990}, {
1991 &quot;foo&quot;: 3
1992} ]
1993</pre></div></div></div>
1994<div class="section">
1995<h5><a name="Example"></a>Example</h5>
1996
1997<div class="source">
1998<div class="source">
1999<pre>SELECT DISTINCT VALUE foo FROM [1, 2, 2, 3] AS foo;
2000</pre></div></div>
2001<p>This version of the query returns:</p>
2002
2003<div class="source">
2004<div class="source">
2005<pre>[ 1
2006, 2
2007, 3
2008 ]
2009</pre></div></div></div></div></div>
2010<div class="section">
2011<h3><a name="Unnamed_Projections"></a><a name="Unnamed_projections" id="Unnamed_projections">Unnamed Projections</a></h3>
2012<p>Similar to standard SQL, SQL++ supports unnamed projections (a.k.a, unnamed <tt>SELECT</tt> clause items), for which names are generated. Name generation has three cases:</p>
2013
2014<ul>
2015
2016<li>If a projection expression is a variable reference expression, its generated name is the name of the variable.</li>
2017
2018<li>If a projection expression is a field access expression, its generated name is the last identifier in the expression.</li>
2019
2020<li>For all other cases, the query processor will generate a unique name.</li>
2021</ul>
2022<div class="section">
2023<div class="section">
2024<h5><a name="Example"></a>Example</h5>
2025
2026<div class="source">
2027<div class="source">
2028<pre>SELECT substr(user.name, 10), user.alias
2029FROM GleambookUsers user
2030WHERE user.id = 1;
2031</pre></div></div>
2032<p>This query outputs:</p>
2033
2034<div class="source">
2035<div class="source">
2036<pre>[ {
2037 &quot;alias&quot;: &quot;Margarita&quot;,
2038 &quot;$1&quot;: &quot;Stoddard&quot;
2039} ]
2040</pre></div></div>
2041<p>In the result, <tt>$1</tt> is the generated name for <tt>substr(user.name, 1)</tt>, while <tt>alias</tt> is the generated name for <tt>user.alias</tt>.</p></div></div></div>
2042<div class="section">
2043<h3><a name="Abbreviated_Field_Access_Expressions"></a><a name="Abbreviated_field_access_expressions" id="Abbreviated_field_access_expressions">Abbreviated Field Access Expressions</a></h3>
2044<p>As in standard SQL, SQL++ field access expressions can be abbreviated (not recommended) when there is no ambiguity. In the next example, the variable <tt>user</tt> is the only possible variable reference for fields <tt>id</tt>, <tt>name</tt> and <tt>alias</tt> and thus could be omitted in the query.</p>
2045<div class="section">
2046<div class="section">
2047<h5><a name="Example"></a>Example</h5>
2048
2049<div class="source">
2050<div class="source">
2051<pre>SELECT substr(name, 10) AS lname, alias
2052FROM GleambookUsers user
2053WHERE id = 1;
2054</pre></div></div>
2055<p>Outputs:</p>
2056
2057<div class="source">
2058<div class="source">
2059<pre>[ {
2060 &quot;lname&quot;: &quot;Stoddard&quot;,
2061 &quot;alias&quot;: &quot;Margarita&quot;
2062} ]
2063</pre></div></div></div></div></div></div>
2064<div class="section">
2065<h2><a name="UNNEST_Clause"></a><a name="Unnest_clauses" id="Unnest_clauses">UNNEST Clause</a></h2>
2066<p>For each of its input tuples, the <tt>UNNEST</tt> clause flattens a collection-valued expression into individual items, producing multiple tuples, each of which is one of the expression&#x2019;s original input tuples augmented with a flattened item from its collection.</p>
2067<div class="section">
2068<h3><a name="Inner_UNNEST"></a><a name="Inner_unnests" id="Inner_unnests">Inner UNNEST</a></h3>
2069<p>The following example is a query that retrieves the names of the organizations that a selected user has worked for. It uses the <tt>UNNEST</tt> clause to unnest the nested collection <tt>employment</tt> in the user&#x2019;s object.</p>
2070<div class="section">
2071<div class="section">
2072<h5><a name="Example"></a>Example</h5>
2073
2074<div class="source">
2075<div class="source">
2076<pre>SELECT u.id AS userId, e.organizationName AS orgName
2077FROM GleambookUsers u
2078UNNEST u.employment e
2079WHERE u.id = 1;
2080</pre></div></div>
2081<p>This query returns:</p>
2082
2083<div class="source">
2084<div class="source">
2085<pre>[ {
2086 &quot;orgName&quot;: &quot;Codetechno&quot;,
2087 &quot;userId&quot;: 1
2088}, {
2089 &quot;orgName&quot;: &quot;geomedia&quot;,
2090 &quot;userId&quot;: 1
2091} ]
2092</pre></div></div>
2093<p>Note that <tt>UNNEST</tt> has SQL&#x2019;s inner join semantics &#x2014; that is, if a user has no employment history, no tuple corresponding to that user will be emitted in the result.</p></div></div></div>
2094<div class="section">
2095<h3><a name="Left_Outer_UNNEST"></a><a name="Left_outer_unnests" id="Left_outer_unnests">Left Outer UNNEST</a></h3>
2096<p>As an alternative, the <tt>LEFT OUTER UNNEST</tt> clause offers SQL&#x2019;s left outer join semantics. For example, no collection-valued field named <tt>hobbies</tt> exists in the object for the user whose id is 1, but the following query&#x2019;s result still includes user 1.</p>
2097<div class="section">
2098<div class="section">
2099<h5><a name="Example"></a>Example</h5>
2100
2101<div class="source">
2102<div class="source">
2103<pre>SELECT u.id AS userId, h.hobbyName AS hobby
2104FROM GleambookUsers u
2105LEFT OUTER UNNEST u.hobbies h
2106WHERE u.id = 1;
2107</pre></div></div>
2108<p>Returns:</p>
2109
2110<div class="source">
2111<div class="source">
2112<pre>[ {
2113 &quot;userId&quot;: 1
2114} ]
2115</pre></div></div>
2116<p>Note that if <tt>u.hobbies</tt> is an empty collection or leads to a <tt>MISSING</tt> (as above) or <tt>NULL</tt> value for a given input tuple, there is no corresponding binding value for variable <tt>h</tt> for an input tuple. A <tt>MISSING</tt> value will be generated for <tt>h</tt> so that the input tuple can still be propagated.</p></div></div></div>
2117<div class="section">
2118<h3><a name="Expressing_Joins_Using_UNNEST"></a><a name="Expressing_joins_using_unnests" id="Expressing_joins_using_unnests">Expressing Joins Using UNNEST</a></h3>
2119<p>The SQL++ <tt>UNNEST</tt> clause is similar to SQL&#x2019;s <tt>JOIN</tt> clause except that it allows its right argument to be correlated to its left argument, as in the examples above &#x2014; i.e., think &#x201c;correlated cross-product&#x201d;. The next example shows this via a query that joins two data sets, GleambookUsers and GleambookMessages, returning user/message pairs. The results contain one object per pair, with result objects containing the user&#x2019;s name and an entire message. The query can be thought of as saying &#x201c;for each Gleambook user, unnest the <tt>GleambookMessages</tt> collection and filter the output with the condition <tt>message.authorId = user.id</tt>&#x201d;.</p>
2120<div class="section">
2121<div class="section">
2122<h5><a name="Example"></a>Example</h5>
2123
2124<div class="source">
2125<div class="source">
2126<pre>SELECT u.name AS uname, m.message AS message
2127FROM GleambookUsers u
2128UNNEST GleambookMessages m
2129WHERE m.authorId = u.id;
2130</pre></div></div>
2131<p>This returns:</p>
2132
2133<div class="source">
2134<div class="source">
2135<pre>[ {
2136 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2137 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
2138}, {
2139 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2140 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
2141}, {
2142 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2143 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
2144}, {
2145 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2146 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2147}, {
2148 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2149 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
2150}, {
2151 &quot;uname&quot;: &quot;IsbelDull&quot;,
2152 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2153}, {
2154 &quot;uname&quot;: &quot;IsbelDull&quot;,
2155 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2156} ]
2157</pre></div></div>
2158<p>Similarly, the above query can also be expressed as the <tt>UNNEST</tt>ing of a correlated SQL++ subquery:</p></div>
2159<div class="section">
2160<h5><a name="Example"></a>Example</h5>
2161
2162<div class="source">
2163<div class="source">
2164<pre>SELECT u.name AS uname, m.message AS message
2165FROM GleambookUsers u
2166UNNEST (
2167 SELECT VALUE msg
2168 FROM GleambookMessages msg
2169 WHERE msg.authorId = u.id
2170) AS m;
2171</pre></div></div></div></div></div></div>
2172<div class="section">
2173<h2><a name="FROM_clauses"></a><a name="From_clauses" id="From_clauses">FROM clauses</a></h2>
2174<p>A <tt>FROM</tt> clause is used for enumerating (i.e., conceptually iterating over) the contents of collections, as in SQL.</p>
2175<div class="section">
2176<h3><a name="Binding_expressions" id="Binding_expressions">Binding expressions</a></h3>
2177<p>In SQL++, in addition to stored collections, a <tt>FROM</tt> clause can iterate over any intermediate collection returned by a valid SQL++ expression. In the tuple stream generated by a <tt>FROM</tt> clause, the ordering of the input tuples are not guaranteed to be preserved.</p>
2178<div class="section">
2179<div class="section">
2180<h5><a name="Example"></a>Example</h5>
2181
2182<div class="source">
2183<div class="source">
2184<pre>SELECT VALUE foo
2185FROM [1, 2, 2, 3] AS foo
2186WHERE foo &gt; 2;
2187</pre></div></div>
2188<p>Returns:</p>
2189
2190<div class="source">
2191<div class="source">
2192<pre>[
2193 3
2194]
2195</pre></div></div></div></div></div>
2196<div class="section">
2197<h3><a name="Multiple_FROM_Terms"></a><a name="Multiple_from_terms" id="Multiple_from_terms">Multiple FROM Terms</a></h3>
2198<p>SQL++ permits correlations among <tt>FROM</tt> terms. Specifically, a <tt>FROM</tt> binding expression can refer to variables defined to its left in the given <tt>FROM</tt> clause. Thus, the first unnesting example above could also be expressed as follows:</p>
2199<div class="section">
2200<div class="section">
2201<h5><a name="Example"></a>Example</h5>
2202
2203<div class="source">
2204<div class="source">
2205<pre>SELECT u.id AS userId, e.organizationName AS orgName
2206FROM GleambookUsers u, u.employment e
2207WHERE u.id = 1;
2208</pre></div></div></div></div></div>
2209<div class="section">
2210<h3><a name="Expressing_Joins_Using_FROM_Terms"></a><a name="Expressing_joins_using_from_terms" id="Expressing_joins_using_from_terms">Expressing Joins Using FROM Terms</a></h3>
2211<p>Similarly, the join intentions of the other <tt>UNNEST</tt>-based join examples above could be expressed as:</p>
2212<div class="section">
2213<div class="section">
2214<h5><a name="Example"></a>Example</h5>
2215
2216<div class="source">
2217<div class="source">
2218<pre>SELECT u.name AS uname, m.message AS message
2219FROM GleambookUsers u, GleambookMessages m
2220WHERE m.authorId = u.id;
2221</pre></div></div></div>
2222<div class="section">
2223<h5><a name="Example"></a>Example</h5>
2224
2225<div class="source">
2226<div class="source">
2227<pre>SELECT u.name AS uname, m.message AS message
2228FROM GleambookUsers u,
2229 (
2230 SELECT VALUE msg
2231 FROM GleambookMessages msg
2232 WHERE msg.authorId = u.id
2233 ) AS m;
2234</pre></div></div>
2235<p>Note that the first alternative is one of the SQL-92 approaches to expressing a join.</p></div></div></div>
2236<div class="section">
2237<h3><a name="Implicit_Binding_Variables"></a><a name="Implicit_binding_variables" id="Implicit_binding_variables">Implicit Binding Variables</a></h3>
2238<p>Similar to standard SQL, SQL++ supports implicit <tt>FROM</tt> binding variables (i.e., aliases), for which a binding variable is generated. SQL++ variable generation falls into three cases:</p>
2239
2240<ul>
2241
2242<li>If the binding expression is a variable reference expression, the generated variable&#x2019;s name will be the name of the referenced variable itself.</li>
2243
2244<li>If the binding expression is a field access expression (or a fully qualified name for a dataset), the generated variable&#x2019;s name will be the last identifier (or the dataset name) in the expression.</li>
2245
2246<li>For all other cases, a compilation error will be raised.</li>
2247</ul>
2248<p>The next two examples show queries that do not provide binding variables in their <tt>FROM</tt> clauses.</p>
2249<div class="section">
2250<div class="section">
2251<h5><a name="Example"></a>Example</h5>
2252
2253<div class="source">
2254<div class="source">
2255<pre>SELECT GleambookUsers.name, GleambookMessages.message
2256FROM GleambookUsers, GleambookMessages
2257WHERE GleambookMessages.authorId = GleambookUsers.id;
2258</pre></div></div>
2259<p>Returns:</p>
2260
2261<div class="source">
2262<div class="source">
2263<pre>[ {
2264 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
2265 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2266}, {
2267 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
2268 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
2269}, {
2270 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
2271 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
2272}, {
2273 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
2274 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
2275}, {
2276 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
2277 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
2278}, {
2279 &quot;name&quot;: &quot;IsbelDull&quot;,
2280 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2281}, {
2282 &quot;name&quot;: &quot;IsbelDull&quot;,
2283 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2284} ]
2285</pre></div></div></div>
2286<div class="section">
2287<h5><a name="Example"></a>Example</h5>
2288
2289<div class="source">
2290<div class="source">
2291<pre>SELECT GleambookUsers.name, GleambookMessages.message
2292FROM GleambookUsers,
2293 (
2294 SELECT VALUE GleambookMessages
2295 FROM GleambookMessages
2296 WHERE GleambookMessages.authorId = GleambookUsers.id
2297 );
2298</pre></div></div>
2299<p>Returns:</p>
2300
2301<div class="source">
2302<div class="source">
2303<pre>Error: &quot;Syntax error: Need an alias for the enclosed expression:\n(select element GleambookMessages\n from GleambookMessages as GleambookMessages\n where (GleambookMessages.authorId = GleambookUsers.id)\n )&quot;,
2304 &quot;query_from_user&quot;: &quot;use TinySocial;\n\nSELECT GleambookUsers.name, GleambookMessages.message\n FROM GleambookUsers,\n (\n SELECT VALUE GleambookMessages\n FROM GleambookMessages\n WHERE GleambookMessages.authorId = GleambookUsers.id\n );&quot;
2305</pre></div></div></div></div></div></div>
2306<div class="section">
2307<h2><a name="JOIN_Clauses"></a><a name="Join_clauses" id="Join_clauses">JOIN Clauses</a></h2>
2308<p>The join clause in SQL++ supports both inner joins and left outer joins from standard SQL.</p>
2309<div class="section">
2310<h3><a name="Inner_joins" id="Inner_joins">Inner joins</a></h3>
2311<p>Using a <tt>JOIN</tt> clause, the inner join intent from the preceeding examples can also be expressed as follows:</p>
2312<div class="section">
2313<div class="section">
2314<h5><a name="Example"></a>Example</h5>
2315
2316<div class="source">
2317<div class="source">
2318<pre>SELECT u.name AS uname, m.message AS message
2319FROM GleambookUsers u JOIN GleambookMessages m ON m.authorId = u.id;
2320</pre></div></div></div></div></div>
2321<div class="section">
2322<h3><a name="Left_Outer_Joins"></a><a name="Left_outer_joins" id="Left_outer_joins">Left Outer Joins</a></h3>
2323<p>SQL++ supports SQL&#x2019;s notion of left outer join. The following query is an example:</p>
2324
2325<div class="source">
2326<div class="source">
2327<pre>SELECT u.name AS uname, m.message AS message
2328FROM GleambookUsers u LEFT OUTER JOIN GleambookMessages m ON m.authorId = u.id;
2329</pre></div></div>
2330<p>Returns:</p>
2331
2332<div class="source">
2333<div class="source">
2334<pre>[ {
2335 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2336 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2337}, {
2338 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2339 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
2340}, {
2341 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2342 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
2343}, {
2344 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2345 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
2346}, {
2347 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
2348 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
2349}, {
2350 &quot;uname&quot;: &quot;IsbelDull&quot;,
2351 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2352}, {
2353 &quot;uname&quot;: &quot;IsbelDull&quot;,
2354 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2355}, {
2356 &quot;uname&quot;: &quot;EmoryUnk&quot;
2357} ]
2358</pre></div></div>
2359<p>For non-matching left-side tuples, SQL++ produces <tt>MISSING</tt> values for the right-side binding variables; that is why the last object in the above result doesn&#x2019;t have a <tt>message</tt> field. Note that this is slightly different from standard SQL, which instead would fill in <tt>NULL</tt> values for the right-side fields. The reason for this difference is that, for non-matches in its join results, SQL++ views fields from the right-side as being &#x201c;not there&#x201d; (a.k.a. <tt>MISSING</tt>) instead of as being &#x201c;there but unknown&#x201d; (i.e., <tt>NULL</tt>).</p>
2360<p>The left-outer join query can also be expressed using <tt>LEFT OUTER UNNEST</tt>:</p>
2361
2362<div class="source">
2363<div class="source">
2364<pre>SELECT u.name AS uname, m.message AS message
2365FROM GleambookUsers u
2366LEFT OUTER UNNEST (
2367 SELECT VALUE message
2368 FROM GleambookMessages message
2369 WHERE message.authorId = u.id
2370 ) m;
2371</pre></div></div>
2372<p>In general, in SQL++, SQL-style join queries can also be expressed by <tt>UNNEST</tt> clauses and left outer join queries can be expressed by <tt>LEFT OUTER UNNESTs</tt>.</p></div></div>
2373<div class="section">
2374<h2><a name="GROUP_BY_Clauses"></a><a name="Group_By_clauses" id="Group_By_clauses">GROUP BY Clauses</a></h2>
2375<p>The SQL++ <tt>GROUP BY</tt> clause generalizes standard SQL&#x2019;s grouping and aggregation semantics, but it also retains backward compatibility with the standard (relational) SQL <tt>GROUP BY</tt> and aggregation features.</p>
2376<div class="section">
2377<h3><a name="Group_variables" id="Group_variables">Group variables</a></h3>
2378<p>In a <tt>GROUP BY</tt> clause, in addition to the binding variable(s) defined for the grouping key(s), SQL++ allows a user to define a <i>group variable</i> by using the clause&#x2019;s <tt>GROUP AS</tt> extension to denote the resulting group. After grouping, then, the query&#x2019;s in-scope variables include the grouping key&#x2019;s binding variables as well as this group variable which will be bound to one collection value for each group. This per-group collection (i.e., multiset) value will be a set of nested objects in which each field of the object is the result of a renamed variable defined in parentheses following the group variable&#x2019;s name. The <tt>GROUP AS</tt> syntax is as follows:</p>
2379
2380<div class="source">
2381<div class="source">
2382<pre>&lt;GROUP&gt; &lt;AS&gt; Variable (&quot;(&quot; Variable &lt;AS&gt; VariableReference (&quot;,&quot; Variable &lt;AS&gt; VariableReference )* &quot;)&quot;)?
2383</pre></div></div>
2384<div class="section">
2385<div class="section">
2386<h5><a name="Example"></a>Example</h5>
2387
2388<div class="source">
2389<div class="source">
2390<pre>SELECT *
2391FROM GleambookMessages message
2392GROUP BY message.authorId AS uid GROUP AS msgs(message AS msg);
2393</pre></div></div>
2394<p>This first example query returns:</p>
2395
2396<div class="source">
2397<div class="source">
2398<pre>[ {
2399 &quot;msgs&quot;: [
2400 {
2401 &quot;msg&quot;: {
2402 &quot;senderLocation&quot;: [
2403 38.97,
2404 77.49
2405 ],
2406 &quot;inResponseTo&quot;: 1,
2407 &quot;messageId&quot;: 11,
2408 &quot;authorId&quot;: 1,
2409 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
2410 }
2411 },
2412 {
2413 &quot;msg&quot;: {
2414 &quot;senderLocation&quot;: [
2415 41.66,
2416 80.87
2417 ],
2418 &quot;inResponseTo&quot;: 4,
2419 &quot;messageId&quot;: 2,
2420 &quot;authorId&quot;: 1,
2421 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
2422 }
2423 },
2424 {
2425 &quot;msg&quot;: {
2426 &quot;senderLocation&quot;: [
2427 37.73,
2428 97.04
2429 ],
2430 &quot;inResponseTo&quot;: 2,
2431 &quot;messageId&quot;: 4,
2432 &quot;authorId&quot;: 1,
2433 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
2434 }
2435 },
2436 {
2437 &quot;msg&quot;: {
2438 &quot;senderLocation&quot;: [
2439 40.33,
2440 80.87
2441 ],
2442 &quot;inResponseTo&quot;: 11,
2443 &quot;messageId&quot;: 8,
2444 &quot;authorId&quot;: 1,
2445 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2446 }
2447 },
2448 {
2449 &quot;msg&quot;: {
2450 &quot;senderLocation&quot;: [
2451 42.5,
2452 70.01
2453 ],
2454 &quot;inResponseTo&quot;: 12,
2455 &quot;messageId&quot;: 10,
2456 &quot;authorId&quot;: 1,
2457 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
2458 }
2459 }
2460 ],
2461 &quot;uid&quot;: 1
2462}, {
2463 &quot;msgs&quot;: [
2464 {
2465 &quot;msg&quot;: {
2466 &quot;senderLocation&quot;: [
2467 31.5,
2468 75.56
2469 ],
2470 &quot;inResponseTo&quot;: 1,
2471 &quot;messageId&quot;: 6,
2472 &quot;authorId&quot;: 2,
2473 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2474 }
2475 },
2476 {
2477 &quot;msg&quot;: {
2478 &quot;senderLocation&quot;: [
2479 48.09,
2480 81.01
2481 ],
2482 &quot;inResponseTo&quot;: 4,
2483 &quot;messageId&quot;: 3,
2484 &quot;authorId&quot;: 2,
2485 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2486 }
2487 }
2488 ],
2489 &quot;uid&quot;: 2
2490} ]
2491</pre></div></div>
2492<p>As we can see from the above query result, each group in the example query&#x2019;s output has an associated group variable value called <tt>msgs</tt> that appears in the <tt>SELECT *</tt>&#x2019;s result. This variable contains a collection of objects associated with the group; each of the group&#x2019;s <tt>message</tt> values appears in the <tt>msg</tt> field of the objects in the <tt>msgs</tt> collection.</p>
2493<p>The group variable in SQL++ makes more complex, composable, nested subqueries over a group possible, which is important given the more complex data model of SQL++ (relative to SQL). As a simple example of this, as we really just want the messages associated with each user, we might wish to avoid the &#x201c;extra wrapping&#x201d; of each message as the <tt>msg</tt> field of a object. (That wrapping is useful in more complex cases, but is essentially just in the way here.) We can use a subquery in the <tt>SELECT</tt> clase to tunnel through the extra nesting and produce the desired result.</p></div>
2494<div class="section">
2495<h5><a name="Example"></a>Example</h5>
2496
2497<div class="source">
2498<div class="source">
2499<pre>SELECT uid, (SELECT VALUE g.msg FROM g) AS msgs
2500FROM GleambookMessages gbm
2501GROUP BY gbm.authorId AS uid
2502GROUP AS g(gbm as msg);
2503</pre></div></div>
2504<p>This variant of the example query returns:</p>
2505
2506<div class="source">
2507<div class="source">
2508<pre> [ {
2509 &quot;msgs&quot;: [
2510 {
2511 &quot;senderLocation&quot;: [
2512 38.97,
2513 77.49
2514 ],
2515 &quot;inResponseTo&quot;: 1,
2516 &quot;messageId&quot;: 11,
2517 &quot;authorId&quot;: 1,
2518 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
2519 },
2520 {
2521 &quot;senderLocation&quot;: [
2522 41.66,
2523 80.87
2524 ],
2525 &quot;inResponseTo&quot;: 4,
2526 &quot;messageId&quot;: 2,
2527 &quot;authorId&quot;: 1,
2528 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
2529 },
2530 {
2531 &quot;senderLocation&quot;: [
2532 37.73,
2533 97.04
2534 ],
2535 &quot;inResponseTo&quot;: 2,
2536 &quot;messageId&quot;: 4,
2537 &quot;authorId&quot;: 1,
2538 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
2539 },
2540 {
2541 &quot;senderLocation&quot;: [
2542 40.33,
2543 80.87
2544 ],
2545 &quot;inResponseTo&quot;: 11,
2546 &quot;messageId&quot;: 8,
2547 &quot;authorId&quot;: 1,
2548 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2549 },
2550 {
2551 &quot;senderLocation&quot;: [
2552 42.5,
2553 70.01
2554 ],
2555 &quot;inResponseTo&quot;: 12,
2556 &quot;messageId&quot;: 10,
2557 &quot;authorId&quot;: 1,
2558 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
2559 }
2560 ],
2561 &quot;uid&quot;: 1
2562 }, {
2563 &quot;msgs&quot;: [
2564 {
2565 &quot;senderLocation&quot;: [
2566 31.5,
2567 75.56
2568 ],
2569 &quot;inResponseTo&quot;: 1,
2570 &quot;messageId&quot;: 6,
2571 &quot;authorId&quot;: 2,
2572 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2573 },
2574 {
2575 &quot;senderLocation&quot;: [
2576 48.09,
2577 81.01
2578 ],
2579 &quot;inResponseTo&quot;: 4,
2580 &quot;messageId&quot;: 3,
2581 &quot;authorId&quot;: 2,
2582 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2583 }
2584 ],
2585 &quot;uid&quot;: 2
2586 } ]
2587</pre></div></div>
2588<p>The next example shows a more interesting case involving the use of a subquery in the <tt>SELECT</tt> list. Here the subquery further processes the groups. There is no renaming in the declaration of the group variable <tt>g</tt> such that <tt>g</tt> only has one field <tt>gbm</tt> which comes from the <tt>FROM</tt> clause.</p></div>
2589<div class="section">
2590<h5><a name="Example"></a>Example</h5>
2591
2592<div class="source">
2593<div class="source">
2594<pre>SELECT uid,
2595 (SELECT VALUE g.gbm
2596 FROM g
2597 WHERE g.gbm.message LIKE '% like%'
2598 ORDER BY g.gbm.messageId
2599 LIMIT 2) AS msgs
2600FROM GleambookMessages gbm
2601GROUP BY gbm.authorId AS uid
2602GROUP AS g;
2603</pre></div></div>
2604<p>This example query returns:</p>
2605
2606<div class="source">
2607<div class="source">
2608<pre>[ {
2609 &quot;msgs&quot;: [
2610 {
2611 &quot;senderLocation&quot;: [
2612 40.33,
2613 80.87
2614 ],
2615 &quot;inResponseTo&quot;: 11,
2616 &quot;messageId&quot;: 8,
2617 &quot;authorId&quot;: 1,
2618 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2619 }
2620 ],
2621 &quot;uid&quot;: 1
2622}, {
2623 &quot;msgs&quot;: [
2624 {
2625 &quot;senderLocation&quot;: [
2626 48.09,
2627 81.01
2628 ],
2629 &quot;inResponseTo&quot;: 4,
2630 &quot;messageId&quot;: 3,
2631 &quot;authorId&quot;: 2,
2632 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2633 },
2634 {
2635 &quot;senderLocation&quot;: [
2636 31.5,
2637 75.56
2638 ],
2639 &quot;inResponseTo&quot;: 1,
2640 &quot;messageId&quot;: 6,
2641 &quot;authorId&quot;: 2,
2642 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2643 }
2644 ],
2645 &quot;uid&quot;: 2
2646} ]
2647</pre></div></div></div></div></div>
2648<div class="section">
2649<h3><a name="Implicit_Grouping_Key_Variables"></a><a name="Implicit_group_key_variables" id="Implicit_group_key_variables">Implicit Grouping Key Variables</a></h3>
2650<p>In the SQL++ syntax, providing named binding variables for <tt>GROUP BY</tt> key expressions is optional. If a grouping key is missing a user-provided binding variable, the underlying compiler will generate one. Automatic grouping key variable naming falls into three cases in SQL++, much like the treatment of unnamed projections:</p>
2651
2652<ul>
2653
2654<li>If the grouping key expression is a variable reference expression, the generated variable gets the same name as the referred variable;</li>
2655
2656<li>If the grouping key expression is a field access expression, the generated variable gets the same name as the last identifier in the expression;</li>
2657
2658<li>For all other cases, the compiler generates a unique variable (but the user query is unable to refer to this generated variable).</li>
2659</ul>
2660<p>The next example illustrates a query that doesn&#x2019;t provide binding variables for its grouping key expressions.</p>
2661<div class="section">
2662<div class="section">
2663<h5><a name="Example"></a>Example</h5>
2664
2665<div class="source">
2666<div class="source">
2667<pre>SELECT authorId,
2668 (SELECT VALUE g.gbm
2669 FROM g
2670 WHERE g.gbm.message LIKE '% like%'
2671 ORDER BY g.gbm.messageId
2672 LIMIT 2) AS msgs
2673FROM GleambookMessages gbm
2674GROUP BY gbm.authorId
2675GROUP AS g;
2676</pre></div></div>
2677<p>This query returns:</p>
2678
2679<div class="source">
2680<div class="source">
2681<pre> [ {
2682 &quot;msgs&quot;: [
2683 {
2684 &quot;senderLocation&quot;: [
2685 40.33,
2686 80.87
2687 ],
2688 &quot;inResponseTo&quot;: 11,
2689 &quot;messageId&quot;: 8,
2690 &quot;authorId&quot;: 1,
2691 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
2692 }
2693 ],
2694 &quot;authorId&quot;: 1
2695}, {
2696 &quot;msgs&quot;: [
2697 {
2698 &quot;senderLocation&quot;: [
2699 48.09,
2700 81.01
2701 ],
2702 &quot;inResponseTo&quot;: 4,
2703 &quot;messageId&quot;: 3,
2704 &quot;authorId&quot;: 2,
2705 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
2706 },
2707 {
2708 &quot;senderLocation&quot;: [
2709 31.5,
2710 75.56
2711 ],
2712 &quot;inResponseTo&quot;: 1,
2713 &quot;messageId&quot;: 6,
2714 &quot;authorId&quot;: 2,
2715 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
2716 }
2717 ],
2718 &quot;authorId&quot;: 2
2719} ]
2720</pre></div></div>
2721<p>Based on the three variable generation rules, the generated variable for the grouping key expression <tt>message.authorId</tt> is <tt>authorId</tt> (which is how it is referred to in the example&#x2019;s <tt>SELECT</tt> clause).</p></div></div></div>
2722<div class="section">
2723<h3><a name="Implicit_Group_Variables"></a><a name="Implicit_group_variables" id="Implicit_group_variables">Implicit Group Variables</a></h3>
2724<p>The group variable itself is also optional in SQL++&#x2019;s <tt>GROUP BY</tt> syntax. If a user&#x2019;s query does not declare the name and structure of the group variable using <tt>GROUP AS</tt>, the query compiler will generate a unique group variable whose fields include all of the binding variables defined in the <tt>FROM</tt> clause of the current enclosing <tt>SELECT</tt> statement. In this case the user&#x2019;s query will not be able to refer to the generated group variable, but is able to call SQL-92 aggregation functions as in SQL-92.</p></div>
2725<div class="section">
2726<h3><a name="Aggregation_Functions"></a><a name="Aggregation_functions" id="Aggregation_functions">Aggregation Functions</a></h3>
2727<p>In the traditional SQL, which doesn&#x2019;t support nested data, grouping always also involves the use of aggregation to compute properties of the groups (for example, the average number of messages per user rather than the actual set of messages per user). Each aggregation function in SQL++ takes a collection (for example, the group of messages) as its input and produces a scalar value as its output. These aggregation functions, being truly functional in nature (unlike in SQL), can be used anywhere in a query where an expression is allowed. The following table catalogs the SQL++ built-in aggregation functions and also indicates how each one handles <tt>NULL</tt>/<tt>MISSING</tt> values in the input collection or a completely empty input collection:</p>
2728
2729<table border="0" class="table table-striped">
2730 <thead>
2731
2732<tr class="a">
2733
2734<th>Function </th>
2735
2736<th>NULL </th>
2737
2738<th>MISSING </th>
2739
2740<th>Empty Collection </th>
2741 </tr>
2742 </thead>
2743 <tbody>
2744
2745<tr class="b">
2746
2747<td>COLL_COUNT </td>
2748
2749<td>counted </td>
2750
2751<td>counted </td>
2752
2753<td>0 </td>
2754 </tr>
2755
2756<tr class="a">
2757
2758<td>COLL_SUM </td>
2759
2760<td>returns NULL </td>
2761
2762<td>returns NULL </td>
2763
2764<td>returns NULL </td>
2765 </tr>
2766
2767<tr class="b">
2768
2769<td>COLL_MAX </td>
2770
2771<td>returns NULL </td>
2772
2773<td>returns NULL </td>
2774
2775<td>returns NULL </td>
2776 </tr>
2777
2778<tr class="a">
2779
2780<td>COLL_MIN </td>
2781
2782<td>returns NULL </td>
2783
2784<td>returns NULL </td>
2785
2786<td>returns NULL </td>
2787 </tr>
2788
2789<tr class="b">
2790
2791<td>COLL_AVG </td>
2792
2793<td>returns NULL </td>
2794
2795<td>returns NULL </td>
2796
2797<td>returns NULL </td>
2798 </tr>
2799
2800<tr class="a">
2801
2802<td>ARRAY_COUNT </td>
2803
2804<td>not counted </td>
2805
2806<td>not counted </td>
2807
2808<td>0 </td>
2809 </tr>
2810
2811<tr class="b">
2812
2813<td>ARRAY_SUM </td>
2814
2815<td>ignores NULL </td>
2816
2817<td>ignores NULL </td>
2818
2819<td>returns NULL </td>
2820 </tr>
2821
2822<tr class="a">
2823
2824<td>ARRAY_MAX </td>
2825
2826<td>ignores NULL </td>
2827
2828<td>ignores NULL </td>
2829
2830<td>returns NULL </td>
2831 </tr>
2832
2833<tr class="b">
2834
2835<td>ARRAY_MIN </td>
2836
2837<td>ignores NULL </td>
2838
2839<td>ignores NULL </td>
2840
2841<td>returns NULL </td>
2842 </tr>
2843
2844<tr class="a">
2845
2846<td>ARRAY_AVG </td>
2847
2848<td>ignores NULL </td>
2849
2850<td>ignores NULL </td>
2851
2852<td>returns NULL </td>
2853 </tr>
2854 </tbody>
2855</table>
2856<p>Notice that SQL++ has twice as many functions listed above as there are aggregate functions in SQL-92. This is because SQL++ offers two versions of each &#x2013; one that handles <tt>UNKNOWN</tt> values in a semantically strict fashion, where unknown values in the input result in unknown values in the output &#x2013; and one that handles them in the ad hoc &#x201c;just ignore the unknown values&#x201d; fashion that the SQL standard chose to adopt.</p>
2857<div class="section">
2858<div class="section">
2859<h5><a name="Example"></a>Example</h5>
2860
2861<div class="source">
2862<div class="source">
2863<pre>ARRAY_AVG(
2864 (
2865 SELECT VALUE ARRAY_COUNT(friendIds) FROM GleambookUsers
2866 )
2867);
2868</pre></div></div>
2869<p>This example returns:</p>
2870
2871<div class="source">
2872<div class="source">
2873<pre>3.3333333333333335
2874</pre></div></div></div>
2875<div class="section">
2876<h5><a name="Example"></a>Example</h5>
2877
2878<div class="source">
2879<div class="source">
2880<pre>SELECT uid AS uid, ARRAY_COUNT(grp) AS msgCnt
2881FROM GleambookMessages message
2882GROUP BY message.authorId AS uid
2883GROUP AS grp(message AS msg);
2884</pre></div></div>
2885<p>This query returns:</p>
2886
2887<div class="source">
2888<div class="source">
2889<pre>[ {
2890 &quot;uid&quot;: 1,
2891 &quot;msgCnt&quot;: 5
2892}, {
2893 &quot;uid&quot;: 2,
2894 &quot;msgCnt&quot;: 2
2895} ]
2896</pre></div></div>
2897<p>Notice how the query forms groups where each group involves a message author and their messages. (SQL cannot do this because the grouped intermediate result is non-1NF in nature.) The query then uses the collection aggregate function ARRAY_COUNT to get the cardinality of each group of messages.</p>
2898<p>Each aggregation function in SQL++ supports DISTINCT modifier that removes duplicate values from the input collection.</p></div>
2899<div class="section">
2900<h5><a name="Example"></a>Example</h5>
2901
2902<div class="source">
2903<div class="source">
2904<pre>ARRAY_SUM(DISTINCT [1, 1, 2, 2, 3])
2905</pre></div></div>
2906<p>This query returns:</p>
2907
2908<div class="source">
2909<div class="source">
2910<pre>6
2911</pre></div></div></div></div></div>
2912<div class="section">
2913<h3><a name="SQL-92_Aggregation_Functions"></a><a name="SQL-92_aggregation_functions" id="SQL-92_aggregation_functions">SQL-92 Aggregation Functions</a></h3>
2914<p>For compatibility with the traditional SQL aggregation functions, SQL++ also offers SQL-92&#x2019;s aggregation function symbols (<tt>COUNT</tt>, <tt>SUM</tt>, <tt>MAX</tt>, <tt>MIN</tt>, and <tt>AVG</tt>) as supported syntactic sugar. The SQL++ compiler rewrites queries that utilize these function symbols into SQL++ queries that only use the SQL++ collection aggregate functions. The following example uses the SQL-92 syntax approach to compute a result that is identical to that of the more explicit SQL++ example above:</p>
2915<div class="section">
2916<div class="section">
2917<h5><a name="Example"></a>Example</h5>
2918
2919<div class="source">
2920<div class="source">
2921<pre>SELECT uid, COUNT(*) AS msgCnt
2922FROM GleambookMessages msg
2923GROUP BY msg.authorId AS uid;
2924</pre></div></div>
2925<p>It is important to realize that <tt>COUNT</tt> is actually <b>not</b> a SQL++ built-in aggregation function. Rather, the <tt>COUNT</tt> query above is using a special &#x201c;sugared&#x201d; function symbol that the SQL++ compiler will rewrite as follows:</p>
2926
2927<div class="source">
2928<div class="source">
2929<pre>SELECT uid AS uid, ARRAY_COUNT( (SELECT VALUE 1 FROM `$1` as g) ) AS msgCnt
2930FROM GleambookMessages msg
2931GROUP BY msg.authorId AS uid
2932GROUP AS `$1`(msg AS msg);
2933</pre></div></div>
2934<p>The same sort of rewritings apply to the function symbols <tt>SUM</tt>, <tt>MAX</tt>, <tt>MIN</tt>, and <tt>AVG</tt>. In contrast to the SQL++ collection aggregate functions, these special SQL-92 function symbols can only be used in the same way they are in standard SQL (i.e., with the same restrictions).</p>
2935<p>DISTINCT modifier is also supported for these aggregate functions.</p></div></div></div>
2936<div class="section">
2937<h3><a name="SQL-92_Compliant_GROUP_BY_Aggregations"></a><a name="SQL-92_compliant_gby" id="SQL-92_compliant_gby">SQL-92 Compliant GROUP BY Aggregations</a></h3>
2938<p>SQL++ provides full support for SQL-92 <tt>GROUP BY</tt> aggregation queries. The following query is such an example:</p>
2939<div class="section">
2940<div class="section">
2941<h5><a name="Example"></a>Example</h5>
2942
2943<div class="source">
2944<div class="source">
2945<pre>SELECT msg.authorId, COUNT(*)
2946FROM GleambookMessages msg
2947GROUP BY msg.authorId;
2948</pre></div></div>
2949<p>This query outputs:</p>
2950
2951<div class="source">
2952<div class="source">
2953<pre>[ {
2954 &quot;authorId&quot;: 1,
2955 &quot;$1&quot;: 5
2956}, {
2957 &quot;authorId&quot;: 2,
2958 &quot;$1&quot;: 2
2959} ]
2960</pre></div></div>
2961<p>In principle, a <tt>msg</tt> reference in the query&#x2019;s <tt>SELECT</tt> clause would be &#x201c;sugarized&#x201d; as a collection (as described in <a href="#Implicit_group_variables">Implicit Group Variables</a>). However, since the SELECT expression <tt>msg.authorId</tt> is syntactically identical to a GROUP BY key expression, it will be internally replaced by the generated group key variable. The following is the equivalent rewritten query that will be generated by the compiler for the query above:</p>
2962
2963<div class="source">
2964<div class="source">
2965<pre>SELECT authorId AS authorId, ARRAY_COUNT( (SELECT g.msg FROM `$1` AS g) )
2966FROM GleambookMessages msg
2967GROUP BY msg.authorId AS authorId
2968GROUP AS `$1`(msg AS msg);
2969</pre></div></div></div></div></div>
2970<div class="section">
2971<h3><a name="Column_Aliases"></a><a name="Column_aliases" id="Column_aliases">Column Aliases</a></h3>
2972<p>SQL++ also allows column aliases to be used as <tt>GROUP BY</tt> keys or <tt>ORDER BY</tt> keys.</p>
2973<div class="section">
2974<div class="section">
2975<h5><a name="Example"></a>Example</h5>
2976
2977<div class="source">
2978<div class="source">
2979<pre>SELECT msg.authorId AS aid, COUNT(*)
2980FROM GleambookMessages msg
2981GROUP BY aid;
2982</pre></div></div>
2983<p>This query returns:</p>
2984
2985<div class="source">
2986<div class="source">
2987<pre>[ {
2988 &quot;$1&quot;: 5,
2989 &quot;aid&quot;: 1
2990}, {
2991 &quot;$1&quot;: 2,
2992 &quot;aid&quot;: 2
2993} ]
2994</pre></div></div></div></div></div></div>
2995<div class="section">
2996<h2><a name="WHERE_Clauses_and_HAVING_Clauses"></a><a name="Where_having_clauses" id="Where_having_clauses">WHERE Clauses and HAVING Clauses</a></h2>
2997<p>Both <tt>WHERE</tt> clauses and <tt>HAVING</tt> clauses are used to filter input data based on a condition expression. Only tuples for which the condition expression evaluates to <tt>TRUE</tt> are propagated. Note that if the condition expression evaluates to <tt>NULL</tt> or <tt>MISSING</tt> the input tuple will be disgarded.</p></div>
2998<div class="section">
2999<h2><a name="ORDER_BY_Clauses"></a><a name="Order_By_clauses" id="Order_By_clauses">ORDER BY Clauses</a></h2>
3000<p>The <tt>ORDER BY</tt> clause is used to globally sort data in either ascending order (i.e., <tt>ASC</tt>) or descending order (i.e., <tt>DESC</tt>). During ordering, <tt>MISSING</tt> and <tt>NULL</tt> are treated as being smaller than any other value if they are encountered in the ordering key(s). <tt>MISSING</tt> is treated as smaller than <tt>NULL</tt> if both occur in the data being sorted. The following example returns all <tt>GleambookUsers</tt> in descending order by their number of friends.</p>
3001<div class="section">
3002<div class="section">
3003<div class="section">
3004<h5><a name="Example"></a>Example</h5>
3005
3006<div class="source">
3007<div class="source">
3008<pre> SELECT VALUE user
3009 FROM GleambookUsers AS user
3010 ORDER BY ARRAY_COUNT(user.friendIds) DESC;
3011</pre></div></div>
3012<p>This query returns:</p>
3013
3014<div class="source">
3015<div class="source">
3016<pre> [ {
3017 &quot;userSince&quot;: &quot;2012-08-20T10:10:00.000Z&quot;,
3018 &quot;friendIds&quot;: [
3019 2,
3020 3,
3021 6,
3022 10
3023 ],
3024 &quot;gender&quot;: &quot;F&quot;,
3025 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
3026 &quot;nickname&quot;: &quot;Mags&quot;,
3027 &quot;alias&quot;: &quot;Margarita&quot;,
3028 &quot;id&quot;: 1,
3029 &quot;employment&quot;: [
3030 {
3031 &quot;organizationName&quot;: &quot;Codetechno&quot;,
3032 &quot;start-date&quot;: &quot;2006-08-06&quot;
3033 },
3034 {
3035 &quot;end-date&quot;: &quot;2010-01-26&quot;,
3036 &quot;organizationName&quot;: &quot;geomedia&quot;,
3037 &quot;start-date&quot;: &quot;2010-06-17&quot;
3038 }
3039 ]
3040 }, {
3041 &quot;userSince&quot;: &quot;2012-07-10T10:10:00.000Z&quot;,
3042 &quot;friendIds&quot;: [
3043 1,
3044 5,
3045 8,
3046 9
3047 ],
3048 &quot;name&quot;: &quot;EmoryUnk&quot;,
3049 &quot;alias&quot;: &quot;Emory&quot;,
3050 &quot;id&quot;: 3,
3051 &quot;employment&quot;: [
3052 {
3053 &quot;organizationName&quot;: &quot;geomedia&quot;,
3054 &quot;endDate&quot;: &quot;2010-01-26&quot;,
3055 &quot;startDate&quot;: &quot;2010-06-17&quot;
3056 }
3057 ]
3058 }, {
3059 &quot;userSince&quot;: &quot;2011-01-22T10:10:00.000Z&quot;,
3060 &quot;friendIds&quot;: [
3061 1,
3062 4
3063 ],
3064 &quot;name&quot;: &quot;IsbelDull&quot;,
3065 &quot;nickname&quot;: &quot;Izzy&quot;,
3066 &quot;alias&quot;: &quot;Isbel&quot;,
3067 &quot;id&quot;: 2,
3068 &quot;employment&quot;: [
3069 {
3070 &quot;organizationName&quot;: &quot;Hexviafind&quot;,
3071 &quot;startDate&quot;: &quot;2010-04-27&quot;
3072 }
3073 ]
3074 } ]
3075</pre></div></div></div></div></div></div>
3076<div class="section">
3077<h2><a name="LIMIT_Clauses"></a><a name="Limit_clauses" id="Limit_clauses">LIMIT Clauses</a></h2>
3078<p>The <tt>LIMIT</tt> clause is used to limit the result set to a specified constant size. The use of the <tt>LIMIT</tt> clause is illustrated in the next example.</p>
3079<div class="section">
3080<div class="section">
3081<div class="section">
3082<h5><a name="Example"></a>Example</h5>
3083
3084<div class="source">
3085<div class="source">
3086<pre> SELECT VALUE user
3087 FROM GleambookUsers AS user
3088 ORDER BY len(user.friendIds) DESC
3089 LIMIT 1;
3090</pre></div></div>
3091<p>This query returns:</p>
3092
3093<div class="source">
3094<div class="source">
3095<pre> [ {
3096 &quot;userSince&quot;: &quot;2012-08-20T10:10:00.000Z&quot;,
3097 &quot;friendIds&quot;: [
3098 2,
3099 3,
3100 6,
3101 10
3102 ],
3103 &quot;gender&quot;: &quot;F&quot;,
3104 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
3105 &quot;nickname&quot;: &quot;Mags&quot;,
3106 &quot;alias&quot;: &quot;Margarita&quot;,
3107 &quot;id&quot;: 1,
3108 &quot;employment&quot;: [
3109 {
3110 &quot;organizationName&quot;: &quot;Codetechno&quot;,
3111 &quot;start-date&quot;: &quot;2006-08-06&quot;
3112 },
3113 {
3114 &quot;end-date&quot;: &quot;2010-01-26&quot;,
3115 &quot;organizationName&quot;: &quot;geomedia&quot;,
3116 &quot;start-date&quot;: &quot;2010-06-17&quot;
3117 }
3118 ]
3119 } ]
3120</pre></div></div></div></div></div></div>
3121<div class="section">
3122<h2><a name="WITH_Clauses"></a><a name="With_clauses" id="With_clauses">WITH Clauses</a></h2>
3123<p>As in standard SQL, <tt>WITH</tt> clauses are available to improve the modularity of a query. The next query shows an example.</p>
3124<div class="section">
3125<div class="section">
3126<div class="section">
3127<h5><a name="Example"></a>Example</h5>
3128
3129<div class="source">
3130<div class="source">
3131<pre>WITH avgFriendCount AS (
3132 SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
3133 FROM GleambookUsers AS user
3134)[0]
3135SELECT VALUE user
3136FROM GleambookUsers user
3137WHERE ARRAY_COUNT(user.friendIds) &gt; avgFriendCount;
3138</pre></div></div>
3139<p>This query returns:</p>
3140
3141<div class="source">
3142<div class="source">
3143<pre>[ {
3144 &quot;userSince&quot;: &quot;2012-08-20T10:10:00.000Z&quot;,
3145 &quot;friendIds&quot;: [
3146 2,
3147 3,
3148 6,
3149 10
3150 ],
3151 &quot;gender&quot;: &quot;F&quot;,
3152 &quot;name&quot;: &quot;MargaritaStoddard&quot;,
3153 &quot;nickname&quot;: &quot;Mags&quot;,
3154 &quot;alias&quot;: &quot;Margarita&quot;,
3155 &quot;id&quot;: 1,
3156 &quot;employment&quot;: [
3157 {
3158 &quot;organizationName&quot;: &quot;Codetechno&quot;,
3159 &quot;start-date&quot;: &quot;2006-08-06&quot;
3160 },
3161 {
3162 &quot;end-date&quot;: &quot;2010-01-26&quot;,
3163 &quot;organizationName&quot;: &quot;geomedia&quot;,
3164 &quot;start-date&quot;: &quot;2010-06-17&quot;
3165 }
3166 ]
3167}, {
3168 &quot;userSince&quot;: &quot;2012-07-10T10:10:00.000Z&quot;,
3169 &quot;friendIds&quot;: [
3170 1,
3171 5,
3172 8,
3173 9
3174 ],
3175 &quot;name&quot;: &quot;EmoryUnk&quot;,
3176 &quot;alias&quot;: &quot;Emory&quot;,
3177 &quot;id&quot;: 3,
3178 &quot;employment&quot;: [
3179 {
3180 &quot;organizationName&quot;: &quot;geomedia&quot;,
3181 &quot;endDate&quot;: &quot;2010-01-26&quot;,
3182 &quot;startDate&quot;: &quot;2010-06-17&quot;
3183 }
3184 ]
3185} ]
3186</pre></div></div>
3187<p>The query is equivalent to the following, more complex, inlined form of the query:</p>
3188
3189<div class="source">
3190<div class="source">
3191<pre>SELECT *
3192FROM GleambookUsers user
3193WHERE ARRAY_COUNT(user.friendIds) &gt;
3194 ( SELECT VALUE AVG(ARRAY_COUNT(user.friendIds))
3195 FROM GleambookUsers AS user
3196 ) [0];
3197</pre></div></div>
3198<p>WITH can be particularly useful when a value needs to be used several times in a query.</p>
3199<p>Before proceeding further, notice that both the WITH query and its equivalent inlined variant include the syntax &#x201c;[0]&#x201d; &#x2013; this is due to a noteworthy difference between SQL++ and SQL-92. In SQL-92, whenever a scalar value is expected and it is being produced by a query expression, the SQL-92 query processor will evaluate the expression, check that there is only one row and column in the result at runtime, and then coerce the one-row/one-column tabular result into a scalar value. SQL++, being designed to deal with nested data and schema-less data, does not (and should not) do this. Collection-valued data is perfectly legal in most SQL++ contexts, and its data is schema-less, so a query processor rarely knows exactly what to expect where and such automatic conversion is often not desirable. Thus, in the queries above, the use of &#x201c;[0]&#x201d; extracts the first (i.e., 0th) element of an array-valued query expression&#x2019;s result; this is needed above, even though the result is an array of one element, to extract the only element in the singleton array and obtain the desired scalar for the comparison.</p></div></div></div></div>
3200<div class="section">
3201<h2><a name="LET_Clauses"></a><a name="Let_clauses" id="Let_clauses">LET Clauses</a></h2>
3202<p>Similar to <tt>WITH</tt> clauses, <tt>LET</tt> clauses can be useful when a (complex) expression is used several times within a query, allowing it to be written once to make the query more concise. The next query shows an example.</p>
3203<div class="section">
3204<div class="section">
3205<div class="section">
3206<h5><a name="Example"></a>Example</h5>
3207
3208<div class="source">
3209<div class="source">
3210<pre>SELECT u.name AS uname, messages AS messages
3211FROM GleambookUsers u
3212LET messages = (SELECT VALUE m
3213 FROM GleambookMessages m
3214 WHERE m.authorId = u.id)
3215WHERE EXISTS messages;
3216</pre></div></div>
3217<p>This query lists <tt>GleambookUsers</tt> that have posted <tt>GleambookMessages</tt> and shows all authored messages for each listed user. It returns:</p>
3218
3219<div class="source">
3220<div class="source">
3221<pre>[ {
3222 &quot;uname&quot;: &quot;MargaritaStoddard&quot;,
3223 &quot;messages&quot;: [
3224 {
3225 &quot;senderLocation&quot;: [
3226 38.97,
3227 77.49
3228 ],
3229 &quot;inResponseTo&quot;: 1,
3230 &quot;messageId&quot;: 11,
3231 &quot;authorId&quot;: 1,
3232 &quot;message&quot;: &quot; can't stand acast its plan is terrible&quot;
3233 },
3234 {
3235 &quot;senderLocation&quot;: [
3236 41.66,
3237 80.87
3238 ],
3239 &quot;inResponseTo&quot;: 4,
3240 &quot;messageId&quot;: 2,
3241 &quot;authorId&quot;: 1,
3242 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
3243 },
3244 {
3245 &quot;senderLocation&quot;: [
3246 37.73,
3247 97.04
3248 ],
3249 &quot;inResponseTo&quot;: 2,
3250 &quot;messageId&quot;: 4,
3251 &quot;authorId&quot;: 1,
3252 &quot;message&quot;: &quot; can't stand acast the network is horrible:(&quot;
3253 },
3254 {
3255 &quot;senderLocation&quot;: [
3256 40.33,
3257 80.87
3258 ],
3259 &quot;inResponseTo&quot;: 11,
3260 &quot;messageId&quot;: 8,
3261 &quot;authorId&quot;: 1,
3262 &quot;message&quot;: &quot; like ccast the 3G is awesome:)&quot;
3263 },
3264 {
3265 &quot;senderLocation&quot;: [
3266 42.5,
3267 70.01
3268 ],
3269 &quot;inResponseTo&quot;: 12,
3270 &quot;messageId&quot;: 10,
3271 &quot;authorId&quot;: 1,
3272 &quot;message&quot;: &quot; can't stand product-w the touch-screen is terrible&quot;
3273 }
3274 ]
3275}, {
3276 &quot;uname&quot;: &quot;IsbelDull&quot;,
3277 &quot;messages&quot;: [
3278 {
3279 &quot;senderLocation&quot;: [
3280 31.5,
3281 75.56
3282 ],
3283 &quot;inResponseTo&quot;: 1,
3284 &quot;messageId&quot;: 6,
3285 &quot;authorId&quot;: 2,
3286 &quot;message&quot;: &quot; like product-z its platform is mind-blowing&quot;
3287 },
3288 {
3289 &quot;senderLocation&quot;: [
3290 48.09,
3291 81.01
3292 ],
3293 &quot;inResponseTo&quot;: 4,
3294 &quot;messageId&quot;: 3,
3295 &quot;authorId&quot;: 2,
3296 &quot;message&quot;: &quot; like product-y the plan is amazing&quot;
3297 }
3298 ]
3299} ]
3300</pre></div></div>
3301<p>This query is equivalent to the following query that does not use the <tt>LET</tt> clause:</p>
3302
3303<div class="source">
3304<div class="source">
3305<pre>SELECT u.name AS uname, ( SELECT VALUE m
3306 FROM GleambookMessages m
3307 WHERE m.authorId = u.id
3308 ) AS messages
3309FROM GleambookUsers u
3310WHERE EXISTS ( SELECT VALUE m
3311 FROM GleambookMessages m
3312 WHERE m.authorId = u.id
3313 );
3314</pre></div></div></div></div></div></div>
3315<div class="section">
3316<h2><a name="UNION_ALL"></a><a name="Union_all" id="Union_all">UNION ALL</a></h2>
3317<p>UNION ALL can be used to combine two input arrays or multisets into one. As in SQL, there is no ordering guarantee on the contents of the output stream. However, unlike SQL, SQL++ does not constrain what the data looks like on the input streams; in particular, it allows heterogenity on the input and output streams. A type error will be raised if one of the inputs is not a collection. The following odd but legal query is an example:</p>
3318<div class="section">
3319<div class="section">
3320<div class="section">
3321<h5><a name="Example"></a>Example</h5>
3322
3323<div class="source">
3324<div class="source">
3325<pre>SELECT u.name AS uname
3326FROM GleambookUsers u
3327WHERE u.id = 2
3328 UNION ALL
3329SELECT VALUE m.message
3330FROM GleambookMessages m
3331WHERE authorId=2;
3332</pre></div></div>
3333<p>This query returns:</p>
3334
3335<div class="source">
3336<div class="source">
3337<pre>[
3338 &quot; like product-z its platform is mind-blowing&quot;
3339 , {
3340 &quot;uname&quot;: &quot;IsbelDull&quot;
3341}, &quot; like product-y the plan is amazing&quot;
3342 ]
3343</pre></div></div></div></div></div></div>
3344<div class="section">
3345<h2><a name="Subqueries" id="Subqueries">Subqueries</a></h2>
3346<p>In SQL++, an arbitrary subquery can appear anywhere that an expression can appear. Unlike SQL-92, as was just alluded to, the subqueries in a SELECT list or a boolean predicate need not return singleton, single-column relations. Instead, they may return arbitrary collections. For example, the following query is a variant of the prior group-by query examples; it retrieves an array of up to two &#x201c;dislike&#x201d; messages per user.</p>
3347<div class="section">
3348<div class="section">
3349<div class="section">
3350<h5><a name="Example"></a>Example</h5>
3351
3352<div class="source">
3353<div class="source">
3354<pre>SELECT uid,
3355 (SELECT VALUE m.msg
3356 FROM msgs m
3357 WHERE m.msg.message LIKE '%dislike%'
3358 ORDER BY m.msg.messageId
3359 LIMIT 2) AS msgs
3360FROM GleambookMessages message
3361GROUP BY message.authorId AS uid GROUP AS msgs(message AS msg);
3362</pre></div></div>
3363<p>For our sample data set, this query returns:</p>
3364
3365<div class="source">
3366<div class="source">
3367<pre>[ {
3368 &quot;msgs&quot;: [
3369 {
3370 &quot;senderLocation&quot;: [
3371 41.66,
3372 80.87
3373 ],
3374 &quot;inResponseTo&quot;: 4,
3375 &quot;messageId&quot;: 2,
3376 &quot;authorId&quot;: 1,
3377 &quot;message&quot;: &quot; dislike x-phone its touch-screen is horrible&quot;
3378 }
3379 ],
3380 &quot;uid&quot;: 1
3381}, {
3382 &quot;msgs&quot;: [
3383
3384 ],
3385 &quot;uid&quot;: 2
3386} ]
3387</pre></div></div>
3388<p>Note that a subquery, like a top-level <tt>SELECT</tt> statment, always returns a collection &#x2013; regardless of where within a query the subquery occurs &#x2013; and again, its result is never automatically cast into a scalar.</p></div></div></div></div>
3389<div class="section">
3390<h2><a name="SQL_vs._SQL-92"></a><a name="Vs_SQL-92" id="Vs_SQL-92">SQL++ vs. SQL-92</a></h2>
3391<p>SQL++ offers the following additional features beyond SQL-92 (hence the &#x201c;++&#x201d; in its name):</p>
3392
3393<ul>
3394
3395<li>Fully composable and functional: A subquery can iterate over any intermediate collection and can appear anywhere in a query.</li>
3396
3397<li>Schema-free: The query language does not assume the existence of a static schema for any data that it processes.</li>
3398
3399<li>Correlated FROM terms: A right-side FROM term expression can refer to variables defined by FROM terms on its left.</li>
3400
3401<li>Powerful GROUP BY: In addition to a set of aggregate functions as in standard SQL, the groups created by the <tt>GROUP BY</tt> clause are directly usable in nested queries and/or to obtain nested results.</li>
3402
3403<li>Generalized SELECT clause: A SELECT clause can return any type of collection, while in SQL-92, a <tt>SELECT</tt> clause has to return a (homogeneous) collection of objects.</li>
3404</ul>
3405<p>The following matrix is a quick &#x201c;SQL-92 compatibility cheat sheet&#x201d; for SQL++.</p>
3406
3407<table border="0" class="table table-striped">
3408 <thead>
3409
3410<tr class="a">
3411
3412<th>Feature </th>
3413
3414<th>SQL++ </th>
3415
3416<th>SQL-92 </th>
3417
3418<th>Why different? </th>
3419 </tr>
3420 </thead>
3421 <tbody>
3422
3423<tr class="b">
3424
3425<td>SELECT * </td>
3426
3427<td>Returns nested objects </td>
3428
3429<td>Returns flattened concatenated objects </td>
3430
3431<td>Nested collections are 1st class citizens </td>
3432 </tr>
3433
3434<tr class="a">
3435
3436<td>SELECT list </td>
3437
3438<td>order not preserved </td>
3439
3440<td>order preserved </td>
3441
3442<td>Fields in a JSON object is not ordered </td>
3443 </tr>
3444
3445<tr class="b">
3446
3447<td>Subquery </td>
3448
3449<td>Returns a collection </td>
3450
3451<td>The returned collection is cast into a scalar value if the subquery appears in a SELECT list or on one side of a comparison or as input to a function </td>
3452
3453<td>Nested collections are 1st class citizens </td>
3454 </tr>
3455
3456<tr class="a">
3457
3458<td>LEFT OUTER JOIN </td>
3459
3460<td>Fills in <tt>MISSING</tt>(s) for non-matches </td>
3461
3462<td>Fills in <tt>NULL</tt>(s) for non-matches </td>
3463
3464<td>&#x201c;Absence&#x201d; is more appropriate than &#x201c;unknown&#x201d; here. </td>
3465 </tr>
3466
3467<tr class="b">
3468
3469<td>UNION ALL </td>
3470
3471<td>Allows heterogeneous inputs and output </td>
3472
3473<td>Input streams must be UNION-compatible and output field names are drawn from the first input stream </td>
3474
3475<td>Heterogenity and nested collections are common </td>
3476 </tr>
3477
3478<tr class="a">
3479
3480<td>IN constant_expr </td>
3481
3482<td>The constant expression has to be an array or multiset, i.e., [..,..,&#x2026;] </td>
3483
3484<td>The constant collection can be represented as comma-separated items in a paren pair </td>
3485
3486<td>Nested collections are 1st class citizens </td>
3487 </tr>
3488
3489<tr class="b">
3490
3491<td>String literal </td>
3492
3493<td>Double quotes or single quotes </td>
3494
3495<td>Single quotes only </td>
3496
3497<td>Double quoted strings are pervasive </td>
3498 </tr>
3499
3500<tr class="a">
3501
3502<td>Delimited identifiers </td>
3503
3504<td>Backticks </td>
3505
3506<td>Double quotes </td>
3507
3508<td>Double quoted strings are pervasive </td>
3509 </tr>
3510 </tbody>
3511</table>
3512<p>The following SQL-92 features are not implemented yet. However, SQL++ does not conflict those features:</p>
3513
3514<ul>
3515
3516<li>CROSS JOIN, NATURAL JOIN, UNION JOIN</li>
3517
3518<li>RIGHT and FULL OUTER JOIN</li>
3519
3520<li>INTERSECT, EXCEPT, UNION with set semantics</li>
3521
3522<li>CAST expression</li>
3523
3524<li>NULLIF expression</li>
3525
3526<li>COALESCE expression</li>
3527
3528<li>ALL and SOME predicates for linking to subqueries</li>
3529
3530<li>UNIQUE predicate (tests a collection for duplicates)</li>
3531
3532<li>MATCH predicate (tests for referential integrity)</li>
3533
3534<li>Row and Table constructors</li>
3535
3536<li>DISTINCT aggregates</li>
3537
3538<li>Preserved order for expressions in a SELECT list</li>
3539</ul>
3540<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3541 ! or more contributor license agreements. See the NOTICE file
3542 ! distributed with this work for additional information
3543 ! regarding copyright ownership. The ASF licenses this file
3544 ! to you under the Apache License, Version 2.0 (the
3545 ! "License"); you may not use this file except in compliance
3546 ! with the License. You may obtain a copy of the License at
3547 !
3548 ! http://www.apache.org/licenses/LICENSE-2.0
3549 !
3550 ! Unless required by applicable law or agreed to in writing,
3551 ! software distributed under the License is distributed on an
3552 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3553 ! KIND, either express or implied. See the License for the
3554 ! specific language governing permissions and limitations
3555 ! under the License.
3556 ! -->
3557<h1><a name="Errors" id="Errors">4. Errors</a></h1>
3558<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3559 ! or more contributor license agreements. See the NOTICE file
3560 ! distributed with this work for additional information
3561 ! regarding copyright ownership. The ASF licenses this file
3562 ! to you under the Apache License, Version 2.0 (the
3563 ! "License"); you may not use this file except in compliance
3564 ! with the License. You may obtain a copy of the License at
3565 !
3566 ! http://www.apache.org/licenses/LICENSE-2.0
3567 !
3568 ! Unless required by applicable law or agreed to in writing,
3569 ! software distributed under the License is distributed on an
3570 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3571 ! KIND, either express or implied. See the License for the
3572 ! specific language governing permissions and limitations
3573 ! under the License.
3574 ! -->
3575<p>A SQL++ query can potentially result in one of the following errors:</p>
3576
3577<ul>
3578
3579<li>syntax error,</li>
3580
3581<li>identifier resolution error,</li>
3582
3583<li>type error,</li>
3584
3585<li>resource error.</li>
3586</ul>
3587<p>If the query processor runs into any error, it will terminate the ongoing processing of the query and immediately return an error message to the client.</p></div>
3588<div class="section">
3589<h2><a name="Syntax_Errors"></a><a name="Syntax_errors" id="Syntax_errors">Syntax Errors</a></h2>
3590<p>An valid SQL++ query must satisfy the SQL++ grammar rules. Otherwise, a syntax error will be raised.</p>
3591<div class="section">
3592<div class="section">
3593<div class="section">
3594<h5><a name="Example"></a>Example</h5>
3595
3596<div class="source">
3597<div class="source">
3598<pre>SELECT *
3599GleambookUsers user
3600</pre></div></div>
3601<p>Since the query misses a <tt>FROM</tt> keyword before the dataset <tt>GleambookUsers</tt>, we will get a syntax error as follows:</p>
3602
3603<div class="source">
3604<div class="source">
3605<pre>Syntax error: In line 2 &gt;&gt;GleambookUsers user;&lt;&lt; Encountered &lt;IDENTIFIER&gt; \&quot;GleambookUsers\&quot; at column 1.
3606</pre></div></div></div>
3607<div class="section">
3608<h5><a name="Example"></a>Example</h5>
3609
3610<div class="source">
3611<div class="source">
3612<pre>SELECT *
3613FROM GleambookUsers user
3614WHERE type=&quot;advertiser&quot;;
3615</pre></div></div>
3616<p>Since &#x201c;type&#x201d; is a reserved keyword in the SQL++ parser, we will get a syntax error as follows:</p>
3617
3618<div class="source">
3619<div class="source">
3620<pre>Error: Syntax error: In line 3 &gt;&gt;WHERE type=&quot;advertiser&quot;;&lt;&lt; Encountered 'type' &quot;type&quot; at column 7.
3621==&gt; WHERE type=&quot;advertiser&quot;;
3622</pre></div></div></div></div></div></div>
3623<div class="section">
3624<h2><a name="Identifier_Resolution_Errors"></a><a name="Identifier_resolution_errors" id="Identifier_resolution_errors">Identifier Resolution Errors</a></h2>
3625<p>Referring an undefined identifier can cause an error if the identifier cannot be successfully resolved as a valid field access.</p>
3626<div class="section">
3627<div class="section">
3628<div class="section">
3629<h5><a name="Example"></a>Example</h5>
3630
3631<div class="source">
3632<div class="source">
3633<pre>SELECT *
3634FROM GleambookUser user;
3635</pre></div></div>
3636<p>Assume we have a typo in &#x201c;GleambookUser&#x201d; which misses the ending &#x201c;s&#x201d;, we will get an identifier resolution error as follows:</p>
3637
3638<div class="source">
3639<div class="source">
3640<pre>Error: Cannot find dataset GleambookUser in dataverse Default nor an alias with name GleambookUser!
3641</pre></div></div></div>
3642<div class="section">
3643<h5><a name="Example"></a>Example</h5>
3644
3645<div class="source">
3646<div class="source">
3647<pre>SELECT name, message
3648FROM GleambookUsers u JOIN GleambookMessages m ON m.authorId = u.id;
3649</pre></div></div>
3650<p>If the compiler cannot figure out all possible fields in <tt>GleambookUsers</tt> and <tt>GleambookMessages</tt>, we will get an identifier resolution error as follows:</p>
3651
3652<div class="source">
3653<div class="source">
3654<pre>Error: Cannot resolve ambiguous alias reference for undefined identifier name
3655</pre></div></div></div></div></div></div>
3656<div class="section">
3657<h2><a name="Type_Errors"></a><a name="Type_errors" id="Type_errors">Type Errors</a></h2>
3658<p>The SQL++ compiler does type checks based on its available type information. In addition, the SQL++ runtime also reports type errors if a data model instance it processes does not satisfy the type requirement.</p>
3659<div class="section">
3660<div class="section">
3661<div class="section">
3662<h5><a name="Example"></a>Example</h5>
3663
3664<div class="source">
3665<div class="source">
3666<pre>abs(&quot;123&quot;);
3667</pre></div></div>
3668<p>Since function <tt>abs</tt> can only process numeric input values, we will get a type error as follows:</p>
3669
3670<div class="source">
3671<div class="source">
3672<pre>Error: Arithmetic operations are not implemented for string
3673</pre></div></div></div></div></div></div>
3674<div class="section">
3675<h2><a name="Resource_Errors"></a><a name="Resource_errors" id="Resource_errors">Resource Errors</a></h2>
3676<p>A query can potentially exhaust system resources, such as the number of open files and disk spaces. For instance, the following two resource errors could be potentially be seen when running the system:</p>
3677
3678<div class="source">
3679<div class="source">
3680<pre>Error: no space left on device
3681Error: too many open files
3682</pre></div></div>
3683<p>The &#x201c;no space left on device&#x201d; issue usually can be fixed by cleaning up disk spaces and reserving more disk spaces for the system. The &#x201c;too many open files&#x201d; issue usually can be fixed by a system administrator, following the instructions <a class="externalLink" href="https://easyengine.io/tutorials/linux/increase-open-files-limit/">here</a>.</p>
3684<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3685 ! or more contributor license agreements. See the NOTICE file
3686 ! distributed with this work for additional information
3687 ! regarding copyright ownership. The ASF licenses this file
3688 ! to you under the Apache License, Version 2.0 (the
3689 ! "License"); you may not use this file except in compliance
3690 ! with the License. You may obtain a copy of the License at
3691 !
3692 ! http://www.apache.org/licenses/LICENSE-2.0
3693 !
3694 ! Unless required by applicable law or agreed to in writing,
3695 ! software distributed under the License is distributed on an
3696 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3697 ! KIND, either express or implied. See the License for the
3698 ! specific language governing permissions and limitations
3699 ! under the License.
3700 ! -->
3701<h1><a name="DDL_and_DML_statements" id="DDL_and_DML_statements">5. DDL and DML statements</a></h1>
3702
3703<div class="source">
3704<div class="source">
3705<pre>Statement ::= ( ( SingleStatement )? ( &quot;;&quot; )+ )* &lt;EOF&gt;
3706SingleStatement ::= DatabaseDeclaration
3707 | FunctionDeclaration
3708 | CreateStatement
3709 | DropStatement
3710 | LoadStatement
3711 | SetStatement
3712 | InsertStatement
3713 | DeleteStatement
3714 | Query
3715</pre></div></div>
3716<p>In addition to queries, an implementation of SQL++ needs to support statements for data definition and manipulation purposes as well as controlling the context to be used in evaluating SQL++ expressions. This section details the DDL and DML statements supported in the SQL++ language as realized today in Apache AsterixDB.</p>
3717<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3718 ! or more contributor license agreements. See the NOTICE file
3719 ! distributed with this work for additional information
3720 ! regarding copyright ownership. The ASF licenses this file
3721 ! to you under the Apache License, Version 2.0 (the
3722 ! "License"); you may not use this file except in compliance
3723 ! with the License. You may obtain a copy of the License at
3724 !
3725 ! http://www.apache.org/licenses/LICENSE-2.0
3726 !
3727 ! Unless required by applicable law or agreed to in writing,
3728 ! software distributed under the License is distributed on an
3729 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3730 ! KIND, either express or implied. See the License for the
3731 ! specific language governing permissions and limitations
3732 ! under the License.
3733 ! --></div>
3734<div class="section">
3735<h2><a name="Lifecycle_Management_Statements"></a><a name="Lifecycle_management_statements" id="Lifecycle_management_statements">Lifecycle Management Statements</a></h2>
3736
3737<div class="source">
3738<div class="source">
3739<pre>CreateStatement ::= &quot;CREATE&quot; ( DatabaseSpecification
3740 | TypeSpecification
3741 | DatasetSpecification
3742 | IndexSpecification
3743 | FunctionSpecification )
3744
3745QualifiedName ::= Identifier ( &quot;.&quot; Identifier )?
3746DoubleQualifiedName ::= Identifier &quot;.&quot; Identifier ( &quot;.&quot; Identifier )?
3747</pre></div></div>
3748<p>The CREATE statement in SQL++ is used for creating dataverses as well as other persistent artifacts in a dataverse. It can be used to create new dataverses, datatypes, datasets, indexes, and user-defined SQL++ functions.</p>
3749<div class="section">
3750<h3><a name="Dataverses" id="Dataverses"> Dataverses</a></h3>
3751
3752<div class="source">
3753<div class="source">
3754<pre>DatabaseSpecification ::= &quot;DATAVERSE&quot; Identifier IfNotExists
3755</pre></div></div>
3756<p>The CREATE DATAVERSE statement is used to create new dataverses. To ease the authoring of reusable SQL++ scripts, an optional IF NOT EXISTS clause is included to allow creation to be requested either unconditionally or only if the dataverse does not already exist. If this clause is absent, an error is returned if a dataverse with the indicated name already exists.</p>
3757<p>The following example creates a new dataverse named TinySocial if one does not already exist.</p>
3758<div class="section">
3759<div class="section">
3760<h5><a name="Example"></a>Example</h5>
3761
3762<div class="source">
3763<div class="source">
3764<pre>CREATE DATAVERSE TinySocial IF NOT EXISTS;
3765</pre></div></div></div></div></div>
3766<div class="section">
3767<h3><a name="Types" id="Types"> Types</a></h3>
3768
3769<div class="source">
3770<div class="source">
3771<pre>TypeSpecification ::= &quot;TYPE&quot; FunctionOrTypeName IfNotExists &quot;AS&quot; ObjectTypeDef
3772FunctionOrTypeName ::= QualifiedName
3773IfNotExists ::= ( &lt;IF&gt; &lt;NOT&gt; &lt;EXISTS&gt; )?
3774TypeExpr ::= ObjectTypeDef | TypeReference | ArrayTypeDef | MultisetTypeDef
3775ObjectTypeDef ::= ( &lt;CLOSED&gt; | &lt;OPEN&gt; )? &quot;{&quot; ( ObjectField ( &quot;,&quot; ObjectField )* )? &quot;}&quot;
3776ObjectField ::= Identifier &quot;:&quot; ( TypeExpr ) ( &quot;?&quot; )?
3777NestedField ::= Identifier ( &quot;.&quot; Identifier )*
3778IndexField ::= NestedField ( &quot;:&quot; TypeReference )?
3779TypeReference ::= Identifier
3780ArrayTypeDef ::= &quot;[&quot; ( TypeExpr ) &quot;]&quot;
3781MultisetTypeDef ::= &quot;{{&quot; ( TypeExpr ) &quot;}}&quot;
3782</pre></div></div>
3783<p>The CREATE TYPE statement is used to create a new named datatype. This type can then be used to create stored collections or utilized when defining one or more other datatypes. Much more information about the data model is available in the <a href="../datamodel.html">data model reference guide</a>. A new type can be a object type, a renaming of another type, an array type, or a multiset type. A object type can be defined as being either open or closed. Instances of a closed object type are not permitted to contain fields other than those specified in the create type statement. Instances of an open object type may carry additional fields, and open is the default for new types if neither option is specified.</p>
3784<p>The following example creates a new object type called GleambookUser type. Since it is defined as (defaulting to) being an open type, instances will be permitted to contain more than what is specified in the type definition. The first four fields are essentially traditional typed name/value pairs (much like SQL fields). The friendIds field is a multiset of integers. The employment field is an array of instances of another named object type, EmploymentType.</p>
3785<div class="section">
3786<div class="section">
3787<h5><a name="Example"></a>Example</h5>
3788
3789<div class="source">
3790<div class="source">
3791<pre>CREATE TYPE GleambookUserType AS {
3792 id: int,
3793 alias: string,
3794 name: string,
3795 userSince: datetime,
3796 friendIds: {{ int }},
3797 employment: [ EmploymentType ]
3798};
3799</pre></div></div>
3800<p>The next example creates a new object type, closed this time, called MyUserTupleType. Instances of this closed type will not be permitted to have extra fields, although the alias field is marked as optional and may thus be NULL or MISSING in legal instances of the type. Note that the type of the id field in the example is UUID. This field type can be used if you want to have this field be an autogenerated-PK field. (Refer to the Datasets section later for more details on such fields.)</p></div>
3801<div class="section">
3802<h5><a name="Example"></a>Example</h5>
3803
3804<div class="source">
3805<div class="source">
3806<pre>CREATE TYPE MyUserTupleType AS CLOSED {
3807 id: uuid,
3808 alias: string?,
3809 name: string
3810};
3811</pre></div></div></div></div></div>
3812<div class="section">
3813<h3><a name="Datasets" id="Datasets"> Datasets</a></h3>
3814
3815<div class="source">
3816<div class="source">
3817<pre>DatasetSpecification ::= ( &lt;INTERNAL&gt; )? &lt;DATASET&gt; QualifiedName &quot;(&quot; QualifiedName &quot;)&quot; IfNotExists
3818 PrimaryKey ( &lt;ON&gt; Identifier )? ( &lt;HINTS&gt; Properties )?
3819 ( &quot;USING&quot; &quot;COMPACTION&quot; &quot;POLICY&quot; CompactionPolicy ( Configuration )? )?
3820 ( &lt;WITH&gt; &lt;FILTER&gt; &lt;ON&gt; Identifier )?
3821 |
3822 &lt;EXTERNAL&gt; &lt;DATASET&gt; QualifiedName &quot;(&quot; QualifiedName &quot;)&quot; IfNotExists &lt;USING&gt; AdapterName
3823 Configuration ( &lt;HINTS&gt; Properties )?
3824 ( &lt;USING&gt; &lt;COMPACTION&gt; &lt;POLICY&gt; CompactionPolicy ( Configuration )? )?
3825AdapterName ::= Identifier
3826Configuration ::= &quot;(&quot; ( KeyValuePair ( &quot;,&quot; KeyValuePair )* )? &quot;)&quot;
3827KeyValuePair ::= &quot;(&quot; StringLiteral &quot;=&quot; StringLiteral &quot;)&quot;
3828Properties ::= ( &quot;(&quot; Property ( &quot;,&quot; Property )* &quot;)&quot; )?
3829Property ::= Identifier &quot;=&quot; ( StringLiteral | IntegerLiteral )
3830FunctionSignature ::= FunctionOrTypeName &quot;@&quot; IntegerLiteral
3831PrimaryKey ::= &lt;PRIMARY&gt; &lt;KEY&gt; NestedField ( &quot;,&quot; NestedField )* ( &lt;AUTOGENERATED&gt; )?
3832CompactionPolicy ::= Identifier
3833</pre></div></div>
3834<p>The CREATE DATASET statement is used to create a new dataset. Datasets are named, multisets of object type instances; they are where data lives persistently and are the usual targets for SQL++ queries. Datasets are typed, and the system ensures that their contents conform to their type definitions. An Internal dataset (the default kind) is a dataset whose content lives within and is managed by the system. It is required to have a specified unique primary key field which uniquely identifies the contained objects. (The primary key is also used in secondary indexes to identify the indexed primary data objects.)</p>
3835<p>Internal datasets contain several advanced options that can be specified when appropriate. One such option is that random primary key (UUID) values can be auto-generated by declaring the field to be UUID and putting &#x201c;AUTOGENERATED&#x201d; after the &#x201c;PRIMARY KEY&#x201d; identifier. In this case, unlike other non-optional fields, a value for the auto-generated PK field should not be provided at insertion time by the user since each object&#x2019;s primary key field value will be auto-generated by the system.</p>
3836<p>Another advanced option, when creating an Internal dataset, is to specify the merge policy to control which of the underlying LSM storage components to be merged. (The system supports Log-Structured Merge tree based physical storage for Internal datasets.) Currently the system supports four different component merging policies that can be chosen per dataset: no-merge, constant, prefix, and correlated-prefix. The no-merge policy simply never merges disk components. The constant policy merges disk components when the number of components reaches a constant number k that can be configured by the user. The prefix policy relies on both component sizes and the number of components to decide which components to merge. It works by first trying to identify the smallest ordered (oldest to newest) sequence of components such that the sequence does not contain a single component that exceeds some threshold size M and that either the sum of the component&#x2019;s sizes exceeds M or the number of components in the sequence exceeds another threshold C. If such a sequence exists, the components in the sequence are merged together to form a single component. Finally, the correlated-prefix policy is similar to the prefix policy, but it delegates the decision of merging the disk components of all the indexes in a dataset to the primary index. When the correlated-prefix policy decides that the primary index needs to be merged (using the same decision criteria as for the prefix policy), then it will issue successive merge requests on behalf of all other indexes associated with the same dataset. The system&#x2019;s default policy is the prefix policy except when there is a filter on a dataset, where the preferred policy for filters is the correlated-prefix.</p>
3837<p>Another advanced option shown in the syntax above, related to performance and mentioned above, is that a <b>filter</b> can optionally be created on a field to further optimize range queries with predicates on the filter&#x2019;s field. Filters allow some range queries to avoid searching all LSM components when the query conditions match the filter. (Refer to <a href="../filters.html">Filter-Based LSM Index Acceleration</a> for more information about filters.)</p>
3838<p>An External dataset, in contrast to an Internal dataset, has data stored outside of the system&#x2019;s control. Files living in HDFS or in the local filesystem(s) of a cluster&#x2019;s nodes are currently supported. External dataset support allows SQL++ queries to treat foreign data as though it were stored in the system, making it possible to query &#x201c;legacy&#x201d; file data (for example, Hive data) without having to physically import it. When defining an External dataset, an appropriate adapter type must be selected for the desired external data. (See the <a href="../externaldata.html">Guide to External Data</a> for more information on the available adapters.)</p>
3839<p>The following example creates an Internal dataset for storing FacefookUserType objects. It specifies that their id field is their primary key.</p>
3840<div class="section">
3841<h4><a name="Example"></a>Example</h4>
3842
3843<div class="source">
3844<div class="source">
3845<pre>CREATE INTERNAL DATASET GleambookUsers(GleambookUserType) PRIMARY KEY id;
3846</pre></div></div>
3847<p>The next example creates another Internal dataset (the default kind when no dataset kind is specified) for storing MyUserTupleType objects. It specifies that the id field should be used as the primary key for the dataset. It also specifies that the id field is an auto-generated field, meaning that a randomly generated UUID value should be assigned to each incoming object by the system. (A user should therefore not attempt to provide a value for this field.) Note that the id field&#x2019;s declared type must be UUID in this case.</p></div>
3848<div class="section">
3849<h4><a name="Example"></a>Example</h4>
3850
3851<div class="source">
3852<div class="source">
3853<pre>CREATE DATASET MyUsers(MyUserTupleType) PRIMARY KEY id AUTOGENERATED;
3854</pre></div></div>
3855<p>The next example creates an External dataset for querying LineItemType objects. The choice of the <tt>hdfs</tt> adapter means that this dataset&#x2019;s data actually resides in HDFS. The example CREATE statement also provides parameters used by the hdfs adapter: the URL and path needed to locate the data in HDFS and a description of the data format.</p></div>
3856<div class="section">
3857<h4><a name="Example"></a>Example</h4>
3858
3859<div class="source">
3860<div class="source">
3861<pre>CREATE EXTERNAL DATASET LineItem(LineItemType) USING hdfs (
3862 (&quot;hdfs&quot;=&quot;hdfs://HOST:PORT&quot;),
3863 (&quot;path&quot;=&quot;HDFS_PATH&quot;),
3864 (&quot;input-format&quot;=&quot;text-input-format&quot;),
3865 (&quot;format&quot;=&quot;delimited-text&quot;),
3866 (&quot;delimiter&quot;=&quot;|&quot;));
3867</pre></div></div></div></div>
3868<div class="section">
3869<h3><a name="Indices" id="Indices">Indices</a></h3>
3870
3871<div class="source">
3872<div class="source">
3873<pre>IndexSpecification ::= &lt;INDEX&gt; Identifier IfNotExists &lt;ON&gt; QualifiedName
3874 &quot;(&quot; ( IndexField ) ( &quot;,&quot; IndexField )* &quot;)&quot; ( &quot;type&quot; IndexType &quot;?&quot;)?
3875 ( (&lt;NOT&gt;)? &lt;ENFORCED&gt; )?
3876IndexType ::= &lt;BTREE&gt; | &lt;RTREE&gt; | &lt;KEYWORD&gt; | &lt;NGRAM&gt; &quot;(&quot; IntegerLiteral &quot;)&quot;
3877</pre></div></div>
3878<p>The CREATE INDEX statement creates a secondary index on one or more fields of a specified dataset. Supported index types include <tt>BTREE</tt> for totally ordered datatypes, <tt>RTREE</tt> for spatial data, and <tt>KEYWORD</tt> and <tt>NGRAM</tt> for textual (string) data. An index can be created on a nested field (or fields) by providing a valid path expression as an index field identifier.</p>
3879<p>An indexed field is not required to be part of the datatype associated with a dataset if the dataset&#x2019;s datatype is declared as open <b>and</b> if the field&#x2019;s type is provided along with its name and if the <tt>ENFORCED</tt> keyword is specified at the end of the index definition. <tt>ENFORCING</tt> an open field introduces a check that makes sure that the actual type of the indexed field (if the optional field exists in the object) always matches this specified (open) field type.</p>
3880<p>The following example creates a btree index called gbAuthorIdx on the authorId field of the GleambookMessages dataset. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the author-id field.</p>
3881<div class="section">
3882<h4><a name="Example"></a>Example</h4>
3883
3884<div class="source">
3885<div class="source">
3886<pre>CREATE INDEX gbAuthorIdx ON GleambookMessages(authorId) TYPE BTREE;
3887</pre></div></div>
3888<p>The following example creates an open btree index called gbSendTimeIdx on the (non-predeclared) sendTime field of the GleambookMessages dataset having datetime type. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the sendTime field. The index is enforced so that records that do not have the &#x201c;sendTime&#x201d; field or have a mismatched type on the field cannot be inserted into the dataset.</p></div>
3889<div class="section">
3890<h4><a name="Example"></a>Example</h4>
3891
3892<div class="source">
3893<div class="source">
3894<pre>CREATE INDEX gbSendTimeIdx ON GleambookMessages(sendTime: datetime?) TYPE BTREE ENFORCED;
3895</pre></div></div>
3896<p>The following example creates a btree index called crpUserScrNameIdx on screenName, a nested field residing within a object-valued user field in the ChirpMessages dataset. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the nested screenName field. Such nested fields must be singular, i.e., one cannot index through (or on) an array-valued field.</p></div>
3897<div class="section">
3898<h4><a name="Example"></a>Example</h4>
3899
3900<div class="source">
3901<div class="source">
3902<pre>CREATE INDEX crpUserScrNameIdx ON ChirpMessages(user.screenName) TYPE BTREE;
3903</pre></div></div>
3904<p>The following example creates an rtree index called gbSenderLocIdx on the sender-location field of the GleambookMessages dataset. This index can be useful for accelerating queries that use the <a href="functions.html#spatial-intersect"><tt>spatial-intersect</tt> function</a> in a predicate involving the sender-location field.</p></div>
3905<div class="section">
3906<h4><a name="Example"></a>Example</h4>
3907
3908<div class="source">
3909<div class="source">
3910<pre>CREATE INDEX gbSenderLocIndex ON GleambookMessages(&quot;sender-location&quot;) TYPE RTREE;
3911</pre></div></div>
3912<p>The following example creates a 3-gram index called fbUserIdx on the name field of the GleambookUsers dataset. This index can be used to accelerate some similarity or substring maching queries on the name field. For details refer to the document on <a href="similarity.html#NGram_Index">similarity queries</a>.</p></div>
3913<div class="section">
3914<h4><a name="Example"></a>Example</h4>
3915
3916<div class="source">
3917<div class="source">
3918<pre>CREATE INDEX fbUserIdx ON GleambookUsers(name) TYPE NGRAM(3);
3919</pre></div></div>
3920<p>The following example creates a keyword index called fbMessageIdx on the message field of the GleambookMessages dataset. This keyword index can be used to optimize queries with token-based similarity predicates on the message field. For details refer to the document on <a href="similarity.html#Keyword_Index">similarity queries</a>.</p></div>
3921<div class="section">
3922<h4><a name="Example"></a>Example</h4>
3923
3924<div class="source">
3925<div class="source">
3926<pre>CREATE INDEX fbMessageIdx ON GleambookMessages(message) TYPE KEYWORD;
3927</pre></div></div>
3928<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3929 ! or more contributor license agreements. See the NOTICE file
3930 ! distributed with this work for additional information
3931 ! regarding copyright ownership. The ASF licenses this file
3932 ! to you under the Apache License, Version 2.0 (the
3933 ! "License"); you may not use this file except in compliance
3934 ! with the License. You may obtain a copy of the License at
3935 !
3936 ! http://www.apache.org/licenses/LICENSE-2.0
3937 !
3938 ! Unless required by applicable law or agreed to in writing,
3939 ! software distributed under the License is distributed on an
3940 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3941 ! KIND, either express or implied. See the License for the
3942 ! specific language governing permissions and limitations
3943 ! under the License.
3944 ! -->
3945<p>The following example creates an open btree index called gbReadTimeIdx on the (non-predeclared) readTime field of the GleambookMessages dataset having datetime type. This index can be useful for accelerating exact-match queries, range search queries, and joins involving the <tt>readTime</tt> field. The index is not enforced so that records that do not have the <tt>readTime</tt> field or have a mismatched type on the field can still be inserted into the dataset.</p></div>
3946<div class="section">
3947<h4><a name="Example"></a>Example</h4>
3948
3949<div class="source">
3950<div class="source">
3951<pre>CREATE INDEX gbReadTimeIdx ON GleambookMessages(readTime: datetime?);
3952</pre></div></div>
3953<!-- ! Licensed to the Apache Software Foundation (ASF) under one
3954 ! or more contributor license agreements. See the NOTICE file
3955 ! distributed with this work for additional information
3956 ! regarding copyright ownership. The ASF licenses this file
3957 ! to you under the Apache License, Version 2.0 (the
3958 ! "License"); you may not use this file except in compliance
3959 ! with the License. You may obtain a copy of the License at
3960 !
3961 ! http://www.apache.org/licenses/LICENSE-2.0
3962 !
3963 ! Unless required by applicable law or agreed to in writing,
3964 ! software distributed under the License is distributed on an
3965 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
3966 ! KIND, either express or implied. See the License for the
3967 ! specific language governing permissions and limitations
3968 ! under the License.
3969 ! --></div></div>
3970<div class="section">
3971<h3><a name="Functions" id="Functions"> Functions</a></h3>
3972<p>The create function statement creates a <b>named</b> function that can then be used and reused in SQL++ queries. The body of a function can be any SQL++ expression involving the function&#x2019;s parameters.</p>
3973
3974<div class="source">
3975<div class="source">
3976<pre>FunctionSpecification ::= &quot;FUNCTION&quot; FunctionOrTypeName IfNotExists ParameterList &quot;{&quot; Expression &quot;}&quot;
3977</pre></div></div>
3978<p>The following is an example of a CREATE FUNCTION statement which is similar to our earlier DECLARE FUNCTION example. It differs from that example in that it results in a function that is persistently registered by name in the specified dataverse (the current dataverse being used, if not otherwise specified).</p>
3979<div class="section">
3980<div class="section">
3981<h5><a name="Example"></a>Example</h5>
3982
3983<div class="source">
3984<div class="source">
3985<pre>CREATE FUNCTION friendInfo(userId) {
3986 (SELECT u.id, u.name, len(u.friendIds) AS friendCount
3987 FROM GleambookUsers u
3988 WHERE u.id = userId)[0]
3989 };
3990</pre></div></div></div></div></div>
3991<div class="section">
3992<h3><a name="Removal" id="Removal"> Removal</a></h3>
3993
3994<div class="source">
3995<div class="source">
3996<pre>DropStatement ::= &quot;DROP&quot; ( &quot;DATAVERSE&quot; Identifier IfExists
3997 | &quot;TYPE&quot; FunctionOrTypeName IfExists
3998 | &quot;DATASET&quot; QualifiedName IfExists
3999 | &quot;INDEX&quot; DoubleQualifiedName IfExists
4000 | &quot;FUNCTION&quot; FunctionSignature IfExists )
4001IfExists ::= ( &quot;IF&quot; &quot;EXISTS&quot; )?
4002</pre></div></div>
4003<p>The DROP statement in SQL++ is the inverse of the CREATE statement. It can be used to drop dataverses, datatypes, datasets, indexes, and functions.</p>
4004<p>The following examples illustrate some uses of the DROP statement.</p>
4005<div class="section">
4006<div class="section">
4007<h5><a name="Example"></a>Example</h5>
4008
4009<div class="source">
4010<div class="source">
4011<pre>DROP DATASET GleambookUsers IF EXISTS;
4012
4013DROP INDEX GleambookMessages.gbSenderLocIndex;
4014
4015DROP TYPE TinySocial2.GleambookUserType;
4016
4017DROP FUNCTION friendInfo@1;
4018
4019DROP DATAVERSE TinySocial;
4020</pre></div></div>
4021<p>When an artifact is dropped, it will be droppped from the current dataverse if none is specified (see the DROP DATASET example above) or from the specified dataverse (see the DROP TYPE example above) if one is specified by fully qualifying the artifact name in the DROP statement. When specifying an index to drop, the index name must be qualified by the dataset that it indexes. When specifying a function to drop, since SQL++ allows functions to be overloaded by their number of arguments, the identifying name of the function to be dropped must explicitly include that information. (<tt>friendInfo@1</tt> above denotes the 1-argument function named friendInfo in the current dataverse.)</p></div></div></div>
4022<div class="section">
4023<h3><a name="Load_Statement"></a><a name="Load_statement" id="Load_statement">Load Statement</a></h3>
4024
4025<div class="source">
4026<div class="source">
4027<pre>LoadStatement ::= &lt;LOAD&gt; &lt;DATASET&gt; QualifiedName &lt;USING&gt; AdapterName Configuration ( &lt;PRE-SORTED&gt; )?
4028</pre></div></div>
4029<p>The LOAD statement is used to initially populate a dataset via bulk loading of data from an external file. An appropriate adapter must be selected to handle the nature of the desired external data. The LOAD statement accepts the same adapters and the same parameters as discussed earlier for External datasets. (See the <a href="externaldata.html">guide to external data</a> for more information on the available adapters.) If a dataset has an auto-generated primary key field, the file to be imported should not include that field in it.</p>
4030<p>The following example shows how to bulk load the GleambookUsers dataset from an external file containing data that has been prepared in ADM (Asterix Data Model) format.</p>
4031<div class="section">
4032<div class="section">
4033<h5><a name="Example"></a>Example</h5>
4034
4035<div class="source">
4036<div class="source">
4037<pre> LOAD DATASET GleambookUsers USING localfs
4038 ((&quot;path&quot;=&quot;127.0.0.1:///Users/bignosqlfan/tinysocialnew/gbu.adm&quot;),(&quot;format&quot;=&quot;adm&quot;));
4039</pre></div></div>
4040<!-- ! Licensed to the Apache Software Foundation (ASF) under one
4041 ! or more contributor license agreements. See the NOTICE file
4042 ! distributed with this work for additional information
4043 ! regarding copyright ownership. The ASF licenses this file
4044 ! to you under the Apache License, Version 2.0 (the
4045 ! "License"); you may not use this file except in compliance
4046 ! with the License. You may obtain a copy of the License at
4047 !
4048 ! http://www.apache.org/licenses/LICENSE-2.0
4049 !
4050 ! Unless required by applicable law or agreed to in writing,
4051 ! software distributed under the License is distributed on an
4052 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
4053 ! KIND, either express or implied. See the License for the
4054 ! specific language governing permissions and limitations
4055 ! under the License.
4056 ! --></div></div></div></div>
4057<div class="section">
4058<h2><a name="Modification_statements" id="Modification_statements">Modification statements</a></h2>
4059<div class="section">
4060<h3><a name="INSERTs"></a><a name="Inserts" id="Inserts">INSERTs</a></h3>
4061
4062<div class="source">
4063<div class="source">
4064<pre>InsertStatement ::= &lt;INSERT&gt; &lt;INTO&gt; QualifiedName Query
4065</pre></div></div>
4066<p>The SQL++ INSERT statement is used to insert new data into a dataset. The data to be inserted comes from a SQL++ query expression. This expression can be as simple as a constant expression, or in general it can be any legal SQL++ query. If the target dataset has an auto-generated primary key field, the insert statement should not include a value for that field in it. (The system will automatically extend the provided object with this additional field and a corresponding value.) Insertion will fail if the dataset already has data with the primary key value(s) being inserted.</p>
4067<p>Inserts are processed transactionally by the system. The transactional scope of each insert transaction is the insertion of a single object plus its affiliated secondary index entries (if any). If the query part of an insert returns a single object, then the INSERT statement will be a single, atomic transaction. If the query part returns multiple objects, each object being inserted will be treated as a separate tranaction. The following example illustrates a query-based insertion.</p>
4068<div class="section">
4069<div class="section">
4070<h5><a name="Example"></a>Example</h5>
4071
4072<div class="source">
4073<div class="source">
4074<pre>INSERT INTO UsersCopy (SELECT VALUE user FROM GleambookUsers user)
4075</pre></div></div></div></div></div>
4076<div class="section">
4077<h3><a name="UPSERTs"></a><a name="Upserts" id="Upserts">UPSERTs</a></h3>
4078
4079<div class="source">
4080<div class="source">
4081<pre>UpsertStatement ::= &lt;UPSERT&gt; &lt;INTO&gt; QualifiedName Query
4082</pre></div></div>
4083<p>The SQL++ UPSERT statement syntactically mirrors the INSERT statement discussed above. The difference lies in its semantics, which for UPSERT are &#x201c;add or replace&#x201d; instead of the INSERT &#x201c;add if not present, else error&#x201d; semantics. Whereas an INSERT can fail if another object already exists with the specified key, the analogous UPSERT will replace the previous object&#x2019;s value with that of the new object in such cases.</p>
4084<p>The following example illustrates a query-based upsert operation.</p>
4085<div class="section">
4086<div class="section">
4087<h5><a name="Example"></a>Example</h5>
4088
4089<div class="source">
4090<div class="source">
4091<pre>UPSERT INTO UsersCopy (SELECT VALUE user FROM GleambookUsers user)
4092</pre></div></div>
4093<p>*Editor&#x2019;s note: Upserts currently work in AQL but are not yet enabled (at the moment) in SQL++.</p></div></div></div>
4094<div class="section">
4095<h3><a name="DELETEs"></a><a name="Deletes" id="Deletes">DELETEs</a></h3>
4096
4097<div class="source">
4098<div class="source">
4099<pre>DeleteStatement ::= &lt;DELETE&gt; &lt;FROM&gt; QualifiedName ( ( &lt;AS&gt; )? Variable )? ( &lt;WHERE&gt; Expression )?
4100</pre></div></div>
4101<p>The SQL++ DELETE statement is used to delete data from a target dataset. The data to be deleted is identified by a boolean expression involving the variable bound to the target dataset in the DELETE statement.</p>
4102<p>Deletes are processed transactionally by the system. The transactional scope of each delete transaction is the deletion of a single object plus its affiliated secondary index entries (if any). If the boolean expression for a delete identifies a single object, then the DELETE statement itself will be a single, atomic transaction. If the expression identifies multiple objects, then each object deleted will be handled as a separate transaction.</p>
4103<p>The following examples illustrate single-object deletions.</p>
4104<div class="section">
4105<div class="section">
4106<h5><a name="Example"></a>Example</h5>
4107
4108<div class="source">
4109<div class="source">
4110<pre>DELETE FROM GleambookUsers user WHERE user.id = 8;
4111</pre></div></div></div>
4112<div class="section">
4113<h5><a name="Example"></a>Example</h5>
4114
4115<div class="source">
4116<div class="source">
4117<pre>DELETE FROM GleambookUsers WHERE id = 5;
4118</pre></div></div>
4119<!-- ! Licensed to the Apache Software Foundation (ASF) under one
4120 ! or more contributor license agreements. See the NOTICE file
4121 ! distributed with this work for additional information
4122 ! regarding copyright ownership. The ASF licenses this file
4123 ! to you under the Apache License, Version 2.0 (the
4124 ! "License"); you may not use this file except in compliance
4125 ! with the License. You may obtain a copy of the License at
4126 !
4127 ! http://www.apache.org/licenses/LICENSE-2.0
4128 !
4129 ! Unless required by applicable law or agreed to in writing,
4130 ! software distributed under the License is distributed on an
4131 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
4132 ! KIND, either express or implied. See the License for the
4133 ! specific language governing permissions and limitations
4134 ! under the License.
4135 ! -->
4136<h1><a name="Reserved_keywords" id="Reserved_keywords">Appendix 1. Reserved keywords</a></h1>
4137<!-- ! Licensed to the Apache Software Foundation (ASF) under one
4138 ! or more contributor license agreements. See the NOTICE file
4139 ! distributed with this work for additional information
4140 ! regarding copyright ownership. The ASF licenses this file
4141 ! to you under the Apache License, Version 2.0 (the
4142 ! "License"); you may not use this file except in compliance
4143 ! with the License. You may obtain a copy of the License at
4144 !
4145 ! http://www.apache.org/licenses/LICENSE-2.0
4146 !
4147 ! Unless required by applicable law or agreed to in writing,
4148 ! software distributed under the License is distributed on an
4149 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
4150 ! KIND, either express or implied. See the License for the
4151 ! specific language governing permissions and limitations
4152 ! under the License.
4153 ! -->
4154<p>All reserved keywords are listed in the following table:</p>
4155
4156<table border="0" class="table table-striped">
4157 <thead>
4158
4159<tr class="a">
4160
4161<th> </th>
4162
4163<th> </th>
4164
4165<th> </th>
4166
4167<th> </th>
4168
4169<th> </th>
4170
4171<th> </th>
4172 </tr>
4173 </thead>
4174 <tbody>
4175
4176<tr class="b">
4177
4178<td>AND </td>
4179
4180<td>ANY </td>
4181
4182<td>APPLY </td>
4183
4184<td>AS </td>
4185
4186<td>ASC </td>
4187
4188<td>AT </td>
4189 </tr>
4190
4191<tr class="a">
4192
4193<td>AUTOGENERATED </td>
4194
4195<td>BETWEEN </td>
4196
4197<td>BTREE </td>
4198
4199<td>BY </td>
4200
4201<td>CASE </td>
4202
4203<td>CLOSED </td>
4204 </tr>
4205
4206<tr class="b">
4207
4208<td>CREATE </td>
4209
4210<td>COMPACTION </td>
4211
4212<td>COMPACT </td>
4213
4214<td>CONNECT </td>
4215
4216<td>CORRELATE </td>
4217
4218<td>DATASET </td>
4219 </tr>
4220
4221<tr class="a">
4222
4223<td>COLLECTION </td>
4224
4225<td>DATAVERSE </td>
4226
4227<td>DECLARE </td>
4228
4229<td>DEFINITION </td>
4230
4231<td>DECLARE </td>
4232
4233<td>DEFINITION </td>
4234 </tr>
4235
4236<tr class="b">
4237
4238<td>DELETE </td>
4239
4240<td>DESC </td>
4241
4242<td>DISCONNECT </td>
4243
4244<td>DISTINCT </td>
4245
4246<td>DROP </td>
4247
4248<td>ELEMENT </td>
4249 </tr>
4250
4251<tr class="a">
4252
4253<td>ELEMENT </td>
4254
4255<td>EXPLAIN </td>
4256
4257<td>ELSE </td>
4258
4259<td>ENFORCED </td>
4260
4261<td>END </td>
4262
4263<td>EVERY </td>
4264 </tr>
4265
4266<tr class="b">
4267
4268<td>EXCEPT </td>
4269
4270<td>EXIST </td>
4271
4272<td>EXTERNAL </td>
4273
4274<td>FEED </td>
4275
4276<td>FILTER </td>
4277
4278<td>FLATTEN </td>
4279 </tr>
4280
4281<tr class="a">
4282
4283<td>FOR </td>
4284
4285<td>FROM </td>
4286
4287<td>FULL </td>
4288
4289<td>FUNCTION </td>
4290
4291<td>GROUP </td>
4292
4293<td>HAVING </td>
4294 </tr>
4295
4296<tr class="b">
4297
4298<td>HINTS </td>
4299
4300<td>IF </td>
4301
4302<td>INTO </td>
4303
4304<td>IN </td>
4305
4306<td>INDEX </td>
4307
4308<td>INGESTION </td>
4309 </tr>
4310
4311<tr class="a">
4312
4313<td>INNER </td>
4314
4315<td>INSERT </td>
4316
4317<td>INTERNAL </td>
4318
4319<td>INTERSECT </td>
4320
4321<td>IS </td>
4322
4323<td>JOIN </td>
4324 </tr>
4325
4326<tr class="b">
4327
4328<td>KEYWORD </td>
4329
4330<td>LEFT </td>
4331
4332<td>LETTING </td>
4333
4334<td>LET </td>
4335
4336<td>LIKE </td>
4337
4338<td>LIMIT </td>
4339 </tr>
4340
4341<tr class="a">
4342
4343<td>LOAD </td>
4344
4345<td>NODEGROUP </td>
4346
4347<td>NGRAM </td>
4348
4349<td>NOT </td>
4350
4351<td>OFFSET </td>
4352
4353<td>ON </td>
4354 </tr>
4355
4356<tr class="b">
4357
4358<td>OPEN </td>
4359
4360<td>OR </td>
4361
4362<td>ORDER </td>
4363
4364<td>OUTER </td>
4365
4366<td>OUTPUT </td>
4367
4368<td>PATH </td>
4369 </tr>
4370
4371<tr class="a">
4372
4373<td>POLICY </td>
4374
4375<td>PRE-SORTED </td>
4376
4377<td>PRIMARY </td>
4378
4379<td>RAW </td>
4380
4381<td>REFRESH </td>
4382
4383<td>RETURN </td>
4384 </tr>
4385
4386<tr class="b">
4387
4388<td>RTREE </td>
4389
4390<td>RUN </td>
4391
4392<td>SATISFIES </td>
4393
4394<td>SECONDARY </td>
4395
4396<td>SELECT </td>
4397
4398<td>SET </td>
4399 </tr>
4400
4401<tr class="a">
4402
4403<td>SOME </td>
4404
4405<td>TEMPORARY </td>
4406
4407<td>THEN </td>
4408
4409<td>TYPE </td>
4410
4411<td>UNKNOWN </td>
4412
4413<td>UNNEST </td>
4414 </tr>
4415
4416<tr class="b">
4417
4418<td>UPDATE </td>
4419
4420<td>USE </td>
4421
4422<td>USING </td>
4423
4424<td>VALUE </td>
4425
4426<td>WHEN </td>
4427
4428<td>WHERE </td>
4429 </tr>
4430
4431<tr class="a">
4432
4433<td>WITH </td>
4434
4435<td>WRITE </td>
4436
4437<td> </td>
4438
4439<td> </td>
4440
4441<td> </td>
4442
4443<td> </td>
4444 </tr>
4445 </tbody>
4446</table>
4447<!-- ! Licensed to the Apache Software Foundation (ASF) under one
4448 ! or more contributor license agreements. See the NOTICE file
4449 ! distributed with this work for additional information
4450 ! regarding copyright ownership. The ASF licenses this file
4451 ! to you under the Apache License, Version 2.0 (the
4452 ! "License"); you may not use this file except in compliance
4453 ! with the License. You may obtain a copy of the License at
4454 !
4455 ! http://www.apache.org/licenses/LICENSE-2.0
4456 !
4457 ! Unless required by applicable law or agreed to in writing,
4458 ! software distributed under the License is distributed on an
4459 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
4460 ! KIND, either express or implied. See the License for the
4461 ! specific language governing permissions and limitations
4462 ! under the License.
4463 ! --></div></div></div></div>
4464<div class="section">
4465<h2><a name="Appendix_2._Performance_Tuning"></a><a name="Performance_tuning" id="Performance_tuning">Appendix 2. Performance Tuning</a></h2>
4466<!-- ! Licensed to the Apache Software Foundation (ASF) under one
4467 ! or more contributor license agreements. See the NOTICE file
4468 ! distributed with this work for additional information
4469 ! regarding copyright ownership. The ASF licenses this file
4470 ! to you under the Apache License, Version 2.0 (the
4471 ! "License"); you may not use this file except in compliance
4472 ! with the License. You may obtain a copy of the License at
4473 !
4474 ! http://www.apache.org/licenses/LICENSE-2.0
4475 !
4476 ! Unless required by applicable law or agreed to in writing,
4477 ! software distributed under the License is distributed on an
4478 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
4479 ! KIND, either express or implied. See the License for the
4480 ! specific language governing permissions and limitations
4481 ! under the License.
4482 ! -->
4483<p>The SET statement can be used to override some cluster-wide configuration parameters for a specific request:</p>
4484
4485<div class="source">
4486<div class="source">
4487<pre>SET &lt;IDENTIFIER&gt; &lt;STRING_LITERAL&gt;
4488</pre></div></div>
4489<p>As parameter identifiers are qualified names (containing a &#x2018;.&#x2019;) they have to be escaped using backticks (``). Note that changing query parameters will not affect query correctness but only impact performance characteristics, such as response time and throughput.</p></div>
4490<div class="section">
4491<h2><a name="Parallelism_Parameter"></a><a name="Parallelism_parameter" id="Parallelism_parameter">Parallelism Parameter</a></h2>
4492<p>The system can execute each request using multiple cores on multiple machines (a.k.a., partitioned parallelism) in a cluster. A user can manually specify the maximum execution parallelism for a request to scale it up and down using the following parameter:</p>
4493
4494<ul>
4495
4496<li>
4497<p><b>compiler.parallelism</b>: the maximum number of CPU cores can be used to process a query. There are three cases of the value <i>p</i> for compiler.parallelism:</p>
4498
4499<ul>
4500
4501<li><i>p</i> &lt; 0 or <i>p</i> &gt; the total number of cores in a cluster: the system will use all available cores in the cluster;</li>
4502 </ul>
4503
4504<ul>
4505
4506<li><i>p</i> = 0 (the default): the system will use the storage parallelism (the number of partitions of stored datasets) as the maximum parallelism for query processing;</li>
4507 </ul>
4508
4509<ul>
4510
4511<li>all other cases: the system will use the user-specified number as the maximum number of CPU cores to use for executing the query.</li>
4512 </ul></li>
4513</ul>
4514<div class="section">
4515<div class="section">
4516<div class="section">
4517<h5><a name="Example"></a>Example</h5>
4518
4519<div class="source">
4520<div class="source">
4521<pre>SET `compiler.parallelism` &quot;16&quot;;
4522
4523SELECT u.name AS uname, m.message AS message
4524FROM GleambookUsers u JOIN GleambookMessages m ON m.authorId = u.id;
4525</pre></div></div></div></div></div></div>
4526<div class="section">
4527<h2><a name="Memory_Parameters"></a><a name="Memory_parameters" id="Memory_parameters">Memory Parameters</a></h2>
4528<p>In the system, each blocking runtime operator such as join, group-by and order-by works within a fixed memory budget, and can gracefully spill to disks if the memory budget is smaller than the amount of data they have to hold. A user can manually configure the memory budget of those operators within a query. The supported configurable memory parameters are:</p>
4529
4530<ul>
4531
4532<li>
4533<p><b>compiler.groupmemory</b>: the memory budget that each parallel group-by operator instance can use; 32MB is the default budget.</p></li>
4534
4535<li>
4536<p><b>compiler.sortmemory</b>: the memory budget that each parallel sort operator instance can use; 32MB is the default budget.</p></li>
4537
4538<li>
4539<p><b>compiler.joinmemory</b>: the memory budget that each parallel hash join operator instance can use; 32MB is the default budget.</p></li>
4540</ul>
4541<p>For each memory budget value, you can use a 64-bit integer value with a 1024-based binary unit suffix (for example, B, KB, MB, GB). If there is no user-provided suffix, &#x201c;B&#x201d; is the default suffix. See the following examples.</p>
4542<div class="section">
4543<div class="section">
4544<div class="section">
4545<h5><a name="Example"></a>Example</h5>
4546
4547<div class="source">
4548<div class="source">
4549<pre>SET `compiler.groupmemory` &quot;64MB&quot;;
4550
4551SELECT msg.authorId, COUNT(*)
4552FROM GleambookMessages msg
4553GROUP BY msg.authorId;
4554</pre></div></div></div>
4555<div class="section">
4556<h5><a name="Example"></a>Example</h5>
4557
4558<div class="source">
4559<div class="source">
4560<pre>SET `compiler.sortmemory` &quot;67108864&quot;;
4561
4562SELECT VALUE user
4563FROM GleambookUsers AS user
4564ORDER BY ARRAY_LENGTH(user.friendIds) DESC;
4565</pre></div></div></div>
4566<div class="section">
4567<h5><a name="Example"></a>Example</h5>
4568
4569<div class="source">
4570<div class="source">
4571<pre>SET `compiler.joinmemory` &quot;132000KB&quot;;
4572
4573SELECT u.name AS uname, m.message AS message
4574FROM GleambookUsers u JOIN GleambookMessages m ON m.authorId = u.id;
4575</pre></div></div></div></div></div></div>
4576 </div>
4577 </div>
4578 </div>
4579
4580 <hr/>
4581
4582 <footer>
4583 <div class="container-fluid">
4584 <div class="row span12">Copyright &copy; 2018
4585 <a href="https://www.apache.org/">The Apache Software Foundation</a>.
4586 All Rights Reserved.
4587
4588 </div>
4589
4590 <?xml version="1.0" encoding="UTF-8"?>
4591<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
4592 feather logo, and the Apache AsterixDB project logo are either
4593 registered trademarks or trademarks of The Apache Software
4594 Foundation in the United States and other countries.
4595 All other marks mentioned may be trademarks or registered
4596 trademarks of their respective owners.</div>
4597
4598
4599 </div>
4600 </footer>
4601 </body>
4602</html>