Add 0.9.1 Documentation
Change-Id: Ib852cde3e959f61fc2c95650535353c0b137c842
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1701
Reviewed-by: Xikui Wang <xkkwww@gmail.com>
diff --git a/docs/0.9.1/aql/fulltext.html b/docs/0.9.1/aql/fulltext.html
new file mode 100644
index 0000000..16ce5d0
--- /dev/null
+++ b/docs/0.9.1/aql/fulltext.html
@@ -0,0 +1,372 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2017-04-24
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+ <head>
+ <meta charset="UTF-8" />
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+ <meta name="Date-Revision-yyyymmdd" content="20170424" />
+ <meta http-equiv="Content-Language" content="en" />
+ <title>AsterixDB – AsterixDB Support of Full-text search queries</title>
+ <link rel="stylesheet" href="../css/apache-maven-fluido-1.3.0.min.css" />
+ <link rel="stylesheet" href="../css/site.css" />
+ <link rel="stylesheet" href="../css/print.css" media="print" />
+
+
+ <script type="text/javascript" src="../js/apache-maven-fluido-1.3.0.min.js"></script>
+
+
+
+<script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+ m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+ })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+ ga('create', 'UA-41536543-1', 'uci.edu');
+ ga('send', 'pageview');</script>
+
+ </head>
+ <body class="topBarDisabled">
+
+
+
+
+ <div class="container-fluid">
+ <div id="banner">
+ <div class="pull-left">
+ <a href=".././" id="bannerLeft">
+ <img src="../images/asterixlogo.png" alt="AsterixDB"/>
+ </a>
+ </div>
+ <div class="pull-right"> </div>
+ <div class="clear"><hr/></div>
+ </div>
+
+ <div id="breadcrumbs">
+ <ul class="breadcrumb">
+
+
+ <li id="publishDate">Last Published: 2017-04-24</li>
+
+
+
+ <li id="projectVersion" class="pull-right">Version: 0.9.1</li>
+
+ <li class="divider pull-right">|</li>
+
+ <li class="pull-right"> <a href="../index.html" title="Documentation Home">
+ Documentation Home</a>
+ </li>
+
+ </ul>
+ </div>
+
+
+ <div class="row-fluid">
+ <div id="leftColumn" class="span3">
+ <div class="well sidebar-nav">
+
+
+ <ul class="nav nav-list">
+ <li class="nav-header">Get Started - Installation</li>
+
+ <li>
+
+ <a href="../ncservice.html" title="Option 1: using NCService">
+ <i class="none"></i>
+ Option 1: using NCService</a>
+ </li>
+
+ <li>
+
+ <a href="../ansible.html" title="Option 2: using Ansible">
+ <i class="none"></i>
+ Option 2: using Ansible</a>
+ </li>
+
+ <li>
+
+ <a href="../aws.html" title="Option 3: using Amazon Web Services">
+ <i class="none"></i>
+ Option 3: using Amazon Web Services</a>
+ </li>
+
+ <li>
+
+ <a href="../yarn.html" title="Option 4: using YARN">
+ <i class="none"></i>
+ Option 4: using YARN</a>
+ </li>
+
+ <li>
+
+ <a href="../install.html" title="Option 5: using Managix (deprecated)">
+ <i class="none"></i>
+ Option 5: using Managix (deprecated)</a>
+ </li>
+ <li class="nav-header">AsterixDB Primer</li>
+
+ <li>
+
+ <a href="../sqlpp/primer-sqlpp.html" title="Option 1: using SQL++">
+ <i class="none"></i>
+ Option 1: using SQL++</a>
+ </li>
+
+ <li>
+
+ <a href="../aql/primer.html" title="Option 2: using AQL">
+ <i class="none"></i>
+ Option 2: using AQL</a>
+ </li>
+ <li class="nav-header">Data Model</li>
+
+ <li>
+
+ <a href="../datamodel.html" title="The Asterix Data Model">
+ <i class="none"></i>
+ The Asterix Data Model</a>
+ </li>
+ <li class="nav-header">Queries - SQL++</li>
+
+ <li>
+
+ <a href="../sqlpp/manual.html" title="The SQL++ Query Language">
+ <i class="none"></i>
+ The SQL++ Query Language</a>
+ </li>
+
+ <li>
+
+ <a href="../sqlpp/builtins.html" title="Builtin Functions">
+ <i class="none"></i>
+ Builtin Functions</a>
+ </li>
+ <li class="nav-header">Queries - AQL</li>
+
+ <li>
+
+ <a href="../aql/manual.html" title="The Asterix Query Language (AQL)">
+ <i class="none"></i>
+ The Asterix Query Language (AQL)</a>
+ </li>
+
+ <li>
+
+ <a href="../aql/builtins.html" title="Builtin Functions">
+ <i class="none"></i>
+ Builtin Functions</a>
+ </li>
+ <li class="nav-header">API/SDK</li>
+
+ <li>
+
+ <a href="../api.html" title="HTTP API">
+ <i class="none"></i>
+ HTTP API</a>
+ </li>
+
+ <li>
+
+ <a href="../csv.html" title="CSV Output">
+ <i class="none"></i>
+ CSV Output</a>
+ </li>
+ <li class="nav-header">Advanced Features</li>
+
+ <li class="active">
+
+ <a href="#"><i class="none"></i>Support of Full-text Queries</a>
+ </li>
+
+ <li>
+
+ <a href="../aql/externaldata.html" title="Accessing External Data">
+ <i class="none"></i>
+ Accessing External Data</a>
+ </li>
+
+ <li>
+
+ <a href="../feeds/tutorial.html" title="Support for Data Ingestion">
+ <i class="none"></i>
+ Support for Data Ingestion</a>
+ </li>
+
+ <li>
+
+ <a href="../udf.html" title="User Defined Functions">
+ <i class="none"></i>
+ User Defined Functions</a>
+ </li>
+
+ <li>
+
+ <a href="../aql/filters.html" title="Filter-Based LSM Index Acceleration">
+ <i class="none"></i>
+ Filter-Based LSM Index Acceleration</a>
+ </li>
+
+ <li>
+
+ <a href="../aql/similarity.html" title="Support of Similarity Queries">
+ <i class="none"></i>
+ Support of Similarity Queries</a>
+ </li>
+ </ul>
+
+
+
+ <hr class="divider" />
+
+ <div id="poweredBy">
+ <div class="clear"></div>
+ <div class="clear"></div>
+ <div class="clear"></div>
+ <a href=".././" title="AsterixDB" class="builtBy">
+ <img class="builtBy" alt="AsterixDB" src="../images/asterixlogo.png" />
+ </a>
+ </div>
+ </div>
+ </div>
+
+
+ <div id="bodyColumn" class="span9" >
+
+ <!-- ! Licensed to the Apache Software Foundation (ASF) under one
+ ! or more contributor license agreements. See the NOTICE file
+ ! distributed with this work for additional information
+ ! regarding copyright ownership. The ASF licenses this file
+ ! to you under the Apache License, Version 2.0 (the
+ ! "License"); you may not use this file except in compliance
+ ! with the License. You may obtain a copy of the License at
+ !
+ ! http://www.apache.org/licenses/LICENSE-2.0
+ !
+ ! Unless required by applicable law or agreed to in writing,
+ ! software distributed under the License is distributed on an
+ ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ! KIND, either express or implied. See the License for the
+ ! specific language governing permissions and limitations
+ ! under the License.
+ ! --><h1>AsterixDB Support of Full-text search queries</h1>
+<div class="section">
+<h2><a name="Table_of_Contents"></a><a name="toc" id="toc">Table of Contents</a></h2>
+
+<ul>
+
+<li><a href="#Motivation">Motivation</a></li>
+
+<li><a href="#Syntax">Syntax</a></li>
+
+<li><a href="#FulltextIndex">Creating and utilizing a Full-text index</a></li>
+</ul></div>
+<div class="section">
+<h2><a name="Motivation_Back_to_TOC"></a><a name="Motivation" id="Motivation">Motivation</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
+<p>Full-Text Search (FTS) queries are widely used in applications where users need to find records that satisfy an FTS predicate, i.e., where simple string-based matching is not sufficient. These queries are important when finding documents that contain a certain keyword is crucial. FTS queries are different from substring matching queries in that FTS queries find their query predicates as exact keywords in the given string, rather than treating a query predicate as a sequence of characters. For example, an FTS query that finds “rain” correctly returns a document when it contains “rain” as a word. However, a substring-matching query returns a document whenever it contains “rain” as a substring, for instance, a document with “brain” or “training” would be returned as well.</p></div>
+<div class="section">
+<h2><a name="Syntax_Back_to_TOC"></a><a name="Syntax" id="Syntax">Syntax</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
+<p>The syntax of AsterixDB FTS follows a portion of the XQuery FullText Search syntax. Two basic forms are as follows:</p>
+
+<div class="source">
+<div class="source">
+<pre> ftcontains(Expression1, Expression2, {FullTextOption})
+ ftcontains(Expression1, Expression2)
+</pre></div></div>
+<p>For example, we can execute the following query to find tweet messages where the <tt>message-text</tt> field includes “voice” as a word. Please note that an FTS search is case-insensitive. Thus, “Voice” or “voice” will be evaluated as the same word.</p>
+
+<div class="source">
+<div class="source">
+<pre> use dataverse TinySocial;
+
+ for $msg in dataset TweetMessages
+ where ftcontains($msg.message-text, "voice", {"mode":"any"})
+ return {"id": $msg.id}
+</pre></div></div>
+<p>The DDL and DML of TinySocial can be found in <a href="primer.html#ADM:_Modeling_Semistructed_Data_in_AsterixDB">ADM: Modeling Semistructed Data in AsterixDB</a>.</p>
+<p>The same query can be also expressed in the SQL++.</p>
+
+<div class="source">
+<div class="source">
+<pre> use TinySocial;
+
+ select element {"id":msg.id}
+ from TweetMessages as msg
+ where TinySocial.ftcontains(msg.`message-text`, "voice", {"mode":"any"})
+</pre></div></div>
+<p>The <tt>Expression1</tt> is an expression that should be evaluable as a string at runtime as in the above example where <tt>$msg.message-text</tt> is a string field. The <tt>Expression2</tt> can be a string, an (un)ordered list of string value(s), or an expression. In the last case, the given expression should be evaluable into one of the first two types, i.e., into a string value or an (un)ordered list of string value(s).</p>
+<p>The following examples are all valid expressions.</p>
+
+<div class="source">
+<div class="source">
+<pre> ... where ftcontains($msg.message-text, "sound")
+ ... where ftcontains($msg.message-text, "sound", {"mode":"any"})
+ ... where ftcontains($msg.message-text, ["sound", "system"], {"mode":"any"})
+ ... where ftcontains($msg.message-text, {{"speed", "stand", "customization"}}, {"mode":"all"})
+ ... where ftcontains($msg.message-text, let $keyword_list := ["voice", "system"] return $keyword_list, {"mode":"all"})
+ ... where ftcontains($msg.message-text, $keyword_list, {"mode":"any"})
+</pre></div></div>
+<p>In the last example above, <tt>$keyword_list</tt> should evaluate to a string or an (un)ordered list of string value(s).</p>
+<p>The last <tt>FullTextOption</tt> parameter clarifies the given FTS request. If you omit the <tt>FullTextOption</tt> parameter, then the default value will be set for each possible option. Currently, we only have one option named <tt>mode</tt>. And as we extend the FTS feature, more options will be added. Please note that the format of <tt>FullTextOption</tt> is a record, thus you need to put the option(s) in a record <tt>{}</tt>. The <tt>mode</tt> option indicates whether the given FTS query is a conjunctive (AND) or disjunctive (OR) search request. This option can be either <tt>“any”</tt> or <tt>“all”</tt>. The default value for <tt>mode</tt> is <tt>“all”</tt>. If one specifies <tt>“any”</tt>, a disjunctive search will be conducted. For example, the following query will find documents whose <tt>message-text</tt> field contains “sound” or “system”, so a document will be returned if it contains either “sound”, “system”, or both of the keywords.</p>
+
+<div class="source">
+<div class="source">
+<pre> ... where ftcontains($msg.message-text, ["sound", "system"], {"mode":"any"})
+</pre></div></div>
+<p>The other option parameter,<tt>“all”</tt>, specifies a conjunctive search. The following examples will find the documents whose <tt>message-text</tt> field contains both “sound” and “system”. If a document contains only “sound” or “system” but not both, it will not be returned.</p>
+
+<div class="source">
+<div class="source">
+<pre> ... where ftcontains($msg.message-text, ["sound", "system"], {"mode":"all"})
+ ... where ftcontains($msg.message-text, ["sound", "system"])
+</pre></div></div>
+<p>Currently AsterixDB doesn’t (yet) support phrase searches, so the following query will not work.</p>
+
+<div class="source">
+<div class="source">
+<pre> ... where ftcontains($msg.message-text, "sound system", {"mode":"any"})
+</pre></div></div>
+<p>As a workaround solution, the following query can be used to achieve a roughly similar goal. The difference is that the following queries will find documents where <tt>$msg.message-text</tt> contains both “sound” and “system”, but the order and adjacency of “sound” and “system” are not checked, unlike in a phrase search. As a result, the query below would also return documents with “sound system can be installed.”, “system sound is perfect.”, or “sound is not clear. You may need to install a new system.”</p>
+
+<div class="source">
+<div class="source">
+<pre> ... where ftcontains($msg.message-text, ["sound", "system"], {"mode":"all"})
+ ... where ftcontains($msg.message-text, ["sound", "system"])
+</pre></div></div></div>
+<div class="section">
+<h2><a name="Creating_and_utilizing_a_Full-text_index_Back_to_TOC"></a><a name="FulltextIndex" id="FulltextIndex">Creating and utilizing a Full-text index</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
+<p>When there is a full-text index on the field that is being searched, rather than scanning all records, AsterixDB can utilize that index to expedite the execution of a FTS query. To create a full-text index, you need to specify the index type as <tt>fulltext</tt> in your DDL statement. For instance, the following DDL statement create a full-text index on the TweetMessages.message-text attribute.</p>
+
+<div class="source">
+<div class="source">
+<pre>create index messageFTSIdx on TweetMessages(message-text) type fulltext;
+</pre></div></div></div>
+ </div>
+ </div>
+ </div>
+
+ <hr/>
+
+ <footer>
+ <div class="container-fluid">
+ <div class="row span12">Copyright © 2017
+ <a href="https://www.apache.org/">The Apache Software Foundation</a>.
+ All Rights Reserved.
+
+ </div>
+
+ <?xml version="1.0" encoding="UTF-8"?>
+<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
+ feather logo, and the Apache AsterixDB project logo are either
+ registered trademarks or trademarks of The Apache Software
+ Foundation in the United States and other countries.
+ All other marks mentioned may be trademarks or registered
+ trademarks of their respective owners.</div>
+
+
+ </div>
+ </footer>
+ </body>
+</html>