finish markdown conversion
diff --git a/asterix-doc/src/site/markdown/AccessingExternalDataInAsterixDB.md b/asterix-doc/src/site/markdown/AccessingExternalDataInAsterixDB.md
index 8e8de3f..7e49a0f 100644
--- a/asterix-doc/src/site/markdown/AccessingExternalDataInAsterixDB.md
+++ b/asterix-doc/src/site/markdown/AccessingExternalDataInAsterixDB.md
@@ -1,4 +1,4 @@
-`<wiki:toc max_depth="4" />`
+# Accessing External Data in AsterixDB #
## Introduction ##
Data that needs to be processed by ASTERIX could be residing outside ASTERIX storage. Examples include data files on a distributed file system such as HDFS or on the local file system of a machine that is part of an ASTERIX cluster. For ASTERIX to process such data, end-user may create a regular dataset in ASTERIX (a.k.a. internal dataset) and load the dataset with the data. ASTERIX supports ''external datasets'' so that it is not necessary to “load” all data prior to using it. This also avoids creating multiple copies of data and the need to keep the copies in sync.
@@ -8,9 +8,9 @@
### Creating an External Dataset ###
-As an example we consider the Lineitem dataset from [http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSTPCHLinkedData/tpch.sql TPCH schema].
+As an example we consider the Lineitem dataset from [TPCH schema](http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSTPCHLinkedData/tpch.sql).
-We assume that you have successfully created an ASTERIX instance following the instructions at [InstallingAsterixUsingManagix Installing Asterix Using Managix].
+We assume that you have successfully created an ASTERIX instance following the instructions at [Installing Asterix Using Managix](InstallingAsterixUsingManagix.html).
_For constructing an example, we assume a single machine setup._
Similar to a regular dataset, an external dataset has an associated datatype. We shall first create the datatype associated with each record in Lineitem data.
@@ -46,7 +46,7 @@
Earlier, we assumed a single machine ASTERIX setup. To satisfy the prerequisite, log-in to the machine running ASTERIX.
- * Download the [https://code.google.com/p/asterixdb/downloads/detail?name=lineitem.tbl&can=2&q= data file] to an appropriate location. We denote this location by SOURCE_PATH.
+ * Download the [data file](https://code.google.com/p/asterixdb/downloads/detail?name=lineitem.tbl&can=2&q=) to an appropriate location. We denote this location by SOURCE_PATH.
ASTERIX provides a built-in adapter for data residing on the local file system. The adapter is referred by its alias- 'localfs'. We create an external dataset named Lineitem and use the 'localfs' adapter.
@@ -102,7 +102,7 @@
In your web-browser, navigate to 127.0.0.1 and paste the above to the query text box. Finally hit 'Execute'.
-Next we move over to the the section [#Writing_queries_against_an_External_Dataset Writing Queries against an External Dataset] and try a sample query against the external dataset.
+Next we move over to the the section [Writing Queries against an External Dataset](#Writing_Queries_against_an_External_Dataset) and try a sample query against the external dataset.
#### 2) Data file resides on an HDFS instance ####
Pre-requisite: It is required that the Namenode and atleast one of the HDFS Datanodes are reachable from the hosts that form the ASTERIX cluster. ASTERIX provides a built-in adapter for data residing on HDFS. The HDFS adapter is referred (in AQL) by its alias - 'hdfs'. We create an external dataset named Lineitem and associate the HDFS adapter with it.
@@ -150,7 +150,7 @@
*format*:
The parameter 'format' refers to the type of the data contained in the file. For example data contained in a file could be in json, ADM format or could be delimited-text with fields separated by a delimiting character.
-As an example. consider the [https://code.google.com/p/asterixdb/downloads/detail?name=lineitem.tbl&can=2&q= data file]. The file is a text file with each line representing a record. The fields in each record are separated by the '|' character.
+As an example. consider the [data file](https://code.google.com/p/asterixdb/downloads/detail?name=lineitem.tbl&can=2&q=). The file is a text file with each line representing a record. The fields in each record are separated by the '|' character.
We assume the HDFS URL to be hdfs://host:port. We further assume that the example data file is copied to the HDFS at a path denoted by HDFS_PATH.
diff --git a/asterix-doc/src/site/markdown/AsterixAlphaRelease.md b/asterix-doc/src/site/markdown/AsterixAlphaRelease.md
index c97f43f..c93c271 100644
--- a/asterix-doc/src/site/markdown/AsterixAlphaRelease.md
+++ b/asterix-doc/src/site/markdown/AsterixAlphaRelease.md
@@ -11,7 +11,7 @@
The ASTERIX effort has been targeting a wide range of semi-structured information, ranging from "data" use cases---where information is well-typed and highly regular---to "content" use cases---where data tends to be irregular, much of each datum may be textual, and the ultimate schema for the various data types involved may be hard to anticipate up front.
The ASTERIX project has been addressing technical issues including highly scalable data storage and indexing, semi-structured query processing on very large clusters, and merging time-tested parallel database techniques with modern data-intensive computing techniques to support performant yet declarative solutions to the problem of storing and analyzing semi-structured information effectively.
The first fruits of this labor have been captured in the AsterixDB system that is now being released in preliminary or "Alpha" release form.
-We are hoping that the arrival of AsterixDB will mark the beginning of the "BDMS era", and we hope that both the Big Data community and the database community will find the AsterixDB system to be interesting and useful for a much broader class of problems than can be addressed with any one of today's current Big Data platforms and related technologies (e.g., Hadoop, Pig, Hive, HBase, Cassandra, and so on). One of our project mottos has been "one size fits a bunch"---at least that has been our aim. For more information about the research effort that led to the birth of AsterixDB, please refer to our NSF project web site: http://asterix.ics.uci.edu/.
+We are hoping that the arrival of AsterixDB will mark the beginning of the "BDMS era", and we hope that both the Big Data community and the database community will find the AsterixDB system to be interesting and useful for a much broader class of problems than can be addressed with any one of today's current Big Data platforms and related technologies (e.g., Hadoop, Pig, Hive, HBase, Cassandra, and so on). One of our project mottos has been "one size fits a bunch"---at least that has been our aim. For more information about the research effort that led to the birth of AsterixDB, please refer to our NSF project web site: [http://asterix.ics.uci.edu/](http://asterix.ics.uci.edu/).
In a nutshell, AsterixDB is a full-function BDMS with a rich feature set that distinguishes it from pretty much any other Big Data platform that's out and available today. We believe that its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. AsterixDB has:
@@ -32,25 +32,25 @@
For the Alpha release, we've got a start; for the Beta release a month or so from now, we will hopefully have much more.
The following is a list of the wiki pages and supporting documents that we have available today:
-1. InstallingAsterixUsingManagix :
+1. [InstallingAsterixUsingManagix](InstallingAsterixUsingManagix.html) :
This is our installation guide, and it is where you should start.
-This document will tell you how to obtain, install, and manage instances of [https://asterixdb.googlecode.com/files/asterix-installer-0.0.4-binary-assembly.zip AsterixDB], including both single-machine setup (for developers) as well as cluster installations (for deployment in its intended form).
+This document will tell you how to obtain, install, and manage instances of [AsterixDB](https://asterixdb.googlecode.com/files/asterix-installer-0.0.4-binary-assembly.zip), including both single-machine setup (for developers) as well as cluster installations (for deployment in its intended form).
-2. AdmAql101 :
+2. [AdmAql101](AdmAql101.html) :
This is a first-timers introduction to the user model of the AsterixDB BDMS, by which we mean the view of AsterixDB as seen from the perspective of an "average user" or Big Data application developer.
The AsterixDB user model consists of its data modeling features (ADM) and its query capabilities (AQL).
This document presents a tiny "social data warehousing" example and uses it as a backdrop for describing, by example, the key features of AsterixDB.
By working through this document, you will learn how to define the artifacts needed to manage data in AsterixDB, how to load data into the system, how to use most of the basic features of its query language, and how to insert and delete data dynamically.
-3. AsterixDataTypesAndFunctions :
+3. [AsterixDataTypesAndFunctions](AsterixDataTypesAndFunctions.html) :
This is a reference document that catalogs the primitive data types and built-in functions available for use in AsterixDB schemas (in ADM) and queries (in AQL).
-4. [https://asterixdb.googlecode.com/files/AQL_Syntax.html AQL Grammar] :
+4. [AQL Grammar](https://asterixdb.googlecode.com/files/AQL_Syntax.html) :
This is a temporary placeholder for a future AQL language reference manual.
It offers a hyperlinked, auto-generated BNF specification of the AQL language syntax.
Our hope is that the combination of documents 2-4 will suffice to enable our adventurous Alpha users to learn and make use of AQL.
-5. [https://code.google.com/p/asterixdb/wiki/AsterixDBRestAPI AsterixDBRestAPI] :
+5. [AsterixDBRestAPI](https://code.google.com/p/asterixdb/wiki/AsterixDBRestAPI) :
Access to data in an AsterixDB instance is provided via a REST-based API.
This is a short document that describes the REST API entry points and their URL syntax.
diff --git a/asterix-doc/src/site/markdown/AsterixDBRestAPI.md b/asterix-doc/src/site/markdown/AsterixDBRestAPI.md
index d0bf8ec..80fbb11 100644
--- a/asterix-doc/src/site/markdown/AsterixDBRestAPI.md
+++ b/asterix-doc/src/site/markdown/AsterixDBRestAPI.md
@@ -1,6 +1,4 @@
-#summary REST API to AsterixDB
-
-`<wiki:toc max_depth="2" />`
+# REST API to AsterixDB #
## DDL API ##
@@ -10,8 +8,18 @@
Parameters:
-|| Parameter || Description || Required? ||
-|| ddl || String containing DDL statements to modify Metadata || Yes ||
+<table>
+<tr>
+ <td>Parameter</td>
+ <td>Description</td>
+ <td>Required?</td>
+</tr>
+<tr>
+ <td>ddl</td>
+ <td>String containing DDL statements to modify Metadata</td>
+ <td>Yes</td>
+</tr>
+</table>
This call does not return any result. If the operations were successful, HTTP OK status code is returned.
@@ -34,10 +42,10 @@
API call for the above DDL statements in the URL-encoded form.
-[http://localhost:19101/ddl?ddl=drop%20dataverse%20company%20if%20exists;create%20dataverse%20company;use%20dataverse%20company;create%20type%20Emp%20as%20open%20{id%20:%20int32,name%20:%20string};create%20dataset%20Employee(Emp)%20primary%20key%20id;]
+[http://localhost:19101/ddl?ddl=drop%20dataverse%20company%20if%20exists;create%20dataverse%20company;use%20dataverse%20company;create%20type%20Emp%20as%20open%20{id%20:%20int32,name%20:%20string};create%20dataset%20Employee(Emp)%20primary%20key%20id;](http://localhost:19101/ddl?ddl=drop%20dataverse%20company%20if%20exists;create%20dataverse%20company;use%20dataverse%20company;create%20type%20Emp%20as%20open%20{id%20:%20int32,name%20:%20string};create%20dataset%20Employee(Emp)%20primary%20key%20id;)
#### Response ####
-*HTTP OK 200* `<br />`
+*HTTP OK 200*
`<NO PAYLOAD>`
## Update API ##
@@ -48,8 +56,18 @@
Parameters:
-|| Parameter || Description || Required? ||
-|| statements || String containing update (insert/delete) statements to execute || Yes ||
+<table>
+<tr>
+ <td>Parameter</td>
+ <td>Description</td>
+ <td>Required?</td>
+</tr>
+<tr>
+ <td>statements</td>
+ <td>String containing update (insert/delete) statements to execute</td>
+ <td>Yes</td>
+</tr>
+</table>
This call does not return any result. If the operations were successful, HTTP OK status code is returned.
@@ -65,11 +83,11 @@
API call for the above update statement in the URL-encoded form.
-[http://localhost:19101/update?statements=use%20dataverse%20company;insert%20into%20dataset%20Employee({%20%22id%22:123,%22name%22:%22John%20Doe%22});]
+[http://localhost:19101/update?statements=use%20dataverse%20company;insert%20into%20dataset%20Employee({%20%22id%22:123,%22name%22:%22John%20Doe%22});](http://localhost:19101/update?statements=use%20dataverse%20company;insert%20into%20dataset%20Employee({%20%22id%22:123,%22name%22:%22John%20Doe%22});)
#### Response ####
-*HTTP OK 200* `<br />`
-`<NO PAYLOAD>` `<br />`
+*HTTP OK 200*
+`<NO PAYLOAD>`
## Query API ##
@@ -79,9 +97,23 @@
Parameters:
-|| Parameter || Description || Required? ||
-|| query || Query string to pass to ASTERIX for execution || Yes ||
-|| mode || Indicate if call should be synchronous or asynchronous. mode = synchronous blocks the call until results are available; mode = asynchronous returns immediately with a handle that can be used later to check the query’s status and to fetch results when available || No. default mode = synchronous ||
+<table>
+<tr>
+ <td>Parameter</td>
+ <td>Description</td>
+ <td>Required?</td>
+</tr>
+<tr>
+ <td>query</td>
+ <td>Query string to pass to ASTERIX for execution</td>
+ <td>Yes</td>
+</tr>
+<tr>
+ <td>mode</td>
+ <td>Indicate if call should be synchronous or asynchronous. mode = synchronous blocks the call until results are available; mode = asynchronous returns immediately with a handle that can be used later to check the query’s status and to fetch results when available</td>
+ <td>No. default mode = synchronous</td>
+</tr>
+</table>
Result: The result is returned as a JSON object as follows
@@ -105,10 +137,10 @@
API call for the above query statement in the URL-encoded form.
-[http://localhost:19101/query?query=use%20dataverse%20company;for%20$l%20in%20dataset('Employee')%20return%20$l;]
+[http://localhost:19101/query?query=use%20dataverse%20company;for%20$l%20in%20dataset('Employee')%20return%20$l;](http://localhost:19101/query?query=use%20dataverse%20company;for%20$l%20in%20dataset('Employee')%20return%20$l;)
#### Response ####
-*HTTP OK 200* `<br />`
+*HTTP OK 200*
Payload
@@ -125,10 +157,10 @@
API call for the above query statement in the URL-encoded form with mode=asynchronous
-[http://localhost:19101/query?query=use+dataverse+company%3B%0A%0Afor+%24l+in+dataset%28%27Employee%27%29+return+%24l%3B%0A&mode=asynchronous]
+[http://localhost:19101/query?query=use%20dataverse%20company;for%20$l%20in%20dataset('Employee')%20return%20$l;&mode=asynchronous](http://localhost:19101/query?query=use%20dataverse%20company;for%20$l%20in%20dataset('Employee')%20return%20$l;&mode=asynchronous)
#### Response ####
-*HTTP OK 200* `<br />`
+*HTTP OK 200*
Payload
@@ -145,8 +177,18 @@
Parameters:
-|| Parameter || Description || Required? ||
-|| handle || Result handle that was returned by a previous call to a /query call with mode = asynchronous || Yes ||
+<table>
+<tr>
+ <td>Parameter</td>
+ <td>Description</td>
+ <td>Required?</td>
+</tr>
+<tr>
+ <td>handle</td>
+ <td>Result handle that was returned by a previous call to a /query call with mode = asynchronous</td>
+ <td>Yes</td>
+</tr>
+</table>
Result: The result is returned as a JSON object as follows:
@@ -173,10 +215,10 @@
API call for reading results from the previous asynchronous query in the URL-encoded form.
-[http://localhost:19101/query/result?handle=%7B%22handle%22%3A+%5B45%2C+0%5D%7D]
+[http://localhost:19101/query/result?handle=%7B%22handle%22%3A+%5B45%2C+0%5D%7D](http://localhost:19101/query/result?handle=%7B%22handle%22%3A+%5B45%2C+0%5D%7D)
#### Response ####
-*HTTP OK 200* `<br />`
+*HTTP OK 200*
Payload
@@ -197,8 +239,18 @@
Parameters:
-|| Parameter || Description || Required? ||
-|| handle || Result handle that was returned by a previous call to a /query call with mode = asynchronous || Yes ||
+<table>
+<tr>
+ <td>Parameter</td>
+ <td>Description</td>
+ <td>Required?</td>
+</tr>
+<tr>
+ <td>handle</td>
+ <td>Result handle that was returned by a previous call to a /query call with mode = asynchronous</td>
+ <td>Yes</td>
+</tr>
+</table>
Result: The result is returned as a JSON object as follows:
@@ -213,7 +265,21 @@
Table of error codes and their types:
-|| Code || Type ||
-|| 1 || Invalid statement ||
-|| 2 || Parse failures ||
-|| 99 || Uncategorized error ||
+<table>
+<tr>
+ <td>Code</td>
+ <td>Type</td>
+</tr>
+<tr>
+ <td>1</td>
+ <td>Invalid statement</td>
+</tr>
+<tr>
+ <td>2</td>
+ <td>Parse failures</td>
+</tr>
+<tr>
+ <td>99</td>
+ <td>Uncategorized error</td>
+</tr>
+</table>
diff --git a/asterix-doc/src/site/markdown/AsterixDataTypes.md b/asterix-doc/src/site/markdown/AsterixDataTypes.md
index 01d0f75..6674006 100644
--- a/asterix-doc/src/site/markdown/AsterixDataTypes.md
+++ b/asterix-doc/src/site/markdown/AsterixDataTypes.md
@@ -1,7 +1,3 @@
-#summary Asterix Data Types
-
-`<wiki:toc max_depth="4" />`
-
# Asterix Data Model (ADM) #
# Basic data types #
@@ -43,25 +39,6 @@
{ "int8": 125i8, "int16": 32765i16, "int32": 294967295, "int64": 1700000000000000000i64 }
-`<wiki:comment>`
-### UInt8 / UInt16 / UInt32 / UInt64 ###
-Unsigned integer types using 8, 16, 32, or 64 bits.
-
- * Example:
-
- let $v8 := uint8("125")
- let $v16 := uint16("32765")
- let $v32 := uint32("4294967295")
- let $v64 := uint64("1700000000000000000")
- return { "int8": $v8, "int16": $v16, "int32": $v32, "int64": $v64}
-
-
- * The expected result is:
-
- { "int8": 125i8, "int16": 32765i16, "int32": 4294967295i64, "int64": 1700000000000000000i64 }
-
-`</wiki:comment>`
-
### Float ###
`Float` represents approximate numeric data values using 4 bytes.
@@ -249,7 +226,7 @@
Negative durations are also supported for the arithmetic operations between time instance types (`Date`, `Time` and `Datetime`), and is used to roll the time back for the given duration. For example `date("2012-01-01") + duration("-P3D")` will return `date("2011-12-29")`.
-Note that a canonical representation of the duration is always returned, regardless whether the duration is in the canonical representation or not from the user's input. More information about canonical representation can be found from [http://www.w3.org/TR/xpath-functions/#canonical-dayTimeDuration XPath dayTimeDuration Canonical Representation] and [http://www.w3.org/TR/xpath-functions/#canonical-yearMonthDuration yearMonthDuration Canonical Representation].
+Note that a canonical representation of the duration is always returned, regardless whether the duration is in the canonical representation or not from the user's input. More information about canonical representation can be found from [XPath dayTimeDuration Canonical Representation](http://www.w3.org/TR/xpath-functions/#canonical-dayTimeDuration) and [yearMonthDuration Canonical Representation](http://www.w3.org/TR/xpath-functions/#canonical-yearMonthDuration).
* Example:
@@ -290,7 +267,7 @@
{ "id": 213508, "name": "Alice Bob" }
-### !OrderedList ###
+### OrderedList ###
An `OrderedList` is a sequence of values for which the order is determined by creation or insertion. OrderedList constructors are denoted by brackets: "[...]".
An example would be
@@ -299,7 +276,7 @@
["alice", 123, "bob", null]
-### !UnorderedList ###
+### UnorderedList ###
An `UnorderedList` is an unordered sequence of values, similar to bags in SQL. UnorderedList constructors are denoted by two opening flower braces followed by data and two closing flower braces, like "{{...}}".
An example would be
diff --git a/asterix-doc/src/site/markdown/AsterixDataTypesAndFunctions.md b/asterix-doc/src/site/markdown/AsterixDataTypesAndFunctions.md
index 1dd4dd6..7576648 100644
--- a/asterix-doc/src/site/markdown/AsterixDataTypesAndFunctions.md
+++ b/asterix-doc/src/site/markdown/AsterixDataTypesAndFunctions.md
@@ -1,7 +1,3 @@
-#summary Asterix Data Types and Functions
-
-`<wiki:toc max_depth="4" />`
-
# Asterix Data Model (ADM) #
# Basic data types #
@@ -43,25 +39,6 @@
{ "int8": 125i8, "int16": 32765i16, "int32": 294967295, "int64": 1700000000000000000i64 }
-`<wiki:comment>`
-### UInt8 / UInt16 / UInt32 / UInt64 ###
-Unsigned integer types using 8, 16, 32, or 64 bits.
-
- * Example:
-
- let $v8 := uint8("125")
- let $v16 := uint16("32765")
- let $v32 := uint32("4294967295")
- let $v64 := uint64("1700000000000000000")
- return { "int8": $v8, "int16": $v16, "int32": $v32, "int64": $v64}
-
-
- * The expected result is:
-
- { "int8": 125i8, "int16": 32765i16, "int32": 4294967295i64, "int64": 1700000000000000000i64 }
-
-`</wiki:comment>`
-
### Float ###
`Float` represents approximate numeric data values using 4 bytes.
@@ -249,7 +226,7 @@
Negative durations are also supported for the arithmetic operations between time instance types (`Date`, `Time` and `Datetime`), and is used to roll the time back for the given duration. For example `date("2012-01-01") + duration("-P3D")` will return `date("2011-12-29")`.
-Note that a canonical representation of the duration is always returned, regardless whether the duration is in the canonical representation or not from the user's input. More information about canonical representation can be found from [http://www.w3.org/TR/xpath-functions/#canonical-dayTimeDuration XPath dayTimeDuration Canonical Representation] and [http://www.w3.org/TR/xpath-functions/#canonical-yearMonthDuration yearMonthDuration Canonical Representation].
+Note that a canonical representation of the duration is always returned, regardless whether the duration is in the canonical representation or not from the user's input. More information about canonical representation can be found from [XPath dayTimeDuration Canonical Representation](http://www.w3.org/TR/xpath-functions/#canonical-dayTimeDuration) and [yearMonthDuration Canonical Representation](http://www.w3.org/TR/xpath-functions/#canonical-yearMonthDuration).
* Example:
@@ -290,7 +267,7 @@
{ "id": 213508, "name": "Alice Bob" }
-### !OrderedList ###
+### OrderedList ###
An `OrderedList` is a sequence of values for which the order is determined by creation or insertion. OrderedList constructors are denoted by brackets: "[...]".
An example would be
@@ -299,7 +276,7 @@
["alice", 123, "bob", null]
-### !UnorderedList ###
+### UnorderedList ###
An `UnorderedList` is an unordered sequence of values, similar to bags in SQL. UnorderedList constructors are denoted by two opening flower braces followed by data and two closing flower braces, like "{{...}}".
An example would be
@@ -1228,7 +1205,7 @@
edit-distance(expression1, expression2)
- * Returns the [http://en.wikipedia.org/wiki/Levenshtein_distance edit distance] of `expression1` and `expression2`.
+ * Returns the [edit distance](http://en.wikipedia.org/wiki/Levenshtein_distance) of `expression1` and `expression2`.
* Arguments:
* `expression1` : A `String` or a homogeneous `OrderedList` of a comparable item type.
* `expression2` : The same type as `expression1`.
@@ -1258,7 +1235,7 @@
edit-distance-check(expression1, expression2, threshold)
- * Checks whether `expression1` and `expression2` have a [http://en.wikipedia.org/wiki/Levenshtein_distance edit distance] `<= `threshold`. The “check” version of edit distance is faster than the "non-check" version because the former can detect whether two items satisfy a given similarity threshold using early-termination techniques, as opposed to computing their real distance. Although possible, it is not necessary for the user to write queries using the “check” versions explicitly, since a rewrite rule can perform an appropriate transformation from a “non-check” version to a “check” version.
+ * Checks whether `expression1` and `expression2` have a [edit distance](http://en.wikipedia.org/wiki/Levenshtein_distance) `<= threshold`. The “check” version of edit distance is faster than the "non-check" version because the former can detect whether two items satisfy a given similarity threshold using early-termination techniques, as opposed to computing their real distance. Although possible, it is not necessary for the user to write queries using the “check” versions explicitly, since a rewrite rule can perform an appropriate transformation from a “non-check” version to a “check” version.
* Arguments:
* `expression1` : A `String` or a homogeneous `OrderedList` of a comparable item type.
@@ -1289,7 +1266,7 @@
similarity-jaccard(list_expression1, list_expression2)
- * Returns the [http://en.wikipedia.org/wiki/Jaccard_index Jaccard similarity] of `list_expression1` and `list_expression2`.
+ * Returns the [Jaccard similarity](http://en.wikipedia.org/wiki/Jaccard_index) of `list_expression1` and `list_expression2`.
* Arguments:
* `list_expression1` : An `UnorderedList` or `OrderedList`.
* `list_expression2` : An `UnorderedList` or `OrderedList`.
@@ -1323,7 +1300,7 @@
similarity-jaccard-check(list_expression1, list_expression2, threshold)
- * Checks whether `list_expression1` and `list_expression2` have a [http://en.wikipedia.org/wiki/Jaccard_index Jaccard similarity] >`= `threshold`. Again, the “check” version of Jaccard is faster than the "non-check" version.
+ * Checks whether `list_expression1` and `list_expression2` have a [Jaccard similarity](http://en.wikipedia.org/wiki/Jaccard_index) `>= threshold`. Again, the “check” version of Jaccard is faster than the "non-check" version.
* Arguments:
* `list_expression1` : An `UnorderedList` or `OrderedList`.
diff --git a/asterix-doc/src/site/markdown/AsterixSimilarityQueries.md b/asterix-doc/src/site/markdown/AsterixSimilarityQueries.md
index 8998e41..4f22fef 100644
--- a/asterix-doc/src/site/markdown/AsterixSimilarityQueries.md
+++ b/asterix-doc/src/site/markdown/AsterixSimilarityQueries.md
@@ -1,8 +1,4 @@
-#summary Similarity Queries and Keyword Queries
-
-`<wiki:toc max_depth="3" />`
-
-# AsterixDB Support of Similarity Queries #
+# AsterixDB Support of Similarity Queries #
## Motivation ##
@@ -10,13 +6,13 @@
## Data Types and Similarity Functions ##
-AsterixDB supports various similarity functions, including [http://en.wikipedia.org/wiki/Levenshtein_distance edit distance] (on strings) and [http://en.wikipedia.org/wiki/Jaccard_index Jaccard] (on sets). For instance, in our [https://code.google.com/p/asterixdb/wiki/AdmAql101#ADM:_Modeling_Semistructed_Data_in_AsterixDB TinySocial] example, the `friend-ids` of a Facebook user forms a set of friends, and we can define a similarity between two sets. We can also convert a string to a set of "q-grams" and define the Jaccard similarity between the two sets of two strings. The "q-grams" of a string are its substrings of length "q". For instance, the 3-grams of the string `schwarzenegger` are `sch`, `chw`, `hwa`, ..., `ger`.
+AsterixDB supports various similarity functions, including [edit distance](http://en.wikipedia.org/wiki/Levenshtein_distance) (on strings) and [Jaccard](http://en.wikipedia.org/wiki/Jaccard_index) (on sets). For instance, in our [TinySocial](AdmAql101.html#ADM:_Modeling_Semistructed_Data_in_AsterixDB) example, the `friend-ids` of a Facebook user forms a set of friends, and we can define a similarity between two sets. We can also convert a string to a set of "q-grams" and define the Jaccard similarity between the two sets of two strings. The "q-grams" of a string are its substrings of length "q". For instance, the 3-grams of the string `schwarzenegger` are `sch`, `chw`, `hwa`, ..., `ger`.
-AsterixDB provides [https://code.google.com/p/asterixdb/wiki/AsterixDataTypesAndFunctions#Tokenizing_Functions tokenization functions] to convert strings to sets, and the [https://code.google.com/p/asterixdb/wiki/AsterixDataTypesAndFunctions#Similarity_Functions similarity functions].
+AsterixDB provides [tokenization functions](AsterixDataTypesAndFunctions.html#Tokenizing_Functions) to convert strings to sets, and the [similarity functions](AsterixDataTypesAndFunctions.html#Similarity_Functions).
## Selection Queries ##
-The following [https://code.google.com/p/asterixdb/wiki/AsterixDataTypesAndFunctions#edit-distance query] asks for all the Facebook users whose name is similar to `Suzanna Tilson`, i.e., their edit distance is at most 2.
+The following [query](AsterixDataTypesAndFunctions.html#edit-distance) asks for all the Facebook users whose name is similar to `Suzanna Tilson`, i.e., their edit distance is at most 2.
use dataverse TinySocial;
@@ -27,7 +23,7 @@
return $user
-The following [https://code.google.com/p/asterixdb/wiki/AsterixDataTypesAndFunctions#similarity-jaccard query] asks for all the Facebook users whose set of friend ids is similar to `[1,5,9]`, i.e., their Jaccard similarity is at least 0.6.
+The following [query](AsterixDataTypesAndFunctions.html#similarity-jaccard) asks for all the Facebook users whose set of friend ids is similar to `[1,5,9]`, i.e., their Jaccard similarity is at least 0.6.
use dataverse TinySocial;
@@ -54,7 +50,7 @@
## Fuzzy Join Queries ##
-AsterixDB supports fuzzy joins between two data sets. The following [https://code.google.com/p/asterixdb/wiki/AdmAql101#Query_5_-_Fuzzy_Join query] finds, for each Facebook user, all Twitter users with names "similar" to their name based on the edit distance.
+AsterixDB supports fuzzy joins between two data sets. The following [query](AdmAql101.html#Query_5_-_Fuzzy_Join) finds, for each Facebook user, all Twitter users with names "similar" to their name based on the edit distance.
use dataverse TinySocial;
diff --git a/asterix-doc/src/site/markdown/InstallingAsterixUsingManagix.md b/asterix-doc/src/site/markdown/InstallingAsterixUsingManagix.md
index 7b55ebf..65aebdf 100644
--- a/asterix-doc/src/site/markdown/InstallingAsterixUsingManagix.md
+++ b/asterix-doc/src/site/markdown/InstallingAsterixUsingManagix.md
@@ -1,14 +1,10 @@
-#summary Installation Instructions
-
-`<wiki:toc max_depth="4" />`
-
# Introduction #
This is a quickstart guide for getting ASTERIX running in a distributed environment. This guide also introduces the ASTERIX installer (nicknamed _*Managix*_) and describes how it can be used to create/manage an ASTERIX instance. By following the simple steps described in this guide, you will get a running instance of ASTERIX. You shall be able to use ASTERIX from its Web interface and manage its lifecycle using Managix. This document assumes that you are running some version of _*Linux*_ or _*MacOS X*_.
## Prerequisites for Installing ASTERIX ##
Prerequisite:
- * [http://www.oracle.com/technetwork/java/javase/downloads/index.html JDK7] (Otherwise known as JDK 1.7).
+ * [JDK7](http://www.oracle.com/technetwork/java/javase/downloads/index.html) (Otherwise known as JDK 1.7).
To know the version of Java installed on your system, execute the following:
@@ -23,10 +19,10 @@
If you need to upgrade or install java, please follow the instructions below.
- * For Linux: [http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-jdk.html JDK 7 Linux Install]
+ * For Linux: [JDK 7 Linux Install](http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-jdk.html)
JDK would be installed at a path under /usr/lib/jvm/jdk-version .
- * For Mac: [http://docs.oracle.com/javase/7/docs/webnotes/install/mac/mac-jdk.html JDK 7 Mac Install]
+ * For Mac: [JDK 7 Mac Install](http://docs.oracle.com/javase/7/docs/webnotes/install/mac/mac-jdk.html)
JDK would be installed at /Library/Java/JavaVirtualMachines/jdk-version/Contents/Home .
The java installation directory is referred as JAVA_HOME. Since we upgraded/installed Java, we need to ensure JAVA_HOME points to the installation directory of JDK 7. Modify your ~/.bash_profile (or ~/.bashrc) and define JAVA_HOME accordingly. After modifying, execute the following:
@@ -75,7 +71,7 @@
$ echo $JAVA_HOME
### Configuring SSH ###
-If SSH is not enabled on your system, please follow the instruction below to enable/install it or else skip to the section [#Configuring_Password-less_SSH Configuring Password-less SSH].
+If SSH is not enabled on your system, please follow the instruction below to enable/install it or else skip to the section [Configuring Password-less SSH](#Configuring_Password-less_SSH).
#### Enabling SSH on Mac ####
The Apple Mac OS X operating system has SSH installed by default but the SSH daemon is not enabled. This means you can’t login remotely or do remote copies until you enable it. To enable it, go to ‘System Preferences’. Under ‘Internet & Networking’ there is a ‘Sharing’ icon. Run that. In the list that appears, check the ‘Remote Login’ option. Also check the "All users" radio button for "Allow access for". This starts the SSH daemon immediately and you can remotely login using your username. The ‘Sharing’ window shows at the bottom the name and IP address to use. You can also find this out using ‘whoami’ and ‘ifconfig’ from the Terminal application.
@@ -100,7 +96,7 @@
RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
Are you sure you want to continue connecting (yes/no)?
-If you are not prompted for a password, that is if you get an output similar to one shown below, skip to the next section [#Configuring_Managix Configuring Managix].
+If you are not prompted for a password, that is if you get an output similar to one shown below, skip to the next section [Configuring Managix](#Configuring_Managix).
$ ssh 127.0.0.1
@@ -152,7 +148,7 @@
The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
- Are you sure you want to continue connecting (yes/no)?Â
+ Are you sure you want to continue connecting (yes/no)?
Type 'yes' and press the enter key. You should see an output similar to one shown below.
@@ -174,7 +170,7 @@
Connection to 127.0.0.1 closed.
### Configuring Managix ###
-You will need the ASTERIX installer (a.k.a Managix). Download Managix from [https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip here]; this includes the bits for Managix as well as ASTERIX.
+You will need the ASTERIX installer (a.k.a Managix). Download Managix from [here](https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip); this includes the bits for Managix as well as ASTERIX.
Unzip the Managix zip bundle to an appropriate location. You may create a sub-directory: asterix-mgmt (short for asterix-management) under your home directory. We shall refer to this location as MANAGIX_HOME.
@@ -249,10 +245,10 @@
let $message := 'Hello World!'
return $message
-Press the "Execute" button. If the query result shows on the output box, then Congratulations You have successfully created an ASTERIX instance
+Press the "Execute" button. If the query result shows on the output box, then Congratulations! You have successfully created an ASTERIX instance!
## Section 2: Single-Machine ASTERIX installation (Advanced) ##
-We assume that you have successfully completed the single-machine ASTERIX installation by following the instructions above in section [#Section_1:_Single-Machine_ASTERIX_installation Single Machine ASTERIX installation]. In this section, we shall cover advanced topics related to ASTERIX configuration. Before we proceed, it is imperative to go through some preliminary concepts related to ASTERIX runtime.
+We assume that you have successfully completed the single-machine ASTERIX installation by following the instructions above in section [ASTERIX installation](#Section_1:_Single-Machine_ASTERIX_installation Single Machine). In this section, we shall cover advanced topics related to ASTERIX configuration. Before we proceed, it is imperative to go through some preliminary concepts related to ASTERIX runtime.
### ASTERIX Runtime ###
An ASTERIX runtime comprises of a ''master node'' and a set of ''worker nodes'', each identified by a unique id. The master node runs a ''Cluster Controller'' service (a.k.a. ''CC''), while each worker node runs a ''Node Controller'' service (a.k.a. ''NC''). Please note that a node in an ASTERIX cluster is a logical concept in the sense that multiple nodes may map to a single physical machine, which is the case for a single-machine ASTERIX installation. This association or mapping between an ASTERIX node and a physical machine is captured in a cluster configuration XML file. In addition, the XML file contains properties and parameters associated with each node.
@@ -303,20 +299,54 @@
The following is a description of the different elements in the cluster configuration xml file.
-|| __Property__ || __Description__ ||
-|| id || A unique id for a node. ||
-|| cluster-ip || IP address of the machine to which a node maps to. This address is used for all internal communication between the nodes. ||
-|| client-ip || Provided for the master node. This IP should be reachable from clients that want to connect with ASTERIX via its web interface. ||
+<table>
+<tr>
+ <td>Property</td>
+ <td>Description</td>
+</tr>
+<tr>
+ <td>id</td>
+ <td>A unique id for a node.</td>
+</tr>
+<tr>
+ <td>cluster-ip</td>
+ <td>IP address of the machine to which a node maps to. This address is used for all internal communication between the nodes.</td>
+</tr>
+<tr>
+ <td>client-ip</td>
+ <td>Provided for the master node. This IP should be reachable from clients that want to connect with ASTERIX via its web interface.</td>
+</tr>
+</table>
#### (2) Properties associated with a worker node (NC) in ASTERIX ####
The following is a list of properties associated with each worker node in an ASTERIX configuration.
-|| __Property__ || __Description__ ||
-|| java_home || Java installation directory at each node. ||
-|| java_opts || JVM arguments passed on to the JVM that represents a node. ||
-|| logdir || A directory where worker node may write logs. ||
-|| io_devices || Comma separated list of IO Device mount points. ||
-|| store || A data directory that ASTERIX uses to store data belonging to dataset(s). ||
+<table>
+<tr>
+ <td>Property</td>
+ <td>Description</td>
+</tr>
+<tr>
+ <td>java_home</td>
+ <td>Java installation directory at each node.</td>
+</tr>
+<tr>
+ <td>java_opts</td>
+ <td>JVM arguments passed on to the JVM that represents a node.</td>
+</tr>
+<tr>
+ <td>logdir</td>
+ <td>A directory where worker node may write logs.</td>
+</tr>
+<tr>
+ <td>io_devices</td>
+ <td>Comma separated list of IO Device mount points.</td>
+</tr>
+<tr>
+ <td>store</td>
+ <td>A data directory that ASTERIX uses to store data belonging to dataset(s).</td>
+</tr>
+</table>
All the above properties can be defined at the global level or a local level. In the former case, these properties apply to all the nodes in an ASTERIX configuration. In the latter case, these properties apply only to the node(s) under which they are defined. A property defined at the local level overrides the definition at the global level.
@@ -352,7 +382,7 @@
## Section 3: Installing ASTERIX on a Cluster of Multiple Machines ##
We assume that you have read the two sections above on single-machine ASTERIX setup. Next we explain how to install ASTERIX in a cluster of multiple machines. As an example, we assume we want to setup ASTERIX on a cluster of three machines, in which we use one machine (called machine A) as the master node and two other machines (called machine B and machine C) as the worker nodes, as shown in the following diagram:
-[https://asterixdb.googlecode.com/files/AsterixCluster.png]
+![AsterixCluster](https://asterixdb.googlecode.com/files/AsterixCluster.png)
Notice that each machine has a ''cluster-ip'' address, which is used by these machines for their intra-cluster communication. Meanwhile, the master machine also has a ''client-ip'' address, using which an end-user outside the cluster can communicate with this machine. The reason we differentiate between these two types of IP addresses is that we can have a cluster of machines using a private network. In this case they have internal ip addresses that cannot be used outside the network. In the case all the machines are on a public network, the "client-ip" and "cluster-ip" of the master machine can share the same address.
@@ -360,7 +390,7 @@
### Step (1): Define the ASTERIX cluster ###
-We first log into the master machine as the user "joe". On this machine, download Managix from [https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip here] (save as above), then do the following steps similar to the single-machine case described above:
+We first log into the master machine as the user "joe". On this machine, download Managix from [here](https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip) (save as above), then do the following steps similar to the single-machine case described above:
machineA> cd ~
@@ -503,26 +533,29 @@
machineA> managix create -n rainbow_asterix -c $MANAGIX_HOME/clusters/rainbow/rainbow.xml
-If the response message does not have warning, then Congratulations You have successfully installed Asterix on this cluster of machines
+If the response message does not have warning, then Congratulations! You have successfully installed Asterix on this cluster of machines!
-Please refer to the section [#Section_4:_Managing_an_ASTERIX_Instance Managing an ASTERIX instance] for a detailed description on the set of available commands/operations that let you manage the lifecycle of an ASTERIX instance. Note that the output of the commands varies with the cluster definition and may not apply to the cluster specification you built above.
+Please refer to the section [Managing the Lifecycle of an ASTERIX Instance](#Section_4:_Managing_the_Lifecycle_of_an_ASTERIX_Instance) for a detailed description on the set of available commands/operations that let you manage the lifecycle of an ASTERIX instance. Note that the output of the commands varies with the cluster definition and may not apply to the cluster specification you built above.
## Section 4: Managing the Lifecycle of an ASTERIX Instance ##
Now that we have an ASTERIX instance running, let us use Managix to manage the instance's lifecycle. Managix provides the following set of commands/operations:
#### Managix Commands ####
-|| *Command* || *Description* ||
-|| [#Creating_an_ASTERIX_instance create] || Creates a new asterix instance. ||
-|| [#Describe_Command describe] || Describes an existing asterix instance. ||
-|| [#Stop_Command stop] || Stops an asterix instance that is in the ACTIVE state. ||
-|| [#Start_Command start] || Starts an Asterix instance. ||
-|| [#Backup_Command backup] || Creates a backup for an existing Asterix instance. ||
-|| [#Restore_Command restore] || Restores an Asterix instance. ||
-|| [#Delete_Command delete] || Deletes an Asterix instance. ||
-|| [#Configuring_Managix validate] || Validates the installer/cluster configuration. ||
-|| [#Configuring_Managix configure] || Auto generate configuration for an Asterix instance. ||
-|| [#Shutdown_Command shutdown] || Shutdown the installer service. ||
+
+<table>
+<tr><td>Command</td> <td>Description</td></tr>
+<tr><td><a href="#Creating_an_ASTERIX_instance">create</a></td> <td>Creates a new asterix instance.</td></tr>
+<tr><td><a href="#Describe_Command" >describe</a></td> <td>Describes an existing asterix instance.</td></tr>
+<tr><td><a href="#Stop_Command" >stop</a></td> <td>Stops an asterix instance that is in the ACTIVE state.</td></tr>
+<tr><td><a href="#Start_Command" >start</a></td> <td>Starts an Asterix instance.</td></tr>
+<tr><td><a href="#Backup_Command" >backup</a></td> <td>Creates a backup for an existing Asterix instance.</td></tr>
+<tr><td><a href="#Restore_Command" >restore</a></td> <td>Restores an Asterix instance.</td></tr>
+<tr><td><a href="#Delete_Command" >delete</a></td> <td>Deletes an Asterix instance.</td></tr>
+<tr><td><a href="#Configuring_Managix" >validate</a></td> <td>Validates the installer/cluster configuration.</td></tr>
+<tr><td><a href="#Configuring_Managix" >configure</a></td><td>Auto generate configuration for an Asterix instance.</td></tr>
+<tr><td><a href="#Shutdown_Command" >shutdown</a></td> <td>Shutdown the installer service.</td></tr>
+</table>
You may obtain the above listing by simply executing 'managix' :