blob: dc750ad6ca6b5d2b011d7025ce95aac1a265e23f [file] [log] [blame]
Ian Maxon444ca1b2017-08-25 11:41:41 -07001<!DOCTYPE html>
2<!--
Ian Maxon7a4bed92017-09-15 02:01:18 +02003 | Generated by Apache Maven Doxia at 2017-09-14
Ian Maxon444ca1b2017-08-25 11:41:41 -07004 | Rendered using Apache Maven Fluido Skin 1.3.0
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
Ian Maxon7a4bed92017-09-15 02:01:18 +020010 <meta name="Date-Revision-yyyymmdd" content="20170914" />
Ian Maxon444ca1b2017-08-25 11:41:41 -070011 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; Introduction</title>
13 <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
14 <link rel="stylesheet" href="./css/site.css" />
15 <link rel="stylesheet" href="./css/print.css" media="print" />
16
17
18 <script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
19
20
21
22<script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
23 (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
24 m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
25 })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
26
27 ga('create', 'UA-41536543-1', 'uci.edu');
28 ga('send', 'pageview');</script>
29
30 </head>
31 <body class="topBarDisabled">
32
33
34
35
36 <div class="container-fluid">
37 <div id="banner">
38 <div class="pull-left">
39 <a href="./" id="bannerLeft">
40 <img src="images/asterixlogo.png" alt="AsterixDB"/>
41 </a>
42 </div>
43 <div class="pull-right"> </div>
44 <div class="clear"><hr/></div>
45 </div>
46
47 <div id="breadcrumbs">
48 <ul class="breadcrumb">
49
50
Ian Maxon7a4bed92017-09-15 02:01:18 +020051 <li id="publishDate">Last Published: 2017-09-14</li>
Ian Maxon444ca1b2017-08-25 11:41:41 -070052
53
54
Ian Maxon7a4bed92017-09-15 02:01:18 +020055 <li id="projectVersion" class="pull-right">Version: 0.9.2</li>
Ian Maxon444ca1b2017-08-25 11:41:41 -070056
57 <li class="divider pull-right">|</li>
58
59 <li class="pull-right"> <a href="index.html" title="Documentation Home">
60 Documentation Home</a>
61 </li>
62
63 </ul>
64 </div>
65
66
67 <div class="row-fluid">
68 <div id="leftColumn" class="span3">
69 <div class="well sidebar-nav">
70
71
72 <ul class="nav nav-list">
73 <li class="nav-header">Get Started - Installation</li>
74
75 <li>
76
77 <a href="ncservice.html" title="Option 1: using NCService">
78 <i class="none"></i>
79 Option 1: using NCService</a>
80 </li>
81
82 <li>
83
84 <a href="ansible.html" title="Option 2: using Ansible">
85 <i class="none"></i>
86 Option 2: using Ansible</a>
87 </li>
88
89 <li>
90
91 <a href="aws.html" title="Option 3: using Amazon Web Services">
92 <i class="none"></i>
93 Option 3: using Amazon Web Services</a>
94 </li>
95
96 <li>
97
98 <a href="yarn.html" title="Option 4: using YARN">
99 <i class="none"></i>
100 Option 4: using YARN</a>
101 </li>
102
103 <li class="active">
104
105 <a href="#"><i class="none"></i>Option 5: using Managix (deprecated)</a>
106 </li>
107 <li class="nav-header">AsterixDB Primer</li>
108
109 <li>
110
111 <a href="sqlpp/primer-sqlpp.html" title="Option 1: using SQL++">
112 <i class="none"></i>
113 Option 1: using SQL++</a>
114 </li>
115
116 <li>
117
118 <a href="aql/primer.html" title="Option 2: using AQL">
119 <i class="none"></i>
120 Option 2: using AQL</a>
121 </li>
122 <li class="nav-header">Data Model</li>
123
124 <li>
125
126 <a href="datamodel.html" title="The Asterix Data Model">
127 <i class="none"></i>
128 The Asterix Data Model</a>
129 </li>
130 <li class="nav-header">Queries - SQL++</li>
131
132 <li>
133
134 <a href="sqlpp/manual.html" title="The SQL++ Query Language">
135 <i class="none"></i>
136 The SQL++ Query Language</a>
137 </li>
138
139 <li>
140
141 <a href="sqlpp/builtins.html" title="Builtin Functions">
142 <i class="none"></i>
143 Builtin Functions</a>
144 </li>
145 <li class="nav-header">Queries - AQL</li>
146
147 <li>
148
149 <a href="aql/manual.html" title="The Asterix Query Language (AQL)">
150 <i class="none"></i>
151 The Asterix Query Language (AQL)</a>
152 </li>
153
154 <li>
155
156 <a href="aql/builtins.html" title="Builtin Functions">
157 <i class="none"></i>
158 Builtin Functions</a>
159 </li>
160 <li class="nav-header">API/SDK</li>
161
162 <li>
163
164 <a href="api.html" title="HTTP API">
165 <i class="none"></i>
166 HTTP API</a>
167 </li>
168
169 <li>
170
171 <a href="csv.html" title="CSV Output">
172 <i class="none"></i>
173 CSV Output</a>
174 </li>
175 <li class="nav-header">Advanced Features</li>
176
177 <li>
178
179 <a href="aql/fulltext.html" title="Support of Full-text Queries">
180 <i class="none"></i>
181 Support of Full-text Queries</a>
182 </li>
183
184 <li>
185
186 <a href="aql/externaldata.html" title="Accessing External Data">
187 <i class="none"></i>
188 Accessing External Data</a>
189 </li>
190
191 <li>
192
193 <a href="feeds/tutorial.html" title="Support for Data Ingestion">
194 <i class="none"></i>
195 Support for Data Ingestion</a>
196 </li>
197
198 <li>
199
200 <a href="udf.html" title="User Defined Functions">
201 <i class="none"></i>
202 User Defined Functions</a>
203 </li>
204
205 <li>
206
207 <a href="aql/filters.html" title="Filter-Based LSM Index Acceleration">
208 <i class="none"></i>
209 Filter-Based LSM Index Acceleration</a>
210 </li>
211
212 <li>
213
214 <a href="aql/similarity.html" title="Support of Similarity Queries">
215 <i class="none"></i>
216 Support of Similarity Queries</a>
217 </li>
218 </ul>
219
220
221
222 <hr class="divider" />
223
224 <div id="poweredBy">
225 <div class="clear"></div>
226 <div class="clear"></div>
227 <div class="clear"></div>
228 <a href="./" title="AsterixDB" class="builtBy">
229 <img class="builtBy" alt="AsterixDB" src="images/asterixlogo.png" />
230 </a>
231 </div>
232 </div>
233 </div>
234
235
236 <div id="bodyColumn" class="span9" >
237
238 <!-- ! Licensed to the Apache Software Foundation (ASF) under one
239 ! or more contributor license agreements. See the NOTICE file
240 ! distributed with this work for additional information
241 ! regarding copyright ownership. The ASF licenses this file
242 ! to you under the Apache License, Version 2.0 (the
243 ! "License"); you may not use this file except in compliance
244 ! with the License. You may obtain a copy of the License at
245 !
246 ! http://www.apache.org/licenses/LICENSE-2.0
247 !
248 ! Unless required by applicable law or agreed to in writing,
249 ! software distributed under the License is distributed on an
250 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
251 ! KIND, either express or implied. See the License for the
252 ! specific language governing permissions and limitations
253 ! under the License.
254 ! --><h1>Introduction</h1>
255<div class="section">
256<h2><a name="Table_of_Contents"></a><a name="toc" id="toc">Table of Contents</a></h2>
257
258<ul>
259
260<li><a href="#PrerequisitesForInstallingAsterixDB">Prerequisites for Installing AsterixDB</a></li>
261
262<li><a href="#Section1SingleMachineAsterixDBInstallation">Section 1: Single-Machine AsterixDB installation</a></li>
263
264<li><a href="#Section2SingleMachineAsterixDBInstallationAdvanced">Section 2: Single-Machine AsterixDB installation (Advanced)</a></li>
265
266<li><a href="#Section3InstallingAsterixDBOnAClusterOfMultipleMachines">Section 3: Installing AsterixDB on a Cluster of Multiple Machines</a></li>
267
268<li><a href="#Section4ManagingTheLifecycleOfAnAsterixDBInstance">Section 4: Managing the Lifecycle of an AsterixDB Instance</a></li>
269
270<li><a href="#Section5FAQ">Section 5: Frequently Asked Questions</a></li>
271</ul>
272<p>This is a quickstart guide for getting AsterixDB running in a distributed environment. This guide also introduces the AsterixDB installer (nicknamed <i><i>Managix</i></i>) and describes how it can be used to create and manage an AsterixDB instance. By following the simple steps described in this guide, you will get a running instance of AsterixDB. You shall be able to use AsterixDB from its Web interface and manage its lifecycle using Managix. This document assumes that you are running some version of <i><i>Linux</i></i> or <i><i>MacOS X</i></i>.</p></div>
273<div class="section">
274<h2><a name="Prerequisites_for_Installing_AsterixDB_Back_to_TOC"></a><a name="PrerequisitesForInstallingAsterixDB" id="PrerequisitesForInstallingAsterixDB">Prerequisites for Installing AsterixDB</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
275<p>Prerequisite:</p>
276
277<ul>
278
279<li><a class="externalLink" href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">JDK&gt;=8</a>.</li>
280</ul>
281<p>To know the version of Java installed on your system, execute the following:</p>
282
283<div class="source">
284<div class="source">
285<pre>$ java -version
286</pre></div></div>
287<p>If your version is at least 1.8.0_x, similar to the output shown below, you are good to proceed.</p>
288
289<div class="source">
290<div class="source">
291<pre>java version &quot;1.8.0_60&quot;
292Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
293Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
294</pre></div></div>
295<p>If you need to upgrade or install java, please follow <a class="externalLink" href="http://docs.oracle.com/javase/8/docs/technotes/guides/install/install_overview.html">Oracle&#x2019;s instructions</a>. The installation directory for</p>
296
297<ul>
298
299<li>
300<p>Linux would be at a path under <tt>/usr/lib/jvm/[jdk-version]</tt>.</p></li>
301
302<li>
303<p>Mac would be <tt>/Library/Java/JavaVirtualMachines/[jdk-version]/Contents/Home</tt>.</p></li>
304</ul>
305<p>The java installation directory is referred as <tt>JAVA_HOME</tt>. Since we upgraded/installed Java, we need to ensure <tt>JAVA_HOME</tt> points to the installation directory of JDK. Modify your ~/.bash_profile (or ~/.bashrc) and define <tt>JAVA_HOME</tt> accordingly. After the modification, execute the following:</p>
306
307<div class="source">
308<div class="source">
309<pre>$ java -version
310</pre></div></div>
311<p>If the version information you obtain does not show 1.8, you need to update the PATH variable. To do so, execute the following:</p>
312
313<div class="source">
314<div class="source">
315<pre>$ echo &quot;PATH=$JAVA_HOME/bin:$PATH&quot; &gt;&gt; ~/.bash_profile (or ~/.bashrc)
316$ source ~/.bash_profile (or ~/.bashrc)
317</pre></div></div></div>
318<div class="section">
319<h2><a name="Section_1:_Single-Machine_AsterixDB_installation_Back_to_TOC"></a><a name="Section1SingleMachineAsterixDBInstallation" id="Section1SingleMachineAsterixDBInstallation">Section 1: Single-Machine AsterixDB installation</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
320<p>We assume a user called &#x201c;Joe&#x201d; with a home directory as /home/joe. On a Mac, the home directory for user Joe would be /Users/joe.</p>
321<div class="section">
322<h3><a name="Configuring_Environment"></a>Configuring Environment</h3>
323<p>Ensure that <tt>JAVA_HOME</tt> variable is defined and points to the the java installation directory on your machine. To verify, execute the following:</p>
324
325<div class="source">
326<div class="source">
327<pre>$ echo $JAVA_HOME
328</pre></div></div>
329<p>If you do not see any output, <tt>JAVA_HOME</tt> is not defined. We need to add the following line to your profile located at /home/joe/.bash_profile or /home/joe/.bashrc, whichever you are using. If you do not have any of these files, create a ~/.bash_profile file.</p>
330
331<div class="source">
332<div class="source">
333<pre>export JAVA_HOME=&lt;Path to Java installation directory&gt;
334</pre></div></div>
335<p>After you have edited ~/.bash_profile (or ~/.bashrc), execute the following to make the changes effective in current shell:</p>
336
337<div class="source">
338<div class="source">
339<pre>$ source /home/joe/.bash_profile (or /home/joe/.bashrc)
340</pre></div></div>
341<p>Before proceeding, verify that <tt>JAVA_HOME</tt> is defined by executing the following:</p>
342
343<div class="source">
344<div class="source">
345<pre>$ echo $JAVA_HOME
346</pre></div></div></div>
347<div class="section">
348<h3><a name="Configuring_SSH"></a>Configuring SSH</h3>
349<p>If SSH is not enabled on your system, please follow the instruction below to enable/install it or else skip to the section <a href="#Configuring_Password-less_SSH">Configuring Password-less SSH</a>.</p>
350<div class="section">
351<h4><a name="Enabling_SSH_on_Mac"></a>Enabling SSH on Mac</h4>
352<p>The Apple Mac OS X operating system has SSH installed by default but the SSH daemon is not enabled. This means you can&#x2019;t login remotely or do remote copies until you enable it. To enable it, go to &#x2018;System Preferences&#x2019;. Under &#x2018;Internet &amp; Networking&#x2019; there is a &#x2018;Sharing&#x2019; icon. Run that. In the list that appears, check the &#x2018;Remote Login&#x2019; option. Also check the &#x201c;All users&#x201d; radio button for &#x201c;Allow access for&#x201d;. This starts the SSH daemon immediately and you can remotely login using your username. The &#x2018;Sharing&#x2019; window shows at the bottom the name and IP address to use. You can also find this out using &#x2018;whoami&#x2019; and &#x2018;ifconfig&#x2019; from the Terminal application.</p></div>
353<div class="section">
354<h4><a name="Enabling_SSH_on_Linux"></a>Enabling SSH on Linux</h4>
355
356<div class="source">
357<div class="source">
358<pre>sudo apt-get install openssh-server
359</pre></div></div>
360<p>Assumming that you have enabled SSH on your system, let us proceed.</p></div>
361<div class="section">
362<h4><a name="Configuring_Password-less_SSH"></a>Configuring Password-less SSH</h4>
363<p>For our single-machine setup of AsterixDB, we need to configure password-less SSH access to localhost. We assume that you are on the machine where you want to install AsterixDB. To verify if you already have password-less SSH configured, execute the following:</p>
364
365<div class="source">
366<div class="source">
367<pre>$ ssh 127.0.0.1
368</pre></div></div>
369<p>If you get an output similar to one shown below, type &#x201c;yes&#x201d; and press enter.</p>
370
371<div class="source">
372<div class="source">
373<pre>The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
374RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
375Are you sure you want to continue connecting (yes/no)?
376</pre></div></div>
377<p>If you are not prompted for a password, that is if you get an output similar to one shown below, it signifies that you already have password-less SSH configured.</p>
378
379<div class="source">
380<div class="source">
381<pre>$ ssh 127.0.0.1
382Last login: Sat Mar 23 22:52:49 2013
383</pre></div></div>
384<p>[Important: Password-less SSH requires the use of a (public,private) key-pair. The key-pair is located as a pair of files under $HOME/.ssh directory. It is required that the (public,private) key-pair files have default names (id_rsa.pub, id_rsa) respectively. If you are using different names, please rename the files to use the default names]</p>
385<p>Skip to the next section <a href="#Configuring_Managix">Configuring Managix</a>.</p>
386<p>You are here because you were prompted for a password. You need to configure password-less SSH. We shall generate a (public,private) key-pair as id_rsa.pub and id_rsa respectively. If $HOME/.ssh already contains a (public,private) key-pair, please ensure the files are renamed before proceeding. Follow the instructions below.</p>
387
388<div class="source">
389<div class="source">
390<pre>$ ssh-keygen -t rsa -P &quot;&quot;
391Generating public/private rsa key pair.
392Enter file in which to save the key (/home/joe/.ssh/id_rsa):
393[Important: Please ensure that we use the default value, so simply press enter]
394</pre></div></div>
395<p>If a key already exists, you should get an output similar to what is shown below. Press &#x2018;y&#x2019; to overwrite the existing key. It is required to use the default name. If you wish to not overwrite a pre-existing key, ensure that the pre-existing key is saved with a different name.</p>
396
397<div class="source">
398<div class="source">
399<pre>/home/joe/.ssh/id_rsa already exists.
400Overwrite (y/n)?
401</pre></div></div>
402<p>You should see an output similar to one shown below:</p>
403
404<div class="source">
405<div class="source">
406<pre>The key fingerprint is:
4074d:b0:30:14:45:cc:99:86:15:48:17:0b:39:a0:05:ca joe@joe-machine
408The key's randomart image is:
409+--[ RSA 2048]----+
410| ..o+B@O= |
411|.. o ==*+ |
412|.E. oo . |
413| o |
414| S . |
415| |
416| |
417| |
418| |
419+-----------------+
420</pre></div></div>
421<p>Note: for Linux users, you may not get an image representation of the key, but this is not an error. Next, execute the following:</p>
422
423<div class="source">
424<div class="source">
425<pre>$ cat $HOME/.ssh/id_rsa.pub &gt;&gt; $HOME/.ssh/authorized_keys
426$ chmod 700 $HOME/.ssh/authorized_keys
427</pre></div></div>
428<p>We shall now retry SSH without password.</p>
429
430<div class="source">
431<div class="source">
432<pre>$ ssh 127.0.0.1
433</pre></div></div>
434<p>You may see an output similar to one shown below:</p>
435
436<div class="source">
437<div class="source">
438<pre>The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
439RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
440Are you sure you want to continue connecting (yes/no)?
441</pre></div></div>
442<p>Type &#x2018;yes&#x2019; and press the enter key. You should see an output similar to one shown below:</p>
443
444<div class="source">
445<div class="source">
446<pre>Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts.
447Last login: Thu Mar 28 12:27:10 2013
448</pre></div></div>
449<p>You should now be able to log in without being prompted for a password or a response.</p>
450
451<div class="source">
452<div class="source">
453<pre>ssh 127.0.0.1
454Last login: Sat Mar 23 22:54:40 2013
455</pre></div></div>
456<p>Execute &#x2018;exit&#x2019; to close the session.</p>
457
458<div class="source">
459<div class="source">
460<pre>$ exit
461logout
462Connection to 127.0.0.1 closed.
463</pre></div></div></div></div>
464<div class="section">
465<h3><a name="Configuring_Managix"></a>Configuring Managix</h3>
466<p>You will need the AsterixDB installer (a.k.a. Managix). Download the Standalone Cluster installer from <a class="externalLink" href="https://asterixdb.apache.org/download.html">here</a>; this includes the bits for Managix as well as AsterixDB.</p>
467<p>We will refer to the directory containing the extracted files as MANAGIX_HOME and we assume that MANAGIX_HOME/bin is on your PATH.</p>
468<p>To be able to create an AsterixDB instance and manage its lifecycle, Managix requires you to configure a set of configuration files namely:</p>
469
470<ul>
471
472<li><tt>conf/managix-conf.xml</tt>: A configuration XML file that contains configuration settings for Managix.</li>
473
474<li>A configuration XML file that describes the nodes in the cluster, e.g., <tt>clusters/local/local.xml</tt>.</li>
475</ul>
476<p>Since we intend to run AsterixDB on a single node, Managix can auto-configure itself and populate the above configuration files. To auto-configure Managix, execute the following in the MANAGIX_HOME directory:</p>
477
478<div class="source">
479<div class="source">
480<pre>/home/joe/asterix-mgmt&gt; $ managix configure
481</pre></div></div>
482<p>Let us do a sample run to validate the set of configuration files auto-generated by Managix.</p>
483
484<div class="source">
485<div class="source">
486<pre>/home/joe/asterix-mgmt&gt; $ managix validate
487 INFO: Environment [OK]
488 INFO: Managix Configuration [OK]
489
490/home/joe/asterix-mgmt&gt; $ managix validate -c clusters/local/local.xml
491 INFO: Environment [OK]
492 INFO: Cluster configuration [OK]
493</pre></div></div></div>
494<div class="section">
495<h3><a name="Creating_an_AsterixDB_instance"></a>Creating an AsterixDB instance</h3>
496<p>Now that we have configured Managix, we shall next create an AsterixDB instance. An AsterixDB instance is identified by a unique name and is created using the <tt>create</tt> command. The usage description for the <tt>create</tt> command can be obtained by executing the following:</p>
497
498<div class="source">
499<div class="source">
500<pre>$ managix help -cmd create
501Creates an AsterixDB instance with a specified name. Post creation, the instance is in ACTIVE state,
502indicating its availability for executing statements/queries.
503Usage arguments/options:
504-n Name of the AsterixDB instance.
505-c Path to the cluster configuration file
506</pre></div></div>
507<p>We shall now use the <tt>create</tt> command to create an AsterixDB instance by the name &#x201c;my_asterix&#x201d;. In doing so, we shall use the cluster configuration file that was auto-generated by Managix.</p>
508
509<div class="source">
510<div class="source">
511<pre>$ managix create -n my_asterix -c clusters/local/local.xml
512</pre></div></div>
513<p>A sample output of the above command is shown below:</p>
514
515<div class="source">
516<div class="source">
517<pre>INFO: Name:my_asterix
518Created:Thu Mar 07 11:14:13 PST 2013
519Web-Url:http://127.0.0.1:19001
520State:ACTIVE
521</pre></div></div>
522<p>The third line above shows the web-url <a class="externalLink" href="http://127.0.0.1:19001">http://127.0.0.1:19001</a> for an AsterixDB&#x2019;s web interface. The AsterixDB instance is in the &#x2018;ACTIVE&#x2019; state, indicating that you may access the web interface by navigating to the web url.</p>
523<p>Type in the following &#x201c;Hello World&#x201d; query in the box:</p>
524
525<div class="source">
526<div class="source">
527<pre>let $message := 'Hello World!'
528return $message
529</pre></div></div>
530<p>Press the &#x201c;Run&#x201d; button. If the query result shows on the output box, then Congratulations! You have successfully created an AsterixDB instance!</p></div></div>
531<div class="section">
532<h2><a name="Section_2:_Single-Machine_AsterixDB_installation_Advanced_Back_to_TOC"></a><a name="Section2SingleMachineAsterixDBInstallationAdvanced" id="Section2SingleMachineAsterixDBInstallationAdvanced">Section 2: Single-Machine AsterixDB installation (Advanced)</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
533<p>We assume that you have successfully completed the single-machine AsterixDB installation by following the instructions above in section <a href="#Section_1:_Single-Machine_AsterixDB_installation">AsterixDB installation</a>. In this section, we shall cover advanced topics related to AsterixDB configuration. Before we proceed, it is imperative to go through some preliminary concepts related to AsterixDB runtime.</p>
534<div class="section">
535<h3><a name="AsterixDB_Runtime"></a>AsterixDB Runtime</h3>
536<p>An AsterixDB runtime comprises of a &#x2018;&#x2018;master node&#x2019;&#x2019; and a set of &#x2018;&#x2018;worker nodes&#x2019;&#x2019;, each identified by a unique id. The master node runs a &#x2018;&#x2018;Cluster Controller&#x2019;&#x2019; service (a.k.a. &#x2018;&#x2018;CC&#x2019;&#x2019;), while each worker node runs a &#x2018;&#x2018;Node Controller&#x2019;&#x2019; service (a.k.a. &#x2018;&#x2018;NC&#x2019;&#x2019;). Please note that a node in an AsterixDB cluster is a logical concept in the sense that multiple nodes may map to a single physical machine, which is the case for a single-machine AsterixDB installation. This association or mapping between an AsterixDB node and a physical machine is captured in a cluster configuration XML file. In addition, the XML file contains properties and parameters associated with each node.</p>
537<div class="section">
538<h4><a name="AsterixDB_Runtime_Configuration"></a>AsterixDB Runtime Configuration</h4>
539<p>As observed earlier, Managix can auto-configure itself for a single-machine setup. As part of auto-configuration, Managix generated the cluster XML file. Let us understand the components of the generated cluster XML file. If you have configured Managix (via the <tt>configure</tt> command), you can find a similar cluster XML file as $MANAGIX_HOME/clusters/local/local.xml. The following is a sample XML file generated on a Ubuntu (Linux) setup:</p>
540
541<div class="source">
542<div class="source">
543<pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;yes&quot;?&gt;
544&lt;cluster xmlns=&quot;cluster&quot;&gt;
545 &lt;name&gt;local&lt;/name&gt;
546 &lt;java_home&gt;/usr/lib/jvm/jdk1.8.0&lt;/java_home&gt;
547 &lt;log_dir&gt;/home/joe/asterix-mgmt/clusters/local/working_dir/logs&lt;/log_dir&gt;
548 &lt;txn_log_dir&gt;/home/joe/asterix-mgmt/clusters/local/working_dir/logs&lt;/txn_log_dir&gt;
549 &lt;iodevices&gt;/home/joe/asterix-mgmt/clusters/local/working_dir&lt;/iodevices&gt;
550 &lt;store&gt;storage&lt;/store&gt;
551 &lt;working_dir&gt;
552 &lt;dir&gt;/home/joe/asterix-mgmt/clusters/local/working_dir&lt;/dir&gt;
553 &lt;NFS&gt;true&lt;/NFS&gt;
554 &lt;/working_dir&gt;
555 &lt;master_node&gt;
556 &lt;id&gt;master&lt;/id&gt;
557 &lt;client_ip&gt;127.0.0.1&lt;/client_ip&gt;
558 &lt;cluster_ip&gt;127.0.0.1&lt;/cluster_ip&gt;
559 &lt;client_port&gt;1098&lt;/client_port&gt;
560 &lt;cluster_port&gt;1099&lt;/cluster_port&gt;
561 &lt;http_port&gt;8888&lt;/http_port&gt;
562 &lt;/master_node&gt;
563 &lt;node&gt;
564 &lt;id&gt;node1&lt;/id&gt;
565 &lt;cluster_ip&gt;127.0.0.1&lt;/cluster_ip&gt;
566 &lt;/node&gt;
567&lt;/cluster&gt;
568</pre></div></div>
569<p>We shall next explain the components of the cluster configuration XML file.</p></div>
570<div class="section">
571<h4><a name="a1_Defining_nodes_in_AsterixDB_runtime"></a>(1) Defining nodes in AsterixDB runtime</h4>
572<p>The single-machine AsterixDB instance configuration that is auto-generated by Managix (using the <tt>configure</tt> command) involves a master node (CC) and a worker node (NC). Each node is assigned a unique id and provided with an ip address (called &#x2018;&#x2018;cluster_ip&#x2019;&#x2019;) that maps a node to a physical machine. The following snippet from the above XML file captures the master/worker nodes in our AsterixDB installation.</p>
573
574<div class="source">
575<div class="source">
576<pre>&lt;master_node&gt;
577 &lt;id&gt;master&lt;/id&gt;
578 &lt;client_ip&gt;127.0.0.1&lt;/client_ip&gt;
579 &lt;cluster_ip&gt;127.0.0.1&lt;/cluster_ip&gt;
580 &lt;client_port&gt;1098&lt;/client_port&gt;
581 &lt;cluster_port&gt;1099&lt;/cluster_port&gt;
582 &lt;http_port&gt;8888&lt;/http_port&gt;
583&lt;/master_node&gt;
584&lt;node&gt;
585 &lt;id&gt;node1&lt;/id&gt;
586 &lt;cluster_ip&gt;127.0.0.1&lt;/cluster_ip&gt;
587&lt;/node&gt;
588</pre></div></div>
589<p>The following is a description of the different elements in the cluster configuration xml file.</p>
590
591<table border="0" class="table table-striped">
592
593<tr class="a">
594
595<td>Property</td>
596
597<td>Description</td>
598</tr>
599
600<tr class="b">
601
602<td>id</td>
603
604<td>A unique id for a node.</td>
605</tr>
606
607<tr class="a">
608
609<td>cluster_ip</td>
610
611<td>IP address of the machine to which a node maps to. This address is used for all internal communication between the nodes.</td>
612</tr>
613
614<tr class="b">
615
616<td>client_ip</td>
617
618<td>Provided for the master node. This IP should be reachable from clients that want to connect with AsterixDB via its web interface.</td>
619</tr>
620
621<tr class="a">
622
623<td>client_port</td>
624
625<td>Provided for the master node. This is the port at which the Cluster Controller (CC) service listens for connections from clients.</td>
626</tr>
627
628<tr class="b">
629
630<td>cluster_port</td>
631
632<td>Provided for the master node. This is the port used by the Cluster Controller (CC) service to listen for connections from Node Controllers (NCs). </td>
633</tr>
634
635<tr class="a">
636
637<td>http-port</td>
638
639<td>Provided for the master node. This is the http port used by the Cluster Controller (CC) service. </td>
640</tr>
641
642</table></div>
643<div class="section">
644<h4><a name="a2_Properties_associated_with_a_worker_node_NC_in_AsterixDB"></a>(2) Properties associated with a worker node (NC) in AsterixDB</h4>
645<p>The following is a list of properties associated with each worker node in an AsterixDB configuration.</p>
646
647<table border="0" class="table table-striped">
648
649<tr class="a">
650
651<td>Property</td>
652
653<td>Description</td>
654</tr>
655
656<tr class="b">
657
658<td>java_home</td>
659
660<td>Java installation directory at each node.</td>
661</tr>
662
663<tr class="a">
664
665<td>log_dir</td>
666
667<td>A directory where the worker node JVM may write logs.</td>
668</tr>
669
670<tr class="b">
671
672<td>txn_log_dir</td>
673
674<td>A directory where the worker node writes transaction logs.</td>
675</tr>
676
677<tr class="a">
678
679<td>iodevices</td>
680
681<td>Comma separated list of IO Device mount points.</td>
682</tr>
683
684<tr class="b">
685
686<td>store</td>
687
688<td>A data directory (under each iodevice) that AsterixDB uses to store data belonging to dataset(s).</td>
689</tr>
690</table>
691<p>All the above properties can be defined at the global level or a local level. In the former case, these properties apply to all the nodes in an AsterixDB configuration. In the latter case, these properties apply only to the node(s) under which they are defined. A property defined at the local level overrides the definition at the global level.</p></div>
692<div class="section">
693<h4><a name="a3_Working_directory_of_an_AsterixDB_instance"></a>(3) Working directory of an AsterixDB instance</h4>
694<p>Next we explain the following setting in the file $MANAGIX_HOME/clusters/local/local.xml.</p>
695
696<div class="source">
697<div class="source">
698<pre>&lt;working_dir&gt;
699 &lt;dir&gt;/Users/joe/asterix-mgmt/clusters/local/working_dir&lt;/dir&gt;
700 &lt;NFS&gt;true&lt;/NFS&gt;
701&lt;/working_dir&gt;
702</pre></div></div>
703<p>Managix associates a working directory with an AsterixDB instance and uses this directory for transferring binaries to each node. If there is a directory that is readable by each node, Managix can use it to place binaries that can be accessed and used by all the nodes in the AsterixDB set up. A network file system (NFS) provides such a functionality for a cluster of physical machines so that a path on NFS is accessible from each machine in the cluster. In the single-machine set up described above, all nodes correspond to a single physical machine. Each path on the local file system is accessible to all the nodes in the AsterixDB setup and the boolean value for NFS above is thus set to <tt>true</tt>.</p></div></div>
704<div class="section">
705<h3><a name="Managix_Configuration"></a>Managix Configuration</h3>
706<p>Managix allows creation and management of multiple AsterixDB instances and uses Zookeeper as its back-end database to keep track of information related to each instance. We need to provide a set of one or more hosts that Managix can use to run a Zookeeper instance. Zookeeper runs as a daemon process on each of the specified hosts. At each host, Zookeeper stores data under the Zookeeper home directory specified as part of the configuration. The following is an example configuration <tt>$MANAGIX_HOME/conf/managix-conf.xml</tt> that has Zookeeper running on the localhost (127.0.0.1) :</p>
707
708<div class="source">
709<div class="source">
710<pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;yes&quot;?&gt;
711&lt;configuration xmlns=&quot;installer&quot;&gt;
712 &lt;zookeeper&gt;
713 &lt;homeDir&gt;/home/joe/asterix/.installer/zookeeper&lt;/homeDir&gt;
714 &lt;clientPort&gt;2900&lt;/clientPort&gt;
715 &lt;servers&gt;
716 &lt;server&gt;127.0.0.1&lt;/server&gt;
717 &lt;/servers&gt;
718 &lt;/zookeeper&gt;
719&lt;/configuration&gt;
720</pre></div></div>
721<p>It is possible to have a single host for Zookeeper. A larger number of hosts would use Zookeeper&#x2019;s replication and fault-tolerance feature such that a failure of a host running Zookeeper would not result in loss of information about existing AsterixDB instances.</p></div></div>
722<div class="section">
723<h2><a name="Section_3:_Installing_AsterixDB_on_a_Cluster_of_Multiple_MachinesBack_to_TOC"></a><a name="Section3InstallingAsterixDBOnAClusterOfMultipleMachines" id="Section3InstallingAsterixDBOnAClusterOfMultipleMachines">Section 3: Installing AsterixDB on a Cluster of Multiple Machines</a><font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
724<p>We assume that you have read the two sections above on single-machine AsterixDB setup. Next we explain how to install AsterixDB in a cluster of multiple machines. As an example, we assume we want to setup AsterixDB on a cluster of three machines, in which we use one machine (called machine A) as the master node and two other machines (called machine B and machine C) as the worker nodes, as shown in the following diagram:</p>
725<p><img src="images/AsterixCluster.png" alt="AsterixCluster" /></p>
726<p>Notice that each machine has a &#x2018;&#x2018;cluster_ip&#x2019;&#x2019; address, which is used by these machines for their intra-cluster communication. Meanwhile, the master machine also has a &#x2018;&#x2018;client_ip&#x2019;&#x2019; address, using which an end-user outside the cluster can communicate with this machine. The reason we differentiate between these two types of IP addresses is that we can have a cluster of machines using a private network. In this case they have internal ip addresses that cannot be used outside the network. In the case all the machines are on a public network, the &#x201c;client_ip&#x201d; and &#x201c;cluster_ip&#x201d; of the master machine can share the same address.</p>
727<p>Next we describe how to set up AsterixDB in this cluster, assuming no Managix has been installed on these machines.</p>
728<div class="section">
729<h3><a name="Step_1:_Configure_SSH"></a>Step (1): Configure SSH</h3>
730<p>The steps of setting up SSH are similar to those in the single-machine setup case. We assume we have a common user account called &#x201c;joe&#x201d; on each machine in the cluster.</p>
731<p>On the master machine, do the following:</p>
732
733<div class="source">
734<div class="source">
735<pre>machineA&gt; ssh 127.0.0.1
736</pre></div></div>
737<p>If you get an output similar to one shown below, type &#x201c;yes&#x201d; and press enter.</p>
738
739<div class="source">
740<div class="source">
741<pre>The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
742RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
743Are you sure you want to continue connecting (yes/no)?
744</pre></div></div>
745<p>If you are not prompted for a password, that is if you get an output similar to one shown below, it signifies that you already have password-less SSH configured.</p>
746
747<div class="source">
748<div class="source">
749<pre>$ ssh 127.0.0.1
750Last login: Sat Mar 23 22:52:49 2013
751</pre></div></div>
752<p>[Important: Password-less SSH requires the use of a (public,private) key-pair. The key-pair is located as a pair of files under $HOME/.ssh directory. It is required that the (public,private) key-pair files have default names (id_rsa.pub, id_rsa) respectively. If you are using different names, please rename the files to use the default names]</p>
753<p>If you are prompted for a password, execute the following</p>
754
755<div class="source">
756<div class="source">
757<pre>machineA&gt; ssh-keygen -t rsa -P &quot;&quot;
758machineA&gt; cat $HOME/.ssh/id_rsa.pub &gt;&gt; $HOME/.ssh/authorized_keys
759machineA&gt; chmod 700 $HOME/.ssh/authorized_keys
760</pre></div></div>
761<p>If $HOME is not on the NFS, copy the id_rsa.pub to the directory ~/.ssh (login with the same account) on each machine, and then do the following on each machine. (Notice that this step is not needed if the folder &#x201c;.ssh&#x201d; is on the NFS and can be accessed by all the nodes.)</p>
762
763<div class="source">
764<div class="source">
765<pre>cd ~/.ssh
766cat id_rsa.pub &gt;&gt; authorized_keys
767chmod 700 $HOME/.ssh/authorized_keys
768</pre></div></div>
769<p>Then run the following step again and type &#x201c;yes&#x201d; if prompted:</p>
770
771<div class="source">
772<div class="source">
773<pre>machineA&gt; ssh 127.0.0.1
774</pre></div></div></div>
775<div class="section">
776<h3><a name="Step_2:_Define_the_AsterixDB_cluster"></a>Step (2): Define the AsterixDB cluster</h3>
777<p>We first log into the master machine as the user &#x201c;joe&#x201d;. On this machine, download the Standalone Cluster installer from <a class="externalLink" href="https://asterixdb.apache.org/download.html">here</a> (save as above), then do the following steps similar to the single-machine case described above:</p>
778
779<div class="source">
780<div class="source">
781<pre>machineA&gt; cd ~
782machineA&gt; mkdir asterix-mgmt
783machineA&gt; cd asterix-mgmt
784machineA&gt; unzip &lt;path to the Managix zip bundle&gt;
785</pre></div></div>
786<p>Note that it is recommended that MANAGIX_HOME is not located on a network file system (NFS). Managix creates artifacts/logs that are not required to be shared. Any overhead associated with creating artifacts/logs on the NFS should be avoided.</p>
787<p>We also need an AsterixDB configuration XML file for the cluster. We give the name to the cluster, say, &#x201c;rainbow&#x201d;. We create a folder for the configuration of this cluster:</p>
788
789<div class="source">
790<div class="source">
791<pre>machineA&gt; mkdir asterix-mgmt/rainbow_cluster
792</pre></div></div>
793<p>For this cluster we create a configuration file <tt>$MANAGIX_HOME/rainbow_cluster/rainbow.xml</tt>. The following is a sample file with explanation of the properties:</p>
794
795<div class="source">
796<div class="source">
797<pre>&lt;cluster xmlns=&quot;cluster&quot;&gt;
798
799 &lt;!-- Name of the cluster --&gt;
800 &lt;name&gt;rainbow&lt;/name&gt;
801
802 &lt;!-- username, which should be valid for all the three machines --&gt;
803 &lt;username&gt;joe&lt;/username&gt;
804
805 &lt;!-- The working directory of Managix. It is recommended for the working
806 directory to be on a network file system (NFS) that can accessed by
807 all machines.
808 Managix creates the directory if it it doesn't exist. --&gt;
809 &lt;working_dir&gt;
810 &lt;dir&gt;/home/joe/managix-workingDir&lt;/dir&gt;
811 &lt;NFS&gt;true&lt;/NFS&gt;
812 &lt;/working_dir&gt;
813
814 &lt;!-- Directory for Asterix to store worker logs information for each machine.
815 Needs to be on the local file system of each machine.
816 Managix creates the directory if it doesn't exist.
817 This property can be overriden for a node by redefining at the node level. --&gt;
818 &lt;log_dir&gt;/mnt/joe/logs&lt;/log_dir&gt;
819
820 &lt;!-- Directory for Asterix to store transaction log information for each machine.
821 Needs to be on the local file system of each machine.
822 Managix creates the directory if it doesn't exist.
823 This property can be overriden for a node by redefining at the node level. --&gt;
824 &lt;txn_log_dir&gt;/mnt/joe/txn_logs&lt;/txn_log_dir&gt;
825
826 &lt;!-- Mount point of an iodevice. Use a comma separated list for a machine that
827 has multiple iodevices (disks).
828 This property can be overriden for a node by redefining at the node level. --&gt;
829 &lt;iodevices&gt;/mnt/joe&lt;/iodevices&gt;
830
831 &lt;!-- Path on each iodevice where Asterix will store its data --&gt;
832 &lt;store&gt;storage&lt;/store&gt;
833
834 &lt;!-- Java home for each machine --&gt;
835 &lt;java_home&gt;/usr/lib/jvm/jdk1.8.0&lt;/java_home&gt;
836
837 &lt;!-- IP addresses of the master machine A --&gt;
838 &lt;master_node&gt;
839 &lt;id&gt;master&lt;/id&gt;
840 &lt;client_ip&gt;128.195.52.177&lt;/client_ip&gt;
841 &lt;cluster_ip&gt;192.168.100.0&lt;/cluster_ip&gt;
842 &lt;client_port&gt;1098&lt;/client_port&gt;
843 &lt;cluster_port&gt;1099&lt;/cluster_port&gt;
844 &lt;http_port&gt;8888&lt;/http_port&gt;
845 &lt;/master_node&gt;
846
847 &lt;!-- IP address(es) of machine B --&gt;
848 &lt;node&gt;
849 &lt;id&gt;nodeB&lt;/id&gt;
850 &lt;cluster_ip&gt;192.168.100.1&lt;/cluster_ip&gt;
851 &lt;/node&gt;
852
853 &lt;!-- IP address(es) of machine C --&gt;
854 &lt;node&gt;
855 &lt;id&gt;nodeC&lt;/id&gt;
856 &lt;cluster_ip&gt;192.168.100.2&lt;/cluster_ip&gt;
857 &lt;/node&gt;
858&lt;/cluster&gt;
859</pre></div></div>
860<p>As stated before, each of the above properties can be defined at the cluster level, in which case it applies to all the nodes in the system. Each property can also be defined at a node level.</p>
861<p>Once we have formed the cluster XML file, we can validate the configuration by doing the following:</p>
862
863<div class="source">
864<div class="source">
865<pre>managix validate -c rainbow_cluster/rainbow.xml
866</pre></div></div>
867<p>This will verify the contents of the file, and also attempt to ssh to each node in the cluster to ensure that password-less SSH is configured correctly. You may see output like</p>
868
869<div class="source">
870<div class="source">
871<pre>The authenticity of host '192.168.100.1 (192.168.100.1)' can't be established.
872RSA key fingerprint is 89:80:31:1f:be:51:16:d7:2b:f5:e0:b3:2c:bd:83:94.
873Are you sure you want to continue connecting (yes/no)?
874</pre></div></div>
875<p>and this output may be repeated for each node in the cluster. Answer &#x201c;yes&#x201d; each time.</p>
876<p>If the final output contains the following lines (possibly separated by the RSA prompts mentione above):</p>
877
878<div class="source">
879<div class="source">
880<pre>INFO: Environment [OK]
881INFO: Cluster configuration [OK]
882</pre></div></div>
883<p>it means that the XML configuration file is correct!</p></div>
884<div class="section">
885<h3><a name="Step_3:_Configuring_Managix"></a>Step (3): Configuring Managix</h3>
886<p>Managix uses a configuration XML file at <tt>$MANAGIX_HOME/conf/managix-conf.xml</tt> to configure its own properties, such as its Zookeeper service. We can use the <tt>configure</tt> command to auto-generate this configuration file:</p>
887
888<div class="source">
889<div class="source">
890<pre>machineA&gt; managix configure
891</pre></div></div>
892<p>We use the <tt>validate</tt> command to validate the Managix configuration. To do so, execute the following.</p>
893
894<div class="source">
895<div class="source">
896<pre>machineA&gt; managix validate
897INFO: Environment [OK]
898INFO: Managix Configuration [OK]
899</pre></div></div>
900<p>Note that the <tt>configure</tt> command also generates a cluster configuration XML file at $MANAGIX_HOME/clusters/local/local.xml. This file is not needed in the case of a cluster of machines.</p></div>
901<div class="section">
902<h3><a name="Step_4:_Creating_an_AsterixDB_instance"></a>Step (4): Creating an AsterixDB instance</h3>
903<p>Now that we have configured Managix, we shall next create an AsterixDB instance, which is identified by a unique name and is created using the <tt>create</tt> command. The usage description for the <tt>create</tt> command can be obtained by executing the following:</p>
904
905<div class="source">
906<div class="source">
907<pre>machineA&gt; managix help -cmd create
908
909Creates an AsterixDB instance with a specified name. Post creation, the instance is in ACTIVE state,
910indicating its availability for executing statements/queries.
911Usage arguments/options:
912-n Name of the AsterixDB instance.
913-c Path to the cluster configuration file
914</pre></div></div>
915<p>We shall now use the <tt>create</tt> command to create an AsterixDB instance called &#x201c;rainbow_asterix&#x201d;. In doing so, we shall use the cluster configuration file that was auto-generated by Managix.</p>
916
917<div class="source">
918<div class="source">
919<pre>machineA&gt; managix create -n rainbow_asterix -c clusters/rainbow.xml
920</pre></div></div>
921<p>If the response message does not have warning, then Congratulations! You have successfully installed AsterixDB on this cluster of machines!</p>
922<p>Please refer to the section <a href="#Section_4:_Managing_the_Lifecycle_of_an_AsterixDB_Instance">Managing the Lifecycle of an AsterixDB Instance</a> for a detailed description on the set of available commands/operations that let you manage the lifecycle of an AsterixDB instance. Note that the output of the commands varies with the cluster definition and may not apply to the cluster specification you built above.</p></div></div>
923<div class="section">
924<h2><a name="Section_4:_Managing_the_Lifecycle_of_an_AsterixDB_Instance_Back_to_TOC"></a><a name="Section4ManagingTheLifecycleOfAnAsterixDBInstance" id="Section4ManagingTheLifecycleOfAnAsterixDBInstance">Section 4: Managing the Lifecycle of an AsterixDB Instance</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
925<p>Now that we have an AsterixDB instance running, let us use Managix to manage the instance&#x2019;s lifecycle. Managix provides the following set of commands/operations:</p>
926<div class="section">
927<div class="section">
928<h4><a name="Managix_Commands"></a>Managix Commands</h4>
929
930<table border="0" class="table table-striped">
931
932<tr class="a">
933<td>Command</td>
934<td>Description</td></tr>
935
936<tr class="b">
937<td><a href="#Creating_an_AsterixDB_instance">create</a></td>
938<td>Creates a new asterix instance.</td></tr>
939
940<tr class="a">
941<td><a href="#Describe_Command">describe</a></td>
942<td>Describes an existing asterix instance.</td></tr>
943
944<tr class="b">
945<td><a href="#Stop_Command">stop</a></td>
946<td>Stops an asterix instance that is in the ACTIVE state.</td></tr>
947
948<tr class="a">
949<td><a href="#Start_Command">start</a></td>
950<td>Starts an AsterixDB instance.</td></tr>
951
952<tr class="b">
953<td><a href="#Backup_Command">backup</a></td>
954<td>Creates a backup for an existing AsterixDB instance.</td></tr>
955
956<tr class="a">
957<td><a href="#Restore_Command">restore</a></td>
958<td>Restores an AsterixDB instance.</td></tr>
959
960<tr class="b">
961<td><a href="#Delete_Command">delete</a></td>
962<td>Deletes an AsterixDB instance.</td></tr>
963
964<tr class="a">
965<td><a href="#Configuring_Managix">validate</a></td>
966<td>Validates the installer/cluster configuration.</td></tr>
967
968<tr class="b">
969<td><a href="#Configuring_Managix">configure</a></td>
970<td>Auto generates a configuration for an AsterixDB instance.</td></tr>
971
972<tr class="a">
973<td><a href="#Log_Command">log</a></td>
974<td>Produces a zip archive containing log files from each node in an AsterixDB instance.</td></tr>
975
976<tr class="b">
977<td><a href="#Shutdown_Command">shutdown</a></td>
978<td>Shuts down the installer service.</td></tr>
979</table>
980<p>You may obtain the above listing by simply executing &#x2018;managix&#x2019; :</p>
981
982<div class="source">
983<div class="source">
984<pre>$ managix
985</pre></div></div>
986<p>We already talked about <tt>create</tt> and <tt>validate</tt> commands. We shall next explain the rest of the commands listed above. We also provide sample output messages of these commands assuming we are running an AsterixDB instance on a single machine.</p>
987<div class="section">
988<h5><a name="Describe_Command"></a>Describe Command</h5>
989<p>The <tt>describe</tt> command provides information about an AsterixDB instance. The usage can be looked up by executing the following:</p>
990
991<div class="source">
992<div class="source">
993<pre>$ managix help -cmd describe
994
995Provides information about an AsterixDB instance.
996The following options are available:
997[-n] Name of the AsterixDB instance.
998[-admin] Provides a detailed description
999</pre></div></div>
1000<p>The brackets indicate optional flags.</p>
1001<p>The output of the <tt>describe</tt> command when used without the <tt>admin</tt> flag contains minimal information and is similar to the output of the <tt>create</tt> command. Let us try running the describe command in &#x201c;admin&#x201d; mode.</p>
1002
1003<div class="source">
1004<div class="source">
1005<pre>$ managix describe -n my_asterix -admin
1006INFO: Name:my_asterix
1007Created:Thu Mar 07 19:07:00 PST 2013
1008Web-Url:http://127.0.0.1:19001
1009State:ACTIVE
1010Master node:master:127.0.0.1
1011node1:127.0.0.1
1012
1013Asterix version:0.0.5
1014Asterix Configuration
1015output_dir = /tmp/asterix_output/
1016Metadata Node:node1
1017Processes
1018NC at 127.0.0.1 [ 22195 ]
1019CC at 127.0.0.1 [ 22161 ]
1020
1021Asterix Configuration
1022 nc.java.opts :-Xmx1024m
1023 cc.java.opts :-Xmx1024m
1024 storage.buffercache.pagesize :32768
1025 storage.buffercache.size :33554432
1026 storage.buffercache.maxopenfiles :214748364
1027 storage.memorycomponent.pagesize :32768
1028 storage.memorycomponent.numpages :1024
1029 storage.memorycomponent.globalbudget :536870192
1030 storage.lsm.mergethreshold :3
1031 storage.lsm.bloomfilter.falsepositiverate:0.01
1032 txn.log.buffer.numpages :8
1033 txn.log.buffer.pagesize :131072
1034 txn.log.partitionsize :2147483648
1035 txn.log.disksectorsize :4096
1036 txn.log.groupcommitinterval :1
1037 txn.log.checkpoint.lsnthreshold :67108864
1038 txn.log.checkpoint.pollfrequency :120
1039 txn.log.checkpoint.history :0
1040 txn.lock.escalationthreshold :1000
1041 txn.lock.shrinktimer :5000
1042 txn.lock.timeout.waitthreshold :60000
1043 txn.lock.timeout.sweepthreshold :10000
1044 compiler.sortmemory :33554432
1045 compiler.joinmemory :33554432
1046 compiler.framesize :32768
1047 web.port :19001
1048 api.port :19002
1049 log.level :INFO
1050</pre></div></div>
1051<p>As seen above, the instance &#x2018;my_asterix&#x2019; is configured such that all processes running at the localhost (127.0.0.1). The process id for each process (JVM) is shown next to it.</p></div>
1052<div class="section">
1053<h5><a name="Stop_Command"></a>Stop Command</h5>
1054<p>The <tt>stop</tt> command can be used for shutting down an AsterixDB instance. After that, the instance is unavailable for executing queries. The usage can be looked up by executing the following.</p>
1055
1056<div class="source">
1057<div class="source">
1058<pre>$ managix help -cmd stop
1059
1060Shuts an AsterixDB instance that is in ACTIVE state. After executing the stop command, the AsterixDB instance transits
1061to the INACTIVE state, indicating that it is no longer available for executing queries.
1062
1063Available arguments/options
1064-n name of the AsterixDB instance.
1065</pre></div></div>
1066<p>To stop the AsterixDB instance.</p>
1067
1068<div class="source">
1069<div class="source">
1070<pre>$ managix stop -n my_asterix
1071 INFO: Stopped AsterixDB instance: my_asterix
1072
1073$ managix describe -n my_asterix
1074 INFO: Name: my_asterix
1075 Created:Thu Mar 07 19:07:00 PST 2013
1076 Web-Url:http://127.0.0.1:19001
1077 State:INACTIVE (Fri Mar 08 09:49:00 PST 2013)
1078</pre></div></div></div>
1079<div class="section">
1080<h5><a name="Start_Command"></a>Start Command</h5>
1081<p>The <tt>start</tt> command starts an AsterixDB instance that is in the INACTIVE state. The usage can be looked up by executing the following:</p>
1082
1083<div class="source">
1084<div class="source">
1085<pre> $ managix help -cmd start
1086
1087 Starts an AsterixDB instance that is in INACTIVE state. After executing the start command, the AsterixDB instance transits to the ACTIVE state, indicating that it is now available for executing statements/queries.
1088
1089 Available arguments/options
1090 -n name of the AsterixDB instance.
1091</pre></div></div>
1092<p>Let us now start the AsterixDB instance.</p>
1093
1094<div class="source">
1095<div class="source">
1096<pre> $ managix start -n my_asterix
1097 INFO: Name:my_asterix
1098 Created:Thu Mar 07 19:07:00 PST 2013
1099 Web-Url:http://127.0.0.1:19001
1100 State:ACTIVE (Fri Mar 08 09:49:00 PST 2013)
1101</pre></div></div></div>
1102<div class="section">
1103<h5><a name="Backup_Command"></a>Backup Command</h5>
1104<p>The backup command allows you to take a backup of the data stored with an AsterixDB instance. The backup can be taken on the local file system or on an HDFS instance. In either case, the snapshots are stored under a backup directory. You need to make sure the backup directory has appropriate read/write permissions. Configuring settings for backup can be found inside the Managix&#x2019;s configuration file located at <tt>$MANAGIX_HOME/conf/managix-conf.xml</tt>.</p>
1105<p><i>Configuring backup on the local file system</i></p>
1106<p>We need to provide a path to a backup directory on the local file system. The backup directory can be configured be editing the Managix configuration XML, found at <tt>$MANAGIX_HOME/conf/managix-conf.xml</tt>.</p>
1107
1108<div class="source">
1109<div class="source">
1110<pre>&lt;backup&gt;
1111 &lt;backupDir&gt;Provide path to the backup directory here&lt;/backupDir&gt;
1112&lt;/backup&gt;
1113</pre></div></div>
1114<p>Prior to taking a backup of an AsterixDB instance, it is required for the instance to be in the INACTIVE state. We do so by using the <tt>stop</tt> command, as shown below:</p>
1115
1116<div class="source">
1117<div class="source">
1118<pre>$ managix stop -n my_asterix
1119 INFO: Stopped AsterixDB instance: my_asterix
1120</pre></div></div>
1121<p>We can now take the backup by executing the following:</p>
1122
1123<div class="source">
1124<div class="source">
1125<pre>$ managix backup -n my_asterix
1126 INFO: my_asterix backed up 0_Fri Mar 08 16:16:34 PST 2013 (LOCAL)
1127</pre></div></div>
1128<p><i>Configuring backup on an HDFS instance</i></p>
1129<p>To configure a backup to be taken on an HDFS instance, we need to provide required information about the running HDFS instance. This information includes the HDFS version and the HDFS url. Simply edit the Managix configuration file and provide the required information.</p>
1130
1131<div class="source">
1132<div class="source">
1133<pre>&lt;backup&gt;
1134 &lt;backupDir&gt;Provide path to the backup directory here&lt;/backupDir&gt;
1135 &lt;hdfs&gt;
1136 &lt;version&gt;0.20.2&lt;/version&gt;
1137 &lt;url&gt;&lt;/url&gt;
1138 &lt;/hdfs&gt;
1139&lt;/backup&gt;
1140</pre></div></div>
1141<p>A sample output when a backup is taken on an HDFS is shown below:</p>
1142
1143<div class="source">
1144<div class="source">
1145<pre>$ managix backup -n my_asterix
1146 INFO: my_asterix backed up 1_Fri Mar 08 17:10:38 PST 2013 (HDFS)
1147</pre></div></div>
1148<p>Each time we take a backup, we are provided with a unique id (a monotonically increasing value starting with 0). This id is required when we need to restore from a previously taken backup. Information about all available backup snapshots can be obtained by using the <tt>describe</tt> command in the admin mode, as shown below:</p>
1149
1150<div class="source">
1151<div class="source">
1152<pre>$ managix describe -n my_asterix -admin
1153INFO: Name:my_asterix
1154Created:Fri Mar 08 15:11:12 PST 2013
1155Web-Url:http://127.0.0.1:19001
1156State:INACTIVE (Fri Mar 08 16:14:20 PST 2013)
1157Master node:master:127.0.0.1
1158node1:127.0.0.1
1159
1160Backup:0 created at Fri Mar 08 16:16:34 PST 2013 (LOCAL)
1161Backup:1 created at Fri Mar 08 17:10:38 PST 2013 (HDFS)
1162
1163Asterix version:0.0.5
1164Asterix Configuration
1165Metadata Node:node1
1166Processes
1167</pre></div></div>
1168<p>The above output shows the available backup identified by it&#x2019;s id (0). We shall next describe the method for restoring an AsterixDB instance from a backup snapshot.</p></div>
1169<div class="section">
1170<h5><a name="Restore_Command"></a>Restore Command</h5>
1171<p>The <tt>restore</tt> command allows you to restore an AsterixDB instance&#x2019;s data from a previously taken backup. The usage description can be obtained as follows:</p>
1172
1173<div class="source">
1174<div class="source">
1175<pre>$ managix help -cmd restore
1176
1177Restores an AsterixDB instance's data from a previously taken backup.
1178Available arguments/options
1179
1180-n name of the AsterixDB instance
1181-b id of the backup snapshot
1182</pre></div></div>
1183<p>The following command restores our AsterixDB instance from the backup snapshot identified by the id (0). Prior to restoring an instance from a backup, it is required that the instance is in the INACTIVE state.</p>
1184
1185<div class="source">
1186<div class="source">
1187<pre>$ managix restore -n my_asterix -b 0
1188INFO: AsterixDB instance: my_asterix has been restored from backup
1189</pre></div></div>
1190<p>You can start the AsterixDB instance by using the start command.</p></div>
1191<div class="section">
1192<h5><a name="Log_Command"></a>Log Command</h5>
1193<p>The <tt>log</tt> command allows you to collect the log files coresponding to each node of an AsterixDB instance into a zip archive. The zip archive is produced on the local file system of the machine running managix.</p>
1194
1195<div class="source">
1196<div class="source">
1197<pre>$ managix help -cmd log
1198
1199Creates a zip archive containing log files corresponding to each worker node (NC) and the master (CC) for an AsterixDB instance
1200
1201Available arguments/options
1202-n name of the AsterixDB instance.
1203-d destination directory for producing the zip archive. Defaults to $MANAGIX_HOME/logdump.
1204</pre></div></div>
1205<p>The following is an example showing the use of the log command.</p>
1206
1207<div class="source">
1208<div class="source">
1209<pre>$ managix log -n my_asterix -d /Users/joe/logdump
1210INFO: Log zip archive created at /Users/joe/logdump/log_Thu_Jun_06_00:53:51_PDT_2013.zip
1211</pre></div></div></div>
1212<div class="section">
1213<h5><a name="Delete_Command"></a>Delete Command</h5>
1214<p>As the name suggests, the <tt>delete</tt> command permanently removes an AsterixDB instance by cleaning up all associated data/artifacts. The usage can be looked up by executing the following:</p>
1215
1216<div class="source">
1217<div class="source">
1218<pre>$ managix help -cmd delete
1219Permanently deletes an AsterixDB instance. The instance must be in the INACTIVE state.
1220
1221Available arguments/options
1222-n name of the AsterixDB instance.
1223
1224
1225$ managix delete -n my_asterix
1226 INFO: AsterixDB instance my_asterix deleted.
1227</pre></div></div></div>
1228<div class="section">
1229<h5><a name="Shutdown_Command"></a>Shutdown Command</h5>
1230<p>Managix uses Zookeeper service for storing all information about created AsterixDB instances. The Zookeeper service runs in the background and can be shut down using the <tt>shutdown</tt> command.</p>
1231
1232<div class="source">
1233<div class="source">
1234<pre>$ managix shutdown
1235</pre></div></div></div>
1236<div class="section">
1237<h5><a name="Help_Command"></a>Help Command</h5>
1238<p>The <tt>help</tt> command provides a usage description of a Managix command.</p>
1239
1240<div class="source">
1241<div class="source">
1242<pre>$ managix help -cmd &lt;command name&gt;
1243</pre></div></div>
1244<p>As an example, for looking up the help for the <tt>configure</tt> command, execute the following</p>
1245
1246<div class="source">
1247<div class="source">
1248<pre>$ managix help -cmd configure
1249
1250Auto-generates the AsterixDB installer configruation settings and AsterixDB cluster
1251configuration settings for a single node setup.
1252</pre></div></div></div></div></div></div>
1253<div class="section">
1254<h2><a name="Section_5:_Frequently_Asked_Questions_Back_to_TOC"></a><a name="Section5FAQ" id="Section5FAQ">Section 5: Frequently Asked Questions</a> <font size="4"><a href="#toc">[Back to TOC]</a></font></h2>
1255<div class="section">
1256<div class="section">
1257<div class="section">
1258<h5><a name="Question"></a>Question</h5>
1259<p>What happens if a machine acting as a node in the Asterix cluster becomes unreachable for some reason (network partition/machine failure) ?</p></div>
1260<div class="section">
1261<h5><a name="Answer"></a>Answer</h5>
1262<p>When a node leaves the Asterix cluster, the AsterixDB instance transits to an &#x2018;UNUSABLE&#x2019; state, indicating that it is no longer available for serving queries. To know which set of node(s) left the cluster, run the describe command with -admin flag.</p>
1263
1264<div class="source">
1265<div class="source">
1266<pre>$ $MANAGIX_HOME/bin/managix describe -n &lt;name of the AsterixDB instance&gt;-admin
1267</pre></div></div>
1268<p>Above command will show the state of AsterixDB instance and list the set of nodes that have left the cluster.</p>
1269<p>The failed node must be brought back to re-join the cluster. Once done, you may bring back the instance to an &#x2018;ACTIVE&#x2019; state by executing the following sequence.</p>
1270<p>1) Get rid of the Asterix processes running on the nodes in the cluster:-</p>
1271
1272<div class="source">
1273<div class="source">
1274<pre>managix stop -n my_asterix
1275</pre></div></div>
1276<p>The processes associated with the instance are terminated and the instance moves to the INACTIVE state.</p>
1277<p>2) Start the AsterixDB instance using the start command.</p>
1278
1279<div class="source">
1280<div class="source">
1281<pre>managix start -n &lt;name of your AsterixDB instance&gt;
1282</pre></div></div></div>
1283<div class="section">
1284<h5><a name="Question"></a>Question</h5>
1285<p>Do I need to create all the directories/paths I put into the cluster configuration XML ?</p></div>
1286<div class="section">
1287<h5><a name="Answer"></a>Answer</h5>
1288<p>Managix will create a path if it is not existing. It does so using the user account mentioned in the cluster configuration xml. Please ensure that the user account has appropriate permissions for creating the missing paths.</p></div>
1289<div class="section">
1290<h5><a name="Question"></a>Question</h5>
1291<p>Should MANAGIX_HOME be on the network file system (NFS) ?</p></div>
1292<div class="section">
1293<h5><a name="Answer"></a>Answer</h5>
1294<p>It is recommended that MANAGIX_HOME is not on the NFS. Managix produces artifacts/logs on disk which are not required to be shared. As such an overhead in creating the artifacts/logs on the NFS should be avoided.</p></div>
1295<div class="section">
1296<h5><a name="Question"></a>Question</h5>
1297<p>How do we change the underlying code (apply a code patch) for an &#x2018;active&#x2019; asterix instance?</p></div>
1298<div class="section">
1299<h5><a name="Answer"></a>Answer</h5>
1300<p>At times, end-user (particularly asterix developer) may run into the need to altering the underlying code that is being run by an asterix instance. In the current version of managix, this can be achieved as follows:-</p>
1301<p>Assume that you have an &#x2018;active&#x2019; instance by the name a1 that is running version v1 of asterix. You have a revised version of asterix - v2 that fixes some bug(s).</p>
1302<p>To upgrade asterix from v1 to v2:-</p>
1303<p>step 1) managix stop -n a1</p>
1304<p>step 2) managix shutdown</p>
1305<p>step 3) copy asterix-server zip (version v2) to asterix/</p>
1306<p>step 4) managix start -n a1</p>
1307<p>a1 now is running on version v2.</p>
1308<p>Limitations:-</p>
1309<p>a) Obviously this wont work in a situation where v2 has made a change that is incompatible with earlier version, such altering schema.</p>
1310<p>b) A change in asterix zip applies to all existing instances (after a restart) and subsequent instances that user creates.</p></div></div></div></div>
1311 </div>
1312 </div>
1313 </div>
1314
1315 <hr/>
1316
1317 <footer>
1318 <div class="container-fluid">
1319 <div class="row span12">Copyright &copy; 2017
1320 <a href="https://www.apache.org/">The Apache Software Foundation</a>.
1321 All Rights Reserved.
1322
1323 </div>
1324
1325 <?xml version="1.0" encoding="UTF-8"?>
1326<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
1327 feather logo, and the Apache AsterixDB project logo are either
1328 registered trademarks or trademarks of The Apache Software
1329 Foundation in the United States and other countries.
1330 All other marks mentioned may be trademarks or registered
1331 trademarks of their respective owners.</div>
1332
1333
1334 </div>
1335 </footer>
1336 </body>
1337</html>