blob: 6937f67c36d2d95cde629f12b5db09d5981e540e [file] [log] [blame]
Ian Maxon49d15b22020-12-06 16:23:00 -08001<!DOCTYPE html>
2<!--
3 | Generated by Apache Maven Doxia Site Renderer 1.8.1 from target/generated-site/markdown/aws.md at 2020-12-06
4 | Rendered using Apache Maven Fluido Skin 1.7
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
10 <meta name="Date-Revision-yyyymmdd" content="20201206" />
11 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; Installation using Amazon Web Services</title>
13 <link rel="stylesheet" href="./css/apache-maven-fluido-1.7.min.css" />
14 <link rel="stylesheet" href="./css/site.css" />
15 <link rel="stylesheet" href="./css/print.css" media="print" />
16 <script type="text/javascript" src="./js/apache-maven-fluido-1.7.min.js"></script>
17
18 </head>
19 <body class="topBarDisabled">
20 <div class="container-fluid">
21 <div id="banner">
22 <div class="pull-left"><a href="./" id="bannerLeft"><img src="images/asterixlogo.png" alt="AsterixDB"/></a></div>
23 <div class="pull-right"></div>
24 <div class="clear"><hr/></div>
25 </div>
26
27 <div id="breadcrumbs">
28 <ul class="breadcrumb">
29 <li id="publishDate">Last Published: 2020-12-06</li>
30 <li id="projectVersion" class="pull-right">Version: 0.9.6-SNAPSHOT</li>
31 <li class="pull-right"><a href="index.html" title="Documentation Home">Documentation Home</a></li>
32 </ul>
33 </div>
34 <div class="row-fluid">
35 <div id="leftColumn" class="span2">
36 <div class="well sidebar-nav">
37 <ul class="nav nav-list">
38 <li class="nav-header">Get Started - Installation</li>
39 <li><a href="ncservice.html" title="Option 1: using NCService"><span class="none"></span>Option 1: using NCService</a></li>
40 <li><a href="ansible.html" title="Option 2: using Ansible"><span class="none"></span>Option 2: using Ansible</a></li>
41 <li class="active"><a href="#"><span class="none"></span>Option 3: using Amazon Web Services</a></li>
42 <li class="nav-header">AsterixDB Primer</li>
43 <li><a href="sqlpp/primer-sqlpp.html" title="Using SQL++"><span class="none"></span>Using SQL++</a></li>
44 <li class="nav-header">Data Model</li>
45 <li><a href="datamodel.html" title="The Asterix Data Model"><span class="none"></span>The Asterix Data Model</a></li>
46 <li class="nav-header">Queries</li>
47 <li><a href="sqlpp/manual.html" title="The SQL++ Query Language"><span class="none"></span>The SQL++ Query Language</a></li>
48 <li><a href="SQLPP.html" title="Raw SQL++ Grammar"><span class="none"></span>Raw SQL++ Grammar</a></li>
49 <li><a href="sqlpp/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li>
50 <li class="nav-header">API/SDK</li>
51 <li><a href="api.html" title="HTTP API"><span class="none"></span>HTTP API</a></li>
52 <li><a href="csv.html" title="CSV Output"><span class="none"></span>CSV Output</a></li>
53 <li class="nav-header">Advanced Features</li>
54 <li><a href="aql/externaldata.html" title="Accessing External Data"><span class="none"></span>Accessing External Data</a></li>
55 <li><a href="feeds.html" title="Data Ingestion with Feeds"><span class="none"></span>Data Ingestion with Feeds</a></li>
56 <li><a href="udf.html" title="User Defined Functions"><span class="none"></span>User Defined Functions</a></li>
57 <li><a href="sqlpp/filters.html" title="Filter-Based LSM Index Acceleration"><span class="none"></span>Filter-Based LSM Index Acceleration</a></li>
58 <li><a href="sqlpp/fulltext.html" title="Support of Full-text Queries"><span class="none"></span>Support of Full-text Queries</a></li>
59 <li><a href="sqlpp/similarity.html" title="Support of Similarity Queries"><span class="none"></span>Support of Similarity Queries</a></li>
60 <li><a href="interval_join.html" title="Support of Interval Joins"><span class="none"></span>Support of Interval Joins</a></li>
61 <li class="nav-header">Deprecated</li>
62 <li><a href="aql/primer.html" title="AsterixDB Primer: Using AQL"><span class="none"></span>AsterixDB Primer: Using AQL</a></li>
63 <li><a href="aql/manual.html" title="Queries: The Asterix Query Language (AQL)"><span class="none"></span>Queries: The Asterix Query Language (AQL)</a></li>
64 <li><a href="aql/builtins.html" title="Queries: Builtin Functions (AQL)"><span class="none"></span>Queries: Builtin Functions (AQL)</a></li>
65</ul>
66 <hr />
67 <div id="poweredBy">
68 <div class="clear"></div>
69 <div class="clear"></div>
70 <div class="clear"></div>
71 <div class="clear"></div>
72<a href="./" title="AsterixDB" class="builtBy"><img class="builtBy" alt="AsterixDB" src="images/asterixlogo.png" /></a>
73 </div>
74 </div>
75 </div>
76 <div id="bodyColumn" class="span10" >
77<!--
78 ! Licensed to the Apache Software Foundation (ASF) under one
79 ! or more contributor license agreements. See the NOTICE file
80 ! distributed with this work for additional information
81 ! regarding copyright ownership. The ASF licenses this file
82 ! to you under the Apache License, Version 2.0 (the
83 ! "License"); you may not use this file except in compliance
84 ! with the License. You may obtain a copy of the License at
85 !
86 ! http://www.apache.org/licenses/LICENSE-2.0
87 !
88 ! Unless required by applicable law or agreed to in writing,
89 ! software distributed under the License is distributed on an
90 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
91 ! KIND, either express or implied. See the License for the
92 ! specific language governing permissions and limitations
93 ! under the License.
94 !-->
95<h1>Installation using Amazon Web Services</h1>
96<div class="section">
97<h2><a name="Table_of_Contents"></a><a name="atoc" id="#toc">Table of Contents</a></h2>
98<ul>
99
100<li><a href="#Introduction">Introduction</a></li>
101<li><a href="#Prerequisites">Prerequisites</a></li>
102<li><a href="#config">Cluster Configuration</a></li>
103<li><a href="#lifecycle">Cluster Lifecycle Management</a></li>
104</ul><!--
105 ! Licensed to the Apache Software Foundation (ASF) under one
106 ! or more contributor license agreements. See the NOTICE file
107 ! distributed with this work for additional information
108 ! regarding copyright ownership. The ASF licenses this file
109 ! to you under the Apache License, Version 2.0 (the
110 ! "License"); you may not use this file except in compliance
111 ! with the License. You may obtain a copy of the License at
112 !
113 ! http://www.apache.org/licenses/LICENSE-2.0
114 !
115 ! Unless required by applicable law or agreed to in writing,
116 ! software distributed under the License is distributed on an
117 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
118 ! KIND, either express or implied. See the License for the
119 ! specific language governing permissions and limitations
120 ! under the License.
121 !-->
122</div>
123<div class="section">
124<h2><a name="Introduction" id="Introduction">Introduction</a></h2>
125<p>Note that you can always manually launch a number of Amazon Web Services EC2 instances and then run the Ansible cluster installation scripts as described <a href="ansible.html">here</a> separately to manage the lifecycle of an AsterixDB cluster on those EC2 instances.</p>
126<p>However, via this installation option, we provide a combo solution for automating both AWS EC2 and AsterixDB, where you can run only one script to deploy, start, stop, and terminate an AsterixDB cluster on AWS.</p></div>
127<div class="section">
128<h2><a name="Prerequisites" id="Prerequisites">Prerequisites</a></h2>
129<ul>
130
131<li>
132
133<p>Supported operating systems for the client: <b>Linux</b> and <b>MacOS</b></p>
134</li>
135<li>
136
137<p>Supported operating systems for Amazon Web Services instances: <b>Linux</b></p>
138</li>
139<li>
140
141<p>Install pip on your client machine:</p>
142<p>CentOS</p>
143
144<div>
145<div>
146<pre class="source"> $ sudo yum install python-pip
147</pre></div></div>
148
149<p>Ubuntu</p>
150
151<div>
152<div>
153<pre class="source"> $ sudo apt-get install python-pip
154</pre></div></div>
155
156<p>macOS</p>
157
158<div>
159<div>
160<pre class="source"> $ brew install pip
161</pre></div></div>
162</li>
163<li>
164
165<p>Install Ansible, boto, and boto3 on your client machine:</p>
166
167<div>
168<div>
169<pre class="source"> $ pip install ansible
170 $ pip install boto
171 $ pip install boto3
172</pre></div></div>
173
174<p>Note that you might need <tt>sudo</tt> depending on your system configuration.</p>
175<p><b>Make sure that the version of Ansible is no less than 2.2.1.0</b>:</p>
176
177<div>
178<div>
179<pre class="source"> $ ansible --version
180 ansible 2.2.1.0
181</pre></div></div>
182
183<p><b>For users with macOS 10.11+</b>, please create a user-level Ansible configuration file at:</p>
184
185<div>
186<div>
187<pre class="source"> ~/.ansible.cfg
188</pre></div></div>
189
190<p>and add the following configuration:</p>
191
192<div>
193<div>
194<pre class="source"> [ssh_connection]
195 control_path = %(directory)s/%%C
196</pre></div></div>
197</li>
198<li>
199
200<p>Download the AsterixDB distribution package, unzip it, navigate to <tt>opt/aws/</tt></p>
201
202<div>
203<div>
204<pre class="source"> $ cd opt/aws
205</pre></div></div>
206
207<p>The following files and directories are in the directory <tt>opt/aws</tt>:</p>
208
209<div>
210<div>
211<pre class="source"> README bin conf yaml
212</pre></div></div>
213
214<p><tt>bin</tt> contains scripts that start and terminate an AWS-based cluster instance, according to the configuration specified in files under <tt>conf</tt>, and <tt>yaml</tt> contains internal Ansible scripts that the shell scripts in <tt>bin</tt> use.</p>
215</li>
216<li>
217
218<p>Create an AWS account and an IAM user.</p>
219<p>Set up a security group that you&#x2019;d like to use for your AWS cluster. <b>The security group should at least allow all TCP connections from anywhere.</b> Provide the name of the security group as the value for the <tt>group</tt> field in <tt>conf/aws_settings.yml</tt>.</p>
220</li>
221<li>
222
223<p>Retrieve your AWS EC2 key pair name and use that as the <tt>keypair</tt> in <tt>conf/aws_settings.yml</tt>;</p>
224<p>retrieve your AWS IAM <tt>access key ID</tt> and use that as the <tt>access_key_id</tt> in <tt>conf/aws_settings.yml</tt>;</p>
225<p>retrieve your AWS IAM <tt>secret access key</tt> and use that as the <tt>secret_access_key</tt> in <tt>conf/aws_settings.yml</tt>.</p>
226<p>Note that you can only read or download <tt>access key ID</tt> and <tt>secret access key</tt> once from your AWS console. If you forget them, you have to create new keys and delete the old ones.</p>
227</li>
228<li>
229
230<p>Configure your ssh setting by editing <tt>~/.ssh/config</tt> and adding the following entry:</p>
231
232<div>
233<div>
234<pre class="source"> Host *.amazonaws.com
235 IdentityFile &lt;path_of_private_key&gt;
236</pre></div></div>
237
238<p>Note that &lt;path_of_private_key&gt; should be replaced by the path to the file that stores the private key for the key pair that you uploaded to AWS and used in <tt>conf/aws_settings</tt>. For example:</p>
239
240<div>
241<div>
242<pre class="source"> Host *.amazonaws.com
243 IdentityFile ~/.ssh/id_rsa
244</pre></div></div>
245</li>
246</ul></div>
247<div class="section">
248<h2><a name="Cluster_Configuration"></a><a name="config" id="config">Cluster Configuration</a></h2>
249<ul>
250
251<li>
252
253<p><b>AWS settings</b>. Edit <tt>conf/instance_settings.yml</tt>. The meaning of each parameter is listed as follows:</p>
254
255<div>
256<div>
257<pre class="source"> # The OS image id for ec2 instances.
258 image: ami-76fa4116
259
260 # The data center region for ec2 instances.
261 region: us-west-2
262
263 # The tag for each ec2 machine. Use different tags for isolation.
264 tag: scale_test
265
266 # The name of a security group that appears in your AWS console.
267 group: default
268
269 # The name of a key pair that appears in your AWS console.
270 keypair: &lt;to be filled&gt;
271
272 # The AWS access key id for your IAM user.
273 access_key_id: &lt;to be filled&gt;
274
275 # The AWS secret key for your IAM user.
276 secret_access_key: &lt;to be filled&gt;
277
278 # The AWS instance type. A full list of available types are listed at:
279 # https://aws.amazon.com/ec2/instance-types/
280 instance_type: t2.micro
281
282 # The number of ec2 instances that construct a cluster.
283 count: 3
284
285 # The user name.
286 user: ec2-user
287
288 # Whether to reuse one slave machine to host the master process.
289 cc_on_nc: false
290</pre></div></div>
291
292<p><b>As described in <a href="#Prerequisites">prerequisites</a>, the following parameters must be customized:</b></p>
293
294<div>
295<div>
296<pre class="source"> # The tag for each ec2 machine. Use different tags for isolation.
297 tag: scale_test
298
299 # The name of a security group that appears in your AWS console.
300 group: default
301
302 # The name of a key pair that appears in your AWS console.
303 keypair: &lt;to be filled&gt;
304
305 # The AWS access key id for your IAM user.
306 access_key_id: &lt;to be filled&gt;
307
308 # The AWS secrety key for your IAM user.
309 secret_access_key: &lt;to be filled&gt;
310</pre></div></div>
311</li>
312<li>
313
314<p><b>Remote working directories</b>. Edit <tt>conf/instance_settings.yml</tt> to change the remote binary directory (the variable &#x201c;binarydir&#x201d;) when necessary. By default, the binary directory will be under the home directory (as the value of Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.</p>
315</li>
316</ul></div>
317<div class="section">
318<h2><a name="Cluster_Lifecycle_Management"></a><a name="lifecycle" id="lifecycle">Cluster Lifecycle Management</a></h2>
319<ul>
320
321<li>
322
323<p>Allocate AWS EC2 nodes (the number of nodes is specified in <tt>conf/instance_settings.yml</tt>) and deploy the binary to all allocated EC2 nodes:</p>
324
325<div>
326<div>
327<pre class="source"> bin/deploy.sh
328</pre></div></div>
329</li>
330<li>
331
332<p>Before starting the AsterixDB cluster, you the instance configuration file <tt>conf/instance/cc.conf</tt> can be modified with the exception of the IP addresses/DNS names which are are generated and cannot be changed. All available parameters and their usage can be found <a href="ncservice.html#Parameters">here</a>.</p>
333</li>
334<li>
335
336<p>Launch your AsterixDB cluster on EC2:</p>
337
338<div>
339<div>
340<pre class="source"> bin/start.sh
341</pre></div></div>
342
343<p>Now you can use the multi-node AsterixDB cluster on EC2 by by opening the master node listed in <tt>conf/instance/inventory</tt> at port <tt>19001</tt> (which can be customized in <tt>conf/instance/cc.conf</tt>) in your browser.</p>
344</li>
345<li>
346
347<p>If you want to stop the AWS-based AsterixDB cluster, run the following script:</p>
348
349<div>
350<div>
351<pre class="source"> bin/stop.sh
352</pre></div></div>
353
354<p>Note that this only stops AsterixDB but does not stop the EC2 nodes.</p>
355</li>
356<li>
357
358<p>If you want to terminate the EC2 nodes that run the AsterixDB cluster, run the following script:</p>
359
360<div>
361<div>
362<pre class="source"> bin/terminate.sh
363</pre></div></div>
364
365<p><b>Note that it will destroy everything in the AsterixDB cluster you installed and terminate all EC2 nodes for the cluster.</b></p>
366</li>
367</ul></div>
368 </div>
369 </div>
370 </div>
371 <hr/>
372 <footer>
373 <div class="container-fluid">
374 <div class="row-fluid">
375<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
376 feather logo, and the Apache AsterixDB project logo are either
377 registered trademarks or trademarks of The Apache Software
378 Foundation in the United States and other countries.
379 All other marks mentioned may be trademarks or registered
380 trademarks of their respective owners.
381 </div>
382 </div>
383 </div>
384 </footer>
385 </body>
386</html>