blob: 42dd15cbe5d9dfea7a2693e8bbb5c35bdcfaa6d2 [file] [log] [blame]
Ian Maxon3355d4c2021-12-13 12:38:15 -08001<!DOCTYPE html>
2<!--
3 | Generated by Apache Maven Doxia Site Renderer 1.8.1 from target/generated-site/markdown/aws.md at 2021-12-13
4 | Rendered using Apache Maven Fluido Skin 1.7
5-->
6<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta charset="UTF-8" />
9 <meta name="viewport" content="width=device-width, initial-scale=1.0" />
10 <meta name="Date-Revision-yyyymmdd" content="20211213" />
11 <meta http-equiv="Content-Language" content="en" />
12 <title>AsterixDB &#x2013; Installation using Amazon Web Services</title>
13 <link rel="stylesheet" href="./css/apache-maven-fluido-1.7.min.css" />
14 <link rel="stylesheet" href="./css/site.css" />
15 <link rel="stylesheet" href="./css/print.css" media="print" />
16 <script type="text/javascript" src="./js/apache-maven-fluido-1.7.min.js"></script>
17
18 </head>
19 <body class="topBarDisabled">
20 <div class="container-fluid">
21 <div id="banner">
22 <div class="pull-left"><a href="./" id="bannerLeft"><img src="images/asterixlogo.png" alt="AsterixDB"/></a></div>
23 <div class="pull-right"></div>
24 <div class="clear"><hr/></div>
25 </div>
26
27 <div id="breadcrumbs">
28 <ul class="breadcrumb">
29 <li id="publishDate">Last Published: 2021-12-13</li>
30 <li id="projectVersion" class="pull-right">Version: 0.9.7.1</li>
31 <li class="pull-right"><a href="index.html" title="Documentation Home">Documentation Home</a></li>
32 </ul>
33 </div>
34 <div class="row-fluid">
35 <div id="leftColumn" class="span2">
36 <div class="well sidebar-nav">
37 <ul class="nav nav-list">
38 <li class="nav-header">Get Started - Installation</li>
39 <li><a href="ncservice.html" title="Option 1: using NCService"><span class="none"></span>Option 1: using NCService</a></li>
40 <li><a href="ansible.html" title="Option 2: using Ansible"><span class="none"></span>Option 2: using Ansible</a></li>
41 <li class="active"><a href="#"><span class="none"></span>Option 3: using Amazon Web Services</a></li>
42 <li class="nav-header">AsterixDB Primer</li>
43 <li><a href="sqlpp/primer-sqlpp.html" title="Using SQL++"><span class="none"></span>Using SQL++</a></li>
44 <li class="nav-header">Data Model</li>
45 <li><a href="datamodel.html" title="The Asterix Data Model"><span class="none"></span>The Asterix Data Model</a></li>
46 <li class="nav-header">Queries</li>
47 <li><a href="sqlpp/manual.html" title="The SQL++ Query Language"><span class="none"></span>The SQL++ Query Language</a></li>
48 <li><a href="SQLPP.html" title="Raw SQL++ Grammar"><span class="none"></span>Raw SQL++ Grammar</a></li>
49 <li><a href="sqlpp/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li>
50 <li class="nav-header">API/SDK</li>
51 <li><a href="api.html" title="HTTP API"><span class="none"></span>HTTP API</a></li>
52 <li><a href="csv.html" title="CSV Output"><span class="none"></span>CSV Output</a></li>
53 <li class="nav-header">Advanced Features</li>
54 <li><a href="aql/externaldata.html" title="Accessing External Data"><span class="none"></span>Accessing External Data</a></li>
55 <li><a href="feeds.html" title="Data Ingestion with Feeds"><span class="none"></span>Data Ingestion with Feeds</a></li>
56 <li><a href="udf.html" title="User Defined Functions"><span class="none"></span>User Defined Functions</a></li>
57 <li><a href="sqlpp/filters.html" title="Filter-Based LSM Index Acceleration"><span class="none"></span>Filter-Based LSM Index Acceleration</a></li>
58 <li><a href="sqlpp/fulltext.html" title="Support of Full-text Queries"><span class="none"></span>Support of Full-text Queries</a></li>
59 <li><a href="sqlpp/similarity.html" title="Support of Similarity Queries"><span class="none"></span>Support of Similarity Queries</a></li>
60 <li><a href="interval_join.html" title="Support of Interval Joins"><span class="none"></span>Support of Interval Joins</a></li>
61 <li><a href="sqlpp/arrayindex.html" title="Support of Array Indexes"><span class="none"></span>Support of Array Indexes</a></li>
62 <li class="nav-header">Deprecated</li>
63 <li><a href="aql/primer.html" title="AsterixDB Primer: Using AQL"><span class="none"></span>AsterixDB Primer: Using AQL</a></li>
64 <li><a href="aql/manual.html" title="Queries: The Asterix Query Language (AQL)"><span class="none"></span>Queries: The Asterix Query Language (AQL)</a></li>
65 <li><a href="aql/builtins.html" title="Queries: Builtin Functions (AQL)"><span class="none"></span>Queries: Builtin Functions (AQL)</a></li>
66</ul>
67 <hr />
68 <div id="poweredBy">
69 <div class="clear"></div>
70 <div class="clear"></div>
71 <div class="clear"></div>
72 <div class="clear"></div>
73<a href="./" title="AsterixDB" class="builtBy"><img class="builtBy" alt="AsterixDB" src="images/asterixlogo.png" /></a>
74 </div>
75 </div>
76 </div>
77 <div id="bodyColumn" class="span10" >
78<!--
79 ! Licensed to the Apache Software Foundation (ASF) under one
80 ! or more contributor license agreements. See the NOTICE file
81 ! distributed with this work for additional information
82 ! regarding copyright ownership. The ASF licenses this file
83 ! to you under the Apache License, Version 2.0 (the
84 ! "License"); you may not use this file except in compliance
85 ! with the License. You may obtain a copy of the License at
86 !
87 ! http://www.apache.org/licenses/LICENSE-2.0
88 !
89 ! Unless required by applicable law or agreed to in writing,
90 ! software distributed under the License is distributed on an
91 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
92 ! KIND, either express or implied. See the License for the
93 ! specific language governing permissions and limitations
94 ! under the License.
95 !-->
96<h1>Installation using Amazon Web Services</h1>
97<div class="section">
98<h2><a name="Table_of_Contents"></a><a name="atoc" id="#toc">Table of Contents</a></h2>
99<ul>
100
101<li><a href="#Introduction">Introduction</a></li>
102<li><a href="#Prerequisites">Prerequisites</a></li>
103<li><a href="#config">Cluster Configuration</a></li>
104<li><a href="#lifecycle">Cluster Lifecycle Management</a></li>
105</ul><!--
106 ! Licensed to the Apache Software Foundation (ASF) under one
107 ! or more contributor license agreements. See the NOTICE file
108 ! distributed with this work for additional information
109 ! regarding copyright ownership. The ASF licenses this file
110 ! to you under the Apache License, Version 2.0 (the
111 ! "License"); you may not use this file except in compliance
112 ! with the License. You may obtain a copy of the License at
113 !
114 ! http://www.apache.org/licenses/LICENSE-2.0
115 !
116 ! Unless required by applicable law or agreed to in writing,
117 ! software distributed under the License is distributed on an
118 ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
119 ! KIND, either express or implied. See the License for the
120 ! specific language governing permissions and limitations
121 ! under the License.
122 !-->
123</div>
124<div class="section">
125<h2><a name="Introduction" id="Introduction">Introduction</a></h2>
126<p>Note that you can always manually launch a number of Amazon Web Services EC2 instances and then run the Ansible cluster installation scripts as described <a href="ansible.html">here</a> separately to manage the lifecycle of an AsterixDB cluster on those EC2 instances.</p>
127<p>However, via this installation option, we provide a combo solution for automating both AWS EC2 and AsterixDB, where you can run only one script to deploy, start, stop, and terminate an AsterixDB cluster on AWS.</p></div>
128<div class="section">
129<h2><a name="Prerequisites" id="Prerequisites">Prerequisites</a></h2>
130<ul>
131
132<li>
133
134<p>Supported operating systems for the client: <b>Linux</b> and <b>MacOS</b></p>
135</li>
136<li>
137
138<p>Supported operating systems for Amazon Web Services instances: <b>Linux</b></p>
139</li>
140<li>
141
142<p>Install pip on your client machine:</p>
143<p>CentOS</p>
144
145<div>
146<div>
147<pre class="source"> $ sudo yum install python-pip
148</pre></div></div>
149
150<p>Ubuntu</p>
151
152<div>
153<div>
154<pre class="source"> $ sudo apt-get install python-pip
155</pre></div></div>
156
157<p>macOS</p>
158
159<div>
160<div>
161<pre class="source"> $ brew install pip
162</pre></div></div>
163</li>
164<li>
165
166<p>Install Ansible, boto, and boto3 on your client machine:</p>
167
168<div>
169<div>
170<pre class="source"> $ pip install ansible
171 $ pip install boto
172 $ pip install boto3
173</pre></div></div>
174
175<p>Note that you might need <tt>sudo</tt> depending on your system configuration.</p>
176<p><b>Make sure that the version of Ansible is no less than 2.2.1.0</b>:</p>
177
178<div>
179<div>
180<pre class="source"> $ ansible --version
181 ansible 2.2.1.0
182</pre></div></div>
183
184<p><b>For users with macOS 10.11+</b>, please create a user-level Ansible configuration file at:</p>
185
186<div>
187<div>
188<pre class="source"> ~/.ansible.cfg
189</pre></div></div>
190
191<p>and add the following configuration:</p>
192
193<div>
194<div>
195<pre class="source"> [ssh_connection]
196 control_path = %(directory)s/%%C
197</pre></div></div>
198</li>
199<li>
200
201<p>Download the AsterixDB distribution package, unzip it, navigate to <tt>opt/aws/</tt></p>
202
203<div>
204<div>
205<pre class="source"> $ cd opt/aws
206</pre></div></div>
207
208<p>The following files and directories are in the directory <tt>opt/aws</tt>:</p>
209
210<div>
211<div>
212<pre class="source"> README bin conf yaml
213</pre></div></div>
214
215<p><tt>bin</tt> contains scripts that start and terminate an AWS-based cluster instance, according to the configuration specified in files under <tt>conf</tt>, and <tt>yaml</tt> contains internal Ansible scripts that the shell scripts in <tt>bin</tt> use.</p>
216</li>
217<li>
218
219<p>Create an AWS account and an IAM user.</p>
220<p>Set up a security group that you&#x2019;d like to use for your AWS cluster. <b>The security group should at least allow all TCP connections from anywhere.</b> Provide the name of the security group as the value for the <tt>group</tt> field in <tt>conf/aws_settings.yml</tt>.</p>
221</li>
222<li>
223
224<p>Retrieve your AWS EC2 key pair name and use that as the <tt>keypair</tt> in <tt>conf/aws_settings.yml</tt>;</p>
225<p>retrieve your AWS IAM <tt>access key ID</tt> and use that as the <tt>access_key_id</tt> in <tt>conf/aws_settings.yml</tt>;</p>
226<p>retrieve your AWS IAM <tt>secret access key</tt> and use that as the <tt>secret_access_key</tt> in <tt>conf/aws_settings.yml</tt>.</p>
227<p>Note that you can only read or download <tt>access key ID</tt> and <tt>secret access key</tt> once from your AWS console. If you forget them, you have to create new keys and delete the old ones.</p>
228</li>
229<li>
230
231<p>Configure your ssh setting by editing <tt>~/.ssh/config</tt> and adding the following entry:</p>
232
233<div>
234<div>
235<pre class="source"> Host *.amazonaws.com
236 IdentityFile &lt;path_of_private_key&gt;
237</pre></div></div>
238
239<p>Note that &lt;path_of_private_key&gt; should be replaced by the path to the file that stores the private key for the key pair that you uploaded to AWS and used in <tt>conf/aws_settings</tt>. For example:</p>
240
241<div>
242<div>
243<pre class="source"> Host *.amazonaws.com
244 IdentityFile ~/.ssh/id_rsa
245</pre></div></div>
246</li>
247</ul></div>
248<div class="section">
249<h2><a name="Cluster_Configuration"></a><a name="config" id="config">Cluster Configuration</a></h2>
250<ul>
251
252<li>
253
254<p><b>AWS settings</b>. Edit <tt>conf/instance_settings.yml</tt>. The meaning of each parameter is listed as follows:</p>
255
256<div>
257<div>
258<pre class="source"> # The OS image id for ec2 instances.
259 image: ami-76fa4116
260
261 # The data center region for ec2 instances.
262 region: us-west-2
263
264 # The tag for each ec2 machine. Use different tags for isolation.
265 tag: scale_test
266
267 # The name of a security group that appears in your AWS console.
268 group: default
269
270 # The name of a key pair that appears in your AWS console.
271 keypair: &lt;to be filled&gt;
272
273 # The AWS access key id for your IAM user.
274 access_key_id: &lt;to be filled&gt;
275
276 # The AWS secret key for your IAM user.
277 secret_access_key: &lt;to be filled&gt;
278
279 # The AWS instance type. A full list of available types are listed at:
280 # https://aws.amazon.com/ec2/instance-types/
281 instance_type: t2.micro
282
283 # The number of ec2 instances that construct a cluster.
284 count: 3
285
286 # The user name.
287 user: ec2-user
288
289 # Whether to reuse one slave machine to host the master process.
290 cc_on_nc: false
291</pre></div></div>
292
293<p><b>As described in <a href="#Prerequisites">prerequisites</a>, the following parameters must be customized:</b></p>
294
295<div>
296<div>
297<pre class="source"> # The tag for each ec2 machine. Use different tags for isolation.
298 tag: scale_test
299
300 # The name of a security group that appears in your AWS console.
301 group: default
302
303 # The name of a key pair that appears in your AWS console.
304 keypair: &lt;to be filled&gt;
305
306 # The AWS access key id for your IAM user.
307 access_key_id: &lt;to be filled&gt;
308
309 # The AWS secrety key for your IAM user.
310 secret_access_key: &lt;to be filled&gt;
311</pre></div></div>
312</li>
313<li>
314
315<p><b>Remote working directories</b>. Edit <tt>conf/instance_settings.yml</tt> to change the remote binary directory (the variable &#x201c;binarydir&#x201d;) when necessary. By default, the binary directory will be under the home directory (as the value of Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.</p>
316</li>
317</ul></div>
318<div class="section">
319<h2><a name="Cluster_Lifecycle_Management"></a><a name="lifecycle" id="lifecycle">Cluster Lifecycle Management</a></h2>
320<ul>
321
322<li>
323
324<p>Allocate AWS EC2 nodes (the number of nodes is specified in <tt>conf/instance_settings.yml</tt>) and deploy the binary to all allocated EC2 nodes:</p>
325
326<div>
327<div>
328<pre class="source"> bin/deploy.sh
329</pre></div></div>
330</li>
331<li>
332
333<p>Before starting the AsterixDB cluster, you the instance configuration file <tt>conf/instance/cc.conf</tt> can be modified with the exception of the IP addresses/DNS names which are are generated and cannot be changed. All available parameters and their usage can be found <a href="ncservice.html#Parameters">here</a>.</p>
334</li>
335<li>
336
337<p>Launch your AsterixDB cluster on EC2:</p>
338
339<div>
340<div>
341<pre class="source"> bin/start.sh
342</pre></div></div>
343
344<p>Now you can use the multi-node AsterixDB cluster on EC2 by by opening the master node listed in <tt>conf/instance/inventory</tt> at port <tt>19001</tt> (which can be customized in <tt>conf/instance/cc.conf</tt>) in your browser.</p>
345</li>
346<li>
347
348<p>If you want to stop the AWS-based AsterixDB cluster, run the following script:</p>
349
350<div>
351<div>
352<pre class="source"> bin/stop.sh
353</pre></div></div>
354
355<p>Note that this only stops AsterixDB but does not stop the EC2 nodes.</p>
356</li>
357<li>
358
359<p>If you want to terminate the EC2 nodes that run the AsterixDB cluster, run the following script:</p>
360
361<div>
362<div>
363<pre class="source"> bin/terminate.sh
364</pre></div></div>
365
366<p><b>Note that it will destroy everything in the AsterixDB cluster you installed and terminate all EC2 nodes for the cluster.</b></p>
367</li>
368</ul></div>
369 </div>
370 </div>
371 </div>
372 <hr/>
373 <footer>
374 <div class="container-fluid">
375 <div class="row-fluid">
376<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
377 feather logo, and the Apache AsterixDB project logo are either
378 registered trademarks or trademarks of The Apache Software
379 Foundation in the United States and other countries.
380 All other marks mentioned may be trademarks or registered
381 trademarks of their respective owners.
382 </div>
383 </div>
384 </div>
385 </footer>
386 </body>
387</html>