| <!DOCTYPE html> |
| <!-- |
| | Generated by Apache Maven Doxia Site Renderer 1.8.1 from target/generated-site/markdown/aws.md at 2021-07-16 |
| | Rendered using Apache Maven Fluido Skin 1.7 |
| --> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head> |
| <meta charset="UTF-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <meta name="Date-Revision-yyyymmdd" content="20210716" /> |
| <meta http-equiv="Content-Language" content="en" /> |
| <title>AsterixDB – Installation using Amazon Web Services</title> |
| <link rel="stylesheet" href="./css/apache-maven-fluido-1.7.min.css" /> |
| <link rel="stylesheet" href="./css/site.css" /> |
| <link rel="stylesheet" href="./css/print.css" media="print" /> |
| <script type="text/javascript" src="./js/apache-maven-fluido-1.7.min.js"></script> |
| |
| </head> |
| <body class="topBarDisabled"> |
| <div class="container-fluid"> |
| <div id="banner"> |
| <div class="pull-left"><a href="./" id="bannerLeft"><img src="images/asterixlogo.png" alt="AsterixDB"/></a></div> |
| <div class="pull-right"></div> |
| <div class="clear"><hr/></div> |
| </div> |
| |
| <div id="breadcrumbs"> |
| <ul class="breadcrumb"> |
| <li id="publishDate">Last Published: 2021-07-16</li> |
| <li id="projectVersion" class="pull-right">Version: 0.9.7</li> |
| <li class="pull-right"><a href="index.html" title="Documentation Home">Documentation Home</a></li> |
| </ul> |
| </div> |
| <div class="row-fluid"> |
| <div id="leftColumn" class="span2"> |
| <div class="well sidebar-nav"> |
| <ul class="nav nav-list"> |
| <li class="nav-header">Get Started - Installation</li> |
| <li><a href="ncservice.html" title="Option 1: using NCService"><span class="none"></span>Option 1: using NCService</a></li> |
| <li><a href="ansible.html" title="Option 2: using Ansible"><span class="none"></span>Option 2: using Ansible</a></li> |
| <li class="active"><a href="#"><span class="none"></span>Option 3: using Amazon Web Services</a></li> |
| <li class="nav-header">AsterixDB Primer</li> |
| <li><a href="sqlpp/primer-sqlpp.html" title="Using SQL++"><span class="none"></span>Using SQL++</a></li> |
| <li class="nav-header">Data Model</li> |
| <li><a href="datamodel.html" title="The Asterix Data Model"><span class="none"></span>The Asterix Data Model</a></li> |
| <li class="nav-header">Queries</li> |
| <li><a href="sqlpp/manual.html" title="The SQL++ Query Language"><span class="none"></span>The SQL++ Query Language</a></li> |
| <li><a href="SQLPP.html" title="Raw SQL++ Grammar"><span class="none"></span>Raw SQL++ Grammar</a></li> |
| <li><a href="sqlpp/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li> |
| <li class="nav-header">API/SDK</li> |
| <li><a href="api.html" title="HTTP API"><span class="none"></span>HTTP API</a></li> |
| <li><a href="csv.html" title="CSV Output"><span class="none"></span>CSV Output</a></li> |
| <li class="nav-header">Advanced Features</li> |
| <li><a href="aql/externaldata.html" title="Accessing External Data"><span class="none"></span>Accessing External Data</a></li> |
| <li><a href="feeds.html" title="Data Ingestion with Feeds"><span class="none"></span>Data Ingestion with Feeds</a></li> |
| <li><a href="udf.html" title="User Defined Functions"><span class="none"></span>User Defined Functions</a></li> |
| <li><a href="sqlpp/filters.html" title="Filter-Based LSM Index Acceleration"><span class="none"></span>Filter-Based LSM Index Acceleration</a></li> |
| <li><a href="sqlpp/fulltext.html" title="Support of Full-text Queries"><span class="none"></span>Support of Full-text Queries</a></li> |
| <li><a href="sqlpp/similarity.html" title="Support of Similarity Queries"><span class="none"></span>Support of Similarity Queries</a></li> |
| <li><a href="interval_join.html" title="Support of Interval Joins"><span class="none"></span>Support of Interval Joins</a></li> |
| <li><a href="sqlpp/arrayindex.html" title="Support of Array Indexes"><span class="none"></span>Support of Array Indexes</a></li> |
| <li class="nav-header">Deprecated</li> |
| <li><a href="aql/primer.html" title="AsterixDB Primer: Using AQL"><span class="none"></span>AsterixDB Primer: Using AQL</a></li> |
| <li><a href="aql/manual.html" title="Queries: The Asterix Query Language (AQL)"><span class="none"></span>Queries: The Asterix Query Language (AQL)</a></li> |
| <li><a href="aql/builtins.html" title="Queries: Builtin Functions (AQL)"><span class="none"></span>Queries: Builtin Functions (AQL)</a></li> |
| </ul> |
| <hr /> |
| <div id="poweredBy"> |
| <div class="clear"></div> |
| <div class="clear"></div> |
| <div class="clear"></div> |
| <div class="clear"></div> |
| <a href="./" title="AsterixDB" class="builtBy"><img class="builtBy" alt="AsterixDB" src="images/asterixlogo.png" /></a> |
| </div> |
| </div> |
| </div> |
| <div id="bodyColumn" class="span10" > |
| <!-- |
| ! Licensed to the Apache Software Foundation (ASF) under one |
| ! or more contributor license agreements. See the NOTICE file |
| ! distributed with this work for additional information |
| ! regarding copyright ownership. The ASF licenses this file |
| ! to you under the Apache License, Version 2.0 (the |
| ! "License"); you may not use this file except in compliance |
| ! with the License. You may obtain a copy of the License at |
| ! |
| ! http://www.apache.org/licenses/LICENSE-2.0 |
| ! |
| ! Unless required by applicable law or agreed to in writing, |
| ! software distributed under the License is distributed on an |
| ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| ! KIND, either express or implied. See the License for the |
| ! specific language governing permissions and limitations |
| ! under the License. |
| !--> |
| <h1>Installation using Amazon Web Services</h1> |
| <div class="section"> |
| <h2><a name="Table_of_Contents"></a><a name="atoc" id="#toc">Table of Contents</a></h2> |
| <ul> |
| |
| <li><a href="#Introduction">Introduction</a></li> |
| <li><a href="#Prerequisites">Prerequisites</a></li> |
| <li><a href="#config">Cluster Configuration</a></li> |
| <li><a href="#lifecycle">Cluster Lifecycle Management</a></li> |
| </ul><!-- |
| ! Licensed to the Apache Software Foundation (ASF) under one |
| ! or more contributor license agreements. See the NOTICE file |
| ! distributed with this work for additional information |
| ! regarding copyright ownership. The ASF licenses this file |
| ! to you under the Apache License, Version 2.0 (the |
| ! "License"); you may not use this file except in compliance |
| ! with the License. You may obtain a copy of the License at |
| ! |
| ! http://www.apache.org/licenses/LICENSE-2.0 |
| ! |
| ! Unless required by applicable law or agreed to in writing, |
| ! software distributed under the License is distributed on an |
| ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| ! KIND, either express or implied. See the License for the |
| ! specific language governing permissions and limitations |
| ! under the License. |
| !--> |
| </div> |
| <div class="section"> |
| <h2><a name="Introduction" id="Introduction">Introduction</a></h2> |
| <p>Note that you can always manually launch a number of Amazon Web Services EC2 instances and then run the Ansible cluster installation scripts as described <a href="ansible.html">here</a> separately to manage the lifecycle of an AsterixDB cluster on those EC2 instances.</p> |
| <p>However, via this installation option, we provide a combo solution for automating both AWS EC2 and AsterixDB, where you can run only one script to deploy, start, stop, and terminate an AsterixDB cluster on AWS.</p></div> |
| <div class="section"> |
| <h2><a name="Prerequisites" id="Prerequisites">Prerequisites</a></h2> |
| <ul> |
| |
| <li> |
| |
| <p>Supported operating systems for the client: <b>Linux</b> and <b>MacOS</b></p> |
| </li> |
| <li> |
| |
| <p>Supported operating systems for Amazon Web Services instances: <b>Linux</b></p> |
| </li> |
| <li> |
| |
| <p>Install pip on your client machine:</p> |
| <p>CentOS</p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ sudo yum install python-pip |
| </pre></div></div> |
| |
| <p>Ubuntu</p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ sudo apt-get install python-pip |
| </pre></div></div> |
| |
| <p>macOS</p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ brew install pip |
| </pre></div></div> |
| </li> |
| <li> |
| |
| <p>Install Ansible, boto, and boto3 on your client machine:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ pip install ansible |
| $ pip install boto |
| $ pip install boto3 |
| </pre></div></div> |
| |
| <p>Note that you might need <tt>sudo</tt> depending on your system configuration.</p> |
| <p><b>Make sure that the version of Ansible is no less than 2.2.1.0</b>:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ ansible --version |
| ansible 2.2.1.0 |
| </pre></div></div> |
| |
| <p><b>For users with macOS 10.11+</b>, please create a user-level Ansible configuration file at:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> ~/.ansible.cfg |
| </pre></div></div> |
| |
| <p>and add the following configuration:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> [ssh_connection] |
| control_path = %(directory)s/%%C |
| </pre></div></div> |
| </li> |
| <li> |
| |
| <p>Download the AsterixDB distribution package, unzip it, navigate to <tt>opt/aws/</tt></p> |
| |
| <div> |
| <div> |
| <pre class="source"> $ cd opt/aws |
| </pre></div></div> |
| |
| <p>The following files and directories are in the directory <tt>opt/aws</tt>:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> README bin conf yaml |
| </pre></div></div> |
| |
| <p><tt>bin</tt> contains scripts that start and terminate an AWS-based cluster instance, according to the configuration specified in files under <tt>conf</tt>, and <tt>yaml</tt> contains internal Ansible scripts that the shell scripts in <tt>bin</tt> use.</p> |
| </li> |
| <li> |
| |
| <p>Create an AWS account and an IAM user.</p> |
| <p>Set up a security group that you’d like to use for your AWS cluster. <b>The security group should at least allow all TCP connections from anywhere.</b> Provide the name of the security group as the value for the <tt>group</tt> field in <tt>conf/aws_settings.yml</tt>.</p> |
| </li> |
| <li> |
| |
| <p>Retrieve your AWS EC2 key pair name and use that as the <tt>keypair</tt> in <tt>conf/aws_settings.yml</tt>;</p> |
| <p>retrieve your AWS IAM <tt>access key ID</tt> and use that as the <tt>access_key_id</tt> in <tt>conf/aws_settings.yml</tt>;</p> |
| <p>retrieve your AWS IAM <tt>secret access key</tt> and use that as the <tt>secret_access_key</tt> in <tt>conf/aws_settings.yml</tt>.</p> |
| <p>Note that you can only read or download <tt>access key ID</tt> and <tt>secret access key</tt> once from your AWS console. If you forget them, you have to create new keys and delete the old ones.</p> |
| </li> |
| <li> |
| |
| <p>Configure your ssh setting by editing <tt>~/.ssh/config</tt> and adding the following entry:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> Host *.amazonaws.com |
| IdentityFile <path_of_private_key> |
| </pre></div></div> |
| |
| <p>Note that <path_of_private_key> should be replaced by the path to the file that stores the private key for the key pair that you uploaded to AWS and used in <tt>conf/aws_settings</tt>. For example:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> Host *.amazonaws.com |
| IdentityFile ~/.ssh/id_rsa |
| </pre></div></div> |
| </li> |
| </ul></div> |
| <div class="section"> |
| <h2><a name="Cluster_Configuration"></a><a name="config" id="config">Cluster Configuration</a></h2> |
| <ul> |
| |
| <li> |
| |
| <p><b>AWS settings</b>. Edit <tt>conf/instance_settings.yml</tt>. The meaning of each parameter is listed as follows:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> # The OS image id for ec2 instances. |
| image: ami-76fa4116 |
| |
| # The data center region for ec2 instances. |
| region: us-west-2 |
| |
| # The tag for each ec2 machine. Use different tags for isolation. |
| tag: scale_test |
| |
| # The name of a security group that appears in your AWS console. |
| group: default |
| |
| # The name of a key pair that appears in your AWS console. |
| keypair: <to be filled> |
| |
| # The AWS access key id for your IAM user. |
| access_key_id: <to be filled> |
| |
| # The AWS secret key for your IAM user. |
| secret_access_key: <to be filled> |
| |
| # The AWS instance type. A full list of available types are listed at: |
| # https://aws.amazon.com/ec2/instance-types/ |
| instance_type: t2.micro |
| |
| # The number of ec2 instances that construct a cluster. |
| count: 3 |
| |
| # The user name. |
| user: ec2-user |
| |
| # Whether to reuse one slave machine to host the master process. |
| cc_on_nc: false |
| </pre></div></div> |
| |
| <p><b>As described in <a href="#Prerequisites">prerequisites</a>, the following parameters must be customized:</b></p> |
| |
| <div> |
| <div> |
| <pre class="source"> # The tag for each ec2 machine. Use different tags for isolation. |
| tag: scale_test |
| |
| # The name of a security group that appears in your AWS console. |
| group: default |
| |
| # The name of a key pair that appears in your AWS console. |
| keypair: <to be filled> |
| |
| # The AWS access key id for your IAM user. |
| access_key_id: <to be filled> |
| |
| # The AWS secrety key for your IAM user. |
| secret_access_key: <to be filled> |
| </pre></div></div> |
| </li> |
| <li> |
| |
| <p><b>Remote working directories</b>. Edit <tt>conf/instance_settings.yml</tt> to change the remote binary directory (the variable “binarydir”) when necessary. By default, the binary directory will be under the home directory (as the value of Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.</p> |
| </li> |
| </ul></div> |
| <div class="section"> |
| <h2><a name="Cluster_Lifecycle_Management"></a><a name="lifecycle" id="lifecycle">Cluster Lifecycle Management</a></h2> |
| <ul> |
| |
| <li> |
| |
| <p>Allocate AWS EC2 nodes (the number of nodes is specified in <tt>conf/instance_settings.yml</tt>) and deploy the binary to all allocated EC2 nodes:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> bin/deploy.sh |
| </pre></div></div> |
| </li> |
| <li> |
| |
| <p>Before starting the AsterixDB cluster, you the instance configuration file <tt>conf/instance/cc.conf</tt> can be modified with the exception of the IP addresses/DNS names which are are generated and cannot be changed. All available parameters and their usage can be found <a href="ncservice.html#Parameters">here</a>.</p> |
| </li> |
| <li> |
| |
| <p>Launch your AsterixDB cluster on EC2:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> bin/start.sh |
| </pre></div></div> |
| |
| <p>Now you can use the multi-node AsterixDB cluster on EC2 by by opening the master node listed in <tt>conf/instance/inventory</tt> at port <tt>19001</tt> (which can be customized in <tt>conf/instance/cc.conf</tt>) in your browser.</p> |
| </li> |
| <li> |
| |
| <p>If you want to stop the AWS-based AsterixDB cluster, run the following script:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> bin/stop.sh |
| </pre></div></div> |
| |
| <p>Note that this only stops AsterixDB but does not stop the EC2 nodes.</p> |
| </li> |
| <li> |
| |
| <p>If you want to terminate the EC2 nodes that run the AsterixDB cluster, run the following script:</p> |
| |
| <div> |
| <div> |
| <pre class="source"> bin/terminate.sh |
| </pre></div></div> |
| |
| <p><b>Note that it will destroy everything in the AsterixDB cluster you installed and terminate all EC2 nodes for the cluster.</b></p> |
| </li> |
| </ul></div> |
| </div> |
| </div> |
| </div> |
| <hr/> |
| <footer> |
| <div class="container-fluid"> |
| <div class="row-fluid"> |
| <div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache |
| feather logo, and the Apache AsterixDB project logo are either |
| registered trademarks or trademarks of The Apache Software |
| Foundation in the United States and other countries. |
| All other marks mentioned may be trademarks or registered |
| trademarks of their respective owners. |
| </div> |
| </div> |
| </div> |
| </footer> |
| </body> |
| </html> |