revised Managix documenation

commit: c358fe7a43543a45be43dc6af95cb9c36f6b70ad [log] [tgz]
author: ramangrover29 <ramangrover29@gmail.com> Wed Jun 26 10:48:38 2013 -0700
committer: ramangrover29 <ramangrover29@gmail.com> Wed Jun 26 10:48:38 2013 -0700
tree: ffa52beaadf4aeeec26b32fd510e82e23ecb4133
parent: 2682065a6723f8af8ee33fd0540a506049bfa08d [diff]
diff --git a/asterix-doc/src/site/markdown/install.md b/asterix-doc/src/site/markdown/install.md
index 65aebdf..fb60988 100644
--- a/asterix-doc/src/site/markdown/install.md
+++ b/asterix-doc/src/site/markdown/install.md

@@ -96,21 +96,35 @@
         RSA key fingerprint is aa:7b:51:90:74:39:c4:f6:28:a2:9d:47:c2:8d:33:31.
         Are you sure you want to continue connecting (yes/no)?
 
-If you are not prompted for a password, that is if you get an output similar to one shown below, skip to the next section [Configuring Managix](#Configuring_Managix).
+If you are not prompted for a password, that is if you get an output similar to one shown below, it signifies that you already 
+have password-less SSH configured. 
 
 
         $ ssh 127.0.0.1
         Last login: Sat Mar 23 22:52:49 2013
 
-You are here because you were prompted for a password.  You need to configure password less SSH.   Follow the instructions below.
+
+[Important: Password-less SSH requires the use of a (public,private) key-pair. The key-pair is located as a pair of files under
+ $HOME/.ssh directory. It is required that the (public,private) key-pair files have default names (id_rsa.pub, id_rsa) respectively. 
+ If you are using different names, please rename the files to use the default names]
+
+Skip to the next section [Configuring Managix](#Configuring_Managix).
+
+
+You are here because you were prompted for a password.  You need to configure password-less SSH.   
+We shall generate a (public,private) key-pair as id_rsa.pub and id_rsa respectively. If $HOME/.ssh already
+contains a (public,private) key-pair, please ensure the files are renamed before proceeding.
+Follow the instructions below.
 
 
          $ ssh-keygen -t rsa -P ""
         Generating public/private rsa key pair.
-        Enter file in which to save the key (/home/joe/.ssh/id_rsa):   [We shall use the default value, so simply press enter]
+        Enter file in which to save the key (/home/joe/.ssh/id_rsa):   
+        [Important: Please ensure that we use the default value, so simply press enter]
+
 
 If a key already exists, you should get an output similar to what is shown below.  Press 'y' to overwrite the existing key.
-
+It is required to use the default name. If you wish to not overwrite a pre-existing key, ensure that the pre-existing key is saved with a different name.
 
         /home/joe/.ssh/id_rsa already exists.
         Overwrite (y/n)?
@@ -261,39 +275,39 @@
         <cluster xmlns="cluster">
             <name>local</name>
             <java_home>/usr/lib/jvm/jdk1.7.0</java_home>
-            <java_opts>-Xmx1048m</java_opts>
-            <logdir>/home/joe/asterix-mgmt/clusters/local/working_dir/logs</logdir>
+            <log_dir>/home/joe/asterix-mgmt/clusters/local/working_dir/logs</logdir>
+            <txn_log_dir>/home/joe/asterix-mgmt/clusters/local/working_dir/logs</txn_log_dir>
             <iodevices>/home/joe/asterix-mgmt/clusters/local/working_dir</iodevices>
             <store>storage</store>
-            <workingDir>
+            <working_dir>
                 <dir>/home/joe/asterix-mgmt/clusters/local/working_dir</dir>
                 <NFS>true</NFS>
-            </workingDir>
-            <master-node>
+            </working_dir>
+            <master_node>
                 <id>master</id>
-                <client-ip>127.0.0.1</client-ip>
-                <cluster-ip>127.0.0.1</cluster-ip>
-            </master-node>
+                <client_ip>127.0.0.1</client_ip>
+                <cluster_ip>127.0.0.1</cluster_ip>
+            </master_node>
             <node>
                 <id>node1</id>
-                <cluster-ip>127.0.0.1</cluster-ip>
+                <cluster_ip>127.0.0.1</cluster_ip>
             </node>
         </cluster>
 
 We shall next explain the components of the cluster configuration XML file.
 
 #### (1) Defining nodes in ASTERIX runtime ####
-The single-machine ASTERIX instance configuration that is auto-generated by Managix (using the "configure" command) involves a master node (CC) and a worker node (NC).  Each node is assigned a unique id and provided with an ip address (called ''cluster-ip'') that maps a node to a physical machine. The following snippet from the above XML file captures the master/worker nodes in our ASTERIX installation.
+The single-machine ASTERIX instance configuration that is auto-generated by Managix (using the "configure" command) involves a master node (CC) and a worker node (NC).  Each node is assigned a unique id and provided with an ip address (called ''cluster_ip'') that maps a node to a physical machine. The following snippet from the above XML file captures the master/worker nodes in our ASTERIX installation.
 
 
-            <master-node>
+            <master_node>
                 <id>master</id>
-                <client-ip>127.0.0.1</client-ip>
-                <cluster-ip>127.0.0.1</cluster-ip>
-            </master-node>
+                <client_ip>127.0.0.1</client_ip>
+                <cluster_ip>127.0.0.1</cluster_ip>
+            </master_node>
             <node>
                 <id>node1</id>
-                <cluster-ip>127.0.0.1</cluster-ip>
+                <cluster_ip>127.0.0.1</cluster_ip>
             </node>
 
 
@@ -309,11 +323,11 @@
   <td>A unique id for a node.</td>
 </tr>
 <tr>
-  <td>cluster-ip</td>
+  <td>cluster_ip</td>
   <td>IP address of the machine to which a node maps to. This address is used for all internal communication between the nodes.</td>
 </tr>
 <tr>
-  <td>client-ip</td>
+  <td>client_ip</td>
   <td>Provided for the master node. This IP should be reachable from clients that want to connect with ASTERIX via its web interface.</td>
 </tr>
 </table>
@@ -331,20 +345,20 @@
   <td>Java installation directory at each node.</td>
 </tr>
 <tr>
-  <td>java_opts</td>
-  <td>JVM arguments passed on to the JVM that represents a node.</td>
-</tr>
-<tr>
-  <td>logdir</td>
+  <td>log_dir</td>
   <td>A directory where worker node may write logs.</td>
 </tr>
 <tr>
-  <td>io_devices</td>
+  <td>txn_log_dir</td>
+  <td>A directory where worker node may write transaction logs.</td>
+</tr>
+<tr>
+  <td>iodevices</td>
   <td>Comma separated list of IO Device mount points.</td>
 </tr>
 <tr>
   <td>store</td>
-  <td>A data directory that ASTERIX uses to store data belonging to dataset(s).</td>
+  <td>A data directory (under each iodevice) that ASTERIX uses to store data belonging to dataset(s).</td>
 </tr>
 </table>
 
@@ -354,10 +368,10 @@
 
 Next we explain the following setting in the file $MANAGIX_HOME/clusters/local/local.xml.
 
-            <workingDir>
+            <working_dir>
                 <dir>/Users/joe/asterix-mgmt/clusters/local/working_dir</dir>
                 <NFS>true</NFS>
-            </workingDir>
+            </working_dir>
 
 
 Managix associates a working directory with an ASTERIX instance and uses this directory for transferring binaries to each node. If there exists a directory that is readable by each node, Managix can use it to place binaries that can be accessed and used by all the nodes in the ASTERIX set up. A network file system (NFS) provides such a functionality for a cluster of physical machines such that a path on NFS is accessible from each machine in the cluster.  In the single-machine set up described above, all nodes correspond to a single physical machine. Each path on the local file system is accessible to all the nodes in the ASTERIX setup and the boolean value for NFS above is thus set to `true`.
@@ -384,13 +398,13 @@
 
 ![AsterixCluster](https://asterixdb.googlecode.com/files/AsterixCluster.png)
 
-Notice that each machine has a ''cluster-ip'' address, which is used by these machines for their intra-cluster communication. Meanwhile, the master machine also has a ''client-ip'' address, using which an end-user outside the cluster can communicate with this machine.  The reason we differentiate between these two types of IP addresses is that we can have a cluster of machines using a private network. In this case they have internal ip addresses that cannot be used outside the network.  In the case all the machines are on a public network, the "client-ip" and "cluster-ip" of the master machine can share the same address.
+Notice that each machine has a ''cluster_ip'' address, which is used by these machines for their intra-cluster communication. Meanwhile, the master machine also has a ''client_ip'' address, using which an end-user outside the cluster can communicate with this machine.  The reason we differentiate between these two types of IP addresses is that we can have a cluster of machines using a private network. In this case they have internal ip addresses that cannot be used outside the network.  In the case all the machines are on a public network, the "client_ip" and "cluster_ip" of the master machine can share the same address.
 
 Next we describe how to set up ASTERIX in this cluster, assuming no Managix has been installed on these machines.
 
 ### Step (1): Define the ASTERIX cluster ###
 
-We first log into the master machine as the user "joe". On this machine, download Managix from [here](https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip) (save as above), then do the following steps similar to the single-machine case described above:
+We first log into the master machine as the user "joe". On this machine, download Managix from [here](https://asterixdb.googlecode.com/files/asterix-installer-0.0.5-binary-assembly.zip) (same as above), then do the following steps similar to the single-machine case described above:
 
 
         machineA> cd ~
@@ -418,42 +432,49 @@
           <username>joe</username>
         
           <!-- The working directory of Managix. It should be on a network file system (NFS) that
-            can accessed by all the machine. Need to create it before running Managix. -->
-          <workingDir>
+            can accessed by all the machine. -->
+          <working_dir>
             <dir>/home/joe/managix-workingDir</dir>
             <NFS>true</NFS>
-          </workingDir>
+          </working_dir>
         
-          <!-- Directory for Asterix to store log information for each machine. Needs
-           to be a local file system. Needs to create it before running Managix. -->
-          <logdir>/mnt/joe/logs</logdir>
+          <!-- Directory for Asterix to store log information for each node. Needs
+           to be on the local file system.  -->
+          <log_dir>/mnt/joe/logs</log_dir>
         
-          <!-- Directory used by each worker node to store data files. Needs
-           to be a local file system. Needs to create it before running Managix. -->
+          <!-- Directory for Asterix to store transaction logs information for each node. Needs
+           to be on the local file system.  -->
+          <txn_log_dir>/mnt/joe/txn-logs</txn_log_dir>
+        
           <iodevices>/mnt/joe</iodevices>
+          
+          <!-- Directory named (under each iodevice) that used by each worker node to store data files. Needs
+           to be on the local file system. -->
           <store>storage</store>
         
-          <!-- Java home for each machine with its JVM options -->
+          <!-- Java home for each node. Can be overriden at node level. -->
           <java_home>/usr/lib/jvm/jdk1.7.0</java_home>
-          <java_opts>-Xmx1024m</java_opts>
         
            <!-- IP addresses of the master machine A -->
-          <master-node>
+          <master_node>
             <id>master</id>
-            <client-ip>128.195.52.177</client-ip>
-            <cluster-ip>192.168.100.0</cluster-ip>
-          </master-node>
+            <client_ip>128.195.52.177</client_ip>
+            <cluster_ip>192.168.100.0</cluster_ip>
+            <client_port>1098</client_port>
+            <cluster_port>1099</cluster_port>
+            <http_port>8888</http_port>
+          </master_node>
         
            <!-- IP address(es) of machine B -->
           <node>
             <id>nodeB</id>
-            <cluster-ip>192.168.100.1</cluster-ip>
+            <cluster_ip>192.168.100.1</cluster_ip>
           </node>
         
            <!-- IP address(es) of machine C -->
           <node>
             <id>nodeC</id>
-            <cluster-ip>192.168.100.2</cluster-ip>
+            <cluster_ip>192.168.100.2</cluster_ip>
           </node>
         </cluster>
 
@@ -530,7 +551,7 @@
 We shall now use the `create` command to create an ASTERIX instance called "rainbow_asterix". In doing so, we shall use the cluster configuration file that was auto-generated by Managix.
 
 
-        machineA> managix create -n rainbow_asterix -c $MANAGIX_HOME/clusters/rainbow/rainbow.xml
+        machineA> managix create -n rainbow_asterix -c $MANAGIX_HOME/clusters/rainbow.xml
 
 
 If the response message does not have warning, then Congratulations! You have successfully installed Asterix on this cluster of machines!
@@ -554,6 +575,7 @@
 <tr><td><a href="#Delete_Command"              >delete</a></td>   <td>Deletes an Asterix instance.</td></tr>
 <tr><td><a href="#Configuring_Managix"         >validate</a></td> <td>Validates the installer/cluster configuration.</td></tr>
 <tr><td><a href="#Configuring_Managix"         >configure</a></td><td>Auto generate configuration for an Asterix instance.</td></tr>
+<tr><td><a href="#Log_Command"                 >log</a></td><td>Produces a zip archive containing log files from each node in an AsterixDB instance.</td></tr>
 <tr><td><a href="#Shutdown_Command"            >shutdown</a></td> <td>Shutdown the installer service.</td></tr>
 </table>
 
@@ -735,6 +757,25 @@
 
 You can start the ASTERIX instance by using the start command.
 
+##### Log Command #####
+
+The `log` command allows you to collect the log files coresponding to each node of an AsterixDB instance into a zip archive. 
+The zip archive is produced on the local file system of the machine running managix. 
+
+          $ managix help -cmd log
+        
+          Creates a zip archive containing log files corresponding to each worker node (NC) and the master (CC)  for an AsterixDB instance
+
+          Available arguments/options
+          -n name of the AsterixDB instance. 
+          -d destination directory for producing the zip archive (defaults to) $MANAGIX_HOME/logdump
+         
+The following is an example showing the use of the log command. 
+
+         $ managix log -n my_asterix -d  /Users/joe/logdump
+         INFO: Log zip archive created at /Users/joe/logdump/log_Thu_Jun_06_00:53:51_PDT_2013.zip
+
+
 ##### Delete Command #####
 As the name suggests, the `delete` command permanently removes an ASTERIX instance by cleaning up all associated data/artifacts. The usage can be looked up by executing the following:
 
@@ -775,34 +816,44 @@
 ## Section 5: Frequently Asked Questions ##
 
 
-*Question*
-What is meant by the "UNUSABLE" state in the lifecycle of  an ASTERIX instance ?
+##### Question #####
+What happens if a machine acting as a node in the Asterix cluster becomes unreachable for some reason (network partition/machine failure) ?
 
+##### Answer #####
+When a node leaves the Asterix cluster, the AsterixDB instance transits to an 'UNUSABLE' state, indicating that it is no longer
+available for serving queries. To know which set of node(s) left the cluster, run the describe command with -admin flag. 
 
-*Answer*
-When Managix fails to start a required process (CC/NC), the instance transits to an UNUSABLE state.
-The reason for the failure needs to be looked up in the logs.
-Before we attempt to start the instance again, any processes that got launched
-as part of failed attempt must be stopped. No other operation except "stop" is supported in the UNUSABLE state.
+        $ $MANAGIX_HOME/bin/managix describe -n <name of the AsterixDB instance>-admin
+        
+Above command will show the state of AsterixDB instance and list the set of nodes that have left the cluster.           
 
-Get rid of the started processes:-
+The failed node must be brought back to re-join the cluster. Once done, you may bring back the 
+instance to an 'ACTIVE' state by executing the following sequence. 
+
+1) Get rid of the Asterix processes running on the nodes in the cluster:-
 
         $MANAGIX_HOME/bin/managix stop -n my_asterix
 
 
-Any processes associated with the instance are killed and the instance moves to the INACTIVE state.
-You may now delete the instance by executing the following
+The processes associated with the instance are terminated and the instance moves to the INACTIVE state.
+
+2) Start the AsterixDB instance using the start command.
+
+        $MANAGIX_HOME/bin/managix start -n <name of your AsterixDB instance>
 
 
-        $MANAGIX_HOME/bin/managix delete -n <name of your ASTERIX instance>
+##### Question #####
+Do I need to create all the directories/paths I put into the cluster configuration XML ?
+
+##### Answer ##### 
+Managix will create a path if it is not existing. It does so using the user account mentioned in the cluster configuration xml. 
+Please ensure that the user account has appropriate permissions for creating the missing paths. 
 
 
-Note that above would remove all traces of the instance including the logs and thus the reason for the failed attempt.
+##### Question ##### 
 
-OR
+Should MANAGIX_HOME be on the network file system (NFS) ?
 
-make a subsequent attempt to start the instance if you realized a mistake in the cluster configuration XML and have corrected it. To start the instance, we execute the following.
-
-
-        $MANAGIX_HOME/bin/managix start -n <name of your ASTERIX instance>
-
+##### Answer #####
+It is recommended that MANAGIX_HOME is not on the NFS. Managix produces artifacts/logs on disk which are not required to be shared. 
+As such an overhead in creating the artifacts/logs on the NFS should be avoided.
commit	c358fe7a43543a45be43dc6af95cb9c36f6b70ad	[log] [tgz]
author	ramangrover29 <ramangrover29@gmail.com>	Wed Jun 26 10:48:38 2013 -0700
committer	ramangrover29 <ramangrover29@gmail.com>	Wed Jun 26 10:48:38 2013 -0700
tree	ffa52beaadf4aeeec26b32fd510e82e23ecb4133
parent	2682065a6723f8af8ee33fd0540a506049bfa08d [diff]