Asterix NCs Fault Tolerance
This change includes the following:
- Adapt replication to unique partitions storage.
- Implement auto failover for failing NCs.
- Implement auto failover for metadata node.
- Fix for ASTERIXDB-1251 using proper error message.
- Basic replication test cases using vagrant virtual cluster for:
1. LSM bulkload components replication.
2. LSM Memory components replication and recovery.
3. Metadata node takeover.
These test cases will be part of the cluster test profile.
Change-Id: Ice26d980912a315fcb3efdd571d6ce88717cfea4
Reviewed-on: https://asterix-gerrit.ics.uci.edu/573
Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Reviewed-by: Till Westmann <tillw@apache.org>
Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>
diff --git a/asterix-common/src/main/resources/schema/cluster.xsd b/asterix-common/src/main/resources/schema/cluster.xsd
index 872c959..e0605f0 100644
--- a/asterix-common/src/main/resources/schema/cluster.xsd
+++ b/asterix-common/src/main/resources/schema/cluster.xsd
@@ -47,7 +47,7 @@
<xs:element name="enabled" type="xs:boolean" />
<xs:element name="replication_port" type="xs:integer" />
<xs:element name="replication_factor" type="xs:integer" />
- <xs:element name="replication_store" type="xs:string" />
+ <xs:element name="auto_failover" type="xs:boolean" />
<xs:element name="replication_time_out" type="xs:integer" />
<!-- definition of complex elements -->
@@ -82,7 +82,7 @@
<xs:element ref="cl:enabled" />
<xs:element ref="cl:replication_port" />
<xs:element ref="cl:replication_factor" />
- <xs:element ref="cl:replication_store" />
+ <xs:element ref="cl:auto_failover" />
<xs:element ref="cl:replication_time_out" />
</xs:sequence>
</xs:complexType>