[NO ISSUE][ING] Set state to stop on failure of recovery
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
- Before this change, if a rebalance comes at the time when
an active recovery retry policy decides to give up, then
the recovery task would finish while the entity is in the
suspended state. When the rebalance completes, the resume
will set the entity back to the temp failure state but
recovery has given up already.
- To fix this, recovery task doesn't return in case of suspend and
wait for the resume call to set the state to stopped.
Change-Id: I5a60a4f547a5b7f2c4c5199d48bcc83f5a2e9ccd
Reviewed-on: https://asterix-gerrit.ics.uci.edu/2757
Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Reviewed-by: Till Westmann <tillw@apache.org>
diff --git a/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/active/RecoveryTask.java b/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/active/RecoveryTask.java
index 9531b63..5d722e7 100644
--- a/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/active/RecoveryTask.java
+++ b/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/active/RecoveryTask.java
@@ -154,7 +154,12 @@
// Recovery task is essntially over now either through failure or through cancellation(stop)
synchronized (listener) {
listener.notifyAll();
- if (listener.getState() != ActivityState.TEMPORARILY_FAILED) {
+ if (listener.getState() != ActivityState.TEMPORARILY_FAILED
+ // Suspend can happen at the same time, the recovery policy decides to stop... in that case, we
+ // must still do two things:
+ // 1. set the state to permanent failure.
+ // 2. set the entity to not running to avoid auto recovery attempt
+ && listener.getState() != ActivityState.SUSPENDED) {
return null;
}
}