Add split brain protection documentation for vector collection [AI-23…

…6] (hazelcast#1429) Co-authored-by: Rob Swain <[email protected]>
Rob-Hazelcast · Dec 20, 2024 · 65d86c2 · 65d86c2
1 parent 5d8ba53
commit 65d86c2
Show file tree

Hide file tree

Showing 8 changed files with 130 additions and 18 deletions.
diff --git a/docs/modules/data-structures/pages/cardinality-estimator-service.adoc b/docs/modules/data-structures/pages/cardinality-estimator-service.adoc
@@ -47,8 +47,8 @@ include::ROOT:example$/dds/ExampleCardinalityEstimator.java[tag=ces]
 == Split-Brain Protection for Cardinality Estimator
 
 Cardinality Estimator can be configured to check for a minimum number of
-available members before applying its operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]). This is a check to avoid performing successful queue
-operations on all parts of a cluster during a network partition.
+available members before applying its operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
+This is a check to avoid performing successful cardinality estimator operations on all parts of a cluster during a network partition.
 
 The following is a list of methods, grouped by the protection types, that support
 split-brain protection checks:

diff --git a/docs/modules/data-structures/pages/list.adoc b/docs/modules/data-structures/pages/list.adoc
@@ -395,8 +395,8 @@ See the <<split-brain-protection-for-ilist-and-transactionallist, Split-Brain Pr
 == Split-Brain Protection for IList and TransactionalList
 
 IList & TransactionalList can be configured to check for a minimum
-number of available members before applying queue operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
-This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.
+number of available members before applying list operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
+This is a check to avoid performing successful list operations on all parts of a cluster during a network partition.
 
 The following is a list of methods, grouped by the protection types, that support split-brain protection checks:
 

diff --git a/docs/modules/data-structures/pages/multimap.adoc b/docs/modules/data-structures/pages/multimap.adoc
@@ -425,7 +425,7 @@ See the xref:network-partitioning:split-brain-protection.adoc#split-brain-protec
 
 MultiMap & TransactionalMultiMap can be configured to check for a minimum number of
 available members before applying their operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
-This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.
+This is a check to avoid performing successful multimap operations on all parts of a cluster during a network partition.
 
 The following is a list of methods that support split-brain protection checks. The list is grouped by the protection types.
 

diff --git a/docs/modules/data-structures/pages/replicated-map.adoc b/docs/modules/data-structures/pages/replicated-map.adoc
@@ -260,7 +260,7 @@ include::ROOT:example$/dds/replicatedmap/ListeningMember.java[tag=lm]
 
 Replicated Map can be configured to check for a minimum number of available
 members before applying its operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
-This is a check to avoid performing successful queue operations on all parts of a
+This is a check to avoid performing successful replicated map operations on all parts of a
 cluster during a network partition.
 
 The following is a list of methods, grouped by the protection types, that support split-brain

diff --git a/docs/modules/data-structures/pages/set.adoc b/docs/modules/data-structures/pages/set.adoc
@@ -371,8 +371,8 @@ See the <<split-brain-protection-for-iset-and-transactionalset, Split-Brain Prot
 == Split-Brain Protection for ISet and TransactionalSet
 
 ISet & TransactionalSet can be configured to check for a minimum number of
-available members before applying queue operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
-This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.
+available members before applying set operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
+This is a check to avoid performing successful set operations on all parts of a cluster during a network partition.
 
 The following is a list of methods, grouped by the protection types, that support
 split-brain protection checks:

diff --git a/docs/modules/data-structures/pages/vector-collections.adoc b/docs/modules/data-structures/pages/vector-collections.adoc
@@ -61,6 +61,17 @@ Can include letters, numbers, and the symbols `-`, `_`, `*`.
 |Optional
 |`0`
 
+|merge-policy
+|Configuration of the merge policy for this vector collection. See the <<merge-policy>> section.
+|Optional
+|`PutIfAbsentMergePolicy` with `batchSize`=`100`
+
+|split-brain-protection-ref
+|Name of the split-brain protection configuration that you want this vector collection to use.
+See the <<split-brain-protection>> section.
+|Optional
+|`NULL`
+
 |===
 
 .Index configuration options
@@ -151,6 +162,8 @@ XML::
                 <use-deduplication>false</use-deduplication>
             </index>
         </indexes>
+        <merge-policy batch-size="200">PutIfAbsentMergePolicy</merge-policy>
+        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
     </vector-collection>
 </hazelcast>
 ----
@@ -175,6 +188,10 @@ hazelcast:
           max-degree: 32
           ef-construction: 256
           use-deduplication: false
+      merge-policy:
+        batch-size: 200
+        class-name: PutIfAbsentMergePolicy
+      split-brain-protection-ref: splitbrainprotection-name
 ----
 --
 Java::
@@ -199,7 +216,11 @@ VectorCollectionConfig collectionConfig = new VectorCollectionConfig("books")
                 .setMaxDegree(32)
                 .setEfConstruction(256)
                 .setUseDeduplication(false)
-    );
+    ).setMergePolicyConfig(
+            new MergePolicyConfig()
+                .setBatchSize(200)
+                .setPolicy(PutIfAbsentMergePolicy.class.getName())
+    ).setSplitBrainProtectionName("splitbrainprotection-name");
 config.addVectorCollectionConfig(collectionConfig);
 ----
 --
@@ -212,11 +233,88 @@ client.create_vector_collection_config("books", backup_count=1, async_backup_cou
     IndexConfig(name="word2vec-index", metric=Metric.DOT, dimension=6),
     IndexConfig(name="glove-index", metric=Metric.DOT, dimension=10,
                 max_degree=32, ef_construction=256, use_deduplication=False),
-])
+], merge_policy="PutIfAbsentMergePolicy", merge_batch_size=200, split_brain_protection_name="splitbrainprotection-name")
+----
+--
+====
+
+
+[[split-brain-protection]]
+=== Split-Brain Protection
+Vector collection can be configured to check for a minimum number of
+available members before applying vector collection operations (see the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection section]).
+This is a check to avoid performing successful vector collection operations on all parts of a cluster during a network partition.
+
+The following methods support split-brain protection checks:
+
+* `WRITE`, `READ_WRITE`:
+** `putAsync`
+** `setAsync`
+** `putIfAbsentAsync`
+** `putAllAsync`
+** `removeAsync`
+** `deleteAsync`
+** `clearAsync`
+** `optimizeAsync`
+* `READ`, `READ_WRITE`:
+** `getAsync`
+** `size`
+** `searchAsync`
+
+The value of `split-brain-protection-ref` should be the split-brain protection configuration name which you
+configured under the `split-brain-protection` element as explained in the xref:network-partitioning:split-brain-protection.adoc[Split-Brain Protection documentation].
+
+[[merge-policy]]
+=== Configuring Merge Policy
+
+While recovering from a split-brain scenario, Vector Collection
+in the small cluster merges into the bigger cluster based on a configured
+merge policy. The merge policy resolves conflicts with different out-of-the-box strategies.
+It can be configured programmatically using the method
+https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/config/vector/VectorCollectionConfig.html#setMergePolicyConfig(com.hazelcast.config.MergePolicyConfig)[setMergePolicyConfig()^],
+or declaratively using the element `merge-policy`.
+The following example shows declarative configuration:
+
+[tabs]
+====
+XML::
++
+--
+[source,xml]
+----
+<hazelcast>
+    ...
+    <vector-collection name="books">
+        <merge-policy batch-size="200">PutIfAbsentMergePolicy</merge-policy>
+    </vector-collection>
+    ...
+</hazelcast>
 ----
 --
+
+YAML::
++
+[source,yaml]
+----
+hazelcast:
+  vector-collection:
+    books:
+      merge-policy:
+        batch-size: 200
+        class-name: PutIfAbsentMergePolicy
+----
 ====
 
+Vector collection supports the following policies:
+
+* `DiscardMergePolicy`: The entry from the smaller cluster is discarded.
+* `PassThroughMergePolicy`: The entry from the smaller cluster wins.
+* `PutIfAbsentMergePolicy`: The entry from the smaller cluster wins if it doesn't exist in the cluster.
+
+Additionally, you can develop a custom merge policy by implementing
+the `SplitBrainMergePolicy` interface, as explained in
+xref:network-partitioning:split-brain-recovery.adoc#custom-merge-policies[Custom merge policies].
+
 == Create collection
 
 You can use either of the `VectorCollection` static methods to get the vector collection. Both methods either create a vector collection, or return an existing one that corresponds to the requested name.

diff --git a/docs/modules/network-partitioning/pages/split-brain-protection.adoc b/docs/modules/network-partitioning/pages/split-brain-protection.adoc
@@ -28,15 +28,15 @@ applications continuing in error with stale data, they are prevented from doing
 
 Split-brain protection is supported for the following Hazelcast data structures:
 
-* IMap (for Hazelcast 3.5 and higher versions)
-* Transactional Map (for Hazelcast 3.5 and higher versions)
-* ICache (for Hazelcast 3.5 and higher versions)
-* ILock (for Hazelcast 3.8 and higher versions)
-* IQueue (for Hazelcast 3.8 and higher versions)
+* IMap
+* Transactional Map
+* ICache
+* ILock
+* IQueue
 * IExecutorService, DurableExecutorService, IScheduledExecutorService,
 MultiMap, ISet, IList, Ringbuffer, Replicated Map, Cardinality Estimator,
 IAtomicLong, IAtomicReference, ISemaphore, ICountDownLatch
-(for Hazelcast 3.10 and higher versions)
+* VectorCollection [.enterprise]*{enterprise-product-name}*
 
 Each data structure to be protected should have the configuration added to
 it as explained in the <<configuring-split-brain-protection, Configuring Split-Brain Protection section>>.

diff --git a/docs/modules/network-partitioning/pages/split-brain-recovery.adoc b/docs/modules/network-partitioning/pages/split-brain-recovery.adoc
@@ -40,8 +40,7 @@ For more information, see the xref:consistency-and-replication:consistency.adoc[
 
 == Merge Policies
 
-Since Hazelcast 3.10 all merge policies implement
-the unified interface `com.hazelcast.spi.SplitBrainMergePolicy`.
+All merge policies implement the unified interface `com.hazelcast.spi.SplitBrainMergePolicy`.
 We provide the following out-of-the-box implementations:
 
 * `DiscardMergePolicy`: The entry from the smaller cluster is discarded.
@@ -76,6 +75,7 @@ The following data structures support split-brain recovery:
 * `Ringbuffer`
 * `CardinalityEstimator`
 * `ScheduledExecutorService`
+* `VectorCollection` [.enterprise]*{enterprise-product-name}*
 
 The statistic based out-of-the-box merge policies are only supported by
 `IMap`, `ICache`, `ReplicatedMap` and `MultiMap`.
@@ -136,6 +136,10 @@ XML::
     <atomic-long name="default">
         <merge-policy>PutIfAbsentMergePolicy</merge-policy>
     </atomic-long>
+
+    <vector-collection name="default">
+        <merge-policy batch-size="100">PutIfAbsentMergePolicy</merge-policy>
+    </vector-collection>
     ...
 </hazelcast>
 ----
@@ -170,6 +174,11 @@ hazelcast:
     default:
       merge-policy:
         class-name: PutIfAbsentMergePolicy
+  vector-collection:
+    default:
+      merge-policy:
+        batch-size: 100
+        class-name: PutIfAbsentMergePolicy
 ----
 ====
 
@@ -226,6 +235,7 @@ Config config = new Config()
   .addMapConfig(mapConfig);
 ----
 
+[#custom-merge-policies]
 == Custom Merge Policies
 
 To implement a custom merge policy you have to implement `com.hazelcast.spi.SplitBrainMergePolicy`:
@@ -385,6 +395,10 @@ And the following table shows the merge types provided by each data structure:
 | `ScheduledExecutorService`
 |
 
+* `MergingEntry`
+| `VectorCollection`
+|
+
 * `MergingEntry`
 |===