Skip to content
This repository has been archived by the owner on May 27, 2020. It is now read-only.

Commit

Permalink
Improve partitioners documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
adelapena committed Dec 9, 2016
1 parent 1c4d617 commit 133c5f5
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@
* An index partitioner to split the index in multiple partitions.
*
* Index partitioning is useful to speed up some searches to the detriment of others, depending on the implementation.
* It is also useful to overcome the Lucene's hard limit of 2147483519 documents per index.
*
* It is also useful to overcome the Lucene's hard limit of 2147483519 documents per local index.
* However, queries involving partitions with more than 2147483519 total documents will still fail.
*
* @author Andres de la Pena {@literal <[email protected]>}
*/
Expand All @@ -41,10 +43,12 @@ public static class None extends Partitioner {
}

/**
* {@link Partitioner} based on the Cassandra's partitioning token.
* A {@link Partitioner} based on the partition key token. Partitioning on token guarantees a good load balancing
* between partitions while speeding up partition-directed searches to the detriment of token range searches
* performance. It allows to efficiently run partition directed queries in nodes indexing more than 2147483519 rows.
* However, token range searches in nodes with more than 2147483519 rows will fail.
*
* Partitioning on token guarantees a good load balancing between partitions while speeding up partition-directed
* searches to the detriment of token range searches.
* The number of partitions per node should be specified.
*/
public static class OnToken extends Partitioner {

Expand Down
9 changes: 6 additions & 3 deletions doc/documentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -601,7 +601,8 @@ Partitioners
Lucene indexes can be partitioned on a per-node basis. This means that the local index in each node
can be split in multiple smaller fragments. Index partitioning is useful to speed up some searches
to the detriment of others, depending on the implementation. It is also useful to overcome the
Lucene's hard limit of 2147483519 documents per local index.
Lucene's hard limit of 2147483519 documents per local index. However, queries involving partitions
with more than 2147483519 total documents will still fail.

Partitioning is disabled by default, and it can be activated specifying a partitioner implementation
in the index creation statement.
Expand Down Expand Up @@ -629,8 +630,10 @@ Token partitioner
_________________

A partitioner based on the partition key token. Partitioning on token guarantees a good load
balancing between partitions while speeding up partition-directed searches to the detriment of any
other searches. The number of partitions per node should be specified.
balancing between partitions while speeding up partition-directed searches to the detriment of token
range searches performance. It allows to efficiently run partition directed queries in nodes
indexing more than 2147483519 rows. However, token range searches in nodes with more than 2147483519
rows will fail. The number of partitions per node should be specified.

.. code-block:: sql
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ import org.apache.cassandra.db.{DecoratedKey, ReadCommand}
* Index partitioning is useful to speed up some searches to the detriment of others, depending on
* the implementation.
*
* It is also useful to overcome the Lucene's hard limit of 2147483519 documents per index.
* It is also useful to overcome the Lucene's hard limit of 2147483519 documents per local index.
* However, queries involving partitions with more than 2147483519 total documents will still fail.
*
* @author Andres de la Pena `[email protected]`
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@ import com.stratio.cassandra.lucene.IndexException
import org.apache.cassandra.db._
import org.apache.cassandra.dht.Token

/** [[Partitioner]] based on the partition key token.
/** [[Partitioner]] partitioner based on the partition key token.
*
* Partitioning on token guarantees a good load balancing between partitions while speeding up
* partition-directed searches to the detriment of token range searches.
* partition-directed searches to the detriment of token range searches performance. It allows to
* efficiently run partition directed queries in nodes indexing more than 2147483519 rows. However,
* token range searches in nodes with more than 2147483519 rows will fail.
*
* @param partitions the number of partitions
* @author Andres de la Pena `[email protected]`
Expand Down

0 comments on commit 133c5f5

Please sign in to comment.