-
Notifications
You must be signed in to change notification settings - Fork 92
Insert Strategy
The Manager.crud().insert()
operation generates INSERT statements with all fields of the entity.
If some fields are not set (e.g. having null
value), then Achilles just set the column to null
in Cassandra
This sounds reasonable but has a huge impact on performance.
Indeed for the CQL semantics, setting a column to null
means deleting it thus creating a tombstone
column.
Let's say you have an User
entity with around 10 fields representing user details. It is obvious that on
user account creation not all of them are provided, probably only login/name/password fields are filled.
When inserting this user in Cassandra you'll create 3 columns for the login/name/password fields and around
7 tombstones
. Later on during compaction Cassandra will need to clean up those 7 tombstones
.
Why don't we in the first place not insert those fields that are null
thus not creating useless tombstones
?
The insert strategies below are the answer for this issue.
This is the default behavior for Achilles. Although creating a lot of tombstones
, this strategy still has a huge advantage with regard to the data consistency.
Let's suppose you have the following sequence of code:
manager
.crud()
.insert(new User(10,'johndoe','John DOE','iamjohndoe!',32))
.execute();
...
manager
.crud()
.insert(new User(10,'johndoe','John DOE','iamjohndoe!'))
.execute();
What we would expect is that the 2nd insert()
operation will wipe all the data from the first insert()
and that's the case with the default insert all fields strategy.
Indeed the second insert()
will generate a INSERT INTO user(id,login,name,password,age,...) VALUES(10,'johndoe','John DOE','iamjohndoe!',null,null....)
statement which erases the previous value for the
fields age
and it saves our day.
This strategy only insert not null fields of an entity. If we take the previous example:
manager
.crud()
.insert(new User(10,'johndoe','John DOE','iamjohndoe!',32))
.execute();
...
manager
.crud()
.insert(new User(10,'johndoe','John DOE','iamjohndoe!'))
.execute();
The second insert()
now will generate a INSERT INTO user(id,login,name,password) VALUES(10,'johndoe','John DOE','iamjohndoe!')
statement, erasing existing values for id/login/name/password
but letting the old age column intact.
If we fetch all data from Cassandra for this user, we will end up with inconsistent data.
To avoid that you'll need to issue a delete()
operation before inserting again.
Insert strategy can be customized for each entity or globally using the @CompileTimeConfig
annotation. To choose between one or other strategy,
annotate your entity with @Strategy(insert = ...)
. There are 2 possible values
-
info.archinnov.achilles.type.InsertStrategy.ALL_FIELDS
- Pros: data consistency
- Cons: may generate a lot of
tombstones
-
info.archinnov.achilles.type.InsertStrategy.NOT_NULL_FIELDS
- Pros: does not create useless
tombstone
- Cons: may introduce data inconsistency when overwriting existing partition with new value
- Pros: does not create useless
Insert Strategy priority
Priority (ascending order) | Description |
---|---|
1 (lowest priority) | Global naming strategy defined at compile time on @CompileTimeConfig |
2 | Locally on each entity using the @Strategy annotation |
3 (highest priority) | at runtime, using the .withInsertStrateg() method on the manager.crud().inser() DSL |
-
Bootstraping Achilles at runtime
- Runtime Configuration Parameters
-
Manager
-
Consistency Level
-
Cassandra Options at runtime
-
Lightweight Transaction (LWT)
-
JSON Serialization
-
Interceptors
-
Bean Validation (JSR-303)