-
Notifications
You must be signed in to change notification settings - Fork 92
Functions Mapping
Achilles supports the new Cassandra 3.0 UDF ( User Defined Function ) and UDA ( User Defined Aggregate ) features. Achilles does not make difference between UDF and UDA and consider both of them as mere function calls.
Before being able to declare an UDF / UDA you need to create a function registry. For this, create a class/abstract class/interface and annotates it with @FunctionRegistry
@FunctionRegistry
public interface MyFunctionRegistry {
...
}
Once created, you can register your functions using the registry class:
@FunctionRegistry
public interface MyFunctionRegistry {
Long string_to_long(String input);
String long_to_string(Long input);
}
Remark: because of a limitation of the JDK annotation processing API, it is not allowed to declare function return types using primitives. Use their corresponding boxed types instead
The types used in the method signatures can be any types defined in Supported Java Types. Furthermore the functions declaration can leverage Achilles Codec System for parameter types and return type too!
@FunctionRegistry
public interface FunctionsWithCodecSystemRegistry {
// CQL function signature = list_to_json(consistencylevels list<text>), returns text
String list_to_json(List<@Enumerated ConsistencyLevel> consistencyLevels);
// CQL function signature = get_int_value(input text), returns text
@Codec(IntToString.class) String get_int_value(String input);
}
In Cassandra, a function declaration and usage is scoped to a keyspace. Therefore the @FunctionRegistry
annotation exposes a keyspace
attribute for you to specify in which keyspace the declared functions should belong to.
@FunctionRegistry(keyspace = "production")
public interface ProductionFunctionRegistry {
...
}
It is possible to declare many functions in the same registry class as well as having many function registry classes, you just need to annotate them with @FunctionRegistry
. However duplicate function declaration is not allowed, Achilles will issue a compilation error if it finds 2 functions with the same name, same signature (parameter types and return type) being declared in the same keyspace no matter if they were declared in the same registry class or not.
In order to guarantee type-safe function calls in its API, Achilles will generate a class hierarchy of all Supported Java Types as well as all types defined on all user entities. Those types are parsed at compile time by the processor and the corresponding classes are generated inside the package info.archinnov.achilles.generated.function
An example of generated type classes:
// Native supported types
Array_Byte_Type
Array_Primitive_byte_Type
...
BigDecimal_Type
BigInteger_Type
Boolean_Type
...
ZonedDateTme_Type
// Non native types present in user entities, views & function registries
List_String_Type
Optional_String_Type
Tuple2_Integer_List_String_Type
...
The naming convention for those type classes is quite simple, they all end with _Type
suffix. The type name is normalized by replacing any < or > by _ . For example if the type Map<Integer, List<String>>
is used in an entity, the type class Map_Integer_List_String_Type
will be generated
We need to generate all these type classes (not to be confused with type class in Scala) to generate type safe function call API.
At compile time Achilles will parse all the methods and their signature to generate 2 classes:
-
info.archinnov.achilles.generated.function.SystemFunctions
: this class contains all the pre-defined system functions in Cassandra likenow()
,ttl()
,writetime()
... -
info.archinnov.achilles.generated.function.FunctionsRegistry
: this class contains the compiled version of all the functions declared in all registries (classes annotated with@FunctionRegistry
)
Those classes will be useful for the Select DSL API. Example:
manager
.dsl()
.select()
.id()
.function(SystemFunctions.toUnixTimestamp(SimpleEntity_AchillesMeta.COLUMNS.DATE), "dateAsLong")
.function(SystemFunctions.writetime(SimpleEntity_AchillesMeta.COLUMNS.VALUE), "writetimeOfValue")
.fromBaseTable()
...
The SystemFunctions.toUnixTimestamp(Date_Type input)
signature accepts a Date_Type
type class as input. To be able to apply this function we have to provide a column that has the same type. Fortunately, for each entity Achilles generates a class XXX_AchillesMeta.COLUMNS
which list all annotated fields of the entity XXX
and the type of those columns are one of the generated type class.
Please note that it is possible to have nested function calls as long as the input type class and returned type class match:
manager
.dsl()
.select()
.id()
.function(SystemFunctions.max(SystemFunctions.writetime(SimpleEntity_AchillesMeta.COLUMNS.VALUE)), "maxWritetimeOfValue")
.fromBaseTable()
...
You can also call UDF / UDA using the class info.archinnov.achilles.generated.function.FunctionsRegistry
:
manager
.dsl()
.select()
.id()
.function(FunctionsRegistry.list_to_json(SimpleEntity_AchillesMeta.COLUMNS.CONSISTENCY_LIST)), "consistency_list_to_json")
.fromBaseTable()
...
The function(FunctionCall functionCall, String alias)
method requires a 2nd parameter which is the alias for this function call. With Cassandra it is not mandatory to define an alias, you can always retrieve the function call value on a row using the canonical name function_name(column_name *)
but it is much more easier and error-proof to require an explicit alias for every function call.
Please note that when you're using a function call at runtime throught the SELECT DSL API, Achilles can only return a TypedMap
because it cannot map the value of this function call from the CQL row.
If you want to map the result of a function call to a field of an entity, use the @ComputedColumn
annotation instead.
Example:
TypedMap row = manager
.dsl()
.select()
.id()
.function(FunctionsRegistry.list_to_json(SimpleEntity_AchillesMeta.COLUMNS.CONSISTENCY_LIST)), "consistency_list_to_json")
.fromBaseTable()
.where()
.id().Eq(id)
.getTypedMap();
// Retrieve the function call result using the alias
String json = row.getTyped("consistency_list_to_json");
Since CASSANDRA-10783 it is possible to call functions or aggregate passing literal values.
Suppose you have defined a function call maxOf(param1 int, param2 int)
and you want to use it to compare the value of some random columns against a literal.
From Cassandra 3.10, you will be able to do:
cqlsh> SELECT maxOf(my_int, 3) FROM keyspace.table WHERE id=xxx;
To support this feature in Achilles with the type safe function calls, now any generated type class (BigDecimal_Type
, Long_Type
, ...)
exposes a wrap(TYPE literal)
method to build an instance of this type wrapping a provided literal value.
With the given maxOf(param1 int, param2 int)
function above:
manager
.dsl()
.select()
.function(FunctionRegistry.maxOf(COLUMNS.MY_INT, Integer_Type.wrap(10)), "maxOf")
...
Pretty cool eh ?
The most amazing thing is this also works for aggregates, opening the door for custom aggregation functions with literal values provided at realtime for filtering.
-
Bootstraping Achilles at runtime
- Runtime Configuration Parameters
-
Manager
-
Consistency Level
-
Cassandra Options at runtime
-
Lightweight Transaction (LWT)
-
JSON Serialization
-
Interceptors
-
Bean Validation (JSR-303)