Skip to content

Commit

Permalink
Feast integration WIP (hazelcast#1163)
Browse files Browse the repository at this point in the history
Making this available for review - I'll be back on the 22nd, and we can
pick this up again; I have not started the tutorials, but this covers
the majority of the other work identified in the doc plan

Sorry, @oliverhowell with no sprint active, I couldn't find a way to
create a ticket on the doc board for this; we can jimmy rig it to
connect the two later
  • Loading branch information
rebekah-lawrence authored Jul 25, 2024
1 parent 45adac6 commit ff84d11
Show file tree
Hide file tree
Showing 6 changed files with 271 additions and 1 deletion.
5 changes: 4 additions & 1 deletion docs/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
** xref:migrate:upgrading-from-imdg-3.adoc[]
** xref:migrate:migration-tool-imdg.adoc[]
*** xref:migrate:dmt-command-reference.adoc[]
* xref:release-notes:releases.adoc[Release notes]
//* xref:release-notes:releases.adoc[Release notes]
// * xref:placeholder.adoc[Troubleshooting]
// * xref:placeholder.adoc[FAQ]
Expand Down Expand Up @@ -179,6 +179,9 @@ include::wan:partial$nav.adoc[]
** xref:spring:hibernate.adoc[]
** xref:spring:transaction-manager.adoc[]
** xref:spring:best-practices.adoc[]
* xref:integrate:integrate-with-feast.adoc[]
** xref:integrate:install-connect.adoc[]
** xref:integrate:feast-config.adoc[]
* xref:integrate:kafka-connect-connectors.adoc[]
* Messaging System Connectors
** xref:integrate:messaging-system-connectors.adoc[Overview]
Expand Down
127 changes: 127 additions & 0 deletions docs/modules/integrate/pages/feast-config.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
= Configure Feature Store
:page-enterprise: true
:description: To use a Feast project, you must configure the feature store. The configuration is defined in a YAML configuration file.

{description}

The feature store forms part of the feature repository. The feature repository stores the configuration to run Feast on your infrastructure and the feature definitions in a central location. It provides the declarative source of truth for the desired state of the feature store. The feature repository consists of the following:

* Feature declarations in Python files
* Infrastructural configuration in the _feature_store.yaml_ file
* Paths to ignore in the feature repository in the _.feastignore_ file
By default, the feature repository is the current directory. This can be changed using the Feast CLI, as described in the link:https://docs.feast.dev/reference/feast-cli-commands[Feast CLI reference, window=_blank] documentation.

For further information on the feature repository, refer to the link:https://docs.feast.dev/reference/feature-repository[Feature repository, window=_blank] topic of the Feast documentation.

The _feature_store.yaml_ file defines the following:

* The environment in which Feast deploys and operates. This is defined in the `project` section
* The location of the feature registry. This is defined in the `registry` section
* The online store. This is defined in the `online_store` section
* The offline store. This is defined in the `offline_store` section
* The namespace to use for the feature store. This can be used to isolate multiple deployments in a single Feast installation, and can contain only letters, numbers, and underscores. This is defined in the `project` section
* The materialization engine. This is defined in the `engine` section
* The serialization service. This serializes the entity key to a bytestring to allow it to be be used as a lookup key in a hash table. You can specify Feast's serialization service. This is defined in the `entity_key_serialization_version` section
== Configure Hazelcast as Online Store

The `online-store` section configures the online store and initializes a config object, which is passed to the `OnlineStore` interface by Feast to create the connection.

The available options are as follows:

[cols="1,1,1,1"]
|===
|Name |Type |Description |Default Value

|type
|String
|Online store type selector
|"hazelcast"

|cluster_name
|String
|Name of the cluster to which you want to connect
|"dev"

|cluster_members
|List[String]
|List of member addresses connected to your cluster
|["localhost:5701"]

|discovery_token
|String
|Discovery token used by a Hazelcast Cloud cluster
|""

|ssl_cafile_path
|String
|Absolute path of CA certificates in PEM format
|""

|ssl_certfile_path
|String
|Absolute path of the client certificate in PEM format
|""

|ssl_keyfile_path
|String
|Absolute path of theprivate key file for the client certificate in PEM format
|""

|ssl_password
|String
|Password for decrypting the keyfile, if encrypted
|""

|key_ttl_seconds
|Integer
|Hazelcast key bin TTL for expiring entities, in seconds
|0

|creds_username
|String
|Enterprise only.
Username for credential authentication
|""

|creds_password
|String
|Enterprise only.
Password for credential authentication
|""
|===

=== Example Local Cluster Configuration

You can configure the connection to a local cluster as shown in the following example:

[source,yaml]
----
online_store:
type: hazelcast
cluster_name: <YOUR_CLUSTER_NAME>
cluster_members: [<YOUR_CLUSTER_MEMBERS>]
ssl_cafile_path: <PATH_TO_CA_CERT>
ssl_certfile_path: <PATH_TO_CLIENT_CERT>
ssl_keyfile_path: <PATH_TO_PRIVATE_KEY_FILE>
ssl_password: <PATH_TO_DECRYPTION_PSWD>
key_ttl_seconds: <TTL_IN_SECONDS>
----

=== Example SSL-enabled Cluster Configuration

You can configure an SSL-enabled connection as shown in the following example:

[source,yaml]
----
online_store:
type: hazelcast
cluster_name: <YOUR_CLUSTER_NAME>
cluster_members: [<YOUR_CLUSTER_MEMBERS>]
sl_cafile_path: <PATH_TO_CA_CERT>
ssl_certfile_path: <PATH_TO_CLIENT_CERT>
kssl_keyfile_path: <PATH_TO_PRIVATE_KEY_FILE>
ssl_password: <YOUR_SSL_PASSWORD>
----

13 changes: 13 additions & 0 deletions docs/modules/integrate/pages/install-connect.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
= Install Feast and Connect to Hazelcast
:page-enterprise: true
:description: Before you can use Feast with Hazelcast as an online store, you must install Feast and connect to Hazelcast.

{description}

== Install Feast

include::partial$feast-install.adoc[]

== Connect to Hazelcast

include::partial$feast-connect.adoc[]
119 changes: 119 additions & 0 deletions docs/modules/integrate/pages/integrate-with-feast.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
= Integrate with Feast
:page-enterprise: true
:description: pass:q[Feast (**Fea**ture **St**ore) is a customizable operational data system, which uses your existing infratstructure to manage and serve machine learning features to real-time models. When integrated with Hazelcast, you can benefit from an online store that supports materializing feature values in a running Hazelcast cluster.]

{description}

This approach unlocks the following to power your Feast real-world machine learning (ML) requirements:

* One-to-one mapping of each feature view to a specific IMap data structure
+
Feast creates a new IMap for each feature, which means that every feature view corresponds to an IMap in the Hazelcast cluster, and the entries in that IMap correspond to features of entitites. Each feature value is stored separately, and can be retrieved individually.
* Hazelcast's inherent strengths, such as high availability, fault tolerance, and data distribution
* Support for a secure TLS/SSL connection to your Hazelcast online store
* The ability to set Time-to-Live (TTL) for features in your Hazelcast cluster
== What is Feast?

Feast is an open-source feature store that can be used on your existing infrastructure to manage and serve ML features to real-time models.

A feature store is a central repository where features can be stored and processed for reuse or sharing. A feature store is used for ML pipelines in much the same way as data warehousing is used for analytics. A feature store supports the discovery, documentation, and reuse of features while ensuring their correctness. This means that you can use exactly the same features in your model training and production, which helps to prevent skewed model predictions.

Feast can help to decouple ML from your data infrastructure. This can be useful in a number of ways; for example, to allow you to move between model types, such as training models to serving models, or from one data infrastructure system to another.

When creating a Feast project, you can define one or more feature views. A feature view is an object that represents a logical group of time-series feature data from the provided raw underlying data, which is known as the data source. If related to a specific object, the feature view has one or more entities.

An entity is a collection of semantically related features. Each entity in a feature store consists of the entity name and an associated value, known as a join key. The name uniquely identifies the entity, and the join key identifies the physical primary key on which feature values are joined during retrieval.

With Feast, you use the following in your ML models:

* Historical data to support predictions that allow scaling to improve model performance
* Real-time data to support data-driven insights
* Pre-computed features that can be served online

For further information on Feast, refer to the link:https://docs.feast.dev/v/master[Feast, window=_blank] documentation.

== Why Integrate with Hazelcast?

Feast supports the following:

* Batch feature creation. The data in the batch - or offline - store can be transformed, for example, using SQL
* Stream feature creation. Streaming services, such as Kafka, supply the data from which stream features are created
* Load data from the offline store to the online store through materialization

Integrating Feast with Hazelcast allows you to do the following:

* Use a Hazelcast cluster for your online data store. An IMap in the cluster contains the feature values, identified by the project and feature view names. You can push data to, and retrieve data from, the IMap to provide a real-time online store
* Connect a Feast feature server to stream data. This avoids the need for a temporary IMap sink, provides a real-time solution, and ensures that your features are always fresh.

See the <<what-next,tutorials>> for some worked examples of using Hazelcast with Feast.

== Functionality Matrix

The following table shows which Feast functionality is supported by the Hazelcast integration:

[cols="2,1"]
|===
|Feast functionality | Supported by Hazelcast?

|Write feature values to the online store
|Yes

|Read feature values to the online store
|Yes

|Update infrastructure, such as tables, in the online store
|Yes

|Teardown infrastructure, such as tables, in the online store
|Yes

|Generate a plan of infrastructure changes
|No

|Support for online transforms
|Yes

|Readable by Python SDK
|Yes

|Readable by Java
|No

|Readable by Go
|No

|Support for entityless feature views
|Yes

|Support for concurrent writing to the same key
|Yes

|Support for TTL at retrieval
|Yes

|Support for deleting expired data
|Yes

|Collocated by feature view
|No

|Collocated by feature service
|No

|Collocated by entity key
|Yes
|===

== What Next?

To use Feast with Hazelcast, you must do the following:

. xref:integrate:install-connect.adoc[Install Feast and connect to Hazelcast]
. xref:integrate:feast-config.adoc[Configure Feast]

You can also work through the following tutorials:

* Get Started with Feature Store
* Feature Compute and Transformation
4 changes: 4 additions & 0 deletions docs/modules/integrate/partials/feast-connect.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
To connect Feast to Hazelcast, enter the following command in your terminal:

[source,console]
feast init REPO_NAME -t hazelcast
4 changes: 4 additions & 0 deletions docs/modules/integrate/partials/feast-install.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
To install Feast, enter the following command in your terminal:

[source,console]
pip install 'feast[hazelcast]'

0 comments on commit ff84d11

Please sign in to comment.