Add documentation for using DataConnectionService [SUP-523] (hazelcas…

…t#1361) Related platform PR https://github.com/hazelcast/hazelcast-mono/pull/3533 --------- Co-authored-by: Tomasz Gawęda <[email protected]> Co-authored-by: Rob Swain <[email protected]>
Rob-Hazelcast · Dec 4, 2024 · 82107c7 · 82107c7
1 parent c0d95b9
commit 82107c7
Show file tree

Hide file tree

Showing 4 changed files with 567 additions and 0 deletions.
diff --git a/docs/antora.yml b/docs/antora.yml
@@ -47,5 +47,6 @@ asciidoc:
     hazelcast-cloud: Cloud
     ucn: User Code Namespaces
     ucd: User Code Deployment
+    minimum-java-version: 17
 nav:
   - modules/ROOT/nav.adoc
diff --git a/docs/modules/data-connections/pages/data-connection-service.adoc b/docs/modules/data-connections/pages/data-connection-service.adoc
@@ -0,0 +1,74 @@
+= Using Data Connections in custom components
+:description: Using the Data Connection Service gives access to the configured xref:data-connections-configuration.adoc[data connections] in custom components.
+
+{description}
+
+== Typical Usage
+
+The typical steps to use a data connection are as follows:
+
+1. Obtain the data connection from the data connection service.
+2. Retrieve the underlying resource from the `DataConnection` instance. This step varies based on the specific implementation of `DataConnection` (e.g., `JdbcDataConnection` provides `getConnection()` which returns a `java.sql.Connection`; `HazelcastDataConnection` provides `getClient()` which returns a `HazelcastInstance`).
+3. Use the resource to perform the required operations.
+4. Dispose of the resource (e.g., by calling `Connection#close` or `HazelcastInstance#destroy`).
+5. Release the `DataConnection` instance (by calling `DataConnection#release()`).
+
+Steps 2, 3, and 4 should be completed as quickly as possible to maximize the efficiency of connection pooling.
+
+[source,java]
+----
+JdbcDataConnetion jdbcDataConnection = instance.getDataConnectionService()
+                .getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class); <1>
+
+try (Connection connection = jdbcDataConnection.getConnection()) { <2>
+    // ... work with connection <3>
+
+    // try-with-resources statement closes the connection <4>
+} catch (SQLException e) {
+    throw new RuntimeException("Failed to load value for key=" + key, e);
+}
+
+jdbcDataConnection.release(); <4>
+----
+
+== Retrieve Data Connection Service
+
+Before working with data connections you need to retrieve an instance of the `DataConnectionService`. Use
+https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/core/HazelcastInstance.html#getDataConnectionService()[`HazelcastInstance#getDataConnectionService()`]
+to obtain an instance of `DataConnectionService`.
+
+You can implement HazelcastInstanceAware in listeners, entry processors, tasks etc. to get access
+to the `HazelcastInstance`.
+
+In the pipeline API you can use
+https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/jet/core/ProcessorMetaSupplier.Context.html#dataConnectionService()[ProcessorMetaSupplier.Context#dataConnectionService()].
+
+NOTE: The Data Connection Service is only available on the member side. Calling `getDataConnectionService()` on client will result in `UnsupportedOperationException`.
+
+== Retrieve Configured DataConnection
+
+Use the `DataConnectionService` to get an instance of previously configured data connection https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/dataconnection/DataConnectionService.html#getAndRetainDataConnection(java.lang.String,java.lang.Class)[DataConnectionService#getAndRetainDataConnection(String, Class)]. For details how to configure a data connection, please refer
+to the xref:data-connections-configuration.adoc[Configuring Data Connections] page.
+
+== Data Connection Scope
+
+The data connection configuration is per-member. For example, when a data connection is created
+with maximum pool size of 10 and the cluster has 3 members, there will be up to 30 connections
+created.
+
+== Data Connection Sharing
+
+Data connection is shared by default. It means that when the data connection is requested in multiple places, the same
+underlying resource (e.g. Jdbc pool, remote client) is used.
+If you want to share the data connection configuration, but use a different instance of the underlying resource,
+set the `DataConnectionConfig#setShared` to false.
+
+== Configuration Considerations
+
+If the data connection is defined in the Hazelcast configuration, it remains immutable for the entire lifespan of the Hazelcast member. In this case, whether you retrieve the DataConnection instance once or each time before accessing the underlying resource, the result will be the same.
+
+However, if the data connection is created dynamically via SQL, it can be replaced using `CREATE OR REPLACE DATA CONNECTION`
+(see xref:sql.adoc).
+In such cases, the DataConnection instance will stay valid until you release it, allowing you to retrieve the underlying resource as needed. This approach can be useful for adapting to changes in data connection configuration.
+
+For example, if you are running a batch job and want to use the same data connection throughout, request the connection at the start of the job. For a streaming job that may need updated configurations, retrieve both the data connection and the underlying resource just before use (e.g., when processing each item in the pipeline).
diff --git a/docs/modules/data-connections/pages/how-to-map-loader-data-connection.adoc b/docs/modules/data-connections/pages/how-to-map-loader-data-connection.adoc
@@ -0,0 +1,278 @@
+= Map Loader using Data Connection
+
+:description: In this tutorial you build a custom map loader that uses a configured data connection to load data not present in an IMap.
+
+{description}
+
+NOTE: This tutorial builds a custom implementation of MapLoader. For the most common use cases we also provide an out-of-the-box implementation xref:mapstore:configuring-a-generic-maploader.adoc[GenericMapLoader].
+
+== Before you begin
+
+To complete this tutorial, you need the following:
+
+[cols="1a,1a"]
+|===
+|Prerequisites|Useful resources
+
+|Java {minimum-java-version} or newer
+|
+|Maven or Gradle
+| https://maven.apache.org/install.html or https://gradle.org/install/
+|Docker
+|https://docs.docker.com/get-started/[Get Started on docker.com]
+
+|===
+
+=== Step 1. Create and Populate the Database
+
+This tutorial uses Docker to run the Postgres database.
+
+Run the following command to start Postgres:
+
+[source, bash]
+----
+docker run --name postgres --rm -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres
+----
+
+Start `psql` client:
+
+[source, bash]
+----
+docker exec -it postgres psql -U postgres
+----
+
+Create a table `my_table` and populate it with data:
+
+[source,sql]
+----
+CREATE TABLE my_table(id INTEGER PRIMARY KEY, value VARCHAR(128));
+
+INSERT INTO my_table VALUES (0, 'zero');
+INSERT INTO my_table VALUES (1, 'one');
+INSERT INTO my_table VALUES (2, 'two');
+INSERT INTO my_table VALUES (3, 'three');
+INSERT INTO my_table VALUES (4, 'four');
+INSERT INTO my_table VALUES (5, 'five');
+INSERT INTO my_table VALUES (6, 'six');
+INSERT INTO my_table VALUES (7, 'seven');
+INSERT INTO my_table VALUES (8, 'eight');
+INSERT INTO my_table VALUES (9, 'nine');
+----
+
+== Step 2. Create a New Java Project
+
+Create a blank Java project named pipeline-service-data-connection-example and copy the Gradle or Maven file into it:
+
+[source,xml]
+----
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+    <modelVersion>4.0.0</modelVersion>
+
+    <groupId>org.example</groupId>
+    <artifactId>maploader-data-connection-example</artifactId>
+    <version>1.0-SNAPSHOT</version>
+
+    <name>maploader-data-connection-example</name>
+
+    <properties>
+        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+        <maven.compiler.release>17</maven.compiler.release>
+    </properties>
+
+    <dependencies>
+        <dependency>
+            <groupId>com.hazelcast</groupId>
+            <artifactId>hazelcast</artifactId>
+            <version>6.0.0-SNAPSHOT</version>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-core</artifactId>
+            <version>2.24.1</version>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-slf4j2-impl</artifactId>
+            <version>2.24.1</version>
+        </dependency>
+        <dependency>
+            <groupId>org.postgresql</groupId>
+            <artifactId>postgresql</artifactId>
+            <version>42.7.4</version>
+        </dependency>
+    </dependencies>
+</project>
+----
+
+== Step 3. MapLoader
+
+The following map loader implements the `com.hazelcast.map.MapLoader` and `com.hazelcast.map.MapLoaderLifecycleSupport`
+interfaces.
+
+[source,java]
+----
+public class SimpleMapLoader implements MapLoader<Integer, String>, MapLoaderLifecycleSupport {
+
+    private JdbcDataConnection jdbcDataConnection;
+
+    // ...
+}
+----
+
+To implement the `MapLoaderLifecycleSupport` interface we need the following methods:
+
+[source,java]
+----
+    // ...
+
+    @Override
+    public void init(HazelcastInstance instance, Properties properties, String mapName) {
+        jdbcDataConnection = instance.getDataConnectionService()
+                .getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class);
+    }
+
+    @Override
+    public void destroy() {
+        jdbcDataConnection.release();
+    }
+
+    // ...
+----
+
+To implement the `MapLoader` interface we need the following methods:
+
+[source,java]
+----
+    @Override
+    public String load(Integer key) {
+        try (Connection connection = jdbcDataConnection.getConnection();
+             PreparedStatement statement = connection.prepareStatement("SELECT value FROM my_table WHERE id = ?")) {
+
+            statement.setInt(1, key);
+            ResultSet resultSet = statement.executeQuery();
+            String value = null;
+            if (resultSet.next()) {
+                value = resultSet.getString("value");
+            }
+            return value;
+        } catch (SQLException e) {
+            throw new RuntimeException("Failed to load value for key=" + key, e);
+        }
+    }
+
+    @Override
+    public Map<Integer, String> loadAll(Collection<Integer> keys) {
+        Map<Integer, String> resultMap = new HashMap<>();
+        StringBuilder queryBuilder = new StringBuilder("SELECT id, value FROM my_table WHERE id IN (");
+
+        // Construct query for batch retrieval
+        keys.forEach(key -> queryBuilder.append("?,"));
+        queryBuilder.setLength(queryBuilder.length() - 1); // Remove last comma
+        queryBuilder.append(")");
+
+        try (Connection connection = jdbcDataConnection.getConnection();
+             PreparedStatement statement = connection.prepareStatement(queryBuilder.toString())) {
+
+            int index = 1;
+            for (Integer key : keys) {
+                statement.setInt(index++, key);
+            }
+
+            ResultSet resultSet = statement.executeQuery();
+            while (resultSet.next()) {
+                resultMap.put(resultSet.getInt("id"), resultSet.getString("value"));
+            }
+            return resultMap;
+        } catch (SQLException e) {
+            throw new RuntimeException("Failed to load values", e);
+        }
+    }
+
+    @Override
+    public Iterable<Integer> loadAllKeys() {
+        List<Integer> keys = new ArrayList<>();
+        try (Connection connection = jdbcDataConnection.getConnection();
+             PreparedStatement statement = connection.prepareStatement("SELECT id FROM my_table");
+             ResultSet resultSet = statement.executeQuery()) {
+
+            while (resultSet.next()) {
+                keys.add(resultSet.getInt("id"));
+            }
+            return keys;
+        } catch (Exception e) {
+            throw new RuntimeException("Failed to load all keys", e);
+        }
+    }
+----
+
+== Step 4. MapLoader Example App
+
+Configure the data connection:
+
+[source,java]
+----
+public class MapLoaderExampleApp {
+    public static void main(String[] args) {
+        Config config = new Config();
+
+        DataConnectionConfig dcc = new DataConnectionConfig("my_data_connection");
+        dcc.setType("JDBC");
+        dcc.setProperty("jdbcUrl", "jdbc:postgresql://172.17.0.2/postgres");
+        dcc.setProperty("user", "postgres");
+        dcc.setProperty("password", "postgres");
+        config.addDataConnectionConfig(dcc);
+
+    }
+}
+----
+
+Configure an IMap named `my_map` with the map loader:
+
+[source,java]
+----
+public class MapLoaderExampleApp {
+    public static void main(String[] args) {
+        // ...
+
+        MapStoreConfig mapStoreConfig = new MapStoreConfig();
+        mapStoreConfig.setClassName(SimpleMapLoader.class.getName());
+
+        MapConfig mapConfig = new MapConfig("my_map");
+        mapConfig.setMapStoreConfig(mapStoreConfig);
+        config.addMapConfig(mapConfig);
+
+
+    }
+}
+----
+
+Create a `HazelcastInstance` with the `Config`, get the IMap and read some data:
+[source,java]
+----
+public class MapLoaderExampleApp {
+    public static void main(String[] args) {
+        // ...
+
+        HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
+        IMap<Integer, String> map = hz.getMap("my_map");
+
+        System.out.println("1 maps to " + map.get(1));
+        System.out.println("42 maps to " + map.get(10));
+    }
+}
+----
+
+When you run this class you should see the following output:
+
+[source,text]
+----
+1 maps to one
+42 maps to null
+----
+
+== Next steps
+
+Read through the xref:configuration:dynamic-config.adoc[Dynamic Configuration] section to find out how to add the
+`DataConnection` config and new `IMap` config with `MapStore` dynamically.