From fd0e71d4824a785968d61e923d47503a49c7bebd Mon Sep 17 00:00:00 2001 From: Natalia Ivakina Date: Wed, 2 Apr 2025 11:49:20 +0200 Subject: [PATCH 1/4] Add a page on how to remove secondary from the cluster on AWS --- modules/ROOT/content-nav.adoc | 1 + .../neo4j-cluster-cloud.adoc | 109 ++++++++++++++++++ 2 files changed, 110 insertions(+) create mode 100644 modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc diff --git a/modules/ROOT/content-nav.adoc b/modules/ROOT/content-nav.adoc index 3e130954b..897bac6d9 100644 --- a/modules/ROOT/content-nav.adoc +++ b/modules/ROOT/content-nav.adoc @@ -15,6 +15,7 @@ ** xref:cloud-deployments/neo4j-aws.adoc[] ** xref:cloud-deployments/neo4j-gcp.adoc[] ** xref:cloud-deployments/neo4j-azure.adoc[] +** xref:cloud-deployments/neo4j-cluster-cloud.adoc[] * xref:docker/index.adoc[] ** xref:docker/introduction.adoc[] diff --git a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc new file mode 100644 index 000000000..ee04bb0d3 --- /dev/null +++ b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc @@ -0,0 +1,109 @@ +:description: Tha page describes how to manage the Neo4j cluster on AWS. +:page-role: enterprise-edition + +[[neo4j-cluster-cloud-deployments]] += Neo4j cluster on self-managed cloud deployments + +Before diving into the topic, it is important to understand basics about Neo4j's clustering. + +Neo4j cluster consists of a homogenous pool of servers that collectively run a number of databases. +The servers can operate in two different database-hosting modes: _primary_ and _secondary_. +A server can simultaneously act as a primary host for one or more databases and as a secondary host for other databases. + +For more details on operational and application aspects of Neo4j's clustering, refer to the xref::clustering/index.adoc[Clustering in Neo4j]. + +For information on how to manage databases and servers in a cluster, see respectively xref::clustering/databases.adoc[] and xref::clustering/servers.adoc[]. + + +== Neo4j cluster on AWS + +Neo4j does not provide Amazon Machine Images (AMIs) with a pre-installed version of the product. +The Neo4j AWS Marketplace listings (and listings on GitHub) use CloudFormation templates that deploy and configure Neo4j dynamically with a shell script. + + +// === Neo4j cluster and auto-scaling groups on AWS + + +=== Removing a (secondary constrained) server from the cluster + +Imagine you have a cluster consisting of three primary constrained servers and two secondary constrained servers. +This means that three servers host primary databases and the other two host secondary databases. + +When performing rolling updates on Amazon Machine Images (AMIs) for secondary servers, it is important to follow a structured approach. +Rotating AMIs is a common practice in such environments. + +However, simply removing secondary servers from the target Network Load Balancer (NLB) one by one does not prevent read requests from being routed to them. +This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of server availability. + +To correctly remove a secondary server from the cluster and reintroduce it after the update: + +. Remove the server from the NLB to stop traffic routing. +. Shut down the server before proceeding with the AMI update. + + +Here are the steps: + +. Remove the secondary from the AWS NLB. + This prevents external clients from sending requests to the secondary. + +. Since Neo4j's cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the secondary server. +To do this, you have to cleanly shut down the secondary. + +.. Run the following query to check servers are hosting all their assigned databases. +The query should return no results: ++ +[source, cypher, role=noplay] +---- +SHOW SERVERS YIELD name, hosting, requestedHosting, serverId WHERE requestedHosting <> hosting +---- + +.. Use the following query to check all databases are in their expected state. +The query should return no results: ++ +[source, cypher, role=noplay] +---- +SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessage WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus, statusMessage +---- + +.. To stop the Neo4j service, run the following command: ++ +[source, shell, role=copy] +---- +sudo systemctl stop neo4j +---- ++ +To configure the timeout period for waiting on active transactions to either complete or be terminated during shutdown, you can modify the environment variable `NEO4J_SHUTDOWN_TIMEOUT` using `systemctl edit neo4j.service` +or the setting xref::configuration/configuration-settings.adoc#config_db.shutdown_transaction_end_timeout[`db.shutdown_transaction_end_timeout`] in _neo4j.conf_ file. ++ +By default, `NEO4J_SHUTDOWN_TIMEOUT` is set to 120 seconds and `db.shutdown_transaction_end_timeout` -- to 10 seconds. ++ +If the shutdown process exceeds these limits, it is considered failed. +You may need to increase the values if the system serves long-running transactions. + +.. Verify that the shutdown process has finished successfully by checking the _neo4j.log_ for relevant log messages confirming the shutdown. + + +. When everything is updated or fixed, start the secondaries one by one again. +.. Run `systemctl start neo4j`. +.. Once the server has been restarted, confirm it is running successfully. ++ +Run the following command and check the server has state `Enabled` and health `Available`. ++ +[source, cypher, role=noplay] +---- +SHOW SERVERS WHERE name = [server-id]; +---- + +.. Confirm that the server has started all the databases that it should. ++ +This command shows any databases that are not in their expected state: ++ +[source, cypher, role=noplay] +---- +SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, serverID WHERE currentStatus <> requestedStatus AND serverID = [server-id] RETURN name, address, currentStatus, requestedStatus +---- + +. Reattach the secondary to the NLB. +Once the secondary server is stable and caught up, add it back to the AWS NLB target group. + + From 03675565e85fdabbab573459eefabb8464119290 Mon Sep 17 00:00:00 2001 From: Natalia Ivakina Date: Tue, 8 Apr 2025 12:26:00 +0200 Subject: [PATCH 2/4] Rewrite the steps --- .../neo4j-cluster-cloud.adoc | 50 ++++++++----------- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc index ee04bb0d3..7c85de56c 100644 --- a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc +++ b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc @@ -2,7 +2,7 @@ :page-role: enterprise-edition [[neo4j-cluster-cloud-deployments]] -= Neo4j cluster on self-managed cloud deployments += Neo4j cluster in self-managed cloud deployments Before diving into the topic, it is important to understand basics about Neo4j's clustering. @@ -24,30 +24,19 @@ The Neo4j AWS Marketplace listings (and listings on GitHub) use CloudFormation t // === Neo4j cluster and auto-scaling groups on AWS -=== Removing a (secondary constrained) server from the cluster +=== Remove a server from the Neo4j cluster -Imagine you have a cluster consisting of three primary constrained servers and two secondary constrained servers. -This means that three servers host primary databases and the other two host secondary databases. +Rolling updates on Amazon Machine Images (AMIs) often involve their rotating. +However, simply removing Neo4j servers from the target Network Load Balancer (NLB) one by one does not prevent requests from being routed to them. +This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of a server availability. -When performing rolling updates on Amazon Machine Images (AMIs) for secondary servers, it is important to follow a structured approach. -Rotating AMIs is a common practice in such environments. +To correctly remove a server from the cluster and reintroduce it after the update, follow the steps outlined below: -However, simply removing secondary servers from the target Network Load Balancer (NLB) one by one does not prevent read requests from being routed to them. -This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of server availability. +. Remove the server from the AWS NLB. + This prevents external clients from sending requests to the server. -To correctly remove a secondary server from the cluster and reintroduce it after the update: - -. Remove the server from the NLB to stop traffic routing. -. Shut down the server before proceeding with the AMI update. - - -Here are the steps: - -. Remove the secondary from the AWS NLB. - This prevents external clients from sending requests to the secondary. - -. Since Neo4j's cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the secondary server. -To do this, you have to cleanly shut down the secondary. +. Since Neo4j's cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the server. +To do this, you have to cleanly shut down the server. .. Run the following query to check servers are hosting all their assigned databases. The query should return no results: @@ -72,18 +61,19 @@ SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessag sudo systemctl stop neo4j ---- + -To configure the timeout period for waiting on active transactions to either complete or be terminated during shutdown, you can modify the environment variable `NEO4J_SHUTDOWN_TIMEOUT` using `systemctl edit neo4j.service` -or the setting xref::configuration/configuration-settings.adoc#config_db.shutdown_transaction_end_timeout[`db.shutdown_transaction_end_timeout`] in _neo4j.conf_ file. -+ -By default, `NEO4J_SHUTDOWN_TIMEOUT` is set to 120 seconds and `db.shutdown_transaction_end_timeout` -- to 10 seconds. +To configure the timeout period for waiting on active transactions to either complete or be terminated before the shutdown, modify the setting xref::configuration/configuration-settings.adoc#config_db.shutdown_transaction_end_timeout[`db.shutdown_transaction_end_timeout`] in the _neo4j.conf_ file. +`db.shutdown_transaction_end_timeout` defaults to 10 seconds. + -If the shutdown process exceeds these limits, it is considered failed. -You may need to increase the values if the system serves long-running transactions. +The environment variable `NEO4J_SHUTDOWN_TIMEOUT` determines how long the system will wait for Neo4j to stop before forcefully terminating the process. +You can change this using `systemctl edit neo4j.service`. +By default, `NEO4J_SHUTDOWN_TIMEOUT` is set to 120 seconds. +If the shutdown process exceeds this limit, it is considered failed. +You may need to increase the value if the system serves long-running transactions. .. Verify that the shutdown process has finished successfully by checking the _neo4j.log_ for relevant log messages confirming the shutdown. -. When everything is updated or fixed, start the secondaries one by one again. +. When everything is updated or fixed, start the servers one by one again. .. Run `systemctl start neo4j`. .. Once the server has been restarted, confirm it is running successfully. + @@ -103,7 +93,7 @@ This command shows any databases that are not in their expected state: SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, serverID WHERE currentStatus <> requestedStatus AND serverID = [server-id] RETURN name, address, currentStatus, requestedStatus ---- -. Reattach the secondary to the NLB. -Once the secondary server is stable and caught up, add it back to the AWS NLB target group. +. Reattach the server to the NLB. +Once the server is stable and caught up, add it back to the AWS NLB target group. From 1856eb7ce9ad83c37ef150a15294187a41834a56 Mon Sep 17 00:00:00 2001 From: NataliaIvakina <82437520+NataliaIvakina@users.noreply.github.com> Date: Thu, 10 Apr 2025 15:53:28 +0200 Subject: [PATCH 3/4] Apply suggestions from code review --- modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc index 7c85de56c..ebf1f61a6 100644 --- a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc +++ b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc @@ -1,4 +1,4 @@ -:description: Tha page describes how to manage the Neo4j cluster on AWS. +:description: The page describes how to manage the Neo4j cluster on AWS. :page-role: enterprise-edition [[neo4j-cluster-cloud-deployments]] @@ -26,7 +26,7 @@ The Neo4j AWS Marketplace listings (and listings on GitHub) use CloudFormation t === Remove a server from the Neo4j cluster -Rolling updates on Amazon Machine Images (AMIs) often involve their rotating. +Rolling updates on Amazon Machine Images (AMIs) often involve rotating the images. However, simply removing Neo4j servers from the target Network Load Balancer (NLB) one by one does not prevent requests from being routed to them. This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of a server availability. From ae6b13eb613face6dee860c927b9b1f0dffa5021 Mon Sep 17 00:00:00 2001 From: Natalia Ivakina Date: Mon, 14 Apr 2025 14:10:27 +0200 Subject: [PATCH 4/4] Move the content --- modules/ROOT/content-nav.adoc | 1 - .../pages/cloud-deployments/neo4j-aws.adoc | 78 ++++++++++++++- .../neo4j-cluster-cloud.adoc | 99 ------------------- 3 files changed, 77 insertions(+), 101 deletions(-) delete mode 100644 modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc diff --git a/modules/ROOT/content-nav.adoc b/modules/ROOT/content-nav.adoc index 897bac6d9..3e130954b 100644 --- a/modules/ROOT/content-nav.adoc +++ b/modules/ROOT/content-nav.adoc @@ -15,7 +15,6 @@ ** xref:cloud-deployments/neo4j-aws.adoc[] ** xref:cloud-deployments/neo4j-gcp.adoc[] ** xref:cloud-deployments/neo4j-azure.adoc[] -** xref:cloud-deployments/neo4j-cluster-cloud.adoc[] * xref:docker/index.adoc[] ** xref:docker/introduction.adoc[] diff --git a/modules/ROOT/pages/cloud-deployments/neo4j-aws.adoc b/modules/ROOT/pages/cloud-deployments/neo4j-aws.adoc index e2eaf7148..ddc14812b 100644 --- a/modules/ROOT/pages/cloud-deployments/neo4j-aws.adoc +++ b/modules/ROOT/pages/cloud-deployments/neo4j-aws.adoc @@ -126,12 +126,88 @@ After the installation finishes successfully, the CloudFormation template provid |=== -== Cluster version consistency +[role=label--enterprise-edition] +== Neo4j cluster on AWS + +=== Cluster version consistency When the CloudFormation template creates a new Neo4j cluster, an Auto Scaling group (ASG) is created and tagged with the monthly version of the installed Neo4j database. If you add more EC2 instances to your ASG, they will be installed with the same monthly version, ensuring that all Neo4j cluster servers are installed with the same version, regardless of when the EC2 instances were created. +=== Remove a server from the Neo4j cluster + +Rolling updates on Amazon Machine Images (AMIs) often involve rotating the images. +However, simply removing Neo4j servers from the target Network Load Balancer (NLB) one by one does not prevent requests from being routed to them. +This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of a server availability. + +To correctly remove a server from the cluster and reintroduce it after the update, follow the steps outlined below: + +. Remove the server from the AWS NLB. + This prevents external clients from sending requests to the server. + +. Since Neo4j's cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the server. +To do this, you have to cleanly shut down the server. + +.. Run the following query to check servers are hosting all their assigned databases. +The query should return no results: ++ +[source, cypher, role=noplay] +---- +SHOW SERVERS YIELD name, hosting, requestedHosting, serverId WHERE requestedHosting <> hosting +---- + +.. Use the following query to check all databases are in their expected state. +The query should return no results: ++ +[source, cypher, role=noplay] +---- +SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessage WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus, statusMessage +---- + +.. To stop the Neo4j service, run the following command: ++ +[source, shell, role=copy] +---- +sudo systemctl stop neo4j +---- ++ +To configure the timeout period for waiting on active transactions to either complete or be terminated before the shutdown, modify the setting xref::configuration/configuration-settings.adoc#config_db.shutdown_transaction_end_timeout[`db.shutdown_transaction_end_timeout`] in the _neo4j.conf_ file. +`db.shutdown_transaction_end_timeout` defaults to 10 seconds. ++ +The environment variable `NEO4J_SHUTDOWN_TIMEOUT` determines how long the system will wait for Neo4j to stop before forcefully terminating the process. +You can change this using `systemctl edit neo4j.service`. +By default, `NEO4J_SHUTDOWN_TIMEOUT` is set to 120 seconds. +If the shutdown process exceeds this limit, it is considered failed. +You may need to increase the value if the system serves long-running transactions. + +.. Verify that the shutdown process has finished successfully by checking the _neo4j.log_ for relevant log messages confirming the shutdown. + + +. When everything is updated or fixed, start the servers one by one again. +.. Run `systemctl start neo4j`. +.. Once the server has been restarted, confirm it is running successfully. ++ +Run the following command and check the server has state `Enabled` and health `Available`. ++ +[source, cypher, role=noplay] +---- +SHOW SERVERS WHERE name = [server-id]; +---- + +.. Confirm that the server has started all the databases that it should. ++ +This command shows any databases that are not in their expected state: ++ +[source, cypher, role=noplay] +---- +SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, serverID WHERE currentStatus <> requestedStatus AND serverID = [server-id] RETURN name, address, currentStatus, requestedStatus +---- + +. Reattach the server to the NLB. +Once the server is stable and caught up, add it back to the AWS NLB target group. + + [role=label--enterprise-edition] == Licensing diff --git a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc b/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc deleted file mode 100644 index ebf1f61a6..000000000 --- a/modules/ROOT/pages/cloud-deployments/neo4j-cluster-cloud.adoc +++ /dev/null @@ -1,99 +0,0 @@ -:description: The page describes how to manage the Neo4j cluster on AWS. -:page-role: enterprise-edition - -[[neo4j-cluster-cloud-deployments]] -= Neo4j cluster in self-managed cloud deployments - -Before diving into the topic, it is important to understand basics about Neo4j's clustering. - -Neo4j cluster consists of a homogenous pool of servers that collectively run a number of databases. -The servers can operate in two different database-hosting modes: _primary_ and _secondary_. -A server can simultaneously act as a primary host for one or more databases and as a secondary host for other databases. - -For more details on operational and application aspects of Neo4j's clustering, refer to the xref::clustering/index.adoc[Clustering in Neo4j]. - -For information on how to manage databases and servers in a cluster, see respectively xref::clustering/databases.adoc[] and xref::clustering/servers.adoc[]. - - -== Neo4j cluster on AWS - -Neo4j does not provide Amazon Machine Images (AMIs) with a pre-installed version of the product. -The Neo4j AWS Marketplace listings (and listings on GitHub) use CloudFormation templates that deploy and configure Neo4j dynamically with a shell script. - - -// === Neo4j cluster and auto-scaling groups on AWS - - -=== Remove a server from the Neo4j cluster - -Rolling updates on Amazon Machine Images (AMIs) often involve rotating the images. -However, simply removing Neo4j servers from the target Network Load Balancer (NLB) one by one does not prevent requests from being routed to them. -This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of a server availability. - -To correctly remove a server from the cluster and reintroduce it after the update, follow the steps outlined below: - -. Remove the server from the AWS NLB. - This prevents external clients from sending requests to the server. - -. Since Neo4j's cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the server. -To do this, you have to cleanly shut down the server. - -.. Run the following query to check servers are hosting all their assigned databases. -The query should return no results: -+ -[source, cypher, role=noplay] ----- -SHOW SERVERS YIELD name, hosting, requestedHosting, serverId WHERE requestedHosting <> hosting ----- - -.. Use the following query to check all databases are in their expected state. -The query should return no results: -+ -[source, cypher, role=noplay] ----- -SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessage WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus, statusMessage ----- - -.. To stop the Neo4j service, run the following command: -+ -[source, shell, role=copy] ----- -sudo systemctl stop neo4j ----- -+ -To configure the timeout period for waiting on active transactions to either complete or be terminated before the shutdown, modify the setting xref::configuration/configuration-settings.adoc#config_db.shutdown_transaction_end_timeout[`db.shutdown_transaction_end_timeout`] in the _neo4j.conf_ file. -`db.shutdown_transaction_end_timeout` defaults to 10 seconds. -+ -The environment variable `NEO4J_SHUTDOWN_TIMEOUT` determines how long the system will wait for Neo4j to stop before forcefully terminating the process. -You can change this using `systemctl edit neo4j.service`. -By default, `NEO4J_SHUTDOWN_TIMEOUT` is set to 120 seconds. -If the shutdown process exceeds this limit, it is considered failed. -You may need to increase the value if the system serves long-running transactions. - -.. Verify that the shutdown process has finished successfully by checking the _neo4j.log_ for relevant log messages confirming the shutdown. - - -. When everything is updated or fixed, start the servers one by one again. -.. Run `systemctl start neo4j`. -.. Once the server has been restarted, confirm it is running successfully. -+ -Run the following command and check the server has state `Enabled` and health `Available`. -+ -[source, cypher, role=noplay] ----- -SHOW SERVERS WHERE name = [server-id]; ----- - -.. Confirm that the server has started all the databases that it should. -+ -This command shows any databases that are not in their expected state: -+ -[source, cypher, role=noplay] ----- -SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, serverID WHERE currentStatus <> requestedStatus AND serverID = [server-id] RETURN name, address, currentStatus, requestedStatus ----- - -. Reattach the server to the NLB. -Once the server is stable and caught up, add it back to the AWS NLB target group. - -