Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the content on seed from URI to a new page #2203

Merged
merged 4 commits into from
Apr 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions modules/ROOT/content-nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@
** Standard databases
*** xref:database-administration/standard-databases/naming-databases.adoc[]
*** xref:database-administration/standard-databases/create-databases.adoc[]
*** xref:database-administration/standard-databases/seed-from-uri.adoc[]
*** xref:database-administration/standard-databases/listing-databases.adoc[]
*** xref:database-administration/standard-databases/alter-databases.adoc[]
*** xref:database-administration/standard-databases/delete-databases.adoc[]
Expand Down
2 changes: 1 addition & 1 deletion modules/ROOT/pages/backup-restore/planning.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ See xref:clustering/monitoring/show-databases-monitoring.adoc#show-databases-mon

However, _restoring_ a database in a cluster is different since it is not known in advance how a database is going to be allocated to the servers in a cluster.
This method relies on the seed already existing on one of the servers.
The recommended way to restore a database in a cluster is to xref:clustering/databases.adoc#cluster-seed-uri[seed from URI].
The recommended way to restore a database in a cluster is to xref::database-administration/standard-databases/seed-from-uri.adoc[seed from URI].

[NOTE]
====
Expand Down
232 changes: 7 additions & 225 deletions modules/ROOT/pages/clustering/databases.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,7 @@ See <<undefined-servers-backup, Undefined servers with fallback backup>> for mor

If you provide a URI to a backup or a dump, the stores on all allocations will be replaced by the backup or the dump at the given URI.
The new allocations can be put on any `ENABLED` server in the cluster.
See <<cluster-seed-uri, Seed from URI>> for more details.
See xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from URI] for more details.


[source, shell]
Expand Down Expand Up @@ -371,9 +371,12 @@ CALL dbms.cluster.recreateDatabase("neo4j", {seedingServers: [], primaries: 3, s
[[cluster-seed]]
== Seed a cluster

There are two different ways to seed a cluster with data.
The first option is to use a _designated seeder_, where a designated server is used to create a backed-up database on other servers in the cluster.
The other options is to seed the cluster from URI, where all servers to host a database are seeded with an identical seed from an external source specified by the URI.
There are two different ways to seed a cluster with data:

* The first option is to use a _designated seeder_, where a designated server is used to create a backed-up database on other servers in the cluster.
* The other option is to seed the cluster from a URI, where all servers to host the database are seeded with an identical seed from an external source specified by that URI.
For more details, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI].

Keep in mind that using a designated seeder can be problematic in some situations as it is not known in advance how a database is going to be allocated to the servers in a cluster.
Also, this method relies on the seed already existing on one of the servers.

Expand Down Expand Up @@ -450,227 +453,6 @@ SHOW DATABASE foo;
9 rows available after 3 ms, consumed after another 1 ms
----

[[cluster-seed-uri]]
=== Seed from URI

This method seeds all servers with an identical seed from an external source, specified by the URI.
The seed can either be a full backup, a differential backup (see xref:clustering/databases.adoc#cloud-seed-provider[`CloudSeedProvider`]), or a dump from an existing database.
The sources of seeds are called _seed providers_.

The mechanism is pluggable, allowing new sources of seeds to be supported (see link:https://www.neo4j.com/docs/java-reference/current/extending-neo4j/project-setup/#extending-neo4j-plugin-seed-provider[Java Reference -> Implement custom seed providers] for more information).
The product has built-in support for seed from a mounted file system (file), FTP server, HTTP/HTTPS server, Amazon S3, Google Cloud Storage, and Azure Cloud Storage.

[NOTE]
====
Amazon S3, Google Cloud Storage, and Azure Cloud Storage are supported by default, but the other providers require configuration of xref:configuration/configuration-settings.adoc#config_dbms.databases.seed_from_uri_providers[`dbms.databases.seed_from_uri_providers`].
====

The URI of the seed is specified when the `CREATE DATABASE` command is issued:

[source, cypher, role="noplay"]
----
CREATE DATABASE foo OPTIONS {existingData: 'use', seedURI:'s3://myBucket/myBackup.backup'}
----

Download and validation of the seed is only performed as the new database is started.
If it fails, the database is not available and it has the `statusMessage`: `Unable to start database` of the `SHOW DATABASES` command.

[source, cypher, role="noplay"]
----
neo4j@neo4j> SHOW DATABASES;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name | type | aliases | access | address | role | writer | requestedStatus | currentStatus | statusMessage | default | home | constituents |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "seed3" | "standard" | [] | "read-write" | "localhost:7682" | "unknown" | FALSE | "online" | "offline" | "Unable to start database `DatabaseId{3fe1a59b[seed3]}`" | FALSE | FALSE | [] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
----

To determine the cause of the problem, it is recommended to look at the `debug.log`.

[NOTE]
====
Starting from Neo4j 2025.01, seed from URI can also be used in combination with xref:database-administration/standard-databases/create-databases.adoc[`CREATE OR REPLACE DATABASE`].
====


[[file-seed-provider]]
==== FileSeedProvider

The `FileSeedProvider` supports:

** `file:`

[[url-connection-seed-provider]]
==== URLConnectionSeedProvider

The `URLConnectionSeedProvider` supports the following:

** `ftp:`
** `http:`
** `https:`

Starting from Neo4j 2025.01, the `URLConnectionSeedProvider` does not support `file`.
// This is true for both Cypher 5 and Cypher 25.

[[cloud-seed-provider]]
==== CloudSeedProvider

The `CloudSeedProvider` supports:

** `s3:`
** `gs:`
** `azb:`

The `CloudSeedProvider` supports using xref:backup-restore/modes.adoc#differential-backup[differential backup] files as seeds.
With the provided differential backup file, the `CloudSeedProvider` searches the directory containing differential backup files for a xref:backup-restore/online-backup.adoc#backup-chain[backup chain] ending at the specified differential backup, and then seeds using this backup chain.

[.tabbed-example]
=====
[role=include-with-AWS-S3]
======

include::partial$/aws-s3-overrides.adoc[]

include::partial$/aws-s3-credentials.adoc[]

. Create database from `myBackup.backup`.
+
[source,shell, role="nocopy"]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup' }
----

======
[role=include-with-Google-cloud-storage]
======

include::partial$/gcs-credentials.adoc[]

. Create database from `myBackup.backup`.
+
[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'gs://myBucket/myBackup.backup' }
----
======
[role=include-with-Azure-cloud-storage]
======

include::partial$/azb-credentials.adoc[]

. Create database from `myBackup.backup`.
+
[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'azb://myStorageAccount/myContainer/myBackup.backup' }
----
======
=====

Starting from Neo4j 2025.01, the `CloudSeedProvider` supports seeding up to a specific date or transaction ID using the `seedRestoreUntil` option.

[role=label--new-2025.01]
Seed up to a specific date::

To seed up to a specific date, you need to pass the differential backup, which contains the data up to that date.
+
[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: datetime("2019-06-01T18:40:32.142+0100") }
----
+
This will seed the database with transactions committed before the provided timestamp.

[role=label--new-2025.01]
Seed up to a specific transaction ID::

To seed up to a specific transaction ID, you need to pass the differential backup that contains the data up to that transaction ID.
+
[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: 123 }
----
+
This will seed the database with transactions up to, but not including transaction 123.

[role=label--deprecated]
[[s3-seed-provider]]
==== S3SeedProvider

// When Cypher 25 is released, we have to label this section 'Cypher 5' as this functionality is only available in Cypher 5.

The `S3SeedProvider` supports:

** `s3:` label:deprecated[Deprecated in 5.26]


[NOTE]
====
Neo4j comes bundled with necessary libraries for AWS S3 connectivity.
Therefore, if you use `S3SeedProvider`,`aws cli` is not required but can be used with the `CloudSeedProvider`.
====

The `S3SeedProvider` requires additional configuration.
This is specified with the `seedConfig` option.
This option expects a comma-separated list of configurations.
Each configuration value is specified as a name followed by `=` and the value, as such:

[source, cypher, role="noplay"]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1' }
----

`S3SeedProvider` also requires passing in credentials.
These are specified with the `seedCredentials` option.
Seed credentials are securely passed from the Cypher command to each server hosting the database.
For this to work, Neo4j on each server in the cluster must be configured with identical keystores.
This is identical to the configuration required by remote aliases, see xref:database-administration/aliases/remote-database-alias-configuration.adoc#remote-alias-config-DBMS_admin-A[Configuration of DBMS with remote database alias].
If this configuration is not performed, the `seedCredentials` option fails.

[source, cypher, role="noplay"]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1', seedCredentials: [accessKey];[secretKey] }
----
Where `accessKey` and `secretKey` are provided by AWS.

==== Seed provider reference

[cols="1,2,2",options="header"]
|===
| URL scheme
| Seed provider
| URI example

| `file:`
| `FileSeedProvider`
| `file://tmp/backup1.backup`

| `ftp:`
| `URLConnectionSeedProvider`
| `ftp://myftp.com/backups/backup1.backup`

| `http:`
| `URLConnectionSeedProvider`
| `\http://myhttp.com/backups/backup1.backup`

| `https:`
| `URLConnectionSeedProvider`
| `\https://myhttp.com/backups/backup1.backup`

| `s3:`
| `S3SeedProvider` label:deprecated[Deprecated in 5.26], +
`CloudSeedProvider`
| `s3://mybucket/backups/backup1.backup`

| `gs:`
| `CloudSeedProvider`
| `gs://mybucket/backups/backup1.backup`

| `azb:`
| `CloudSeedProvider`
| `azb://mystorageaccount.blob/backupscontainer/backup1.backup`
|===

[[cluster-allow-deny-db]]
== Controlling locations with allowed/denied databases

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2143,7 +2143,7 @@ The following values are available: `CloudSeedProvider`, `FileSeedProvider`, `S3
This list specifies enabled seed providers.
If a seed source (URI scheme) is supported by multiple providers in the list, the first matching provider will be used.
If the list is set to empty, the seed from URI functionality is effectively disabled.
See xref:/clustering/databases.adoc#cluster-seed-uri[Seed from URI] for more information.
See xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from a URI] for more information.
|Valid values
a|A comma-separated list where each element is a string.
|Default value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,18 +91,18 @@ Replaced by `existingDataSeedServer`.
| URI to a backup or a dump from an existing database.
|
Defines an identical seed from an external source which will be used to seed all servers.
For more information, see xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from a URI].

| `seedConfig`
| Comma-separated list of configuration values.
|
For more information see xref::clustering/databases.adoc#cluster-seed-uri[Seed from URI].

| `seedCredentials` label:deprecated[Deprecated in 5.26]
| credentials
|
Defines credentials that need to be passed into certain seed providers.
It is recommended to use the `CloudSeedProvider` seed provider, which does not require this configuration when seeding from cloud storage.
For more information see xref::clustering/databases.adoc#cloud-seed-provider[CloudSeedProvider].
For more information see xref::database-administration/standard-databases/seed-from-uri.adoc#cloud-seed-provider[CloudSeedProvider].

| `txLogEnrichment`
| `FULL` \| `DIFF` \| `OFF`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ DROP DATABASE movies DUMP DATA
----

In Neo4j, dumps can be stored in the directory specified by the xref:configuration/configuration-settings.adoc#config_server.directories.dumps.root[`server.directories.dumps.root`] setting (by default, the path for storing dumps is xref:configuration/file-locations.adoc#data[`<neo4j-home>/data/dumps`]).
You can use dumps to create databases through the xref:clustering/databases.adoc#cluster-seed-uri[Seed from URI approach].
You can use dumps to create databases using the xref::database-administration/standard-databases/seed-from-uri.adoc[seed from a URI] approach.

The option `DESTROY DATA` explicitly requests the default behavior of the command.

Expand Down
Loading