diff --git a/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc b/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc index 8694bcf8b117..b6a974f485f0 100644 --- a/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc +++ b/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc @@ -1,12 +1,9 @@ [[elastic-cloud-architecture]] -== Elasticsearch Service - High Availability - Single Region -++++ -Cloud: High Availability - Single Region -++++ +== Hot / Frozen - High Availability The Hot-Frozen Elasticsearch cluster architecture is cost optimized for large time-series datasets while keeping all of the data **fully searchable**. There is no need to "re-hydrate" archived data. In this architecture, the hot tier is primarily used for indexing and immediate searching (1-3 days) with a majority of the search being handled by the frozen tier. Since the data is moved to searchable snapshots in an object store, the cost of keeping all of the data searchable is dramatically reduced. -TIP: This architecture includes all the essential components of the Elastic Stack. It's designed to ensure your deployment has a stable foundation, based on expert recommendations, but is not intended for sizing workloads. +This architecture is ideal for observability use cases. The architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is foundationally ready to handle any desired workload with resiliency. This architecture shows a very high level representation of data flow. For more details on that, see our https://www.elastic.co/guide/en/ingest/current/use-case-arch.html[Ingest Architectures]. The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production]. @@ -16,9 +13,12 @@ The most important foundational step to any architecture is designing your deplo This architecture is intended for organizations that need to do the following: -* Monitor the performance and health of their applications in real time, including the creation and tracking of SLOs (Service Level Objectives). -* Provide insights and alerts to ensure optimal performance and quick issue resolution for applications. -* Apply machine learning and artificial intelligence to assist engineers and application teams in dealing with terabytes of new data per day. +* Monitor the performance and health of their applications in real-time, including the creation and tracking of SLOs (Service Level Objectives). +* Provide insights and alerts using logs, metrics, traces, or events to ensure optimal performance and quick issue resolution for applications +* Apply Machine Learning and Artificial Intelligence to assist SREs and Application Teams in dealing with the large amount of data in this type of use case. +* Ensure resilience to hardware failures, and maintain availability during operational maintenance by defining zones or pods to enable smooth failure handling. +* Deploy the most cost effective architecture model that allows for maximum flexibility between storage cost and performance. + [discrete] @@ -27,124 +27,75 @@ This architecture is intended for organizations that need to do the following: image::images/elastic-cloud-architecture.png["An Elastic Cloud Architecture"] -[discrete] -[[cloud-hot-frozen-configuration]] -=== Example configuration - -The following is a sample configuration with the following specifications: - -* An ingest rate of 1TB/day -* 1 day in the hot tier -* 89 days in the frozen tier -* A total of 90 days of searchable data +TIP: We use an Availability zones (AZ) concept in the architecture above. When running in your own Data center (DC) you can equate AZs to racks or even separate physical machines. -[discrete] -[[cloud-hot-frozen-aws]] -==== AWS +The diagram illustrates an Elasticsearch cluster deployed in Elastic Cloud across 3 availability zones. For production, we recommend a minimum of 2 availability zones and 3 availability zones for mission critical applications. See https://www.elastic.co/guide/en/cloud/current/ec-planning.html[Plan for Production] for more details. Note that even if the cluster is deployed across only two availability zones, a third master node is still required for quorum voting and will be created automatically in the third availability zone. -* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones) -* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones) -* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones) -* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones) -* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones) - -[discrete] -[[cloud-hot-frozen-azure]] -==== Azure - -* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones) -* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones) -* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones) -* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones) -* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones) - -[discrete] -[[cloud-hot-frozen-gcp]] -==== GCP +The number of data nodes shown for each tier (hot and frozen) is illustrative and would be scaled up depending on ingest volume and retention period (see the example below). Hot nodes contain both primary and replica shards. By default, primary and replica shards are always guaranteed to be in different availability zones. Frozen nodes rely on a large high-speed cache and retrieve data from the Snapshot Store as needed. -* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones) -* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones) -* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones) -* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones) -* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones) +Machine learning nodes are optional but highly recommended for large scale time series use cases since the amount of data quickly becomes too difficult to analyze without applying techniques such as machine learning based anomaly detection. [discrete] -[[cloud-hot-frozen-recommended-instance-types]] -==== Recommended instance types per cloud provider +[[cloud-hot-frozen-configuration]] +=== Recommended Hardware Specifications -The following table details our recommended node types for this architecture, based on the hardware configurations described previously. +Elastic Cloud allows you to deploy clusters in AWS, Azure and Google Cloud. Available hardware types and configurations vary across all three cloud providers but each provides instance types that meet our recommendations for the node types used in this architecture: For more details on these instance types, see our documentation on Elastic Cloud hardware for https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html[AWS], https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html[Azure], and https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html[GCP]. -[cols="10, 30, 30, 30"] +[cols="10, 10, 10, 10, 10"] |=== -| *Type* | *AWS Instance/Type* | *Azure Instance/Type* | *GCP Instance/Type* -|image:images/hot.png["An Elastic Cloud Architecture"] | aws.es.datahot.c6gd -c6gd |azure.es.datahot.fsv2 -f32sv2|gcp.es.datahot.n2.68x32x45 - +| **T*ype** | **AWS** | **Azure** | **GCP** | **Physical** +|image:images/hot.png["An Elastic Cloud Architecture"] | +c6gd | +f32sv2| -N2 -|image:images/frozen.png["An Elastic Cloud Architecture"] -| aws.es.datafrozen.i3en +N2| +32 vCPU + +64 GB RAM + +2-5 NVMe SSD +|image:images/frozen.png["An Elastic Cloud Architecture"] +| i3en | -azure.es.datafrozen.edsv4 - - e8dsv4 | -gcp.es.datafrozen.n2.68x10x95 - - -N2 +N2| +8 vCPU + +64 GB RAM + +2-5 NVMe SSD |image:images/machine-learning.png["An Elastic Cloud Architecture"] -| aws.es.ml.m6gd - - +| m6gd | -azure.es.ml.fsv2 - - f32sv2 | -gcp.es.ml.n2.68x32x45 - - -N2 +N2| +32 vCPU + +32 GB RAM + +2-5 NVMe SSD |image:images/master.png["An Elastic Cloud Architecture"] -| aws.es.master.c6gd - - +| c6gd | -azure.es.master.fsv2 - - f32sv2 | -gcp.es.master.n2.68x32x45 - - -N2 +N2| +8 vCPU + +64 GB RAM + +2-5 NVMe SSD |image:images/kibana.png["An Elastic Cloud Architecture"] -| aws.kibana.c6gd - - +| c6gd | -azure.kibana.fsv2 - - f32sv2 | -gcp.kibana.n2.68x32x45 - - N2| +8 vCPU + +64 GB RAM + +2-5 NVMe SSD |=== [discrete] @@ -160,10 +111,14 @@ The following are important considerations for this architecture: * This architecture uses a Hot/Frozen architecture. If you require https://www.elastic.co/guide/en/security/current/about-rules.html[detection rule lookback] or complex dashboards you may need to leverage a https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html#cold-tier[cold tier]. +* Only a single copy of (some of … i.e. the most recently written data that is not yet part of a snapshot) data exists during maintenance windows - (Note: This could be addressed by adding data nodes to POD 3 and setting the sharding strategy to 1 Primary and 2 Replicas) + +* Maintenance should be performed one availability zone at a time. + [discrete] [[cloud-architecture-limitations]] === Limitations of this architecture -* This architecture is not intended for Disaster Recovery, because it is deployed across Availability Zones in a single cloud region. To make this architecture disaster proof, add a second deployment in another cloud region. Learn more at, https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[disaster recovery]. +* This architecture is not intended as a https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[disaster recovery] architecture since it is deployed across Availability Zones in a single cloud region. [discrete] [[cloud-hot-frozen-resources]] diff --git a/docs/reference/reference-architectures/images/elastic-cloud-architecture.png b/docs/reference/reference-architectures/images/elastic-cloud-architecture.png index 505220e2cc56..556b06d77714 100644 Binary files a/docs/reference/reference-architectures/images/elastic-cloud-architecture.png and b/docs/reference/reference-architectures/images/elastic-cloud-architecture.png differ diff --git a/docs/reference/reference-architectures/images/single-datacenter.png b/docs/reference/reference-architectures/images/single-datacenter.png deleted file mode 100644 index 15210d6c4382..000000000000 Binary files a/docs/reference/reference-architectures/images/single-datacenter.png and /dev/null differ diff --git a/docs/reference/reference-architectures/images/three-availability-zone.png b/docs/reference/reference-architectures/images/three-availability-zone.png deleted file mode 100644 index f9ea160a9403..000000000000 Binary files a/docs/reference/reference-architectures/images/three-availability-zone.png and /dev/null differ diff --git a/docs/reference/reference-architectures/index.asciidoc b/docs/reference/reference-architectures/index.asciidoc index 82907f758ebb..86b8dce64a65 100644 --- a/docs/reference/reference-architectures/index.asciidoc +++ b/docs/reference/reference-architectures/index.asciidoc @@ -29,15 +29,6 @@ a| * You need long retention periods with the ability to search indices in an object store cost-effectively. * Use cloud provider's highly available object stores for data integrity so you don't have to depend on your own. -| <> - -This architecture is derived from the Elasticsearch Service - High Availability - Single Region architecture. It defines additional considerations required when self-deploying. It uses multi-availability zone architecture and is optimized for time-series. - -a| -* When you need an architecture that is resilient to unplanned outages -| |=== include::elastic-cloud-architecture.asciidoc[] - -include::three-availability-zones.asciidoc[] diff --git a/docs/reference/reference-architectures/three-availability-zones.asciidoc b/docs/reference/reference-architectures/three-availability-zones.asciidoc deleted file mode 100644 index c2561c7baaf4..000000000000 --- a/docs/reference/reference-architectures/three-availability-zones.asciidoc +++ /dev/null @@ -1,68 +0,0 @@ -[[three-availability-zones]] -== Self Managed - High Availability - Single Region -++++ -Self Managed - High Availability - Single Region -++++ - -This article outlines a scalable and highly available architecture for Elasticsearch using three availability zones. - -TIP: This architecture includes all the essential components of the Elastic Stack. It's designed to ensure your deployment has a stable foundation, based on expert recommendations, but is not intended for sizing workloads. - -The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production]. - -[discrete] -[[three-availability-zones-use-case]] -=== Use case - -This architecture is intended for organizations that need to do the following: - -* Be resilient to unplanned outages -* Ensure availability during operational maintenance of any given availability zone - -[discrete] -[[three-availability-zones-architecture]] -=== Architecture - -image::images/three-availability-zone.png["A three-availability-zones time-series architecture"] - -[discrete] -[[three-availability-zones-configuration]] -=== Example configuration - -The following is a sample configuration with the following specifications: - -* An ingest rate of 1TB/day -* 1 day in the hot tier -* 89 days in the frozen tier -* A total of 90 days of searchable data - -* Hot tier: 120G RAM (2 60G RAM node x 3 availability zones) -* Frozen tier: 120G RAM (1 60G RAM node x 3 availability zones) -* Machine learning: 128G RAM (1 64G node x 3 availability zones) -* Master nodes: 24G RAM (8G node x 3 availability zones) -* Kibana: 16G RAM (16G node x 3 availability zone) - -[discrete] -[[three-availability-zones-considerations]] -=== Important considerations - -The following are important considerations for this architecture: - -* You may require more than one copy of the most recently written data to be available. To achieve this, add data nodes to pod 3 and set the https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#create-a-sharding-strategy[sharding strategy] to 1 primary and 2 replicas. -* Maintenance should be performed one pod at a time. -* A yellow cluster state is acceptable during maintenance. This is due to the replica shards being unassigned. - -[discrete] -[[three-zone-limitations]] -=== Limitations of this architecture -* No region resilience -* During maintenance windows, only a single copy of the latest data not yet captured in a snapshot is available. -* This design assumes the data is written once and not updated. - -[discrete] -[[three-availability-zones-resources]] - -=== Resources and references - - -