Skip to content

Commit

Permalink
updates
Browse files Browse the repository at this point in the history
  • Loading branch information
georgewallace committed Nov 5, 2024
1 parent 93b38e2 commit fe34296
Show file tree
Hide file tree
Showing 6 changed files with 49 additions and 171 deletions.
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
[[elastic-cloud-architecture]]
== Elasticsearch Service - High Availability - Single Region
++++
<titleabbrev>Cloud: High Availability - Single Region</titleabbrev>
++++
== Hot / Frozen - High Availability

The Hot-Frozen Elasticsearch cluster architecture is cost optimized for large time-series datasets while keeping all of the data **fully searchable**. There is no need to "re-hydrate" archived data. In this architecture, the hot tier is primarily used for indexing and immediate searching (1-3 days) with a majority of the search being handled by the frozen tier. Since the data is moved to searchable snapshots in an object store, the cost of keeping all of the data searchable is dramatically reduced.

TIP: This architecture includes all the essential components of the Elastic Stack. It's designed to ensure your deployment has a stable foundation, based on expert recommendations, but is not intended for sizing workloads.
This architecture is ideal for observability use cases. The architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is foundationally ready to handle any desired workload with resiliency. This architecture shows a very high level representation of data flow. For more details on that, see our https://www.elastic.co/guide/en/ingest/current/use-case-arch.html[Ingest Architectures].

The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production].

Expand All @@ -16,9 +13,12 @@ The most important foundational step to any architecture is designing your deplo

This architecture is intended for organizations that need to do the following:

* Monitor the performance and health of their applications in real time, including the creation and tracking of SLOs (Service Level Objectives).
* Provide insights and alerts to ensure optimal performance and quick issue resolution for applications.
* Apply machine learning and artificial intelligence to assist engineers and application teams in dealing with terabytes of new data per day.
* Monitor the performance and health of their applications in real-time, including the creation and tracking of SLOs (Service Level Objectives).
* Provide insights and alerts using logs, metrics, traces, or events to ensure optimal performance and quick issue resolution for applications
* Apply Machine Learning and Artificial Intelligence to assist SREs and Application Teams in dealing with the large amount of data in this type of use case.
* Ensure resilience to hardware failures, and maintain availability during operational maintenance by defining zones or pods to enable smooth failure handling.
* Deploy the most cost effective architecture model that allows for maximum flexibility between storage cost and performance.



[discrete]
Expand All @@ -27,124 +27,75 @@ This architecture is intended for organizations that need to do the following:

image::images/elastic-cloud-architecture.png["An Elastic Cloud Architecture"]

[discrete]
[[cloud-hot-frozen-configuration]]
=== Example configuration

The following is a sample configuration with the following specifications:

* An ingest rate of 1TB/day
* 1 day in the hot tier
* 89 days in the frozen tier
* A total of 90 days of searchable data
TIP: We use an Availability zones (AZ) concept in the architecture above. When running in your own Data center (DC) you can equate AZs to racks or even separate physical machines.

[discrete]
[[cloud-hot-frozen-aws]]
==== AWS
The diagram illustrates an Elasticsearch cluster deployed in Elastic Cloud across 3 availability zones. For production, we recommend a minimum of 2 availability zones and 3 availability zones for mission critical applications. See https://www.elastic.co/guide/en/cloud/current/ec-planning.html[Plan for Production] for more details. Note that even if the cluster is deployed across only two availability zones, a third master node is still required for quorum voting and will be created automatically in the third availability zone.

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[cloud-hot-frozen-azure]]
==== Azure

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[cloud-hot-frozen-gcp]]
==== GCP
The number of data nodes shown for each tier (hot and frozen) is illustrative and would be scaled up depending on ingest volume and retention period (see the example below). Hot nodes contain both primary and replica shards. By default, primary and replica shards are always guaranteed to be in different availability zones. Frozen nodes rely on a large high-speed cache and retrieve data from the Snapshot Store as needed.

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)
Machine learning nodes are optional but highly recommended for large scale time series use cases since the amount of data quickly becomes too difficult to analyze without applying techniques such as machine learning based anomaly detection.

[discrete]
[[cloud-hot-frozen-recommended-instance-types]]
==== Recommended instance types per cloud provider
[[cloud-hot-frozen-configuration]]
=== Recommended Hardware Specifications

The following table details our recommended node types for this architecture, based on the hardware configurations described previously.
Elastic Cloud allows you to deploy clusters in AWS, Azure and Google Cloud. Available hardware types and configurations vary across all three cloud providers but each provides instance types that meet our recommendations for the node types used in this architecture:

For more details on these instance types, see our documentation on Elastic Cloud hardware for https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html[AWS], https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html[Azure], and https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html[GCP].

[cols="10, 30, 30, 30"]
[cols="10, 10, 10, 10, 10"]
|===
| *Type* | *AWS Instance/Type* | *Azure Instance/Type* | *GCP Instance/Type*
|image:images/hot.png["An Elastic Cloud Architecture"] | aws.es.datahot.c6gd
c6gd |azure.es.datahot.fsv2
f32sv2|gcp.es.datahot.n2.68x32x45

| **T*ype** | **AWS** | **Azure** | **GCP** | **Physical**
|image:images/hot.png["An Elastic Cloud Architecture"] |
c6gd |
f32sv2|

N2
|image:images/frozen.png["An Elastic Cloud Architecture"]
| aws.es.datafrozen.i3en

N2|
32 vCPU +
64 GB RAM +
2-5 NVMe SSD

|image:images/frozen.png["An Elastic Cloud Architecture"]
|
i3en
|
azure.es.datafrozen.edsv4


e8dsv4
|
gcp.es.datafrozen.n2.68x10x95


N2
N2|
8 vCPU +
64 GB RAM +
2-5 NVMe SSD
|image:images/machine-learning.png["An Elastic Cloud Architecture"]
| aws.es.ml.m6gd


|
m6gd
|
azure.es.ml.fsv2


f32sv2
|
gcp.es.ml.n2.68x32x45


N2
N2|
32 vCPU +
32 GB RAM +
2-5 NVMe SSD
|image:images/master.png["An Elastic Cloud Architecture"]
| aws.es.master.c6gd


|
c6gd
|
azure.es.master.fsv2


f32sv2
|
gcp.es.master.n2.68x32x45


N2
N2|
8 vCPU +
64 GB RAM +
2-5 NVMe SSD
|image:images/kibana.png["An Elastic Cloud Architecture"]
| aws.kibana.c6gd


|
c6gd
|
azure.kibana.fsv2


f32sv2
|
gcp.kibana.n2.68x32x45


N2|
8 vCPU +
64 GB RAM +
2-5 NVMe SSD
|===

[discrete]
Expand All @@ -160,10 +111,14 @@ The following are important considerations for this architecture:

* This architecture uses a Hot/Frozen architecture. If you require https://www.elastic.co/guide/en/security/current/about-rules.html[detection rule lookback] or complex dashboards you may need to leverage a https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html#cold-tier[cold tier].

* Only a single copy of (some of … i.e. the most recently written data that is not yet part of a snapshot) data exists during maintenance windows - (Note: This could be addressed by adding data nodes to POD 3 and setting the sharding strategy to 1 Primary and 2 Replicas)

* Maintenance should be performed one availability zone at a time.

[discrete]
[[cloud-architecture-limitations]]
=== Limitations of this architecture
* This architecture is not intended for Disaster Recovery, because it is deployed across Availability Zones in a single cloud region. To make this architecture disaster proof, add a second deployment in another cloud region. Learn more at, https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[disaster recovery].
* This architecture is not intended as a https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[disaster recovery] architecture since it is deployed across Availability Zones in a single cloud region.

[discrete]
[[cloud-hot-frozen-resources]]
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
9 changes: 0 additions & 9 deletions docs/reference/reference-architectures/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,6 @@ a|
* You need long retention periods with the ability to search indices in an object store cost-effectively.
* Use cloud provider's highly available object stores for data integrity so you don't have to depend on your own.

| <<three-availability-zones>>

This architecture is derived from the Elasticsearch Service - High Availability - Single Region architecture. It defines additional considerations required when self-deploying. It uses multi-availability zone architecture and is optimized for time-series.

a|
* When you need an architecture that is resilient to unplanned outages
|
|===

include::elastic-cloud-architecture.asciidoc[]

include::three-availability-zones.asciidoc[]

This file was deleted.

0 comments on commit fe34296

Please sign in to comment.