diff --git a/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc b/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc index 793c7a19bd7c5..f8c74b6ce713d 100644 --- a/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc +++ b/docs/reference/reference-architectures/elastic-cloud-architecture.asciidoc @@ -7,13 +7,13 @@ This architecture is ideal for observability use cases. The architecture include [discrete] [[cloud-hot-use-case]] -==== Use Case +==== Use case -This architecture is intended for organizations that need to: +This architecture is intended for organizations that need to do the following: -* Monitor the performance and health of their applications in real-time, including the creation and tracking of SLOs (Service Level Objectives). +* Monitor the performance and health of their applications in real time, including the creation and tracking of SLOs (Service Level Objectives). * Provide insights and alerts to ensure optimal performance and quick issue resolution for applications. -* Apply Machine Learning and Artificial Intelligence to assist SREs and Application Teams in dealing with the large amount of data in this type of use case. +* Apply machine learning and artificial intelligence to assist engineers and application teams in dealing with the large amount of data in this type of use case. [discrete] @@ -24,27 +24,18 @@ image::images/elastic-cloud-architecture.png["An Elastic Cloud Architecture"] [discrete] [[cloud-hot-frozen-considerations]] -==== Important Considerations +==== Important considerations The following list are important conderations for this architecture: -* **Time Series Data Updates:** -** Typically, time series use cases are append only and there is rarely a need to update documents once they have been ingested into Elasticsearch. The frozen tier is read-only so once data rolls over to the frozen tier documents can no longer be updated. If there is a need to update documents for some part of the data lifecycle, that will require either a larger hot tier or the introduction of a warm tier to cover the time period needed for document updates. -* **Multi-AZ Frozen Tier:** -** When using the frozen tier for storing data for regulatory purposes (e.g. one or more years), we typically recommend a single availability zone. However, since this architecture relies on the frozen tier for most of the search capabilities, we recommend at least two availability zones to ensure that there will be data nodes available in the event of an AZ failure. - -* **Architecture Variant - adding a Cold Tier** -** The hot-frozen architecture works well for most time-series use cases. However, when there is a need for more frequent, low-latency searches, introducing a cold tier may be required. Some common examples include detection rule lookback for security use cases or complex custom dashboards. The ILM policy for the example Hot-Frozen architecture above could be modified from 1 day in hot, 89 in frozen to 1 day in hot, 7 days in cold, and 82 days in frozen. Cold nodes fully mount a searchable snapshot for primary shards; replica shards are not needed for reliability. In the event of a failure, cold tier nodes can recover data from the underlying snapshot instead. See https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html[Data tiers] for more details on Elasticsearch data tiers. Note: our Data tiers docs may be slightly at odds with the concept of hot/frozen or hot/cold/frozen. Should they be updated? -* **Limitations of this architecture** -** This architecture is a high-availability Elasticsearch architecture. It is not intended as a Disaster Recovery architecture since it is deployed across Availability Zones in a single cloud region. This architecture can be enhanced for Disaster Recovery by adding a second deployment in another cloud region. Details on Disaster Recovery for Elasticsearch can be found https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[here]. +[discrete] +[[cloud-architecture-limitations]] +==== Limitations of this architecture +* This architecture is a high-availability Elasticsearch architecture. It is not intended as a Disaster Recovery architecture since it is deployed across Availability Zones in a single cloud region. This architecture can be enhanced for Disaster Recovery by adding a second deployment in another cloud region. Details on Disaster Recovery for Elasticsearch can be found https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[here]. [discrete] [[cloud-hot-frozen-resources]] ==== Resources and references -* <> * https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html[Elastic Cloud (Elasticsearch Service)] -* https://www.elastic.co/guide/en/cloud/current/ec-prepare-production.html[Elastic Cloud - Preparing a deployment for production] -* https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html[Elasticsearch Documentation] -* https://www.elastic.co/guide/en/kibana/current/index.html[Kibana Documentation] -* https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html[Size your shards] \ No newline at end of file +* https://www.elastic.co/guide/en/cloud/current/ec-prepare-production.html[Elastic Cloud - Preparing a deployment for production] \ No newline at end of file diff --git a/docs/reference/reference-architectures/general-cluster-guidance.asciidoc b/docs/reference/reference-architectures/general-cluster-guidance.asciidoc index 9bd62d71d9b9c..bdd7e11c11e7b 100644 --- a/docs/reference/reference-architectures/general-cluster-guidance.asciidoc +++ b/docs/reference/reference-architectures/general-cluster-guidance.asciidoc @@ -1,11 +1,37 @@ -[[reference-architecture-components]] -== General cluster guidance +[[reference-architecture-general-guidance]] +== General architecture guidance This page provides prescriptive guidance on key concepts to take into account when building out an Elastic Architecture. This includes components, sharding strategy, hardware recommendations, and index lifecycle. -[discrete] -[[component-types]] -=== Component types +=== Index strategy +Use index lifecycle management with index templates for consistent index level settings, please see, https://www.elastic.co/guide/en/elasticsearch/reference/current/set-up-lifecycle-policy.html[Configure a lifecycle policy] for more detail. + +* *Hot:* Use this tier for ingestion and the fastest reads on the most current data. This architecture assumes no updates to the data once written. +* **Warm: ** - Add information on when to use warm tier. +* **Cold** - Add information on when to use cold tier. +* **Frozen:** Data is persisted in a repository; however, it is accessed from the node's cache. It may not be as fast as the Hot tier; however, it can still be fast depending on the caching strategy. Frozen does not mean slow - it means immutable and saved in durable storage. + +* **Time Series Data Updates:** +** Typically, time series use cases are append only and there is rarely a need to update documents once they have been ingested into Elasticsearch. The frozen tier is read-only so once data rolls over to the frozen tier documents can no longer be updated. If there is a need to update documents for some part of the data lifecycle, that will require either a larger hot tier or the introduction of a warm tier to cover the time period needed for document updates. +* **Multi-AZ Frozen Tier:** +** When using the frozen tier for storing data for regulatory purposes (e.g. one or more years), we typically recommend a single availability zone. However, since this architecture relies on the frozen tier for most of the search capabilities, we recommend at least two availability zones to ensure that there will be data nodes available in the event of an AZ failure. + +* **Architecture Variant - adding a Cold Tier** +** The hot-frozen architecture works well for most time-series use cases. However, when there is a need for more frequent, low-latency searches, introducing a cold tier may be required. Some common examples include detection rule lookback for security use cases or complex custom dashboards. The ILM policy for the example Hot-Frozen architecture above could be modified from 1 day in hot, 89 in frozen to 1 day in hot, 7 days in cold, and 82 days in frozen. Cold nodes fully mount a searchable snapshot for primary shards; replica shards are not needed for reliability. In the event of a failure, cold tier nodes can recover data from the underlying snapshot instead. See https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html[Data tiers] for more details on Elasticsearch data tiers. Note: our Data tiers docs may be slightly at odds with the concept of hot/frozen or hot/cold/frozen. Should they be updated? + + +[[arch-sharding-strategy]] +=== Sharding strategy + +The most important foundational step to maintaining performance as you scale is proper shard sizing, location, count, and shard distribution. For a complete understanding of what shards are and how they should be used please review https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html[Size your shards]. + +* *Sizing:* Maintain shard sizes within https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#shard-size-recommendation[recommended ranges] and aim for an optimal number of shards. +* *Distribution:* In a distributed system, any distributed process is only as fast as the slowest node. As a result, it is optimal to maintain indexes with a primary shard count that is a multiple of the node count in a given tier. This creates even distribution of processing and prevents hotspots. +** Shard distribution should be enforced using the https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#avoid-node-hotspots[‘total shards per node’] index level setting. +* **Shard allocation awareness:** To prevent both a primary and a replica from being copied to the same zone, or in this case the same pod, you can use https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#shard-allocation-awareness[shard allocation awareness] and define a simple attribute in the elaticsearch.yaml file on a per-node basis to make Elasticsearch aware of the physical topology and route shards appropriately. In deployment models with multiple availability zones, AZ's would be used in place of pod location. + +[[cluster-topology]] +=== Cluster topology Each component serves a specific function within the Elasticsearch cluster, contributing to its overall performance and reliability. Understanding these node types is crucial for designing, managing, and optimizing your Elasticsearch deployment. @@ -16,11 +42,14 @@ Each component serves a specific function within the Elasticsearch cluster, cont | Master Node | image:images/master.png[Image showing a master node] | Responsible for cluster-wide settings and state, including index metadata and node information. Ensures cluster health by managing node joining and leaving. -|Storage is not a key factor for master nodes, however CPU and memory are important considerations. Each of our recommended instance types for master nodes have a vCPU:RAM ratio of at least 0.500. +a| +* Storage is not a key factor for master nodes, however CPU and memory are important considerations. +* Each of our recommended instance types for master nodes have a vCPU:RAM ratio of at least 0.500. | Data Node - Hot | image:images/hot.png[Image showing a hot data node] -| Stores data and performs CRUD, search, and aggregations. High I/O, CPU, and memory requirements. -|Since the hot tier is responsible for ingest, search and force-merge (when creating the searchable snapshots to roll data over to the frozen tier), cpu-optimized nodes with solid state drives are strongly recommended. Hot nodes should have a disk:memory ratio no higher than 45:1 and the vCPU:RAM ratio should be a minimum of 0.500. +| Stores data and performs CRUD, search, and aggregations. High I/O, CPU, and memory requirements. The hot tier is responsible for ingest, search and force-merge when creating searchable snapshots to roll data over to the frozen tier. +a|* CPU optimized nodes with solid state drives are strongly recommended. +* Hot nodes should have a disk:memory ratio no higher than 45:1 and the vCPU:RAM ratio should be a minimum of 0.500. | Data Node - Warm | Need Warm Image | Stores data and performs CRUD, search, and aggregations. High I/O, CPU, and memory requirements. @@ -31,16 +60,18 @@ Each component serves a specific function within the Elasticsearch cluster, cont | | Data Node - Frozen | image:images/frozen.png[Image showing a hot data node] -| Stores data and performs CRUD, search, and aggregations. High I/O, CPU, and memory requirements. -|The frozen tier uses a local cache to hold data from the Snapshot Store in the cloud providers' object store. For the best query performance in the frozen tier, frozen nodes should use solid state drives with a disk:memory ratio of at least 75:1 and a vCPU:RAM ratio of at least 0.133. +| Stores data and performs CRUD, search, and aggregations. High I/O, CPU, and memory requirements. The frozen tier uses a local cache to hold data from the Snapshot Store in the cloud providers' object store. +| For the best query performance in the frozen tier, frozen nodes should use solid state drives with a disk:memory ratio of at least 75:1 and a vCPU:RAM ratio of at least 0.133. | Machine Learning Node | image:images/machine-learning.png[Image showing a machine learning node] | Executes machine learning jobs, including anomaly detection, data frame analysis, and inference. -|Storage is not a key factor for ML nodes, however CPU and memory are important considerations. Each of our recommended instance types for machine learning have a vCPU:RAM ratio of at least 0.250. +a|* Storage is not a key factor for ML nodes, however CPU and memory are important considerations. +* Each of our recommended instance types for machine learning have a vCPU:RAM ratio of at least 0.250. | Kibana | image:images/kibana.png[Image showing a kibana node] | Provides the front-end interface for visualizing data stored in Elasticsearch. Essential for creating dashboards and managing visualizations. -|Storage is not a key factor for kibana nodes, however CPU and memory are important considerations. Each of our recommended instance types for kibana nodes have a vCPU:RAM ratio of at least 0.500. +a|* Storage is not a key factor for kibana nodes, however CPU and memory are important considerations. +* Each of our recommended instance types for kibana nodes have a vCPU:RAM ratio of at least 0.500. | Snapshot Storage | image:images/snapshot.png[Image showing snapshot storage] | Serves as the repository for storing snapshots of Elasticsearch indices. Critical for backup and disaster recovery. @@ -124,40 +155,14 @@ N2| For more details on these instance types, see our documentation on Elastic Cloud hardware for https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html[AWS], https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html[Azure] and https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html[GCP]. [discrete] -[[component-HA-guidance]] -=== High-availability guidance +[[component-HA]] +=== High-availability For production we recommend a minimum of 2 availability zones and 3 availability zones for mission critical applications. See https://www.elastic.co/guide/en/cloud/current/ec-planning.html[Plan for Production] for more details. -TIP: Even if the cluster is deployed across only two AZ, a third master node is still required for quorum voting and will be created automatically in the third AZ. +TIP: Even if the cluster is deployed across only two availability zones, a third master node is still required for quorum voting and will be created automatically in the third availability. The number of data nodes for each tier be scaled up depending on ingest volume and retention period. Hot nodes can contain both primary and replica shards. By default, primary and replica shards are always guaranteed to be in different availability zones. Frozen nodes rely on a large high-speed cache and retrieve data from the Snapshot Store as needed. Machine learning nodes are optional but highly recommended for large scale time series use cases since the amount of data quickly becomes too difficult to analyze without applying techniques such as machine learning based anomaly detection. -[discrete] -=== Shard Management - -The most important foundational step to maintaining performance as you scale is proper shard sizing, location, count, and shard distribution. For a complete understanding of what shards are and how they should be used please review https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html[Size your shards]. - -* *Sizing:* Maintain shard sizes within https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#shard-size-recommendation[recommended ranges] and aim for an optimal number of shards. -* *Distribution:* In a distributed system, any distributed process is only as fast as the slowest node. As a result, it is optimal to maintain indexes with a primary shard count that is a multiple of the node count in a given tier. This creates even distribution of processing and prevents hotspots. -** Shard distribution should be enforced using the https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#avoid-node-hotspots[‘total shards per node’] index level setting. -* **Shard allocation awareness:** To prevent both a primary and a replica from being copied to the same zone, or in this case the same pod, you can use https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#shard-allocation-awareness[shard allocation awareness] and define a simple attribute in the elaticsearch.yaml file on a per-node basis to make Elasticsearch aware of the physical topology and route shards appropriately. In deployment models with multiple availability zones, AZ's would be used in place of pod location. - -[discrete] -=== Index lifecyle -Use index lifecycle management with index templates for consistent index level settings, please see, https://www.elastic.co/guide/en/elasticsearch/reference/current/set-up-lifecycle-policy.html[Configure a lifecycle policy] for more detail. - -* *Hot:* Use this tier for ingestion and the fastest reads on the most current data. This architecture assumes no updates to the data once written. -* **Warm: ** - Add information on when to use warm tier. -* **Cold** - Add information on when to use cold tier. -* **Frozen:** Data is persisted in a repository; however, it is accessed from the node's cache. It may not be as fast as the Hot tier; however, it can still be fast depending on the caching strategy. Frozen does not mean slow - it means immutable and saved in durable storage. - -* **Time Series Data Updates:** -** Typically, time series use cases are append only and there is rarely a need to update documents once they have been ingested into Elasticsearch. The frozen tier is read-only so once data rolls over to the frozen tier documents can no longer be updated. If there is a need to update documents for some part of the data lifecycle, that will require either a larger hot tier or the introduction of a warm tier to cover the time period needed for document updates. -* **Multi-AZ Frozen Tier:** -** When using the frozen tier for storing data for regulatory purposes (e.g. one or more years), we typically recommend a single availability zone. However, since this architecture relies on the frozen tier for most of the search capabilities, we recommend at least two availability zones to ensure that there will be data nodes available in the event of an AZ failure. - -* **Architecture Variant - adding a Cold Tier** -** The hot-frozen architecture works well for most time-series use cases. However, when there is a need for more frequent, low-latency searches, introducing a cold tier may be required. Some common examples include detection rule lookback for security use cases or complex custom dashboards. The ILM policy for the example Hot-Frozen architecture above could be modified from 1 day in hot, 89 in frozen to 1 day in hot, 7 days in cold, and 82 days in frozen. Cold nodes fully mount a searchable snapshot for primary shards; replica shards are not needed for reliability. In the event of a failure, cold tier nodes can recover data from the underlying snapshot instead. See https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html[Data tiers] for more details on Elasticsearch data tiers. Note: our Data tiers docs may be slightly at odds with the concept of hot/frozen or hot/cold/frozen. Should they be updated? \ No newline at end of file diff --git a/docs/reference/reference-architectures/high-availability.asciidoc b/docs/reference/reference-architectures/high-availability.asciidoc index 73fa0aed5fe7f..1076bbf0b056a 100644 --- a/docs/reference/reference-architectures/high-availability.asciidoc +++ b/docs/reference/reference-architectures/high-availability.asciidoc @@ -1,5 +1,5 @@ [[reference-architecture-high-availability]] -== Time Series Highly Available Architectures +== Time series highly available architectures This page outlines reference architectures designed to ensure high availability for time series data in Elasticsearch deployments. These architectures leverage Elasticsearch's features—such as shard allocation, and replication to provide resilient, scalable solutions. diff --git a/docs/reference/reference-architectures/images/snapshot.png b/docs/reference/reference-architectures/images/snapshot.png index 0f44ff81f83fc..968378139abbb 100644 Binary files a/docs/reference/reference-architectures/images/snapshot.png and b/docs/reference/reference-architectures/images/snapshot.png differ diff --git a/docs/reference/reference-architectures/index.asciidoc b/docs/reference/reference-architectures/index.asciidoc index ac63af89b9c44..aad49c7c7cae0 100644 --- a/docs/reference/reference-architectures/index.asciidoc +++ b/docs/reference/reference-architectures/index.asciidoc @@ -1,7 +1,7 @@ [[reference-architectures]] -= Reference Architectures += Reference architectures -Elasticsearch Reference Architectures serve as essential blueprints for deploying, managing, and optimizing Elasticsearch clusters tailored to different use cases. These architectures provide standardized, proven solutions that help users with best practices for infrastructure setup, data ingestion, indexing, search performance, and high availability. Whether you're handling logs, metrics, or sophisticated search applications, these reference architectures ensure scalability, reliability, and efficient resource utilization. By leveraging these guidelines, organizations can confidently deploy Elasticsearch, achieving optimal performance while minimizing risks and complexities. +Elasticsearch Reference Architectures serve as blueprints for deploying, managing, and optimizing Elasticsearch clusters tailored to different use cases. These architectures are designed by Solutions Architectures to provide standardized, proven solutions that help users with best practices for infrastructure setup, data ingestion, indexing, search performance, and high availability. Whether you're handling logs, metrics, or sophisticated search applications, these reference architectures ensure scalability, reliability, and efficient resource utilization. By leveraging these guidelines, organizations can confidently deploy Elasticsearch, achieving optimal performance while minimizing risks and complexities. TIP: You can host {es} on your own hardware or send your data to {es} on {ecloud} or serverless. @@ -13,15 +13,13 @@ These reference architectures are recommendations and should be adapted to fit y [cols="50, 50"] |=== -| *Architecture* | *Use when* +| *Architecture* | *Use case* | <> image:images/multi-region-two-datacenter.png[Image showing a multi-region two datacenter architecture] a| -You want to: - * Monitor the performance and health of their applications in real-time * Provide insights and alerts to ensure optimal performance and quick issue resolution. @@ -30,8 +28,6 @@ You want to: image:images/elastic-cloud-architecture.png[Image showing a Elastic Cloud Hot-Frozen Architecture] a| -You want to: - * Ipsum lorem * Lorem ipsum @@ -40,8 +36,6 @@ You want to: image:images/single-datacenter.png[Image showing a single datacenter architecture] a| -You want to: - * TBD * TBD. @@ -50,8 +44,6 @@ You want to: image:images/three-availability-zone.png[Image showing a three Availability zone architecture] a| -You want to: - * TBD * TBD | @@ -59,9 +51,9 @@ You want to: [discrete] [[reference-architectures-ingest-architectures]] -=== Ingest Architectures +=== Ingest architectures -Additionally, we have architectures specifically tailored to the ingestion portion of your architecture and these can be found at, https://www.elastic.co/guide/en/ingest/current/use-case-arch.html[Ingest Architectures] +Additionally, we have architectures specifically tailored to the ingestion portion of your architecture and these can be found at, https://www.elastic.co/guide/en/ingest/current/use-case-arch.html[Ingest architectures] include::general-cluster-guidance.asciidoc[] diff --git a/docs/reference/reference-architectures/multi-region-two-datacenter-architecture.asciidoc b/docs/reference/reference-architectures/multi-region-two-datacenter-architecture.asciidoc index b659100934283..ee12aa9e5333a 100644 --- a/docs/reference/reference-architectures/multi-region-two-datacenter-architecture.asciidoc +++ b/docs/reference/reference-architectures/multi-region-two-datacenter-architecture.asciidoc @@ -1,15 +1,17 @@ [[multi-region-two-datacenter-architecture]] === Multi-Region - Two Datacenters -This article defines a scalable and highly available architecture for Elasticsearch using two datacenters in separate geographical regions. The architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is foundationally ready to handle any desired workload with resiliency. This architecture does include very high level representations of data flow, but the implementation of which will be included in subsequent documentation. +This article defines a scalable and highly available architecture for Elasticsearch using two datacenters in separate geographical regions. + +TIP: This architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is foundationally ready to handle any desired workload with resiliency. [discrete] [[multi-region-use-case]] -==== Use Case +==== Use case -This architecture is intended for organizations that need to: +This architecture is intended for organizations that need to do the following: -* Monitor the performance and health of their applications in real-time +* Monitor the performance and health of their applications in real time * Provide insights and alerts to ensure optimal performance and quick issue resolution for applications [discrete] @@ -20,25 +22,14 @@ image::images/multi-region-two-datacenter.png["A multi-region time-series archit [discrete] [[multi-region-considerations]] -==== Important Considerations +==== Important considerations The following list are important conderations for this architecture: -* **Shard Management:** The most important foundational step to maintaining performance as you scale is proper shard sizing, location, count, and shard distribution. For a complete understanding of what shards are and how they should be used please review https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html[Size your shards]. -** *Sizing:* Maintain shard sizes within https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#shard-size-recommendation[recommended ranges] and aim for an optimal number of shards. -** *Distribution:* In a distributed system, any distributed process is only as fast as the slowest node. As a result, it is optimal to maintain indexes with a primary shard count that is a multiple of the node count in a given tier. This creates even distribution of processing and prevents hotspots. -**** Shard distribution should be enforced using the https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#avoid-node-hotspots[‘total shards per node’] index level setting. -** **Shard allocation awareness:** To prevent both a primary and a replica from being copied to the same zone, or in this case the same pod, you can use https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#shard-allocation-awareness[shard allocation awareness] and define a simple attribute in the elaticsearch.yaml file on a per-node basis to make Elasticsearch aware of the physical topology and route shards appropriately. In deployment models with multiple availability zones, AZ's would be used in place of pod location. - - -* **Index lifecyle:** Use index lifecycle management with index templates for consistent index level settings, please see, https://www.elastic.co/guide/en/elasticsearch/reference/current/set-up-lifecycle-policy.html[Configure a lifecycle policy] for more detail. -** *Hot:* Use this tier for ingestion and the fastest reads on the most current data. This architecture assumes no updates to the data once written. -** **Warm / Cold** - This tier is not considered for this pattern. -** **Frozen:** Data is persisted in a repository; however, it is accessed from the node's cache. It may not be as fast as the Hot tier; however, it can still be fast depending on the caching strategy. Frozen does not mean slow - it means immutable and saved in durable storage. - - -* **Limitations of this architecture** -** No region resilience +[discrete] +[[multi-region-limitations]] +==== Limitations of this architecture +* No region resilience [discrete] [[multi-region-resources]] diff --git a/docs/reference/reference-architectures/self-managed-single-datacenter.asciidoc b/docs/reference/reference-architectures/self-managed-single-datacenter.asciidoc index d973f70b96f51..c234d0e098f2f 100644 --- a/docs/reference/reference-architectures/self-managed-single-datacenter.asciidoc +++ b/docs/reference/reference-architectures/self-managed-single-datacenter.asciidoc @@ -1,24 +1,26 @@ [[self-managed-single-datacenter]] === Self Managed - Single Datacenter -This architecture ensures high availability during normal operations and node maintenance. It includes all necessary Elastic Stack components but is not intended for workload sizing. Use this pattern as a foundation and extend it to meet your specific needs. While it represents data flow for context, implementations may vary. Key design elements include the number and location of master nodes, data nodes, zone awareness, and shard allocation strategy. For more details, see https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-design-large-clusters.html#high-availability-cluster-design-two-zones[Resilience in larger clusters - Two-zone clusters]. This design does not cover cross-region (geographically diverse) disaster recovery. +This architecture ensures high availability during normal operations and node maintenance. -While this architecture does include a representation of a data flow, this is being provided for contextual understanding and may differ from implementation to implementation. The critical portion of the design is the number and location of master nodes, the location of data nodes, zone awareness and the shard allocation strategy. For additional information see this https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-design-large-clusters.html#high-availability-cluster-design-two-zones[reference]. -This design does not address cross region (i.e. geographically diverse) disaster recovery. +TIP: This architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is foundationally ready to handle any desired workload with resiliency. + +Key design elements include the number and location of master nodes, data nodes, zone awareness, and shard allocation strategy. For more details, see https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-design-large-clusters.html#high-availability-cluster-design-two-zones[Resilience in larger clusters - Two-zone clusters]. +While this architecture does include a representation of a data flow, this is being provided for contextual understanding and may differ from implementation to implementation. The critical portion of the design is the number and location of master nodes, the location of data nodes, zone awareness and the shard allocation strategy. For additional information see this https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-design-large-clusters.html#high-availability-cluster-design-two-zones[reference]. [discrete] [[single-datacenter-use-case]] -==== Use Case +==== Use case -This architecture is intended for organizations that need to: +This architecture is intended for organizations that need to do the following: * Store data that is written once and not updated (e.g. logs, metrics or even an accounting ledger where balance updates are done via additional offsetting entries) * Be resilient to hardware failures * Ensure availability during operational maintenance of any given (zone i.e. POD in the diagram) * Maintain a single copy of the data during maintenance -* Leverage a Frozen Data tier as part of the Information Lifecycle -* Leverage a Snapshot Repository for additional recovery options +* Leverage a Frozen Data tier as part of the https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-index-lifecycle.html[Information Lifecycle] +* Leverage a https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html[Snapshot Repository] for additional recovery options [discrete] [[single-datacenter-architecture]] @@ -28,13 +30,13 @@ image::images/single-datacenter.png["A self hosted single datacenter deployment" [discrete] [[single-datacenter-considerations]] -==== Important Considerations +==== Important considerations The following list are important conderations for this architecture: * **Operate** -** Maintenance will be done only on one POD at a time. +** Maintenance will be done only on one pod at a time. ** A yellow cluster state is acceptable during maintenance. (This will be due to replica shards being unassigned.) @@ -57,11 +59,17 @@ The following list are important conderations for this architecture: * **Forced Awareness:** This should be set in In order to prevent Elastic from trying to create replica shards when a given POD is down for maintenance. * https://www.elastic.co/guide/en/elasticsearch/reference/8.16/snapshots-take-snapshot.html#automate-snapshots-slm[SLM (Snapshot Lifecycle Management): Considerations] -* **Limitations of this architecture** -** No region resilience -** Only a single copy of (some of … i.e. the most recently written data that is not yet part of a snapshot) data exists during maintenance windows - (Note: This could be addressed by adding data nodes to POD 3 and setting the sharding strategy to 1 Primary and 2 Replicas) -** Assumes write once (no updating of documents) -* **Benefits of this architecture** + +[discrete] +[[single-datacenter-limitations]] +==== Limitations of this architecture +** This design does not address cross region disaster recovery. +** Only a single copy of the most recently written data that is not yet part of a snapshot data exists during maintenance windows. This can be addressed by adding data nodes to pod 3 and setting the sharding strategy to 1 Primary and 2 Replicas. +** This design assumes the data is read only and not updated. + +[discrete] +[[single-datacenter-benefits]] +==== Benefits of this architecture** ** Reduces cost by leveraging the Frozen tier as soon as that makes sense from an ingest and most frequently read documents perspective ** Significantly reduces the likelihood of hot-spotting due to the sharding strategy ** Eliminates network and disk overhead caused by rebalancing attempts that would occur during maintenance due to setting forced awareness. diff --git a/docs/reference/reference-architectures/three-availability-zones.asciidoc b/docs/reference/reference-architectures/three-availability-zones.asciidoc index adb44b6ae49fb..8bf98d9c58577 100644 --- a/docs/reference/reference-architectures/three-availability-zones.asciidoc +++ b/docs/reference/reference-architectures/three-availability-zones.asciidoc @@ -5,9 +5,9 @@ This article outlines a scalable and highly available architecture for Elasticse [discrete] [[three-availability-zones-use-case]] -==== Use Case +==== Use case -This architecture is intended for organizations that need to: +This architecture is intended for organizations that need to do the following: * Be resilient to hardware failures * Ensure availability during operational maintenance of any given availability zone @@ -24,7 +24,7 @@ image::images/three-availability-zone.png["A three-availability-zones time-serie [discrete] [[three-availability-zones-considerations]] -==== Important Considerations +==== Important considerations The following list are important conderations for this architecture: @@ -38,11 +38,20 @@ The following list are important conderations for this architecture: ** Set up a repository for the frozen tier. * https://www.elastic.co/guide/en/elasticsearch/reference/8.16/snapshots-take-snapshot.html#automate-snapshots-slm[SLM (Snapshot Lifecycle Management): Considerations] -* **Limitations of this architecture** + +[discrete] +[[three-zone-limitations]] +==== Limitations of this architecture + ** No region resilience ** Only a single copy of (some of … i.e. the most recently written data that is not yet part of a snapshot) data exists during maintenance windows - (Note: This could be addressed by adding data nodes to POD 3 and setting the sharding strategy to 1 Primary and 2 Replicas) ** Assumes write once (no updating of documents) -* **Benefits of this architecture** + + +[discrete] +[[three-zone-benefits]] +==== Benefits of this architecture + ** Reduces cost by leveraging the Frozen tier as soon as that makes sense from an ingest and most frequently read documents perspective ** Significantly reduces the likelihood of hot-spotting due to the sharding strategy ** Eliminates network and disk overhead caused by rebalancing attempts that would occur during maintenance due to setting forced awareness.