From 54a397fd8e68b9742bb8d254b7d86301e212b47f Mon Sep 17 00:00:00 2001 From: Chris Topaloudis Date: Fri, 13 Mar 2020 16:28:08 +0100 Subject: [PATCH] docs: restore and typos --- docs/operations.rst | 55 +++++++++++++++++++++++++++++++-------- invenio_stats/__init__.py | 6 ++--- 2 files changed, 47 insertions(+), 14 deletions(-) diff --git a/docs/operations.rst b/docs/operations.rst index 96193dec..d8217c05 100644 --- a/docs/operations.rst +++ b/docs/operations.rst @@ -8,8 +8,8 @@ Operations ========== -Since our only copy of statts is stored in the indices of Elasticsearch in case -of a cluster error or failure we will lose our stats data. Thus it is adviced +Since our only copy of stats is stored in the indices of Elasticsearch in case +of a cluster error or failure we will lose our stats data. Thus it is advised to setup a backup/restore mechanism for projects in production. We have several options when it comes down to tooling and methods for preserving @@ -42,7 +42,9 @@ Backup with elasticdump .. note:: Apart from the data, you will also have to backup the mappings, so you are - able to restore data properly. + able to restore data properly. The following example will backup only stats + for record-views (not the events), you can go through your indices and + select which ones make sense to backup. Save our mappings and our index data to record_view_mapping_backup.json and @@ -51,7 +53,7 @@ record_view_index_backup.json files respectively. .. code-block:: console $ elasticdump \ - > --input=http://production.es.com:9200/stats-record-view \ + > --input=http://localhost:9200/stats-record-view-2020-03 \ > --output=record_view_mapping_backup.json \ > --type=mapping @@ -63,7 +65,7 @@ record_view_index_backup.json files respectively. Fri, 13 Mar 2020 13:13:01 GMT | dump complete $ elasticdump \ - > --input=http://production.es.com:9200/stats-record-view \ + > --input=http://localhost:9200/stats-record-view-2020-03 \ > --output=record_view_index_backup.json \ > --type=data @@ -74,15 +76,46 @@ record_view_index_backup.json files respectively. Fri, 13 Mar 2020 13:13:13 GMT | Total Writes: 5 Fri, 13 Mar 2020 13:13:13 GMT | dump complete +In order to test restore functionality below I will delete on purpose the +index we backed up, from my instance. + +.. code-block:: console + + $ curl -XDELETE http://localhost:9200/stats-record-view-2020-03 + {"acknowledged":true} + Restore with elasticdump ~~~~~~~~~~~~~~~~~~~~~~~~ -There is a saying that goes "A backup worked only when it got restored." This -section will take us through the restore process of the previous step. We will -have to bring our application close to the state it was before the ES cluster -failure. +As we are all aware a backup did not work until it gets restored. Note that +before importing our data, we need to import the mappings to re-create the index. +The process is identical with the backup with just reversed sources --input and +--output. + + +.. code-block:: console -Some data loss is possible, from the time we notice the issue and restore -our cluster and its data to the last valid backed up dataset. + $ elasticdump \ + > --input=record_view_mapping_backup.json \ + > --output=http://localhost:9200/stats-record-view-2020-03 \ + > --type=mapping + + Fri, 13 Mar 2020 15:22:17 GMT | starting dump + Fri, 13 Mar 2020 15:22:17 GMT | got 1 objects from source file (offset: 0) + Fri, 13 Mar 2020 15:22:17 GMT | sent 1 objects to destination elasticsearch, wrote 4 + Fri, 13 Mar 2020 15:22:17 GMT | got 0 objects from source file (offset: 1) + Fri, 13 Mar 2020 15:22:17 GMT | Total Writes: 4 + Fri, 13 Mar 2020 15:22:17 GMT | dump complete + + $ elasticdump \ + > --input=record_view_mapping_backup.json \ + > --output=http://localhost:9200/stats-record-view-2020-03 \ + > --type=mapping + Fri, 13 Mar 2020 15:23:01 GMT | starting dump + Fri, 13 Mar 2020 15:23:01 GMT | got 5 objects from source file (offset: 0) + Fri, 13 Mar 2020 15:23:01 GMT | sent 5 objects to destination elasticsearch, wrote 5 + Fri, 13 Mar 2020 15:23:01 GMT | got 0 objects from source file (offset: 5) + Fri, 13 Mar 2020 15:23:01 GMT | Total Writes: 5 + Fri, 13 Mar 2020 15:23:01 GMT | dump complete diff --git a/invenio_stats/__init__.py b/invenio_stats/__init__.py index a94264e3..4b5fe657 100644 --- a/invenio_stats/__init__.py +++ b/invenio_stats/__init__.py @@ -230,7 +230,7 @@ def register_events(): cluster. Thus Invenio-Stats provides a way to *compress* those events by pre-aggregating them into meaningful statistics. -*Example: individual file downoalds events can be aggregated into the number of +*Example: individual file downloads events can be aggregated into the number of file download per day and per file.* Aggregations are registered in the same way as events, under the entrypoint @@ -270,7 +270,7 @@ def register_aggregations(): ] An aggregator class must be specified. The dictionary ``params`` -contains all the arguments given to its construtor. An Aggregator class is +contains all the arguments given to its constructor. An Aggregator class is just required to have a ``run()`` method. The default one is :py:class:`~invenio_stats.aggregations.StatAggregator` @@ -300,7 +300,7 @@ def register_aggregations(): ] } -Again the registering function returns the configuraton for the query: +Again the registering function returns the configuration for the query: .. code-block:: python