Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add action and state stores migrations #4305

Conversation

AndersonQ
Copy link
Member

@AndersonQ AndersonQ commented Feb 21, 2024

What does this PR do?

Add migration for the action/state stores and the necessary tests.

Now it migrates:

  • from the action store
  • from the deprecated YAML state store

It also removes all action store unused code and creates the internal/pkg/agent/storage/store/internal/migrations/ package. This package should contain all necessary and deprecated code to read the old stores data into current models. internal/pkg/agent/storage/store/migrations.go orchestrates the migrations.

Why is it important?

The new State Store uses a different schema and is serialised in JSON instead of YAML

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • [ ] I have added an integration test or an E2E test

How to test this PR locally

Testing action store -> new JSON state Store:

  • install a 7.17 agent
  • perform a policy update
  • ensure the action is persisted in the action store.
  • upgrade to this version
  • ensure the action store file is deleted and a new store.enc is present
  • decrypt the encrypted store and check the same action is present.

Testing current state store -> new JSON state Store:

  • same as above, but start with the latest agent release
  • you might use a scheduled upgrade action to ensure there is an action in the action queue
  • the store will be encrypted by default, so:
  • decrypt it
  • check the content
  • upgrade the agent
  • check the store content again. The information should be the same, but in JSON

Related issues

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

@AndersonQ AndersonQ self-assigned this Feb 21, 2024
@AndersonQ AndersonQ changed the base branch from main to bugfix/3912-borken-stat-store February 21, 2024 14:19
@AndersonQ AndersonQ force-pushed the 3912-state-store-migrations branch from 8acc5da to 6edaacf Compare February 21, 2024 14:22
@AndersonQ AndersonQ changed the title [WIP] state store migrations Add action and state stores migrations Feb 22, 2024
@AndersonQ AndersonQ marked this pull request as ready for review February 22, 2024 16:48
@AndersonQ AndersonQ requested a review from a team as a code owner February 22, 2024 16:48
@AndersonQ AndersonQ requested review from michalpristas and blakerouse and removed request for a team February 22, 2024 16:48
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@AndersonQ AndersonQ requested review from pchila and removed request for michalpristas February 22, 2024 16:49
Copy link
Contributor

mergify bot commented Feb 23, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 3912-state-store-migrations upstream/3912-state-store-migrations
git merge upstream/bugfix/3912-borken-stat-store
git push upstream 3912-state-store-migrations

@AndersonQ AndersonQ force-pushed the 3912-state-store-migrations branch from d7c4e40 to dae2b26 Compare February 23, 2024 06:16
@AndersonQ AndersonQ mentioned this pull request Mar 5, 2024
3 tasks
@AndersonQ AndersonQ force-pushed the 3912-state-store-migrations branch from 88f1d98 to de84eab Compare March 5, 2024 14:48
Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing unit tests for the migrations.

@AndersonQ
Copy link
Member Author

I think this is missing unit tests for the migrations.

@blakerouse could you be more specific? I didn't add explicit test for the function on store/migrations.go and store/internal/migrations/migrations.go because they're tested together with the whole store.

It's odd, GoLand and SonarQube show different code coverage.

Anyway, I can spot a few cases the test did not cover, there are tests for the migrations

@AndersonQ AndersonQ force-pushed the 3912-state-store-migrations branch from 53405cf to 7424ddb Compare March 7, 2024 10:08

// Action is a struct to read an Action Policy Change from its old
// YAML format.
type Action struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action has quite a broad meaning in the rest of the agent.
This looks like a very specific action store: it represents a Policy Action.
What about calling this PolicyActionStore or something more specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a private type actionQueue []fleetapi.ScheduledAction in state_store.go that makes following this even more confusion.

Why isn't this the general fleetapi.Action? Or one of the concrete Fleet action API implementations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package is only for action and state store migrations. It cannot be imported outside the strore package. That's why I thought it would not be confusing, at this level there is only one meaning for Action.

Why isn't this the general fleetapi.Action? Or one of the concrete Fleet action API implementations?

because it's a code for migration, the fleetapi models do not work, they changed. There was a whole PR just to fix the schema: #4240

Copy link
Member Author

@AndersonQ AndersonQ Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it was copied from the old store code. I kept it as it was, even thought it has properties to read different actions, as far as I can tell, there should only be 'Policy Change' actions saved

}

// ActionQueue is a struct to read the action queue from its old YAML format.
type ActionQueue struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This queue is quite peculiar: it's the same as Action above with ExpirationTime and Version attributes added.
I assume that the structure follows the old structure we had in the store, but maybe we can add some comments about why it's defined this way 😅

Comment on lines 48 to 59
// LoadActionStore loads an action store from loader.
func LoadActionStore(loader interface{ Load() (io.ReadCloser, error) }) (*Action, error) {
return LoadStore[Action](loader)
}

// LoadYAMLStateStore loads the old YAML state store from loader.
func LoadYAMLStateStore(loader interface{ Load() (io.ReadCloser, error) }) (*StateStore, error) {
return LoadStore[StateStore](loader)
}

// LoadStore loads a YAML file.
func LoadStore[Store any](loader interface{ Load() (io.ReadCloser, error) }) (store *Store, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to use an anonymous interface type inline here? Since it already appears 3 times in these few lines we may define something like:

Suggested change
// LoadActionStore loads an action store from loader.
func LoadActionStore(loader interface{ Load() (io.ReadCloser, error) }) (*Action, error) {
return LoadStore[Action](loader)
}
// LoadYAMLStateStore loads the old YAML state store from loader.
func LoadYAMLStateStore(loader interface{ Load() (io.ReadCloser, error) }) (*StateStore, error) {
return LoadStore[StateStore](loader)
}
// LoadStore loads a YAML file.
func LoadStore[Store any](loader interface{ Load() (io.ReadCloser, error) }) (store *Store, err error) {
type loader interface {
Load() (io.ReadCloser, error)
}
// LoadActionStore loads an action store from loader.
func LoadActionStore(loader loader) (*Action, error) {
return LoadStore[Action](loader)
}
// LoadYAMLStateStore loads the old YAML state store from loader.
func LoadYAMLStateStore(loader loader) (*StateStore, error) {
return LoadStore[StateStore](loader)
}
// LoadStore loads a YAML file.
func LoadStore[Store any](loader loader) (store *Store, err error) {

return fmt.Errorf("store state migrated from YAML failed: %w", err)
}

err = store.Save(jsonReader)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make a copy of the old store before rewriting the contents?
I am mostly thinking of having a way to restore the old contents in case of botched migration or rollback of an upgrade...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we need it. We don't do it when saving the store i the normal, non-migration flow, why would we need it here we well? Another thing, it uses a encrypted store to save, which saves by saving a .tmp file and renaming it.

Copy link
Member

@pchila pchila Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was more about having a backup in case of problems with the migration... but in that case we would also need to save it in the diagnostic bundle. It's just a nice to have, not blocking

internal/pkg/agent/storage/store/state_store.go Outdated Show resolved Hide resolved
}

t.Run("action store file does not exists", func(t *testing.T) {
ctx := context.Background()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a context that expires after a while, just in case the test gets stuck...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, but those tests do not wait on anything, there isn't any expectation they might get stuck.

Copy link

@AndersonQ AndersonQ merged commit 9f7ccc2 into elastic:bugfix/3912-borken-stat-store Mar 20, 2024
9 checks passed
@AndersonQ AndersonQ deleted the 3912-state-store-migrations branch March 20, 2024 06:09
@AndersonQ AndersonQ mentioned this pull request Apr 2, 2024
4 tasks
AndersonQ added a commit that referenced this pull request Jun 27, 2024
Fix action store -> state store migration tests (#4235)
* use golden file for store migration test
* remove withFile function
* migrate take in a storage.Store instead of the storage's path. It's needed so we can set the encrypted store vault's path

refactor state store (#4253)
It modifies the state store API to match the current needs.

update action model to match fleet-server schema (#4240)
* simplify fleetapi.Actions.UnmarshalJSON
* add test to ensure the state store is correctly loaded from disk
* skip state store migration tests, they will be fixes on a follow-up PR as part of #3912

add migrations for action and state stores (#4305)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants