Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add wait_for_destroy option when destroying and environment #946

Merged
merged 6 commits into from
Oct 14, 2024

Conversation

TomerHeber
Copy link
Collaborator

Issue & Steps to Reproduce / Feature Request

resolves #686

Solution

  1. Added the option in the schema.
  2. When an environment is 'destroyed', added the 'wait' code if the option is enabled.
  3. Added acceptance tests.
  4. Some misc changes to enable (in the future) a more modern linter checkers.

@@ -63,5 +63,4 @@ func TestAgentValues(t *testing.T) {
},
)
})

Copy link
Collaborator Author

@TomerHeber TomerHeber Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore all these changes... there are thousands of changes required to successfully add a new linter, so I slowly fix these with each PR I make.

@@ -368,6 +370,12 @@ func resourceEnvironment() *schema.Resource {
Description: "variable set id",
},
},
"wait_for_destroy": {
Type: schema.TypeBool,
Description: "during destroy, waits for the environment status to be 'INACTIVE'. Times out after 30 minutes.",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the description is clear enough.

if err != nil {
if frerr, ok := err.(*http.FailedResponseError); ok && frerr.BadRequest() {
tflog.Warn(ctx, "Could not delete environment. Already deleted?", map[string]interface{}{"id": d.Id(), "error": frerr.Error()})
return nil
}
return diag.Errorf("could not delete environment: %v", err)
}

if environment.Status != "INACTIVE" && d.Get("wait_for_destroy").(bool) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wait will only be applied if 'wait_for_destroy' is enabled.

timer := time.NewTimer(timeout) // When invoked - time out
results := make(chan error)

go func() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is some golang specific stuff (channels) not very common in other languages.
https://go.dev/tour/concurrency/2

@TomerHeber TomerHeber requested a review from yaronya September 1, 2024 16:22

environment = &latestEnvironment

if environment.Status == "INACTIVE" {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destroy successful...

timeout := time.Minute * 30
if os.Getenv("TF_ACC") == "1" { // For acceptance tests.
waitInteval = time.Second
timeout = time.Second * 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean that it's 10 seconds?
Would it fail if environment couldn't destroy in 10 seconds?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for unit tests...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's for UTs - does it mean that test will hang for 10s?

What about harness tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... but unit tests run in parallel so it doesn't really cause much problems. (max timeout it 10 seconds)
Harness tests - at least 10 seconds (time between intervals). But harness also run in parallel, so no issues IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having 10 seconds timeout for harness tests is too low. Usually deployments take about 5-10 seconds only to start (i.e schedule a worker that handles the deployment), so I suggest we keep it as 3 minutes.

Copy link
Collaborator Author

@TomerHeber TomerHeber Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 10 seconds timeout is for acceptance tests, not harness tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify - for harness tests it'd keep the original behavior, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

return
}

environment = &latestEnvironment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this extra assignment needed?

Copy link
Collaborator Author

@TomerHeber TomerHeber Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoids a few lines of code for the first iteration... but I can check if it can be avoided.

@@ -929,17 +937,69 @@ func resourceEnvironmentDelete(ctx context.Context, d *schema.ResourceData, meta
return diag.Errorf(`must enable "force_destroy" safeguard in order to destroy`)
}

_, err := apiClient.EnvironmentDestroy(d.Id())
environment, err := apiClient.EnvironmentDestroy(d.Id())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The destroy API returns the deployment log of the new destroy deployment, and not the environment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm if that's the case - sounds like a long standing bug...
I'll fix it....

env0/resource_environment.go Outdated Show resolved Hide resolved
env0/resource_environment_scheduling_test.go Outdated Show resolved Hide resolved
@@ -66,7 +66,7 @@ type ApiClientInterface interface {
Environment(id string) (Environment, error)
EnvironmentCreate(payload EnvironmentCreate) (Environment, error)
EnvironmentCreateWithoutTemplate(payload EnvironmentCreateWithoutTemplate) (Environment, error)
EnvironmentDestroy(id string) (Environment, error)
EnvironmentDestroy(id string) (*EnvironmentDestroyResponse, error)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an old 'bug'.

@@ -91,6 +93,7 @@ type DeploymentLog struct {
Output json.RawMessage `json:"output,omitempty"`
Error json.RawMessage `json:"error,omitempty"`
Type string `json:"type"`
Status string `json:"status"`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added Status to the DeploymentLog sruct.

@@ -283,13 +299,13 @@ func (client *ApiClient) EnvironmentCreateWithoutTemplate(payload EnvironmentCre
return result, nil
}

func (client *ApiClient) EnvironmentDestroy(id string) (Environment, error) {
var result Environment
func (client *ApiClient) EnvironmentDestroy(id string) (*EnvironmentDestroyResponse, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old bug fixed...

results <- nil
return
}
if slices.Contains([]string{"WAITING_FOR_USER", "TIMEOUT", "FAILURE", "CANCELLED", "INTERNAL_FAILURE", "ABORTING", "ABORTED", "SKIPPED", "NEVER_DEPLOYED"}, deployment.Status) {
Copy link
Collaborator Author

@TomerHeber TomerHeber Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fast) fail on any of these status.

if environment.Status == "DESTROY_IN_PROGRESS" {
continue
}
if deployment.Status == "SUCCESS" {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy path. After waiting...

return
case <-ticker.C:
continue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep waiting...

@@ -49,6 +49,7 @@ resource "env0_environment" "auto_glob_envrironment" {
approve_plan_automatically = true
deploy_on_push = true
force_destroy = true
wait_for_destroy = true
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the flag to one of the harness tests...

@TomerHeber TomerHeber requested a review from yaronya September 3, 2024 13:29
@yaronya
Copy link
Contributor

yaronya commented Sep 24, 2024

@TomerHeber can you please resolve all conflicts before I review?

@TomerHeber TomerHeber force-pushed the feat-wait_for_destory-#686 branch from 8988667 to e12d69b Compare September 24, 2024 12:49
@TomerHeber
Copy link
Collaborator Author

@TomerHeber can you please resolve all conflicts before I review?

@yaronya - done!

@TomerHeber TomerHeber force-pushed the feat-wait_for_destory-#686 branch from e12d69b to bc20d5b Compare October 2, 2024 01:22
@@ -256,6 +263,15 @@ func (client *ApiClient) Environment(id string) (Environment, error) {
return result, nil
}

func (client *ApiClient) EnvironmentDeployment(id string) (*DeploymentLog, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to call it EnvironmentDeploymentLog so it'd be clearer to the dev that it's different from EnvironmentDeploy

client/environment.go Show resolved Hide resolved
return
}

if slices.Contains([]string{"WAITING_FOR_USER", "TIMEOUT", "FAILURE", "CANCELLED", "INTERNAL_FAILURE", "ABORTING", "ABORTED", "SKIPPED", "NEVER_DEPLOYED"}, deployment.Status) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to not fail on WAITING_FOR_USER but rather keep on waiting, and maybe write something back to the user that the deployment is waiting for an approval

This is actually something to consider for other statuses as well, like QUEUED, so how about we print the current status on every iteration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional messages, and no longer fast failing with WAITING_FOR_USER

timeout := time.Minute * 30
if os.Getenv("TF_ACC") == "1" { // For acceptance tests.
waitInteval = time.Second
timeout = time.Second * 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having 10 seconds timeout for harness tests is too low. Usually deployments take about 5-10 seconds only to start (i.e schedule a worker that handles the deployment), so I suggest we keep it as 3 minutes.

@TomerHeber TomerHeber requested a review from yaronya October 8, 2024 13:08
@TomerHeber
Copy link
Collaborator Author

@yaronya - this is the relevant commit 351976c

timeout := time.Minute * 30
if os.Getenv("TF_ACC") == "1" { // For acceptance tests.
waitInteval = time.Second
timeout = time.Second * 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify - for harness tests it'd keep the original behavior, correct?

@github-actions github-actions bot added ready to merge PR approved - can be merged once the PR owner is ready and removed pending final review labels Oct 14, 2024
@TomerHeber TomerHeber merged commit af33272 into main Oct 14, 2024
7 checks passed
@TomerHeber TomerHeber deleted the feat-wait_for_destory-#686 branch October 14, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-client feature integration-tests provider ready to merge PR approved - can be merged once the PR owner is ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

failed destroy dropping resources (environments) on the floor
2 participants