Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow disabling destination #128

Open
lijok opened this issue Nov 28, 2024 · 15 comments
Open

Allow disabling destination #128

lijok opened this issue Nov 28, 2024 · 15 comments
Labels
enhancement New feature or request priority:medium

Comments

@lijok
Copy link

lijok commented Nov 28, 2024

We need a way to disable destinations via terraform. I see there's a disabled_at option present on hookdeck_destination resource, but does not appear to do anything?

@leggetter leggetter added the enhancement New feature or request label Nov 28, 2024
@leggetter
Copy link
Collaborator

With Terraform, you'd expect to be able to set a disabled property on a resource to disable it. However, the Destination, Connection, and Source resources do not have a disabled / enabled or paused property, so instead, we need to trigger a side effect of calling the /disable/enable/pause/unpause RPC method.

@alexluong - what are your thoughts on having a property on the Terraform resource but not on the API resource? Also, could we then hook into the flow, and have a side effect of calling the RPC endpoints?

@alexluong
Copy link
Collaborator

@lijok hi, any chance you can elaborate on your use case a bit? Currently the TF doesn't support this operation as it's not the intended use case for TF. We can certainly support it but would be curious to hear of your use case if possible. Any reason you prefer managing this with TF over the dashboard?

As for the disabled_at property, it is a read-only field so you can't really set it. If we were to support this behavior, we would need to expose a separate field disabled since the disabled_at shall be the timestamp from the server.


@leggetter

However, the Destination, Connection, and Source resources do not have a disabled / enabled or paused property

Technically, this is not the case. From the API perspective, there's no disabled | enabled | paused property either. There's the timestamps, and whether it's disabled is derived from whether the timestamp exists.

cc destination ref

Also, could we then hook into the flow, and have a side effect of calling the RPC endpoints?

Yes, that's possible but I'm not sure if it's an antipattern or not.

@lijok
Copy link
Author

lijok commented Dec 2, 2024

@lijok hi, any chance you can elaborate on your use case a bit? Currently the TF doesn't support this operation as it's not the intended use case for TF. We can certainly support it but would be curious to hear of your use case if possible. Any reason you prefer managing this with TF over the dashboard?

As for the disabled_at property, it is a read-only field so you can't really set it. If we were to support this behavior, we would need to expose a separate field disabled since the disabled_at shall be the timestamp from the server.

@leggetter

However, the Destination, Connection, and Source resources do not have a disabled / enabled or paused property

Technically, this is not the case. From the API perspective, there's no disabled | enabled | paused property either. There's the timestamps, and whether it's disabled is derived from whether the timestamp exists.

cc destination ref

Also, could we then hook into the flow, and have a side effect of calling the RPC endpoints?

Yes, that's possible but I'm not sure if it's an antipattern or not.

We have two Hookdeck orgs, prod and dev, in line with our DR strategy of everything in 2's.
One Terraform root module deploys to both, with a config specific to prod and dev.
In dev, we don't want to send billing events to Hyperline, as one example.
So, in dev.tfvars we have a enable_hyperline_integration var set to false.
The path of least friction to implement that var, would be to toggle a disabled flag on the hyperline destination.
Because it doesn't exist, we currently have to optionally create connections, which are more difficult to grok.

EDIT: We also have a number of kill switches installed in our IaC which can be toggled in the case of an emergency. This feature would be utilized for them as well.

@alexbouchardd
Copy link
Contributor

Thanks, @lijok@alexluong. I think @leggetter's proposition to expose a disabled property on Terraform makes sense, as it doesn't fit the API's REST assumptions. In line with that, the same would be true for paused. Now, the issue, though, is that a source and destination can also be disabled/paused, but that has the side effect of also updating the associated connections, which would break the Terraform state. Unclear to me what the best practice from a TF standpoint is for this.

@lijok
Copy link
Author

lijok commented Dec 2, 2024

Being able to pause and disable a connection instead would also be a good way to address this requirement, if not better

@alexluong
Copy link
Collaborator

alexluong commented Dec 3, 2024

Now, the issue, though, is that a source and destination can also be disabled/paused, but that has the side effect of also updating the associated connections, which would break the Terraform state.

Ah that's tough. I don't think there's a clear TF best practice because we ventured into a use case where Terraform is not specifically designed for. With that said, this is a common pattern, so I've looked at how other providers approach this.

Now, none of them has the cascading effect of our source -> connection. However, I think as long as we have clear documentation that users should read the disabled_at timestamp to get the correct state, then we should be fine-ish in terms of the cascading issue.

When our source is disabled, upon state refresh at the end of the TF lifecycle, it will correctly update the disabled_at state of affected connections.

Personally, I would prefer a dedicated resource to update state because it may cause less confusion to the end users. If there's a connection.state or connection.disabled property, people may be inclined to read it as the source of truth over connection.disabled_at.

Problem

The main issue with this approach is this flow:

Let's say we have 1 source with 5 connections

1: disable source -> cascade & disable all 5 connections
2: enable source

End state: source is enabled; all 5 connections are disabled

Depending on our approach, there's either no way to fix the disabled state, or users will have to be explicit about re-enabling them.

Maybe this is not an issue that we need to address (or not yet) while trying to deliver this feature. I've explored a few approaches but nothing cleanly solves this problem just yet.

Potential solution

Requiring explicit disabled connections when disabling source or destination:

resource "hookdeck_source_disable" "disable_source" {
  source_id = hookdeck_source.my_source.id
  connection_disable_ids = [
    hookdeck_connection_disable.disable_connection_1.id,
    hookdeck_connection_disable.disable_connection_2.id
  ]
}

The problem with this is we still need to consider whether this array is an exhaustive list of all affected connections or not. And technically, we can query the API and validate, but what if there are connections not managed by Terraform? So we may need to allow non-exhaustive list, but at least it would still encourage a more explicit workflow.

So edge case would be users fail to provide a full list of hookdeck_connection_disable resources, and therefore those connections states are stuck afterwards.

Hopefully, this gives a fair idea of some of the potential issues we should consider when it comes to this feature. Let me know if you have any input or what we should consider here.

@leggetter
Copy link
Collaborator

Personally, I would prefer a dedicated resource to update state because it may cause less confusion to the end users.

I hear what you're saying. But then you add a level of separation between the resource having its state changed and the action of applying that change. I think that would provide a poorer DX from a Terraform management perspective and that adding the virtual properties directly to the resources is a better DX.

google_cloud_scheduler_job also has a paused boolean property.

The Google Cloud Scheduler is also an RPC method, so it is relevant to this scenario:
https://cloud.google.com/scheduler/docs/reference/rest/v1beta1/projects.locations.jobs/pause

Clearly, Google is also working around this with their Terraform provider.

Let's say we have 1 source with 5 connections

1: disable source -> cascade & disable all 5 connections
2: enable source

End state: source is enabled; all 5 connections are disabled

If this is how the API behaves, then that's outside of the Terraform provider. i.e., not something we're trying to solve here.

My suggestion would be:

  1. Add disabled and paused properties to the various resources
  2. On apply, catch those properties and map them to the appropriate RPC API calls (assuming some state change logic would also be applied)
  3. (not sure if this is possible or relevant in TF): when retrieving the value of disabled and paused, use the value of the {action}_at timestamp from the API to determine the value.
  4. Leave the cascading effects of disabling and enabling Connections to the API and read the resulting values after those have taken effect.

@alexluong
Copy link
Collaborator

@leggetter I hear you, let me explore this approach a bit with code and see how things come together!

@leggetter
Copy link
Collaborator

leggetter commented Dec 3, 2024

@leggetter I hear you, let me explore this approach a bit with code and see how things come together!

@alexluong Good call. It may flag other challenges and identify what is and is not possible.

@alexluong
Copy link
Collaborator

alexluong commented Dec 3, 2024

Question: what about this scenario: in Terraform, the user has source.disabled = true and connected connection.disabled = false explicitly?

when retrieving the value of disabled and paused, use the value of the {action}_at timestamp from the API to determine the value.

This is the behavior of a computed field. I'm not sure the exact behavior of how a computed field and a explicit & conflicting value would work.

resource "hookdeck_source" "source" {
  name = "source"
  disabled = true
}

resource "hookdeck_connection" "connection" {
  destination_id = hookdeck_destination.destination.id
  source_id = hookdeck_source.source.id
  disabled = false
}

@leggetter
Copy link
Collaborator

@alexluong, what would the state across the Source and Connection be if you were to do the following via the API:

  1. Disable the Source
  2. Enable the Connection

?

As mentioned earlier, we should leave the Source/Destination/Connection disable logic to the API and just translate the results to the resource properties.

I'm not too sure of the TF lifecycle so maybe is causes a problem?

@alexluong
Copy link
Collaborator

alexluong commented Dec 3, 2024

what would the state across the Source and Connection be if you were to do the following via the API: 1. Disable the Source 2. Enable the Connection?

Then both would become enabled.

As mentioned earlier, we should leave the Source/Destination/Connection disable logic to the API and just translate the results to the resource properties.

My concern is when someone sets the field explicitly, i.e. hard-coding the value in the Terraform code themselves. What if they set an impossible state?

So our idea is if someone selects set the field disabled = true, we'll make an API call to make that property disabled in the API. But what if specifically set the source to be disabled but connection enabled at the same time? We cannot make it so that state exists.

Consider this code snippet that I sent above:

resource "hookdeck_source" "source" {
  name = "source"
  disabled = true
}

resource "hookdeck_connection" "connection" {
  destination_id = hookdeck_destination.destination.id
  source_id = hookdeck_source.source.id
  disabled = false
}

What happens if someone tries to apply this? Each resource works independently so we won't be able to take into account other resource values for validation.

@leggetter
Copy link
Collaborator

leggetter commented Dec 3, 2024

Maybe we should only allow setting Connection state within TF and not Source state? Would this avoid the problem?

If someone changes the state of a Connection, does it cascade down to the Source? Even if that Source is used within other Connections?

@alexbouchardd
Copy link
Contributor

Maybe we should only allow setting Connection state within TF and not Source state? Would this avoid the problem?

This ^

@alexluong
Copy link
Collaborator

alexluong commented Dec 4, 2024

Maybe we should only allow setting Connection state within TF and not Source state? Would this avoid the problem?

Yes, this should avoid the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority:medium
Projects
None yet
Development

No branches or pull requests

4 participants