Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validator flow #1711

Merged
merged 1 commit into from
Dec 20, 2023
Merged

Add validator flow #1711

merged 1 commit into from
Dec 20, 2023

Conversation

otherview
Copy link
Contributor

Why this change is needed

Please provide a description and a link to the underlying ticket

What changes were made as part of this PR

Please provide a high level list of the changes made

PR checks pre-merging

Please indicate below by ticking the checkbox that you have read and performed the required
PR checks

  • PR checks reviewed and performed

Copy link

coderabbitai bot commented Dec 20, 2023

Walkthrough

The update introduces a GitHub Actions workflow aimed at streamlining the deployment of a Testnet Validator on Azure. This workflow is manually triggered and utilizes user inputs to execute a sequence of jobs: building the validator, deploying it to Azure, and updating the load balancer settings to accommodate the new validator. This automation simplifies the process of setting up and maintaining a Testnet Validator in the cloud environment.

Changes

File Change Summary
.github/workflows/.../manual-deploy-testnet-validator.yml Added a new GitHub Actions workflow for manual deployment of a Testnet Validator on Azure, including jobs for building, deploying, and updating the load balancer.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can reply to a review comment made by CodeRabbit.
  • You can tag CodeRabbit on specific lines of code or files in the PR by tagging @coderabbitai in a comment.
  • You can tag @coderabbitai in a PR comment and ask one-off questions about the PR and the codebase. Use quoted replies to pass the context for follow-up questions.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

@otherview otherview merged commit 63ba499 into main Dec 20, 2023
2 checks passed
@otherview otherview deleted the pedro/add_validator branch December 20, 2023 11:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 98e0913 and 10373a0.
Files selected for processing (1)
  • .github/workflows/manual-deploy-testnet-validator.yml (1 hunks)
Additional comments: 2
.github/workflows/manual-deploy-testnet-validator.yml (2)
  • 60-97: The Docker build and push steps seem to be using Docker BuildKit, which is good for performance and security. However, ensure that the --build-arg TESTMODE=true is intended for production deployment and does not enable any insecure test features.

  • 188-215: The update to the load balancer should be done with minimal downtime. Verify that the scripts used (testnet-clear-loadbalancer.sh and the Azure CLI commands) handle the update atomically and do not result in service disruption.

Comment on lines +9 to +57
on:
workflow_dispatch:
inputs:
testnet_type:
description: 'Testnet Type'
required: true
default: 'dev-testnet'
type: choice
options:
- 'dev-testnet'
- 'uat-testnet'
- 'sepolia-testnet'
log_level:
description: 'Log Level 1-Error 5-Trace'
required: true
default: 3
type: number
node_private_key:
description: 'Node Private Key'
required: true
type: string
node_account_address:
description: 'Node Account Address'
required: true
type: string
node_l1_ws_url:
description: 'Node L1 Connection String'
required: true
type: string
MGMT_CONTRACT_ADDR:
description: 'Management Contract Addr'
required: true
type: string
MSG_BUS_CONTRACT_ADDR:
description: 'Message bus Contract Addr'
required: true
type: string
L1_START_HASH:
description: 'L1 Starting Hash'
required: true
type: string
HOC_ERC20_ADDR:
description: 'HOC ERC20 Contract Addr'
required: true
type: string
POC_ERC20_ADDR:
description: 'POC ERC20 Contract Addr'
required: true
type: string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that all input parameters, especially sensitive ones like node_private_key, are handled securely and are not logged or exposed inappropriately. It's good practice to avoid echoing sensitive information in logs.

Comment on lines +99 to +185
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: ${{ github.event.inputs.testnet_type }}


steps:
- name: 'Extract branch name'
shell: bash
run: |
echo "Branch Name: ${GITHUB_REF_NAME}"
echo "BRANCH_NAME=${GITHUB_REF_NAME}" >> $GITHUB_ENV

- name: 'Login via Azure CLI'
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}

- name: 'Create VM for Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm create -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--admin-username obscurouser --admin-password "${{ secrets.OBSCURO_NODE_VM_PWD }}" \
--public-ip-address-dns-name "obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}" \
--tags deploygroup=ObscuroNode-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }} ${{ vars.AZURE_DEPLOY_GROUP_L2 }}=true \
--vnet-name ${{ github.event.inputs.testnet_type }}-virtual-network --subnet ${{ github.event.inputs.testnet_type }}-sub-network \
--size Standard_DC8_v2 --storage-sku StandardSSD_LRS --image ObscuroConfUbuntu \
--public-ip-sku Standard --authentication-type password

- name: 'Open Obscuro node-${{ matrix.host_id }} ports on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm open-port -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" --port 80,81,6060,6061,10000

# To overcome issues with critical VM resources being unavailable, we need to wait for the VM to be ready
- name: 'Allow time for VM initialization'
shell: bash
run: sleep 60

- name: 'Start Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm run-command invoke -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--command-id RunShellScript \
--scripts 'mkdir -p /home/obscuro \
&& git clone --depth 1 -b ${{ env.BRANCH_NAME }} https://github.com/ten-protocol/go-ten.git /home/obscuro/go-obscuro \
&& docker network create --driver bridge node_network || true \
&& docker run -d --name datadog-agent \
--network node_network \
-e DD_API_KEY=${{ secrets.DD_API_KEY }} \
-e DD_LOGS_ENABLED=true \
-e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
-e DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true \
-e DD_CONTAINER_EXCLUDE_LOGS="name:datadog-agent" \
-e DD_SITE="datadoghq.eu" \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
datadog/agent:latest \
&& cd /home/obscuro/go-obscuro/ \
&& sudo go run /home/obscuro/go-obscuro/go/node/cmd \
-is_genesis=false \
-node_type=validator \
-is_sgx_enabled=true \
-host_id=${{ github.event.inputs.node_account_address }} \
-l1_ws_url=${{ github.event.inputs.node_l1_ws_url }} \
-management_contract_addr=${{ github.event.inputs.MGMT_CONTRACT_ADDR }} \
-message_bus_contract_addr=${{ github.event.inputs.MSG_BUS_CONTRACT_ADDR }} \
-l1_start=${{ github.event.inputs.L1_START_HASH }} \
-private_key=${{ github.event.inputs.node_private_key }} \
-sequencer_id=${{ vars.ACCOUNT_ADDR_NODE_0 }} \
-host_public_p2p_addr=obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}.uksouth.cloudapp.azure.com:10000 \
-host_p2p_port=10000 \
-enclave_docker_image=${{ vars.L2_ENCLAVE_DOCKER_BUILD_TAG }} \
-host_docker_image=${{ vars.L2_HOST_DOCKER_BUILD_TAG }} \
-is_debug_namespace_enabled=true \
-log_level=${{ github.event.inputs.log_level }} \
-batch_interval=${{ vars.L2_BATCH_INTERVAL }} \
-max_batch_interval=${{ vars.L2_MAX_BATCH_INTERVAL }} \
-rollup_interval=${{ vars.L2_ROLLUP_INTERVAL }} \
-l1_chain_id=${{ vars.L1_CHAIN_ID }} \
start'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VM creation step uses a password for authentication (--authentication-type password). It's recommended to use SSH keys instead of passwords for better security. Additionally, ensure that the VM image ObscuroConfUbuntu is up-to-date with security patches.

- --authentication-type password
+ --authentication-type ssh

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: ${{ github.event.inputs.testnet_type }}
steps:
- name: 'Extract branch name'
shell: bash
run: |
echo "Branch Name: ${GITHUB_REF_NAME}"
echo "BRANCH_NAME=${GITHUB_REF_NAME}" >> $GITHUB_ENV
- name: 'Login via Azure CLI'
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: 'Create VM for Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm create -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--admin-username obscurouser --admin-password "${{ secrets.OBSCURO_NODE_VM_PWD }}" \
--public-ip-address-dns-name "obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}" \
--tags deploygroup=ObscuroNode-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }} ${{ vars.AZURE_DEPLOY_GROUP_L2 }}=true \
--vnet-name ${{ github.event.inputs.testnet_type }}-virtual-network --subnet ${{ github.event.inputs.testnet_type }}-sub-network \
--size Standard_DC8_v2 --storage-sku StandardSSD_LRS --image ObscuroConfUbuntu \
--public-ip-sku Standard --authentication-type password
- name: 'Open Obscuro node-${{ matrix.host_id }} ports on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm open-port -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" --port 80,81,6060,6061,10000
# To overcome issues with critical VM resources being unavailable, we need to wait for the VM to be ready
- name: 'Allow time for VM initialization'
shell: bash
run: sleep 60
- name: 'Start Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm run-command invoke -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--command-id RunShellScript \
--scripts 'mkdir -p /home/obscuro \
&& git clone --depth 1 -b ${{ env.BRANCH_NAME }} https://github.com/ten-protocol/go-ten.git /home/obscuro/go-obscuro \
&& docker network create --driver bridge node_network || true \
&& docker run -d --name datadog-agent \
--network node_network \
-e DD_API_KEY=${{ secrets.DD_API_KEY }} \
-e DD_LOGS_ENABLED=true \
-e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
-e DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true \
-e DD_CONTAINER_EXCLUDE_LOGS="name:datadog-agent" \
-e DD_SITE="datadoghq.eu" \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
datadog/agent:latest \
&& cd /home/obscuro/go-obscuro/ \
&& sudo go run /home/obscuro/go-obscuro/go/node/cmd \
-is_genesis=false \
-node_type=validator \
-is_sgx_enabled=true \
-host_id=${{ github.event.inputs.node_account_address }} \
-l1_ws_url=${{ github.event.inputs.node_l1_ws_url }} \
-management_contract_addr=${{ github.event.inputs.MGMT_CONTRACT_ADDR }} \
-message_bus_contract_addr=${{ github.event.inputs.MSG_BUS_CONTRACT_ADDR }} \
-l1_start=${{ github.event.inputs.L1_START_HASH }} \
-private_key=${{ github.event.inputs.node_private_key }} \
-sequencer_id=${{ vars.ACCOUNT_ADDR_NODE_0 }} \
-host_public_p2p_addr=obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}.uksouth.cloudapp.azure.com:10000 \
-host_p2p_port=10000 \
-enclave_docker_image=${{ vars.L2_ENCLAVE_DOCKER_BUILD_TAG }} \
-host_docker_image=${{ vars.L2_HOST_DOCKER_BUILD_TAG }} \
-is_debug_namespace_enabled=true \
-log_level=${{ github.event.inputs.log_level }} \
-batch_interval=${{ vars.L2_BATCH_INTERVAL }} \
-max_batch_interval=${{ vars.L2_MAX_BATCH_INTERVAL }} \
-rollup_interval=${{ vars.L2_ROLLUP_INTERVAL }} \
-l1_chain_id=${{ vars.L1_CHAIN_ID }} \
start'
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: ${{ github.event.inputs.testnet_type }}
steps:
- name: 'Extract branch name'
shell: bash
run: |
echo "Branch Name: ${GITHUB_REF_NAME}"
echo "BRANCH_NAME=${GITHUB_REF_NAME}" >> $GITHUB_ENV
- name: 'Login via Azure CLI'
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: 'Create VM for Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm create -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--admin-username obscurouser --admin-password "${{ secrets.OBSCURO_NODE_VM_PWD }}" \
--public-ip-address-dns-name "obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}" \
--tags deploygroup=ObscuroNode-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }} ${{ vars.AZURE_DEPLOY_GROUP_L2 }}=true \
--vnet-name ${{ github.event.inputs.testnet_type }}-virtual-network --subnet ${{ github.event.inputs.testnet_type }}-sub-network \
--size Standard_DC8_v2 --storage-sku StandardSSD_LRS --image ObscuroConfUbuntu \
--public-ip-sku Standard --authentication-type ssh
- name: 'Open Obscuro node-${{ matrix.host_id }} ports on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm open-port -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" --port 80,81,6060,6061,10000
# To overcome issues with critical VM resources being unavailable, we need to wait for the VM to be ready
- name: 'Allow time for VM initialization'
shell: bash
run: sleep 60
- name: 'Start Obscuro node-${{ matrix.host_id }} on Azure'
uses: azure/CLI@v1
with:
inlineScript: |
az vm run-command invoke -g Testnet -n "${{ vars.AZURE_RESOURCE_PREFIX }}-${{ matrix.host_id }}-${{ GITHUB.RUN_NUMBER }}" \
--command-id RunShellScript \
--scripts 'mkdir -p /home/obscuro \
&& git clone --depth 1 -b ${{ env.BRANCH_NAME }} https://github.com/ten-protocol/go-ten.git /home/obscuro/go-obscuro \
&& docker network create --driver bridge node_network || true \
&& docker run -d --name datadog-agent \
--network node_network \
-e DD_API_KEY=${{ secrets.DD_API_KEY }} \
-e DD_LOGS_ENABLED=true \
-e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
-e DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true \
-e DD_CONTAINER_EXCLUDE_LOGS="name:datadog-agent" \
-e DD_SITE="datadoghq.eu" \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
datadog/agent:latest \
&& cd /home/obscuro/go-obscuro/ \
&& sudo go run /home/obscuro/go-obscuro/go/node/cmd \
-is_genesis=false \
-node_type=validator \
-is_sgx_enabled=true \
-host_id=${{ github.event.inputs.node_account_address }} \
-l1_ws_url=${{ github.event.inputs.node_l1_ws_url }} \
-management_contract_addr=${{ github.event.inputs.MGMT_CONTRACT_ADDR }} \
-message_bus_contract_addr=${{ github.event.inputs.MSG_BUS_CONTRACT_ADDR }} \
-l1_start=${{ github.event.inputs.L1_START_HASH }} \
-private_key=${{ github.event.inputs.node_private_key }} \
-sequencer_id=${{ vars.ACCOUNT_ADDR_NODE_0 }} \
-host_public_p2p_addr=obscuronode-${{ matrix.host_id }}-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}.uksouth.cloudapp.azure.com:10000 \
-host_p2p_port=10000 \
-enclave_docker_image=${{ vars.L2_ENCLAVE_DOCKER_BUILD_TAG }} \
-host_docker_image=${{ vars.L2_HOST_DOCKER_BUILD_TAG }} \
-is_debug_namespace_enabled=true \
-log_level=${{ github.event.inputs.log_level }} \
-batch_interval=${{ vars.L2_BATCH_INTERVAL }} \
-max_batch_interval=${{ vars.L2_MAX_BATCH_INTERVAL }} \
-rollup_interval=${{ vars.L2_ROLLUP_INTERVAL }} \
-l1_chain_id=${{ vars.L1_CHAIN_ID }} \
start'

Comment on lines +217 to +230
check-obscuro-is-healthy:
needs:
- build
- deploy
runs-on: ubuntu-latest
environment:
name: ${{ github.event.inputs.testnet_type }}
steps:
- uses: actions/checkout@v3

- name: "Wait until obscuro node is healthy"
shell: bash
run: |
./.github/workflows/runner-scripts/wait-node-healthy.sh --host=obscuronode-0-${{ github.event.inputs.testnet_type }}-${{ GITHUB.RUN_NUMBER }}.uksouth.cloudapp.azure.com
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health check steps should have a timeout to prevent indefinite waiting in case the nodes do not become healthy. Ensure that the wait-node-healthy.sh script has a reasonable timeout and error handling mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants