THORChain Bare-Metal Validator  —  Migration to Cosmo-Operator

May 2025, THORChain have announced a migration to a new Cosmo-Operator to help automate and coordinate future Thornode releases. While…

Share
THORChain Bare-Metal Validator  —  Migration to Cosmo-Operator

May 2025, THORChain have announced a migration to a new Cosmos-Operator to help automate and coordinate future Thornode releases. While this could be a straight forward migration for anyone following the out of the box THORChain Validator setup. Many Node Operator following this Bare-metal Multi-Validator Configuration were concerned if the migration of their Validators would be smooth, as well as eventual bad surprises down the road. In this Guide, I will go trought my understanding of the Cosmos-Operator, as well as the steps I took to complete the migration of my validators.

What is Cosmos-Operator

Cosmos-Operator is a product from StrangeGloves, to assist manage Cosmos node, see their repo here.

There is two component, first the Cosmos-Operator, a pod that run in its own namespace, there is only one per Kube Cluster, and the cosmosfullnode, that is like an ‘advanced’ deployment with more feature than a standard kube deployments, that replace thornode and bifrost deployments.

They work together to change thornode and bifrost pod image based on block height. As you can see here, it contain a list of block height, with the corresponding image for thornode and bifrost, and will allow Node Operator to manually update that list (via git pull) in advance for the new version to take effect on desired height. Cosmosfullnode will also downgrade to previous version image configured if Node Operator download a pre-upgrade snapshot.

versions:
    - height: 0
      image: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
      containers:
        bifrost: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
    # Add chain versions here for future scheduled upgrades and historical sync

What is NOT Cosmos-Operator

Let’s rule out what Cosmos-Operator is NOT:

Cosmos-Operator is NOT something that will git pull any new confirmation nor download a new thornode version automatically. It work only based on the list of height and image above.
Cosmos-Operator will not scale up a thornode-pod when chain reach upgrade height if thornode is scaled down.
Cosmos-Operator will have no effect on any node that was not converted to cosmosfullnode, even if some of the validator were already converted on the same kube cluster.
Cosmos-Operator will not download a chainsnapshot.
Cosmos-Operator will not interfer with other pods running on the same cluster, even if they have similar names (Bifrost Pod from a maya install).
Cosmos-Operator will not spin up new external chain daemon automatically when they are released.
Cosmos-Operator will not do anything related to external chain daemons, it ONLY deal with the thornode and bifrost pods.

Side Effects

Thornode is no longer a deployment:
One side effect is that the thornode can’t be referred as deploy/thornode anymore (e.g. kubectl scale — replicas=0 deploy/thornode -n node1), so this could affect custom scripts.

Here are a few example of the new kubectl command to manually call thornode, such as set-ip-address or to manually scale down and up the thornode, etc:

# Scale Down Thornode
kubectl -n node1 patch cosmosfullnodes thornode -p '{"spec":{"instanceOverrides":{"thornode-0":{"disable":"Pod"}}}}' --type=merge

# Scale Up Thornode
kubectl -n node1 patch cosmosfullnodes thornode -p '{"spec":{"instanceOverrides":{"thornode-0":{"disable":null}}}}' --type=merge

# Call set-ip-address with custom ip
kubectl exec -it -n node1 -c node thornode-0 -- /kube-scripts/set-ip-address.sh "<ProxyExternalIP>"

Thornode pod is now called thornode-0:
Pod is now called thornode-0, but it is still reachable via thornode.node1.svc.cluster.local:27147.

In addition to services thornode and bifrost, there is now a thornode-rpc and thornode-p2p-0.

Prepare Migration — Backup

Let’s be safe and ensure we have a fresh backup:

make backup
-> thornode
make backup
-> bifrost

make mnemonic
-> save somewhere
make password
-> save somewhere

And take a copy of our Git repository, just in case we corrupt something down the way

cp -r ~/node1 ~/node1_backup20250529

Prepare Migration — Review existing Customizations

If you are like me, you have been dragging some custom configs for a few years now, and may not remember them all. Let’s review all change to be familiar with them

List all customized files

cd ~/node1/node-launcher/

git diff --name-only origin/master

git --no-pager diff origin/master

Review each file customisations

git diff origin/master -- bifrost/templates/deployment.yaml
git diff origin/master -- bifrost/values.yaml
git diff origin/master -- gateway/templates/service.yaml
git diff origin/master -- thornode-stack/mainnet.yaml
git diff origin/master -- thornode/templates/deployment.yaml
git diff origin/master -- thornode/values.yaml

Save for future references

git --no-pager diff origin/master > ../gitdifff_20250529.txt
git --no-pager diff -U1000000 origin/master > ../gitdifff_20250529_full.txt
git --no-pager diff --color=always origin/master > ../gitdifff_20250529_color.txt

File gitdiff_20250529.txt allow us to copy past things easily, while gitdiff_20250529_full.txt give us the larger scope in case we need to see the structure where the config was, and gitdiff_20250529_color.txt allow us to see the color to easily, so we didnt miss anything.

Look at running services for any Custom Patched External IP

kubectl get svc -n node1

Pull Changes and resolve Conflicts

Let’s go ahead and pull the latest changes from the repository.

cd ~/node1/node-launcher/
git pull --rebase --autostash

Look at Conflicting changes

git diff

* Unmerged path bifrost/templates/deployment.yaml
* Unmerged path bifrost/values.yaml
* Unmerged path thornode/templates/deployment.yaml
git status

On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
 modified:   gateway/templates/service.yaml
 modified:   thornode-stack/mainnet.yaml
 modified:   thornode/values.yaml

Unmerged paths:
  (use "git restore --staged <file>..." to unstage)
  (use "git add/rm <file>..." as appropriate to mark resolution)
 deleted by us:   bifrost/templates/deployment.yaml
 deleted by us:   bifrost/values.yaml
 deleted by us:   thornode/templates/deployment.yaml

Remove the deprecated files

git rm bifrost/templates/deployment.yaml
git rm bifrost/values.yaml
git rm thornode/templates/deployment.yaml

Re-Apply Customisations — External_IP

External_IP that was set in bifrost/templates/deployment and thornode/templates/deployment is now set into thornode/templates/cosmosfullnode.yam.

nano thornode/templates/cosmosfullnode.yaml

Set the Thornode’s External_IP on line 95.

Set the Bifrost’s External_IP on line 280.

Re-Apply Customisations — Ressources

CPU/RAM that was set in bifrost/values and thornode/values are now set into thornode/values.yaml.

nano thornode/values.yaml

Set the Thornode’s Resources on line 97. (should have been merged already)

Set the Thornode’s Resources on line 284.

Re-Apply Customisation — Thornode’s GRPC

Used when a Thornode Validator is used as a Thornode End-Point.

nano thornode/templates/cosmosfullnode.yaml

Set the Thornode’s GRPC Config on line 130.

Re-Work Customisation — thornode-stack/mainnet

aShared Daemon End-Point that used to be at the bottom of thornode-stack/mainnet under the tag bifrost: now need to be nested under de tag thornode: at the bigenning of the file and tabbed accordingly.

nano thornode-stack/mainnet.yaml

Here is an example of the new layout:

thornode:
  statesync:
    auto: false
    snapshotInterval: 0
  versions:
    - height: 0
      image: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
      containers:
        bifrost: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
    # Add chain versions here for future scheduled upgrades and historical sync

# ============================================================================================
# POINT TO SHARED DAEMONS
# ============================================================================================

  bifrost:

  # Binance
    binanceDaemon:
      mainnet: http://192.168.111.222:27147

  # Ethereum
    ethereumDaemon:
      mainnet: http://192.168.1111.222:8545

  # Bitcoin
    bitcoinDaemon:
      mainnet: 192.168.111.222:8332

  # Bitcoin-Cash
    bitcoinCashDaemon:
      mainnet: 192.168.1111.222:18332

  # Litecoin
    litecoinDaemon:
      mainnet: 192.168.111.222:9332

  # Dogecoin
    dogecoinDaemon:
      mainnet: 192.168.111.222:22555
      
  # Avalanche
    avaxDaemon:
      mainnet: http://192.168.111.222:9650/ext/bc/C/rpc
      
  #Gaia
    gaiaDaemon:
      enabled: true
      mainnet:
        rpc: http://192.168.111.222:26657
        grpc: 192.168.111.222:9090
        grpcTLS: false

  # New Style End-Point
    env:

      # Binance-Smart
      BIFROST_CHAINS_BSC_RPC_HOST: http://192.168.111.222:18545/
      BIFROST_CHAINS_BSC_BLOCK_SCANNER_RPC_HOST: http://192.168.111.222:18545/
      BSC_HOST: http://192.168.111.222:18545/

      # Not db compact at init time, it will still run randomly at runtime when needed.
      BIFROST_CHAINS_ETH_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_GAIA_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_DOGE_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_BTC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_AVAX_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_BSC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_BNB_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
      BIFROST_CHAINS_LTC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"

# ============================================================================================
# ============================================================================================

bitcoin-daemon:
  enabled: false

litecoin-daemon:
  enabled: false

bitcoin-cash-daemon:
  enabled: false

ethereum-daemon:
  enabled: false

dogecoin-daemon:
  enabled: false

gaia-daemon:
  enabled: false

avalanche-daemon:
  enabled: false

binance-smart-daemon:
  enabled: false

base-daemon:
  enabled: false

xrp-daemon:
  enabled: false

cardano-daemon:
  enabled: false

Confirm Customisation — External Chain Daemon

Confirm which daemon to run in this namespace in thornode-stack/mainnet.yaml. (should have been merged already).

nano thornode-stack/mainnet.yaml

Re-Apply Customisation — BASE End-Point

BASE End-Point that was set in bifrost/value is now set in thornode/values.yaml.

nano thornode/values.yaml

Set the BASE End-Point on line 157.

(Not required if already in thornode-stack/mainnet)

Re-Apply Customisation — BSC End-Point

BSC End-Point that was set in thornode-stack/mainnet is now set in thornode/values.yaml.

nano thornode/values.yaml

Set the BASE End-Point on line 150.

(Not required if already in thornode-stack/mainnet)

Re-Apply Customisation — Level DB Compact on Init

Level DB Compact on Init that was set in thornode-stack/mainnet is now set in thornode/values.yaml.

nano thornode/values.yaml

Set the Level DB Compact on Init on line 148.

(Not required if already in thornode-stack/mainnet)

Confirm Customisation — MetalLb in Gateway

Gateway Metal-Lb IP Mapping that is set in gateway/template/services.yaml.

nano gateway/templates/services.yaml

Confirm Gateway’s MetalLb on line 17. (should have been merged already).

Confirm Customisation — Midgard

Midgard enabling that is set in thornode-stack/values.yaml.

nano thornode-stack/values.yaml

Confirm midgard enabled on line 26. (should have been merged already).

Cosmos-Operator Prerequisit — Go version 1.23+

Check current installation version, if applicable

go version

remove previous version (if applicable)

sudo rm -rf /usr/local/go

Install go 1.24.3

wget https://go.dev/dl/go1.24.3.linux-amd64.tar.gz
sudo tar -xvf go1.24.3.linux-amd64.tar.gz
sudo mv go /usr/local
rm go1.24.3.linux-amd64.tar.gz

Add Path (if new installation)

echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
source ~/.bashrc

Confirm Install

go version

Installation of Cosmos-Operator

Note that for new created node, this will now be part of make tools

cd ~/node1/node-launcher/
make install-cosmos-operator

Monitor the output for errors, they are not obvious and pod will start and look ready even if there is some blocking errors, such as go not found, E0529 or E0531 errors, or others. If there is any error just try to re-run make install-cosmos-operator.

In addition to confirming cosmos-operator Pod Creation, take the time to confirm that the make install-cosmos-operator didn’t ouput any errors. I had to run it multiple time on most my servers.

Confirm Pod creation

kubectl -n cosmos-operator-system get pods

Scale down existing thornode pod

Scale Down Thornode to avoid having 2 thornode pod running, which would result in heavy slash.

kubectl scale --replicas=0 deploy/thornode -n node1 --timeout=5m

Confirm Thrnode pod is no longer running

kubectl get pods -n node1

Install Changes

Apply all the changes, remove old thornode and bifrost deployment, replacing them by cosmosfullnode.

make install

Verify Thornode and Bifrost pod Scale up.

kubectl get pods -n node1

Verify status

make status

Other Validators

Go through the same process for all other validators.