THORChain Bare-Metal Validator — Migration to Cosmo-Operator
May 2025, THORChain have announced a migration to a new Cosmo-Operator to help automate and coordinate future Thornode releases. While…
May 2025, THORChain have announced a migration to a new Cosmos-Operator to help automate and coordinate future Thornode releases. While this could be a straight forward migration for anyone following the out of the box THORChain Validator setup. Many Node Operator following this Bare-metal Multi-Validator Configuration were concerned if the migration of their Validators would be smooth, as well as eventual bad surprises down the road. In this Guide, I will go trought my understanding of the Cosmos-Operator, as well as the steps I took to complete the migration of my validators.
What is Cosmos-Operator
Cosmos-Operator is a product from StrangeGloves, to assist manage Cosmos node, see their repo here.
There is two component, first the Cosmos-Operator, a pod that run in its own namespace, there is only one per Kube Cluster, and the cosmosfullnode, that is like an ‘advanced’ deployment with more feature than a standard kube deployments, that replace thornode and bifrost deployments.
They work together to change thornode and bifrost pod image based on block height. As you can see here, it contain a list of block height, with the corresponding image for thornode and bifrost, and will allow Node Operator to manually update that list (via git pull) in advance for the new version to take effect on desired height. Cosmosfullnode will also downgrade to previous version image configured if Node Operator download a pre-upgrade snapshot.
versions:
- height: 0
image: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
containers:
bifrost: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
# Add chain versions here for future scheduled upgrades and historical syncWhat is NOT Cosmos-Operator
Let’s rule out what Cosmos-Operator is NOT:
Cosmos-Operator is NOT something that will git pull any new confirmation nor download a new thornode version automatically. It work only based on the list of height and image above.
Cosmos-Operator will not scale up a thornode-pod when chain reach upgrade height if thornode is scaled down.
Cosmos-Operator will have no effect on any node that was not converted to cosmosfullnode, even if some of the validator were already converted on the same kube cluster.
Cosmos-Operator will not download a chainsnapshot.
Cosmos-Operator will not interfer with other pods running on the same cluster, even if they have similar names (Bifrost Pod from a maya install).
Cosmos-Operator will not spin up new external chain daemon automatically when they are released.
Cosmos-Operator will not do anything related to external chain daemons, it ONLY deal with the thornode and bifrost pods.
Side Effects
Thornode is no longer a deployment:
One side effect is that the thornode can’t be referred as deploy/thornode anymore (e.g. kubectl scale — replicas=0 deploy/thornode -n node1), so this could affect custom scripts.
Here are a few example of the new kubectl command to manually call thornode, such as set-ip-address or to manually scale down and up the thornode, etc:
# Scale Down Thornode
kubectl -n node1 patch cosmosfullnodes thornode -p '{"spec":{"instanceOverrides":{"thornode-0":{"disable":"Pod"}}}}' --type=merge
# Scale Up Thornode
kubectl -n node1 patch cosmosfullnodes thornode -p '{"spec":{"instanceOverrides":{"thornode-0":{"disable":null}}}}' --type=merge
# Call set-ip-address with custom ip
kubectl exec -it -n node1 -c node thornode-0 -- /kube-scripts/set-ip-address.sh "<ProxyExternalIP>"Thornode pod is now called thornode-0:
Pod is now called thornode-0, but it is still reachable via thornode.node1.svc.cluster.local:27147.
In addition to services thornode and bifrost, there is now a thornode-rpc and thornode-p2p-0.
Prepare Migration — Backup
Let’s be safe and ensure we have a fresh backup:
make backup
-> thornode
make backup
-> bifrost
make mnemonic
-> save somewhere
make password
-> save somewhereAnd take a copy of our Git repository, just in case we corrupt something down the way
cp -r ~/node1 ~/node1_backup20250529Prepare Migration — Review existing Customizations
If you are like me, you have been dragging some custom configs for a few years now, and may not remember them all. Let’s review all change to be familiar with them
List all customized files
cd ~/node1/node-launcher/
git diff --name-only origin/master
git --no-pager diff origin/masterReview each file customisations
git diff origin/master -- bifrost/templates/deployment.yaml
git diff origin/master -- bifrost/values.yaml
git diff origin/master -- gateway/templates/service.yaml
git diff origin/master -- thornode-stack/mainnet.yaml
git diff origin/master -- thornode/templates/deployment.yaml
git diff origin/master -- thornode/values.yamlSave for future references
git --no-pager diff origin/master > ../gitdifff_20250529.txt
git --no-pager diff -U1000000 origin/master > ../gitdifff_20250529_full.txt
git --no-pager diff --color=always origin/master > ../gitdifff_20250529_color.txtFile gitdiff_20250529.txt allow us to copy past things easily, while gitdiff_20250529_full.txt give us the larger scope in case we need to see the structure where the config was, and gitdiff_20250529_color.txt allow us to see the color to easily, so we didnt miss anything.
Look at running services for any Custom Patched External IP
kubectl get svc -n node1Pull Changes and resolve Conflicts
Let’s go ahead and pull the latest changes from the repository.
cd ~/node1/node-launcher/
git pull --rebase --autostashLook at Conflicting changes
git diff
* Unmerged path bifrost/templates/deployment.yaml
* Unmerged path bifrost/values.yaml
* Unmerged path thornode/templates/deployment.yamlgit status
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: gateway/templates/service.yaml
modified: thornode-stack/mainnet.yaml
modified: thornode/values.yaml
Unmerged paths:
(use "git restore --staged <file>..." to unstage)
(use "git add/rm <file>..." as appropriate to mark resolution)
deleted by us: bifrost/templates/deployment.yaml
deleted by us: bifrost/values.yaml
deleted by us: thornode/templates/deployment.yamlRemove the deprecated files
git rm bifrost/templates/deployment.yaml
git rm bifrost/values.yaml
git rm thornode/templates/deployment.yamlRe-Apply Customisations — External_IP
External_IP that was set in bifrost/templates/deployment and thornode/templates/deployment is now set into thornode/templates/cosmosfullnode.yam.
nano thornode/templates/cosmosfullnode.yamlSet the Thornode’s External_IP on line 95.
Set the Bifrost’s External_IP on line 280.
Re-Apply Customisations — Ressources
CPU/RAM that was set in bifrost/values and thornode/values are now set into thornode/values.yaml.
nano thornode/values.yamlSet the Thornode’s Resources on line 97. (should have been merged already)
Set the Thornode’s Resources on line 284.
Re-Apply Customisation — Thornode’s GRPC
Used when a Thornode Validator is used as a Thornode End-Point.
nano thornode/templates/cosmosfullnode.yamlSet the Thornode’s GRPC Config on line 130.
Re-Work Customisation — thornode-stack/mainnet
aShared Daemon End-Point that used to be at the bottom of thornode-stack/mainnet under the tag bifrost: now need to be nested under de tag thornode: at the bigenning of the file and tabbed accordingly.
nano thornode-stack/mainnet.yamlHere is an example of the new layout:
thornode:
statesync:
auto: false
snapshotInterval: 0
versions:
- height: 0
image: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
containers:
bifrost: registry.gitlab.com/thorchain/thornode:mainnet-3.6.1@sha256:a40c63b1d2c3523aeb9bc2a42f92094ffddce1ffd00d5e888863302448f8cace
# Add chain versions here for future scheduled upgrades and historical sync
# ============================================================================================
# POINT TO SHARED DAEMONS
# ============================================================================================
bifrost:
# Binance
binanceDaemon:
mainnet: http://192.168.111.222:27147
# Ethereum
ethereumDaemon:
mainnet: http://192.168.1111.222:8545
# Bitcoin
bitcoinDaemon:
mainnet: 192.168.111.222:8332
# Bitcoin-Cash
bitcoinCashDaemon:
mainnet: 192.168.1111.222:18332
# Litecoin
litecoinDaemon:
mainnet: 192.168.111.222:9332
# Dogecoin
dogecoinDaemon:
mainnet: 192.168.111.222:22555
# Avalanche
avaxDaemon:
mainnet: http://192.168.111.222:9650/ext/bc/C/rpc
#Gaia
gaiaDaemon:
enabled: true
mainnet:
rpc: http://192.168.111.222:26657
grpc: 192.168.111.222:9090
grpcTLS: false
# New Style End-Point
env:
# Binance-Smart
BIFROST_CHAINS_BSC_RPC_HOST: http://192.168.111.222:18545/
BIFROST_CHAINS_BSC_BLOCK_SCANNER_RPC_HOST: http://192.168.111.222:18545/
BSC_HOST: http://192.168.111.222:18545/
# Not db compact at init time, it will still run randomly at runtime when needed.
BIFROST_CHAINS_ETH_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_GAIA_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_DOGE_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_BTC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_AVAX_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_BSC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_BNB_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
BIFROST_CHAINS_LTC_SCANNER_LEVELDB_COMPACT_ON_INIT: "false"
# ============================================================================================
# ============================================================================================
bitcoin-daemon:
enabled: false
litecoin-daemon:
enabled: false
bitcoin-cash-daemon:
enabled: false
ethereum-daemon:
enabled: false
dogecoin-daemon:
enabled: false
gaia-daemon:
enabled: false
avalanche-daemon:
enabled: false
binance-smart-daemon:
enabled: false
base-daemon:
enabled: false
xrp-daemon:
enabled: false
cardano-daemon:
enabled: falseConfirm Customisation — External Chain Daemon
Confirm which daemon to run in this namespace in thornode-stack/mainnet.yaml. (should have been merged already).
nano thornode-stack/mainnet.yamlRe-Apply Customisation — BASE End-Point
BASE End-Point that was set in bifrost/value is now set in thornode/values.yaml.
nano thornode/values.yamlSet the BASE End-Point on line 157.
(Not required if already in thornode-stack/mainnet)
Re-Apply Customisation — BSC End-Point
BSC End-Point that was set in thornode-stack/mainnet is now set in thornode/values.yaml.
nano thornode/values.yamlSet the BASE End-Point on line 150.
(Not required if already in thornode-stack/mainnet)
Re-Apply Customisation — Level DB Compact on Init
Level DB Compact on Init that was set in thornode-stack/mainnet is now set in thornode/values.yaml.
nano thornode/values.yamlSet the Level DB Compact on Init on line 148.
(Not required if already in thornode-stack/mainnet)
Confirm Customisation — MetalLb in Gateway
Gateway Metal-Lb IP Mapping that is set in gateway/template/services.yaml.
nano gateway/templates/services.yamlConfirm Gateway’s MetalLb on line 17. (should have been merged already).
Confirm Customisation — Midgard
Midgard enabling that is set in thornode-stack/values.yaml.
nano thornode-stack/values.yamlConfirm midgard enabled on line 26. (should have been merged already).
Cosmos-Operator Prerequisit — Go version 1.23+
Check current installation version, if applicable
go versionremove previous version (if applicable)
sudo rm -rf /usr/local/goInstall go 1.24.3
wget https://go.dev/dl/go1.24.3.linux-amd64.tar.gz
sudo tar -xvf go1.24.3.linux-amd64.tar.gz
sudo mv go /usr/local
rm go1.24.3.linux-amd64.tar.gzAdd Path (if new installation)
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
source ~/.bashrcConfirm Install
go versionInstallation of Cosmos-Operator
Note that for new created node, this will now be part of make tools
cd ~/node1/node-launcher/
make install-cosmos-operatorMonitor the output for errors, they are not obvious and pod will start and look ready even if there is some blocking errors, such as go not found, E0529 or E0531 errors, or others. If there is any error just try to re-run make install-cosmos-operator.
In addition to confirming cosmos-operator Pod Creation, take the time to confirm that the make install-cosmos-operator didn’t ouput any errors. I had to run it multiple time on most my servers.
Confirm Pod creation
kubectl -n cosmos-operator-system get podsScale down existing thornode pod
Scale Down Thornode to avoid having 2 thornode pod running, which would result in heavy slash.
kubectl scale --replicas=0 deploy/thornode -n node1 --timeout=5mConfirm Thrnode pod is no longer running
kubectl get pods -n node1Install Changes
Apply all the changes, remove old thornode and bifrost deployment, replacing them by cosmosfullnode.
make installVerify Thornode and Bifrost pod Scale up.
kubectl get pods -n node1Verify status
make statusOther Validators
Go through the same process for all other validators.