Tools: Ceph Public Network Migration (No Downtime)

Tools: Ceph Public Network Migration (No Downtime)

Source: Dev.to

Ceph Public Network Migration (Proxmox) ## 📌 Context ## 🎯 Objective ## 🧱 Key Concepts (Read Once) ## public_network ## cluster_network ## Important behaviours ## 1️⃣ Prepare the New Ceph Network ## Verify connectivity ## 2️⃣ Add the New Public Network (Dual-Network Phase) ## 3️⃣ Recreate MONs (One by One) ## 4️⃣ Recreate MGRs (One by One) ## 🔧 Recovery Tip ## 5️⃣ Recreate CephFS Metadata Servers (MDS) ## 6️⃣ Remove the Old Public Network ## 7️⃣ Recreate MONs, MGRs, and MDS (Again) ## 8️⃣ Protect the Cluster Before Touching OSDs ## 9️⃣ Restart OSDs (Data Plane Migration) ## 🔟 Remove Protection ## 🔎 Verification (Critical) ## 1️⃣ Verify Ceph daemon addresses ## 2️⃣ Verify traffic is using the Ceph fabric ## 3️⃣ Verify raw network performance (iperf3) ## 🚨 Troubleshooting: “OSDs Not Reachable / Wrong Subnet” ## Symptom ## Fix (Critical) ## Restart ALL MONs (mandatory) ## Restart ALL MGRs (mandatory) ## (Optional) Clean config DB ## ⚠️ Risks Considered ## Why this change is risky ## Failure modes considered ## Assumptions ## ✅ Final State ## 🙏 Acknowledgements 172.16.0.0/16 → 10.50.0.0/24 No service downtime, no data loss This procedure documents a live Ceph public network migration performed on a Proxmox-backed Ceph cluster. The goal was to eliminate management-network congestion while maintaining cluster availability and data integrity. Migrate all Ceph traffic (MON, MGR, MDS, OSD front + back) from a congested management network to a dedicated Ceph fabric (e.g. 2.5 GbE switch), while keeping the cluster healthy and online. Create a dedicated bridge on each node (example: vmbr-ceph): Assign IPs on the new subnet: Ensure this network is isolated (no gateway required). NOTE: Back up the file first Edit /etc/pve/ceph.conf: ⚠️ Do NOT remove the old network yet MONs enforce network validation. ✔ Ensure quorum after each step. If a manager fails to start: MDS binds its address at creation time ✔ Verify CephFS health before proceeding. Edit /etc/pve/ceph.conf and remove 172.16.0.0/16: This ensures all control-plane daemons bind exclusively to the new network. Restart one OSD at a time: While Ceph is under load: RX/TX counters should increase, confirming traffic is not using the management network. ⚠️ Important: iperf3 must be installed on all Ceph nodes to test the fabric correctly. Correct testing method: Expected for 2.5 GbE Ceph fabric: Ceph config DB or MON/MGR cache still references the old network. Restart OSDs again (one by one). ✔ This should resolve any “OSDs missing / wrong subnet” cases. Changing Ceph cluster networking affects quorum, OSD availability, replication traffic, and client IO. Incorrect sequencing can cause data unavailability or permanent loss. This migration approach was heavily informed by the following Proxmox forum discussion, which proved critical in resolving address-binding and daemon recreation issues during the Ceph public network transition: In particular, the guidance around: was instrumental in achieving a clean, no–data-loss migration. Many thanks to the contributors in that thread for sharing real-world operational experience. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: vim /etc/network/interfaces Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: vim /etc/network/interfaces CODE_BLOCK: vim /etc/network/interfaces COMMAND_BLOCK: auto vmbr-ceph iface vmbr-ceph inet static address 10.50.0.20/24 bridge-ports eno2 bridge-stp off bridge-fd 0 # Ceph (Fabric) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: auto vmbr-ceph iface vmbr-ceph inet static address 10.50.0.20/24 bridge-ports eno2 bridge-stp off bridge-fd 0 # Ceph (Fabric) COMMAND_BLOCK: auto vmbr-ceph iface vmbr-ceph inet static address 10.50.0.20/24 bridge-ports eno2 bridge-stp off bridge-fd 0 # Ceph (Fabric) CODE_BLOCK: ping 10.50.0.30 iperf3 -s / -c <peer> Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ping 10.50.0.30 iperf3 -s / -c <peer> CODE_BLOCK: ping 10.50.0.30 iperf3 -s / -c <peer> CODE_BLOCK: cp /etc/pve/ceph.conf /etc/pve/ceph.conf.bak Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: cp /etc/pve/ceph.conf /etc/pve/ceph.conf.bak CODE_BLOCK: cp /etc/pve/ceph.conf /etc/pve/ceph.conf.bak CODE_BLOCK: public_network = 10.50.0.0/24, 172.16.0.0/16 cluster_network = 10.50.0.0/24, 172.16.0.0/16 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: public_network = 10.50.0.0/24, 172.16.0.0/16 cluster_network = 10.50.0.0/24, 172.16.0.0/16 CODE_BLOCK: public_network = 10.50.0.0/24, 172.16.0.0/16 cluster_network = 10.50.0.0/24, 172.16.0.0/16 CODE_BLOCK: pveceph mon destroy <node> pveceph mon create ceph -s Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: pveceph mon destroy <node> pveceph mon create ceph -s CODE_BLOCK: pveceph mon destroy <node> pveceph mon create ceph -s CODE_BLOCK: pveceph mgr destroy <node> pveceph mgr create Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: pveceph mgr destroy <node> pveceph mgr create CODE_BLOCK: pveceph mgr destroy <node> pveceph mgr create CODE_BLOCK: ceph mgr dump Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ceph mgr dump CODE_BLOCK: ceph mgr dump CODE_BLOCK: systemctl reset-failed ceph-mgr@<node> systemctl start ceph-mgr@<node> Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: systemctl reset-failed ceph-mgr@<node> systemctl start ceph-mgr@<node> CODE_BLOCK: systemctl reset-failed ceph-mgr@<node> systemctl start ceph-mgr@<node> CODE_BLOCK: pveceph mds destroy <node> pveceph mds create Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: pveceph mds destroy <node> pveceph mds create CODE_BLOCK: pveceph mds destroy <node> pveceph mds create CODE_BLOCK: public_network = 10.50.0.0/24 cluster_network = 10.50.0.0/24 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: public_network = 10.50.0.0/24 cluster_network = 10.50.0.0/24 CODE_BLOCK: public_network = 10.50.0.0/24 cluster_network = 10.50.0.0/24 CODE_BLOCK: ceph osd set noout Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ceph osd set noout CODE_BLOCK: ceph osd set noout CODE_BLOCK: systemctl restart ceph-osd@<id> ceph -s Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: systemctl restart ceph-osd@<id> ceph -s CODE_BLOCK: systemctl restart ceph-osd@<id> ceph -s CODE_BLOCK: PGs: active+clean Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: PGs: active+clean CODE_BLOCK: PGs: active+clean CODE_BLOCK: ceph osd unset noout Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ceph osd unset noout CODE_BLOCK: ceph osd unset noout COMMAND_BLOCK: ceph osd metadata <id> | egrep 'front_addr|back_addr' Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: ceph osd metadata <id> | egrep 'front_addr|back_addr' COMMAND_BLOCK: ceph osd metadata <id> | egrep 'front_addr|back_addr' CODE_BLOCK: ip -s link show vmbr-ceph Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ip -s link show vmbr-ceph CODE_BLOCK: ip -s link show vmbr-ceph COMMAND_BLOCK: apt install iperf3 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: apt install iperf3 COMMAND_BLOCK: apt install iperf3 CODE_BLOCK: iperf3 -s Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: iperf3 -c <peer_ip> -P 4 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: iperf3 -c <peer_ip> -P 4 COMMAND_BLOCK: iperf3 -c <peer_ip> -P 4 CODE_BLOCK: osd.X's public address is not in '172.16.x.x/16' subnet Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: osd.X's public address is not in '172.16.x.x/16' subnet CODE_BLOCK: osd.X's public address is not in '172.16.x.x/16' subnet CODE_BLOCK: systemctl restart ceph-mon@pve2 systemctl restart ceph-mon@pve3 systemctl restart ceph-mon@pve4 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: systemctl restart ceph-mon@pve2 systemctl restart ceph-mon@pve3 systemctl restart ceph-mon@pve4 CODE_BLOCK: systemctl restart ceph-mon@pve2 systemctl restart ceph-mon@pve3 systemctl restart ceph-mon@pve4 CODE_BLOCK: systemctl restart ceph-mgr@pve2 systemctl restart ceph-mgr@pve3 systemctl restart ceph-mgr@pve4 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: systemctl restart ceph-mgr@pve2 systemctl restart ceph-mgr@pve3 systemctl restart ceph-mgr@pve4 CODE_BLOCK: systemctl restart ceph-mgr@pve2 systemctl restart ceph-mgr@pve3 systemctl restart ceph-mgr@pve4 CODE_BLOCK: ceph config rm global public_network ceph config rm global cluster_network ceph config set global public_network 10.50.0.0/24 ceph config set global cluster_network 10.50.0.0/24 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ceph config rm global public_network ceph config rm global cluster_network ceph config set global public_network 10.50.0.0/24 ceph config set global cluster_network 10.50.0.0/24 CODE_BLOCK: ceph config rm global public_network ceph config rm global cluster_network ceph config set global public_network 10.50.0.0/24 ceph config set global cluster_network 10.50.0.0/24 - 🎯 Objective - 🧱 Key Concepts (Read Once) - 🚨 Troubleshooting - ⚠️ Risks Considered - ✅ Final State - Client ↔ OSD traffic - MON / MGR control plane - CephFS metadata traffic - OSD ↔ OSD replication & recovery (data plane) - MON & MGR enforce address validation - OSDs bind addresses at restart - /etc/pve/ceph.conf is not authoritative alone — Ceph also uses its internal config database - pve2 → 10.50.0.20/24 - pve3 → 10.50.0.30/24 - pve4 → 10.50.0.40/24 - Proxmox UI → Ceph → Nodes - ceph config dump - Recreate standby managers first - Leave the active manager for last - MONs (one by one) - MGRs (standbys first, active last) - MDS (one by one) - ✅ front_addr → 10.50.0.x - ✅ back_addr → 10.50.0.x - ❌ No 172.16.x.x - Server on one node: - Client on a different node: - ~2.1–2.4 Gbit/s - Minimal or zero retransmits - Stable throughput across multiple streams - MON quorum loss - OSD flapping - Client IO stalls - Backfill storms - Split-brain conditions - Single Ceph cluster - Dedicated replication network (fabric) - Change executed during low IO window - Dedicated Ceph fabric (2.5 GbE) - No Ceph traffic on management NIC - MON / MGR / MDS / OSD fully migrated - No data loss - Stable cluster - Proxmox Forum – “Ceph: changing public network” https://forum.proxmox.com/threads/ceph-changing-public-network.119116/ - Temporarily running dual public networks - Recreating MON, MGR, and MDS daemons to force address rebinding - Avoiding full cluster downtime during network migration