Tools: Open Source Solved: When Do You Decide To Stop A Ppc Campaign?
Posted on Feb 10
• Originally published at wp.me
TL;DR: Unidentified, costly “zombie microservices” (metaphorically, ‘PPC campaigns’ burning cash) often consume significant cloud resources with unknown dependencies, leading to high bills and fear of shutdown. Safely decommission these services using methods like gradual resource reduction (‘Strangle and Observe’), thorough dependency mapping (‘Archaeological Dig’), or a controlled, reversible ‘Scream Test’ during low-traffic periods.
Struggling with ‘zombie’ services and legacy processes racking up your cloud bill? Learn when and how to safely decommission infrastructure without causing a production outage.
I remember staring at the monthly cloud bill. It was a five-figure number that made my stomach turn, and one line item stood out: a fleet of massive EC2 instances under a service named ‘DataAggregator-PROD’. They were costing us nearly $4,000 a month, just humming along. I asked around. The new product manager had never heard of it. The junior devs thought it was “some legacy thing we don’t touch.” It was a ghost in the machine, a technical ‘PPC campaign’ burning cash with zero measurable ROI. The problem? No one knew for sure what would happen if we turned it off. This is a story I’ve seen play out at nearly every company I’ve worked for.
This isn’t about blaming people. It’s a natural consequence of growth, changing priorities, and team turnover. A project that was critical two years ago gets superseded. The original developers move on. The documentation, if it ever existed, is now a dead link in a forgotten Confluence space. We end up with these zombie services for a few key reasons:
So you’re stuck with this expensive, mysterious process. You know it’s probably useless, but the risk of shutting it down feels too high. Let’s walk through how we, in the trenches, actually solve this.
This is my go-to first step when the political capital or time for a full investigation is low. It’s a bit hacky, but it’s effective. You don’t kill the service, you starve it. The goal is to make it cheap and see who screams.
If it’s an auto-scaling group of servers, scale the desired/min/max count down to one, on the smallest instance type possible. If it’s a data pipeline, change its cron schedule from every hour to once a day at 3 AM. The service is still “running,” which satisfies the nervous stakeholders, but your costs plummet. Now, you watch your monitoring dashboards like a hawk.
Source: Dev.to