
Why My Home Lab Stopped Crashing After I Reduced Container Count by 70%
Last February, my home lab started doing something annoying: random services would go unresponsive around 2 AM, Portainer would show containers in a perpetual restart loop, and my wife would mention — with increasing patience — that the Plex queue had frozen again mid-episode. I had 34 Docker containers running on a single Proxmox node. The fix turned out to be embarrassingly simple, but getting there took a full weekend audit and some honest conversations with myself about why I was running half that stuff in the first place.
How I Got to 34 Containers in the First Place
If you run a home lab long enough, you know exactly how this happens. You see a project on Reddit or a self-hosted forum, spin it up in twenty minutes, tell yourself you’ll evaluate it properly later, and then “later” never comes. The container just sits there, consuming memory and CPU cycles, doing approximately nothing useful.
My stack had grown organically over about two years. What started as Plex, Pi-hole, and a basic Nginx reverse proxy became — through steady weekend tinkering — a sprawling mess that included:
- Three different dashboard solutions (Homarr, Heimdall, and Homer — I never committed to one)
- Two Minecraft servers neither of my kids had logged into in six months
- A Gitea instance for personal repos that I was mirroring to GitHub anyway
- Vaultwarden running alongside a nearly identical test instance I’d spun up “temporarily”
- Uptime Kuma monitoring containers that were themselves unstable
- A Calibre-Web install with about forty books that I never actually read through the interface
- FreshRSS, which I logged into maybe four times total
- A Flame dashboard I set up after abandoning Heimdall
- Watchtower running on two separate stacks for reasons I genuinely could not remember
This is not a complete list. The honest truth is I had containers I couldn’t immediately name when I ran docker ps. That alone should have been a signal.
What the Actual Problem Was
Memory pressure, not CPU
My Proxmox node is a repurposed Dell OptiPlex 9020 with a Core i7-4790 and 16 GB of RAM. Not a powerhouse, but plenty capable for a reasonable workload. The issue wasn’t processing power — average CPU across the day hovered around 18%. The problem was memory.
With 34 containers running, I was consistently sitting at 13.8 to 14.2 GB of RAM used. That sounds like it leaves a couple gigabytes free, but in practice Linux’s OOM killer was getting invoked regularly when anything spiked. Plex transcoding a 4K file would push memory hard, something else would get killed, that killed container would trigger a restart, and the cascade would occasionally take Nginx down with it — meaning nothing external was reachable until I noticed and intervened.
Docker networking overhead
Something I hadn’t fully appreciated: each Docker network bridge has overhead. I had eleven separate Docker networks defined. Most of those existed because I’d followed compose tutorials that created new networks without thinking about consolidation. Fewer, well-defined networks with explicit container membership is significantly cleaner than the organic sprawl I’d built.
Log volume nobody was watching
Running docker system df showed me something uncomfortable: I had accumulated over 22 GB in container logs. Twenty-two gigabytes. Some containers were logging verbosely with no rotation configured, and I had just never looked. That’s disk space and I/O that was doing absolutely nothing for me.
The Audit Process
Three questions for every container
I sat down on a Saturday morning with coffee and a spreadsheet. For each container, I asked three questions:
- Did I use this in the last 30 days? Not “could I use it” or “might I want to.” Did I actually open it.
- Does someone in my household depend on it? My wife uses Plex daily. My kids use nothing I’m running. I use Vaultwarden, Nextcloud, and the VPN. Everything else was speculative.
- Does it justify its resource cost against alternatives? Gitea is a fine product, but if I’m mirroring everything to GitHub anyway, what am I actually gaining from the local instance?
The answers were clarifying and a little humbling. About two-thirds of what I was running failed at least one of those tests immediately.
What got cut
Everything listed above got removed, plus a handful more. The Minecraft servers went away — I archived the world data to a backup drive, but the active containers were just burning memory for nostalgia. All three dashboard containers went. I picked Homarr, committed to it, deleted the others. The duplicate Vaultwarden test instance was shut down. Calibre-Web, FreshRSS, the duplicate Watchtower instance, a Speedtest Tracker I had forgotten was running — all gone.
Gitea was the one I debated longest. I do like having local version control. But with Forgejo available and my actual usage pattern being almost entirely through GitHub anyway, I couldn’t justify the overhead for my current setup. I exported everything and closed it out. If my usage patterns change, I can spin it back up in fifteen minutes.
What stayed
Nine containers made the cut:
- Plex — used daily, non-negotiable
- Pi-hole — network-level ad blocking, essential to how the whole house functions
- Nginx Proxy Manager — reverse proxy handling external access
- Vaultwarden — password manager, single instance
- Nextcloud — family photo sync and shared documents
- Wireguard — remote access when I’m away from home, particularly useful given Canadian hotel Wi-Fi tends to be terrible
- Homarr — single dashboard, finally committing to one
- Uptime Kuma — monitoring, though I also cleaned up all the dead checks that were pointing at services I’d just killed
- Portainer — container management
Everything on that list gets used. Everything justifies being there.
What Changed After the Cleanup
The results were not subtle. RAM usage dropped to a stable 6.2 to 7.4 GB depending on what Plex is doing. The OOM killer has not fired a single time in the four months since the cleanup. The 2 AM crashes stopped immediately — they were memory-related, and the memory problem went away.
Startup time after a reboot went from about four minutes before all services were healthy to under ninety seconds. Nginx has not gone down once. Plex has been rock solid in a way it genuinely wasn’t before.
There’s also a maintenance benefit I hadn’t fully anticipated. With nine containers instead of thirty-four, I actually review my compose files and update cadence. Before, updates felt overwhelming — there were always things to update, always things potentially broken by updates, and the mental overhead made me defer it constantly. Now I can actually stay on top of things. Every container in my stack, I know exactly what it does and why it’s there.
The Honest Tradeoffs
I want to be clear that this is not a “less is always more” sermon. There are real costs to what I did.
I lost Gitea, and there are days I miss having fully local version control with no dependency on an external service. GitHub is a corporation with its own priorities, and I’m aware that self-hosting my repos had genuine value beyond just tinkering. I made a pragmatic call based on my actual usage, but it wasn’t a cost-free decision.
I also lost the Calibre-Web install, and my ebook management situation is genuinely worse now. I’ve been meaning to revisit that and stand it back up on the NAS instead, where it makes more sense than on the primary compute node. It’s on the list.
The more important tradeoff is philosophical. Home labs exist partly for learning. Spinning up a new service, breaking it, figuring out why, fixing it — that’s how you actually build skills. If you’re running a home lab purely as a production environment, you’re probably using the wrong tool. There’s tension between “keep it stable” and “keep it interesting,” and 34 containers represented me leaning hard into interesting at the cost of stable. Nine containers might be leaning too far the other way for some people.
My compromise: I have a separate, isolated Proxmox VM I use specifically for testing new things. It’s deliberately not on my main Docker host. When something proves itself useful and stable, it’s a candidate for the main stack. When it doesn’t, it gets wiped without any impact on services my family depends on.
Also worth mentioning: 16 GB of RAM is not a lot. Someone running the same number of containers on a machine with 32 or 64 GB might never have hit the pressure I did. Hardware matters, and the right answer for your stack depends heavily on what you’re actually running it on. In Canada, where used server hardware — even something like a basic Supermicro workstation — can run $400 to $800 CAD on Kijiji, the economics of “just add more RAM” are real. Sometimes the right fix is an audit, not a hardware purchase.
Practical Steps if You’re in the Same Situation
Start with visibility
Run docker stats and leave it open for ten minutes. You will immediately see which containers are consuming significant resources and which ones are sitting nearly idle. Follow that with docker system df to see log and volume accumulation. These two commands told me more about my actual situation in five minutes than months of vague dissatisfaction had.
Export before you delete
For anything with data — Gitea repos, Nextcloud content, Vaultwarden vaults, game worlds — export and back up before removing the container. I keep a cold backup drive specifically for “things I removed from active service but want to be able to restore.” It has lived up to its value twice already.
Consolidate networks deliberately
Review your Docker networks. Most home lab setups can operate on two or three named networks: one for internal-only services, one for services that need proxy access, and maybe one isolated one for anything that shouldn’t talk to the rest. Eleven networks was not serving me.
Set log rotation as infrastructure, not an afterthought
Add log rotation to your Docker daemon config or per-container in your compose files. The default is no rotation, which means logs grow until disk is full. This is a trivial fix that I should have done on day one.
If your home lab is behaving the way mine was — random failures, services you can’t quite trust, maintenance that feels endless — the answer might not be better hardware or a different orchestration tool. Start with the audit. Figure out what’s actually earning its place, remove what isn’t, and then see what you’re dealing with before spending money or time on anything else.
Related Auburn AI Products
Building a homelab or self-hosting content site? Auburn AI has practical kits:
- 500 Homelab and Self-Hosting Blog Titles ($27)
- Auburn AI Monitoring Stack ($37) – 6 production PowerShell scripts
- Podcast Automation Kit ($37)
- Browse all Auburn AI products
