Abstract Complex mixes of monoliths, micro-services, databases, data centers, networking, and cloud providers provide a dizzying array of opportunities for your services to fail. No one has perfect failover, so you have be prepared to play defense.
We will look at three categories of failures and ways to recognize them coming, and avoid spreading the carnage they cause to other services you provide.
Zombies: Long running but abandoned requests that eat up memory and crash the system long after the user who conjured it gave up.
Black Holes: Dependent services that take connections but never give them up, or perform so poorly for a time that all your attention eventually gets focused on that one thing.
Tribbles: Similar requests that you normally invite, but they come too many, too fast for your service to handle as they take your attention and eat up all your resources.
You can expect to see:
Bio Avery has been building and supporting large systems for two decades. His recent focus has been on finding themes for why systems fail, and building practical solutions.