Ad Serving Interruption
Incident Report for Kevel
Resolved
We have resolved the issue that caused our downtime - things should be back to normal. We deployed a new release of the engine code today and that all went as planned. When we enabled a new feature though it was missing a configuration value and caused the engines to all crash at the same time. This causes a cascading failure where engines couldn't come back up without being hit with 100x more traffic than they could handle. We brought up additional engines to cover the load and within 20 minutes had enough to handle our production load.

We have identified the bug in the code that allowed this missing config to crash the engine and a fix for that is going out right now.

We hate nothing more than having downtime - please accept our apologies for this issue and know that we will be working to ensure that we prevent it in the future.
Posted Apr 07, 2015 - 17:49 EDT
Identified
Ad serving is currently experiencing down-time. We are working quickly to resolve it and will provide a complete post mortem.
Posted Apr 07, 2015 - 17:03 EDT