Surviving Black Friday - a Resilience Engineering Tale

DevconTLV March 2016 Conference, Tuesday, March 22, 2016, 14:20

The 'Black Friday fail' is the greatest fear of every major online retailer. Since downtime equals money, and in Black Friday it means quite a lot of money.

But the sad truth is that a failure of a service is inevitable, especially in a large distributed system. So how can we survive a failure of a service when it inevitably fails. 
* In this lecture I will show how failures in large systems differs from failures in small systems. 
* Will show examples of resilience engineering. 
* Why simulate failures, and how to do it in your system. 
* How to use gradual rollout, circuit breakers and automatic fallback to protect your system. 
* The importance of failing fast, and failing silently. 
* And the misconceptions we all have on how a large scale website failure unfolds.

Omri Fima

Omri Fima

Technical Lead


Omri is a TechLead at Sears Israel by day , and a Maker by night.
Omri is responsible on designing the user profiling, personalization and recommendation capabilities for Sears Israel. He is experienced in large scale system architecture, and Agile methodologies as well as integrating Hardware and software to create exciting new experiences.

Other Presentations at DevconTLV March 2016

Open Accessibilty Menu