all InfoSec news
SREcon23 Americas - We're Still Down: A Metastable Failure Tale
April 25, 2023, 10:01 p.m. | USENIX
USENIX www.youtube.com
Kyle Lexmond
""The status? The system has been down for hours, and we haven't been able to get it back up yet""—words on an incident conference call that you probably don't want to hear.
This talk explores how a globally distributed CDN experienced a metastable failure, design changes that make future failures less likely, and the unorthodox fix that made a recovery possible (and can hopefully apply to future metastable failures—maybe even yours). …
americas back back up call cdn conference design distributed don down fix future incident recovery system
More from www.youtube.com / USENIX
SREcon24 Americas - Meeting the Challenge of Burnout
1 month, 2 weeks ago |
www.youtube.com
SREcon24 Americas - Build vs. Buy in the Midst of Armageddon
1 month, 2 weeks ago |
www.youtube.com
SREcon24 Americas - Triage with Mental Models
1 month, 2 weeks ago |
www.youtube.com
SREcon24 Americas - 99.99% of Your Traces Are (Probably) Trash
1 month, 2 weeks ago |
www.youtube.com
Jobs in InfoSec / Cybersecurity
CyberSOC Technical Lead
@ Integrity360 | Sandyford, Dublin, Ireland
Cyber Security Strategy Consultant
@ Capco | New York City
Cyber Security Senior Consultant
@ Capco | Chicago, IL
Sr. Product Manager
@ MixMode | Remote, US
Corporate Intern - Information Security (Year Round)
@ Associated Bank | US WI Remote
Senior Offensive Security Engineer
@ CoStar Group | US-DC Washington, DC