April 25, 2023, 10:01 p.m. | USENIX

USENIX www.youtube.com

Turning an Incident Report into a Design Issue with TLA+

A. Finn Hackett, University of British Columbia; Markus Alexander Kuppe, Microsoft Research

This talk will discuss our experience using modeling-driven techniques as part of a postmortem deep dive into a long-lasting, high-impact outage at Microsoft. We built a precise specification of the micro-service architecture, most notably its foundational distributed database service CosmosDB. Modeling allowed us to go beyond the standard postmortem analysis and accurately determine the outage's root cause. The …

americas analysis architecture beyond british british columbia columbia database deep dive design discuss distributed dive experience high impact incident issue key markus micro microsoft modeling outage report research root service standard techniques the key university

CyberSOC Technical Lead

@ Integrity360 | Sandyford, Dublin, Ireland

Cyber Security Strategy Consultant

@ Capco | New York City

Cyber Security Senior Consultant

@ Capco | Chicago, IL

Senior Security Researcher - Linux MacOS EDR (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

Sr. Manager, NetSec GTM Programs

@ Palo Alto Networks | Santa Clara, CA, United States

SOC Analyst I

@ Fortress Security Risk Management | Cleveland, OH, United States