Dec. 15, 2023, 2:24 a.m. | Veysel Kocaman, Hasham Ul Haq, David Talby

cs.CR updates on arXiv.org arxiv.org

Recent research advances achieve human-level accuracy for de-identifying
free-text clinical notes on research datasets, but gaps remain in reproducing
this in large real-world settings. This paper summarizes lessons learned from
building a system used to de-identify over one billion real clinical notes, in
a fully automated way, that was independently certified by multiple
organizations for production use. A fully automated solution requires a very
high level of accuracy that does not require manual review. A hybrid
context-based model architecture is …

accuracy automated beyond building datasets de-identification free human identification identify large lessons learned real research settings system text world

Incident Response Lead

@ Blue Yonder | Hyderabad

GRC Analyst

@ Chubb | Malaysia

Information Security Manager

@ Walbec Group | Waukesha, WI, United States

Senior Executive / Manager, Security Ops (TSSQ)

@ SMRT Corporation Ltd | Singapore, SG

Senior Engineer, Cybersecurity

@ Sonova Group | Valencia (CA), United States

Consultant (Multiple Positions Available)

@ Atos | Plano, TX, US, 75093