all InfoSec news
Beyond Accuracy: Automated De-Identification of Large Real-World Clinical Text Datasets. (arXiv:2312.08495v1 [cs.CL])
cs.CR updates on arXiv.org arxiv.org
Recent research advances achieve human-level accuracy for de-identifying
free-text clinical notes on research datasets, but gaps remain in reproducing
this in large real-world settings. This paper summarizes lessons learned from
building a system used to de-identify over one billion real clinical notes, in
a fully automated way, that was independently certified by multiple
organizations for production use. A fully automated solution requires a very
high level of accuracy that does not require manual review. A hybrid
context-based model architecture is …
accuracy automated beyond building datasets de-identification free human identification identify large lessons learned real research settings system text world