Global Medical technology Company
A global medical technology company needed to unlock the value of its health data by anonymizing terabytes worth of medical images and clinical notes.
We created a multi-part machine learning (ML) system to identify and blur all personally identifiable information (PII) or protected health information (PHI) in image and text files.
Data anonymization integrated into new and existing solutions
Faster, safer data sharing with development partners
New data monetization opportunities
To unlock the value in its terabytes of medical images and clinical notes, one of the world’s largest global medical technology companies had to anonymize and format all of its data while maintaining HIPAA compliance. That meant identifying and anonymizing all instances of PII and PHI on every image and text file.
But identifying PII and PHI is difficult because sensitive information gets embedded in unpredictable ways (e.g., a physician’s handwritten notes about a patient, machine metadata, a half-visible hospital or manufacturer logo). Moreover, the sensitive and confidential nature of medical data prohibited using publicly available large language models (LLMs) to develop a solution.
The medical technology company needed a partner with expertise in machine learning (ML) data recognition models, optical character recognition (OCR) algorithms, and custom AI models. They chose WillowTree.
Custom ML Models Detect & Blur Sensitive Information in Medical Images & Text Files
Our healthcare client needed an ML system sophisticated enough to identify and protect sensitive information while preserving its medical data’s scientific value. We helped them develop a single system combining three powerful components:
A human-in-the-loop mechanism deepens safeguards while also optimizing performance. The system tags each identification with a confidence score, signaling when manual review may be needed. Developer feedback then helps the system learn, another mechanism for driving better performance over time.
Accelerated R&D + New Sources of Revenue
By successfully developing a HIPAA-compliant data anonymization system for our client’s image and text medical files, we made it possible for them to integrate data anonymization into new and existing healthcare solutions. This also allows them to share valuable health data with research partners faster and more securely, accelerating innovation while protecting patient privacy.
Data anonymization also creates new potential revenue opportunities for our client. For instance, our client could use their anonymized data to:
“The future of healthcare innovation lies in how well organizations manage and use the vast quantity of data generated every day. With the right partner, medical technology companies can turn ‘byproduct’ data into innovation and commercialization opportunities, leveraging advanced data management and analytics to power research and deliver real-world impact.”