Credit: CC0 Public Domain
University of Michigan professor W. Nicholson Price, who also has affiliations with Harvard Law School and the University of Copenhagen Faculty of Law, suggests in a Focus piece published in Science Translational Medicine, that the time has come to set up a way to validate and integrate deep learning medical systems. He claims that the medical community is already facing serious questions of properly implementing the new kind of technology.
Deep learning algorithms are set to make a major impact on the practice of medicine—Price notes that areas such as prognosis, radiology and pathology are already being impacted. Next up will be diagnosis. Deep learning algorithms allow for the swift retrieval and analysis of vast amounts of disparate information and can learn more as they are fed more data. They are expected to revolutionize some areas of medicine, for example, diagnosing very rare diseases, or identifying tumors more quickly than a human ever could.
But deep learning algorithms suffer from a serious problem—they are not transparent. You cannot ask a deep learning algorithm why it flagged a given cell sample as cancerous, because it does not know how it knows. It just does. Before doctors are allowed to diagnose, they require difficult training, and before medicines are prescribed or medical devices are approved, they must go through extensive testing and clinical trials. But what about deep learning algorithms? How should they be tested and certified as safe for use with human patients? They could be tested in clinical trials, but what happens when the data or the algorithm is updated? Also, how can such systems be tested in tried and true ways if the means by which they arrive at answers is unknown?
Price claims the main issue is validation, and suggests a three-step process that would allow for the safe integration of deep learning algorithms into modern medical practices. The first involves adopting procedural steps to ensure that medical algorithms are created in ways that will work with previously tested vetting techniques, and that they will be trained on high-quality data. The second involves developing a reliability factor to ensure independent test data is used and results verified. And the third involves developing performance standards that stand up to real-world evidence.