Artificial intelligence outperforms dermatologists in detection of melanoma: study

Study data published in the Annals of Oncology showed that when a deep learning convolutional neural network (CNN) was trained to identify skin cancer, it missed fewer melanomas and misdiagnosed benign nevi less often as malignant, when compared to a group of dermatologists.

Lead author Holger Haenssle explained that to train Google's Inception v4 CNN architecture, "we showed [it] more than 100 000 images of malignant and benign skin cancers and moles and indicated the diagnosis for each image." He added that "with each training image, the CNN improved its ability to differentiate between benign and malignant lesions."

The CNN's performance was compared to a group of 58 dermatologists, just over half of which had more than five years of experience, while 19 percent had between two years and five years of experience and 29 percent had less than two years of experience.

FirstWord Reports: Providing insight, analysis and expert opinion on important Pharma trends and challenging issues <Click here> 

The dermatologists were initially presented just with dermoscopic images (level I) and asked to make a diagnosis of either melanoma or benign nevus, as well as indicate what their management decision would be, such as whether surgery was required. Four weeks later, they were asked for diagnoses and management decisions again based on dermoscopy plus clinical information about the patient and close-up images of the same 100 cases (level II).

When the doctors were provided with dermoscopic images only, they achieved a mean sensitivity and specificity for lesion classification of 86.6 percent and 71.3 percent, respectively, but were "significantly outperformed" by the CNN, the authors said. At level II, more clinical information improved the dermatologists' diagnostic performance, with the group achieving a mean sensitivity of 88.9 percent and specificity of 75.7 percent. However, despite this improvement, dermatologists still showed a specificity that was lower than the CNN specificity of 82.5 percent.

"Our data clearly show that a CNN algorithm may be a suitable tool to aid physicians in melanoma detection irrespective of their individual level of experience and training," the authors said. However, they noted that in level I of the study, 13 of the 58 dermatologists "showed a slightly higher diagnostic performance than the CNN."

In an accompanying editorial, Victoria Mar and Peter Soyer said the results suggest artificial intelligence "promises a more standardised level of diagnostic accuracy, such that all people, regardless of where they live or which doctor they see, will be able to access reliable diagnostic assessment." They added "we envisage that sooner than later, automated diagnosis will change the diagnostic paradigm in dermatology. Still, there is much more work to be done to implement this exciting technology safely into routine clinical care."

To read more Top Story articles, click here.