However, a deep learning algorithm has been created by researchers from the Mahmood Lab at Brigham and Women’s Hospital, a founding institution of the Mass General Brigham healthcare system, that can learn specific features that can later be used on large pathology image databases to identify similar features.
The new tool, known as SISH (Self-Supervised Image search for Histology), functions as a search engine for pathology images and has vast potential for a wide array of applications, such as finding unusual diseases and assisting clinicians in selecting which patients are most likely to respond to similar therapies.
The self-teaching method is described in a study that was published in the journal Nature Biomedical Engineering.
We show that our system can assist with the diagnosis of rare diseases and find cases with similar morphologic patterns without the need for manual annotations, and large datasets for supervised training. This system has the potential to improve pathology training, disease subtyping, tumor identification, and rare morphology identification.
Faisal Mahmood, Study Senior Author and Computational Pathologist, Department of Pathology, Brigham and Women’s Hospital
Modern electronic databases have the capacity to store a large number of digital records and reference images, particularly in pathology, through whole slide images (WSIs).
However, since each WSI is gigapixel-sized and there are so many images in these libraries, searching for and obtaining WSIs can be time-consuming and difficult. Scalability continues to be a significant barrier to effective utilization as a result.
SISH, which trains itself to learn feature representations that can be used to discover cases with equivalent features in pathology at a consistent pace regardless of the size of the database, was developed by researchers at Brigham to address this problem.
The speed and capacity of SISH to obtain information on interpretable disease subtypes for both common and uncommon malignancies were examined in this study.
The algorithm was able to successfully retrieve images with speed and accuracy from a database of tens of thousands of whole slide images from over 22,000 patient cases, with over 50 different disease types and over a dozen anatomical sites.
In several situations, including retrieval of disease subtypes, the speed of retrieval exceeded alternative techniques, especially as the size of the image database grew to include thousands of images. SISH was able to keep up a steady search speed even as the repositories grew in size.
However, the method has certain drawbacks, such as a need for large memory, limited context awareness within large tissue slides, and it is also limited to a single imaging modality.
Overall, the algorithm showed that it could efficiently retrieve images from a variety of datasets, regardless of the size of the repository. Additionally, it showed that it was adept at diagnosing specific rare disease types and could act as a search engine by identifying specific areas of images that would be important for diagnosis.
The diagnosis, prognosis, and analysis of future diseases could be considerably improved by this work.
Mahmood added, “As the sizes of image databases continue to grow, we hope that SISH will be useful in making the identification of diseases easier. We believe one important future direction in this area is multimodal case retrieval which involves jointly using pathology, radiology, genomic and electronic medical record data to find similar patient cases.”
This research was funded in part by the NIGMS R35GM138216 grant (to F.M.), the Brigham President's Fund, the BWH and MGH Pathology, the Google Cloud Research Grant, and the Nvidia GPU Grant Program.
The Tau Beta Pi Fellowship and the NIH National Cancer Institute (NCI) Ruth L. Kirschstein National Service Award, T32CA251062, also provided supplementary funding for it.
Journal Reference
Chen, C., et al. (2022) Fast and scalable search of whole-slide images via self-supervised deep learning. Nature Biomedical Engineering. doi:10.1038/s41551-022-00929-8.