The daily life of humans has been impacted by artificial intelligence. Initially, it was ChatGPT. Currently, it is beer commercials and AI-generated pizza.While we know that AI is not always perfect, it appears that our interactions with AI are not perfect either.
Peter Koo, an Assistant Professor from Cold Spring Harbor Laboratory (CSHL) has discovered the fact that researchers making use of famous computational tools to decipher AI forecasts are picking up too much “noise,” or extra data when examining DNA. Also, a way to fix this has been found.
Currently, with only a couple of new lines of code, researchers could get highly trustworthy explanations out of strong AIs called deep neural networks. This implies that they could continue going after genuine DNA features. Such features may just signal the next discovery in health and medicine. However, researchers would not see the signals if they were drowned out by high noise.
So, what results in the interfering noise? It is a puzzling and invisible source like digital “dark matter.” Astronomers and physicists believe the majority of the universe has been filled with dark matter, a material that employs gravitational effects but that no one has yet observed.
Koo states that in the same way, Koo and his team found that the data that AI is being trained on lacks critical data, as a result of considerable blind spots. Much worse, those blind spots get widely regarded when interpreting AI predictions of DNA function.
The deep neural network is incorporating this random behavior because it learns a function everywhere. But DNA is only in a small subspace of that. And it introduces a lot of noise. And so we show that this problem actually does introduce a lot of noise across a wide variety of prominent AI models.
Peter Koo, Assistant Professor, Cold Spring Harbor Laboratory
Digital dark matter is known as an outcome of researchers lending computational methods from computer vision AI. DNA data, unlike images, has been specified to be a combination of four nucleotide letters: A, C, G, and T.
However, image data collected in the form of pixels could be long and constant. Simply put, one is feeding AI as an input it does not know how to tackle correctly.
By employing the computational correction of Koo, researchers have the potential to interpret DNA analyses of AI so precisely.
We end up seeing sites that become much more crisp and clean, and there is less spurious noise in other regions. One-off nucleotides that are deemed to be very important all of a sudden disappear.
Peter Koo, Assistant Professor, Cold Spring Harbor Laboratory
Koo hopes that noise disturbance impacts more than AI-powered DNA analyzers. He imagines it to be an extensive affliction amongst computational processes including similar types of data. A note to remember is that dark matter is present everywhere. Luckily, Koo’s new tool could help bring researchers out of the darkness and into the light.
The study was financially supported by the National Institutes of Health, Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory.
Journal Reference:
Majdandzic, A., et al. (2023) Correcting gradient-based interpretations of deep neural networks for genomics. Genome Biology. doi.org/10.1186/s13059-023-02956-3.