DNAstack today announced Viral AI, a federated network for genomic variant surveillance and infectious disease research. Viral AI was designed to deliver equitable access to software infrastructure, accelerate international data sharing, and empower scientists and public health officials with globally representative datasets they need to mitigate current and future infectious disease outbreaks.
Genomic surveillance is required to detect new variants that can threaten the global COVID-19 pandemic response by being more transmissible, pathogenic, evasive of diagnostics, and resistant to therapies. Regional genome sequencing, analysis, and rapid international data sharing are critical to inform public health decisions to help slow the devastating impacts of COVID-19.
Viral AI introduces a new way to share and analyze genomics, clinical, administrative, and related data, facilitating insights about transmission, severity, diagnostics and vaccine escape. As an alternative to the centralized model, where data is uploaded to a single vendor-managed database, Viral AI adopts a federated architecture to connect, analyze, and share data without moving it. This model enables faster, more efficient, regulatory compliant, and regionally sovereign data management, enabling viral surveillance efforts to be more equitable, scalable, and sustainable.
"We envision a future where global pathogen surveillance is powered by a real-time digital network of local datasets," said Dr. Marc Fiume, CEO at DNAstack. "Viral AI democratizes access to software that follows international best practices in bioinformatics and data sharing, empowering any country or organization looking to implement local surveillance while participating in a global data sharing network."
Viral AI accelerates science by making data uniformly accessible through a user-friendly graphical interface and powerful programmatic interfaces, integrating data across different sources from around the world, such as NCBI Sequence Read Archive (SRA), Canadian COVID Genomics Network (CanCOGeN), and European Center for Disease Prevention and Control, among others. Over one million viral sequences have been added with corresponding assemblies, variant calls, and lineage assignments, all harmonized through an open source bioinformatics pipeline.
Researchers can use Developer Tools to analyze Viral AI data from scientific computing environments like Terra, a software platform for collaborative biomedical research co-developed by the Broad Institute of MIT and Harvard, Microsoft, and Verily Life Sciences. Terra extends native cloud services with researcher-focused capabilities, including shared workspaces to run computational workflows reproducibly and at scale. Researchers can clone a public workspace that has been created in Terra which demonstrates how to run analyses across data from Viral AI. "DNAstack is using Terra to offer researchers powerful new ways to analyze Viral AI data, increasing the data's value, expanding Terra's ecosystem and enabling new research frontiers," said David Glazer, Engineering Director and Terra CTO at Verily Life Sciences. "This integration aligns with our shared vision for creating a modular, community-driven biomedical data ecosystem enabled by open, interoperable standards."
The Viral AI network is powered by enterprise implementations of open standards created by the Global Alliance for Genomics & Health (GA4GH). The GA4GH develops technical, ethical, regulatory, and security protocols for responsible sharing of genomics and clinical datasets, designed in collaboration with world-leading organizations and precision health initiatives. "The GA4GH is working with the international community to strengthen standards that will help enable rapid and timely responses to new variants of concern domestically and around the globe," said Peter Goodhand, CEO of GA4GH. "With Viral AI, DNAstack is demonstrating how GA4GH standards can be applied to accelerate international data sharing for pathogen surveillance and infectious disease research."
DNAstack software enables data custodians to set up independent locally-controlled infrastructure with capabilities to process, interpret, and responsibly share data with attribution. Viral AI includes a secondary analysis pipeline for genome assembly, variant calling, and lineage assignment on data from Illumina, Oxford Nanopore, and PacBio sequencers, a visual analytics dashboard for monitoring variants of concern, and a data publication tool for real-time sharing to the network. Private instances can be set up quickly within customer-managed environments on major cloud platforms and on-premises.
DNAstack is collaborating with Amazon Web Services (AWS) to harmonize, process, and share viral genome sequences deposited into NCBI SRA as part of the AWS Diagnostic Development Initiative (DDI). Sequencing data is uniformly re-processed by a secondary analysis pipeline executed through Amazon Genomics CLI and made freely available with support from the AWS Open Data Sponsorship Program. "Our goal is to accelerate innovations that can advance the collective understanding of COVID-19 and other infectious diseases," said Maggie Carter, Global Lead, Social Impact at AWS. "We're pleased to work with DNAstack to make it easy for the global community to tap into one of the largest open collections of SARS-CoV-2 genomes in the world."
Viral AI is supporting the Canadian COVID-19 Genomics Network (CanCOGeN) VirusSeq initiative, part of the national genomics program led by Genome Canada. "The need for genomic surveillance will not end with this pandemic or with coronaviruses," said Dr. Catalina Lopez-Correa, Chief Scientific Officer at Genome Canada and Executive Director of CanCOGeN. "By democratizing digital infrastructure for genomic variant surveillance and rapid data sharing, Viral AI will help create more inclusive, globally representative datasets that we can use to make better public policy decisions, diagnostics, and treatments." The Government of Ontario is also using Viral AI to inform pandemic response and planning using insights from genomics, epidemiology, public policy, and other data.
Viral AI builds on the technology created in collaboration with a national consortium supported by Canada's Digital Supercluster, a federal program that invests in the development and adoption of digital technologies. The COVID Cloud project, led by DNAstack, included experts in infectious disease, ethics, policy, cloud computing, and artificial intelligence. "We applaud the leadership shown by this Canadian consortium, working together with the international community to demonstrate the power of collaboration in developing new technologies to fight big challenges in health," said Sue Paish, CEO of Canada's Digital Technology Supercluster. "The virus that causes COVID-19 knows no boundaries, and neither should our approach to innovation and collaboration."