Researchers have developed Aardvark Weather, the first machine learning system that generates accurate global and local weather forecasts directly from observational data—eliminating the need for traditional numerical weather prediction (NWP) models.
Study: End-to-end data-driven weather prediction. Image Credit: IgorZh/Shutterstock.com
In a new study published in Nature, the team presents Aardvark, a fully end-to-end weather forecasting system that doesn’t rely on NWP at any stage. While most existing tools still depend on NWP for initialization or regional refinement, Aardvark directly processes raw observations to produce forecasts—globally and locally—with better or comparable accuracy, extended lead times of up to 10 days, and significantly reduced computational costs
Why Forecasting Needs a Rethink
Weather forecasting is critical across a range of sectors, including agriculture, aviation, emergency management, and energy. For decades, forecasts have been powered by numerical weather prediction systems—complex models that simulate physical processes like fluid dynamics and thermodynamics.
These systems ingest massive volumes of data from satellites, radar, and surface instruments. Observations are merged into a coherent starting point via data assimilation, after which supercomputers run physical simulations to model future weather conditions. Over the years, this approach has improved in scope and accuracy—today’s models can predict conditions up to 15 days out—but the method remains computationally demanding, hard to optimize, and slow to adapt.
Machine learning has started to ease some of these burdens by replacing individual components of the forecasting pipeline. But until now, no model has fully replaced NWP systems from start to finish. Aardvark is the first to do so—delivering accurate forecasts using raw observations alone, without any NWP products.
Aardvark’s Approach: An End-to-End ML Forecasting Pipeline
The team behind Aardvark designed a neural process-based system for weather forecasting that sidesteps traditional NWP inputs. Instead, it integrates a wide range of observational datasets to generate accurate, real-time forecasts. These include remote sensing and in-situ sources like land stations, marine platforms, and radiosondes. Satellite inputs—from scatterometers, microwave and infrared sounders, and geostationary composites—are processed at a 1.5° resolution. To better capture local and temporal patterns, the system also incorporates diurnal, seasonal, and topographic influences.
At the heart of Aardvark is a modular architecture made up of three key components:
- Encoder – Uses set convolution (SetConv) layers and a vision transformer backbone to transform sparse observational data into structured gridded initial states.
- Processor – A transformer-based model that refines forecasts iteratively at six-hour intervals, trained on residuals from the ERA5 reanalysis dataset (1979–present).
- Decoder – Translates global forecasts down to station-level predictions using a hybrid of UNet and SetConv layers.
The system is evaluated against established baselines, using latitude-weighted root mean square error (RMSE) for global forecasts and mean absolute error (MAE) for station-level accuracy. End-to-end fine-tuning ensures predictions are tailored to individual stations, improving performance where it matters most.
Training draws from the comprehensive ERA5 dataset, with models fine-tuned on key surface-level variables like 2-meter temperature and 10-meter wind speed. Aardvark runs on a four-GPU setup totaling around 100 GPU hours, emphasizing efficiency and scalability. Its design is purpose-built for handling sparse, heterogeneous data while delivering performance that rivals traditional NWP—especially at coarser resolutions.
Performance Highlights
When benchmarked globally, Aardvark holds its own against the Global Forecast System (GFS) across a range of variables—temperature, wind, and humidity among them—and even approaches the performance of the higher-resolution HRES model. Notably, it achieves this while using just 8 % of the observational data that conventional NWP systems rely on. Surface-level forecasts are especially strong, although forecast clarity tends to soften at longer lead times. In targeted case studies, such as tropical cyclone tracking, Aardvark effectively captures mesoscale dynamics.
At the station level, Aardvark performs competitively with post-processed HRES forecasts and closely matches the operational National Digital Forecast Database (NDFD) across the contiguous US. For regions with limited forecasting infrastructure—like parts of West Africa and the Pacific—it not only holds up but often outperforms HRES, highlighting its potential for global reach.
Ablation studies also revealed how different data types contribute to performance: low-Earth-orbit (LEO) sounder data are critical for initializing accurate states, while in-situ data significantly enhance surface and geopotential height forecasts. End-to-end fine-tuning amplifies these strengths, trimming temperature error by 3–6 % and wind speed error by 1–2 %. These gains show how adaptable the system is to regional demands.
Why it Matters
Aardvark offers a compelling alternative to traditional forecasting approaches—not just by matching operational models in skill, but by excelling in environments where resources are scarce. Its modular structure supports flexible fine-tuning, enabling customization for specific variables, regions, or operational goals. While it doesn't yet match the resolution or ensemble capabilities of top-tier NWP systems, its design allows for easy integration of new data sources and emerging applications, from extreme weather forecasting to seasonal climate outlooks.
By significantly lowering the barriers to high-quality forecasting, Aardvark opens the door for broader use in sectors like agriculture, renewable energy, and disaster preparedness—areas where timely, localized forecasts can have major impacts.
Looking Ahead
Aardvark is changing the way we think about weather forecasting. Instead of leaning on the traditional, resource-intensive NWP pipelines that have dominated the field for decades, it takes a fresh, data-driven approach. Using machine learning, it delivers forecasts that are not only faster and more cost-effective but often just as accurate—even when working with a fraction of the data.
What really sets Aardvark apart is how well it performs in places that conventional systems tend to overlook. In regions with limited forecasting infrastructure, it shines—bringing reliable, high-quality predictions to areas that have long gone without.
The device is still far from perfect. There's room to grow in areas like resolution and ensemble forecasting. But the beauty of Aardvark lies in its modular design—it’s built to evolve. Whether you're monitoring hurricanes, planning a crop cycle, or preparing for extreme weather events, Aardvark offers a flexible, scalable solution that feels like the future of weather forecasting.
Journal Reference
Allen, A., Markou, S., Tebbutt, W., Requeima, J., Bruinsma, W. P., Andersson, T. R., Herzog, M., Lane, N. D., Chantry, M., Hosking, J. S., & Turner, R. E. (2025). End-to-end data-driven weather prediction. Nature. DOI:10.1038/s41586-025-08897-0. https://www.nature.com/articles/s41586-025-08897-0
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.