Posted in | News | Consumer Robotics

AI Understands Human Actions Like Never Before

Researchers from the University of Virginia have developed an AI-driven intelligent video analyzer capable of identifying human actions in video footage with unprecedented accuracy and precision, according to a study published in IEEE Transactions on Pattern Analysis and Machine Intelligence.

AI Understands Human Actions Like Never Before
Professor and chair of the Department of Electrical and Computer Engineering, Scott T. Acton. Image Credit: University of Virginia

What if a security camera could not only record footage but also understand the activity it captures, making a real-time distinction between normal activity and potentially harmful behavior? Researchers at the University of Virginia’s School of Engineering and Applied Science are working towards this future with their recent development of the Semantic and Motion-Aware Spatiotemporal Transformer Network (SMAST). This system could enhance public safety, surveillance, healthcare motion tracking, and autonomous vehicle navigation in complex environments.

This AI technology opens doors for real-time action detection in some of the most demanding environments. It is the kind of advancement that can help prevent accidents, improve diagnostics, and even save lives.

Scott T. Acton, Study Lead Researcher, Professor and Chair, Department of Electrical and Computer Engineering, University of Virginia

AI-Driven Innovation for Complex Video Analysis

How does SMAST work? At its core, it relies on artificial intelligence to identify and interpret complex human behaviors. The system is powered by two key elements.

First, a multi-feature selective attention model helps the AI filter out irrelevant details, focusing on the most important parts of a scene, like specific actions or objects. This enables more accurate event detection, such as recognizing when someone is throwing a ball rather than just moving their arm.

Second, a motion-aware 2D positional encoding algorithm allows the AI to track movement over time. For instance, in videos where people change positions frequently, this tool helps the AI remember and understand those movements and their relationships.

By combining these capabilities, SMAST can accurately identify complex movements in real time, making it highly effective in critical areas like autonomous driving, healthcare diagnostics, and surveillance.

SMAST represents a new approach to how machines recognize and understand human behavior. Current systems often struggle to capture the context of events in chaotic, continuous video footage. However, SMAST’s innovative design, powered by AI components that learn and adapt from data, allows it to track the dynamic interactions between people and objects with exceptional precision.

Setting New Standards in Action Detection Technology

This breakthrough enables the AI system to identify actions like a runner crossing the street, a doctor performing a complex procedure, or detecting a security threat in a crowded area. SMAST has already surpassed leading solutions in key academic benchmarks, including AVA, UCF101-24, and EPIC-Kitchens, setting new standards for accuracy and efficiency.

The societal impact could be huge. We are excited to see how this AI technology might transform industries, making video-based systems more intelligent and capable of real-time understanding.

Matthew Korban, Postdoctoral Research Associate, University of Virginia

Matthew Korban, Peter Youngs, and Scott T. Acton from the University of Virginia are the study authors.

The National Science Foundation (NSF), under Grant 2000487 and Grant 2322993, funded the study.

Journal Reference:

Korban, M. et. al. (2024) A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. doi.org/10.1109/TPAMI.2024.3377192

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.