Posted in | News | Medical Robotics

AI Breakthrough in Image Recognition with Lp-Convolution

A collaborative team from the Institute for Basic Science (IBS), Yonsei University, and the Max Planck Institute has developed a new artificial intelligence (AI) technique called Lp-Convolution. This advancement brings machine vision closer to the image processing capabilities of the human brain, improving the accuracy and efficiency of image recognition systems while reducing the computational demands of existing AI models.

Brain-Inspired Design of Lp-Convolution. The brain processes visual information using a Gaussian-shaped connectivity structure that gradually spreads from the center outward, flexibly integrating a wide range of information. In contrast, traditional CNNs face issues where expanding the filter size dilutes information or reduces accuracy (d, e). To overcome these structural limitations, the research team developed Lp-Convolution, inspired by the brain’s connectivity (a–c). This design spatially distributes weights to preserve key information even over large receptive fields, effectively addressing the shortcomings of conventional CNNs. Image Credit: Institute for Basic Science (IBS)

Bridging the Gap Between CNNs and the Human Brain

The human brain is particularly adept at identifying key details within complex scenes, a capability that traditional AI systems have struggled to replicate. Convolutional Neural Networks (CNNs), the most commonly used AI model for image recognition, process images using small, square filters. While effective, this approach limits the ability to capture broader patterns in fragmented data.

Vision Transformers (ViTs) have shown superior performance by analyzing entire images at once, but they require significant computational resources and large datasets, making them impractical for many real-world applications.

Inspired by the brain's visual cortex, which processes information through selective, circular, sparse connections, the research team explored whether a brain-inspired approach could make CNNs both more efficient and powerful.

Introducing Lp-Convolution: A Smarter Way to See

To address this challenge, the team developed Lp-Convolution, a novel method that uses a multivariate p-generalized normal distribution (MPND) to dynamically reshape CNN filters. Unlike conventional CNNs, which use fixed square filters, Lp-Convolution allows AI models to adapt the shape of their filters, stretching them horizontally or vertically based on the task, similar to how the human brain selectively focuses on relevant details.

This development addresses the long-standing issue in AI known as the large kernel problem. Increasing filter sizes in CNNs (e.g., using 7×7 or larger kernels) typically does not improve performance, despite adding more parameters. Lp-Convolution overcomes this limitation by introducing flexible, biologically inspired connectivity patterns.

Real-World Performance: Stronger, Smarter, and More Robust AI

In tests using standard image classification datasets (CIFAR-100, TinyImageNet), Lp-Convolution significantly improved accuracy on both established models like AlexNet and modern architectures like RepLKNet. The method also demonstrated high robustness against corrupted data, a critical challenge in practical AI applications.

Additionally, the researchers observed that when the Lp-masks used in their method resembled a Gaussian distribution, the AI's internal processing patterns closely aligned with biological neural activity, as confirmed by comparisons with mouse brain data.

We humans quickly spot what matters in a crowded scene. Our Lp-Convolution mimics this ability, allowing AI to flexibly focus on the most relevant parts of an image—just like the brain does.

Dr. C. Justin Lee, Director, Center for Cognition and Sociality, Institute for Basic Science

Impact and Future Applications

Unlike earlier approaches that either used small, inflexible filters or relied on computationally intensive transformers, Lp-Convolution offers a viable and efficient alternative. This innovation has the potential to impact several fields, including:

  • Autonomous driving: Enabling AI to rapidly detect obstacles in real time.
  • Medical imaging: Enhancing AI-driven diagnoses by emphasizing subtle details.
  • Robotics: Facilitating smarter and more adaptable machine vision in dynamic environments.

This work is a powerful contribution to both AI and neuroscience. By aligning AI more closely with the brain, we’ve unlocked new potential for CNNs, making them smarter, more adaptable, and more biologically realistic.

Dr. C. Justin Lee, Director, Center for Cognition and Sociality, Institute for Basic Science 

Moving forward, the research team plans to further develop this technology, exploring its potential in complex reasoning tasks such as puzzle-solving (e.g., Sudoku) and real-time image processing.

The study will be presented at the International Conference on Learning Representations (ICLR) 2025, and the research team has made their code and models publicly available.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.