Posted in | News | Medical Robotics

AI Breakthrough in Image Recognition with Lp-Convolution

Download PDF Copy

Reviewed

Reviewed by Lexie CornerApr 23 2025

A collaborative team from the Institute for Basic Science (IBS), Yonsei University, and the Max Planck Institute has developed a new artificial intelligence (AI) technique called Lp-Convolution. This advancement brings machine vision closer to the image processing capabilities of the human brain, improving the accuracy and efficiency of image recognition systems while reducing the computational demands of existing AI models.

**Brain-Inspired Design of L_p-Convolution.** The brain processes visual information using a Gaussian-shaped connectivity structure that gradually spreads from the center outward, flexibly integrating a wide range of information. In contrast, traditional CNNs face issues where expanding the filter size dilutes information or reduces accuracy (d, e). To overcome these structural limitations, the research team developed L_p-Convolution, inspired by the brain’s connectivity (a–c). This design spatially distributes weights to preserve key information even over large receptive fields, effectively addressing the shortcomings of conventional CNNs. Image Credit: Institute for Basic Science (IBS)

Bridging the Gap Between CNNs and the Human Brain

The human brain is particularly adept at identifying key details within complex scenes, a capability that traditional AI systems have struggled to replicate. Convolutional Neural Networks (CNNs), the most commonly used AI model for image recognition, process images using small, square filters. While effective, this approach limits the ability to capture broader patterns in fragmented data.

Vision Transformers (ViTs) have shown superior performance by analyzing entire images at once, but they require significant computational resources and large datasets, making them impractical for many real-world applications.

Inspired by the brain's visual cortex, which processes information through selective, circular, sparse connections, the research team explored whether a brain-inspired approach could make CNNs both more efficient and powerful.

Introducing L_p-Convolution: A Smarter Way to See

To address this challenge, the team developed L_p-Convolution, a novel method that uses a multivariate p-generalized normal distribution (MPND) to dynamically reshape CNN filters. Unlike conventional CNNs, which use fixed square filters, L_p-Convolution allows AI models to adapt the shape of their filters, stretching them horizontally or vertically based on the task, similar to how the human brain selectively focuses on relevant details.

This development addresses the long-standing issue in AI known as the large kernel problem. Increasing filter sizes in CNNs (e.g., using 7×7 or larger kernels) typically does not improve performance, despite adding more parameters. L_p-Convolution overcomes this limitation by introducing flexible, biologically inspired connectivity patterns.

Real-World Performance: Stronger, Smarter, and More Robust AI

In tests using standard image classification datasets (CIFAR-100, TinyImageNet), L_p-Convolution significantly improved accuracy on both established models like AlexNet and modern architectures like RepLKNet. The method also demonstrated high robustness against corrupted data, a critical challenge in practical AI applications.

Additionally, the researchers observed that when the L_p-masks used in their method resembled a Gaussian distribution, the AI's internal processing patterns closely aligned with biological neural activity, as confirmed by comparisons with mouse brain data.

We humans quickly spot what matters in a crowded scene. Our L_p-Convolution mimics this ability, allowing AI to flexibly focus on the most relevant parts of an image—just like the brain does.

Dr. C. Justin Lee, Director, Center for Cognition and Sociality, Institute for Basic Science

Impact and Future Applications

Unlike earlier approaches that either used small, inflexible filters or relied on computationally intensive transformers, L_p-Convolution offers a viable and efficient alternative. This innovation has the potential to impact several fields, including:

Autonomous driving: Enabling AI to rapidly detect obstacles in real time.
Medical imaging: Enhancing AI-driven diagnoses by emphasizing subtle details.
Robotics: Facilitating smarter and more adaptable machine vision in dynamic environments.

This work is a powerful contribution to both AI and neuroscience. By aligning AI more closely with the brain, we’ve unlocked new potential for CNNs, making them smarter, more adaptable, and more biologically realistic.

Dr. C. Justin Lee, Director, Center for Cognition and Sociality, Institute for Basic Science

Moving forward, the research team plans to further develop this technology, exploring its potential in complex reasoning tasks such as puzzle-solving (e.g., Sudoku) and real-time image processing.

The study will be presented at the International Conference on Learning Representations (ICLR) 2025, and the research team has made their code and models publicly available.

Source:

Institute for Basic Science (IBS)