A Lightweight Collaborative Attention Residual Network with Depthwise Convolutions for Visual Feature Representation

Main Article Content

Fenna Marcelline

Abstract

This paper proposes a lightweight convolutional neural network model that integrates depthwise separable convolution and a collaborative attention mechanism to enhance classification accuracy and robustness in complex visual environments. The proposed CAM-DResNet model improves computational efficiency by replacing standard convolutions with depthwise convolutions and incorporates a Collaborative Channel-Spatial-Pixel Attention (CCSPA) module after each network stage to enhance global and local feature perception. This structure enables the network to effectively capture fine-grained texture features while maintaining strong generalization capability under diverse lighting, scale, and noise conditions. Extensive experiments on benchmark image datasets demonstrate that CAM-DResNet achieves superior performance compared to classical models such as ResNet50, DenseNet, and MobileNet, with improved accuracy and reduced model parameters. Specifically, the proposed model achieves over 91% classification accuracy while reducing computational complexity by one-third relative to ResNet50. The results indicate that the integration of multi-level attention and lightweight residual design provides a robust and efficient solution for high-precision image classification tasks in modern computer vision applications.

Article Details

Section

Articles