A Lightweight Collaborative Attention Residual Network with Depthwise Convolutions for Visual Feature Representation

Fenna Marcelline Marcelline

doi:10.5281/zenodo.17563318

pdf

Published: 2025-07-01

DOI: https://doi.org/10.5281/zenodo.17563318

Fenna Marcelline

University of Windsor

Abstract

This paper proposes a lightweight convolutional neural network model that integrates depthwise separable convolution and a collaborative attention mechanism to enhance classification accuracy and robustness in complex visual environments. The proposed CAM-DResNet model improves computational efficiency by replacing standard convolutions with depthwise convolutions and incorporates a Collaborative Channel-Spatial-Pixel Attention (CCSPA) module after each network stage to enhance global and local feature perception. This structure enables the network to effectively capture fine-grained texture features while maintaining strong generalization capability under diverse lighting, scale, and noise conditions. Extensive experiments on benchmark image datasets demonstrate that CAM-DResNet achieves superior performance compared to classical models such as ResNet50, DenseNet, and MobileNet, with improved accuracy and reduced model parameters. Specifically, the proposed model achieves over 91% classification accuracy while reducing computational complexity by one-third relative to ResNet50. The results indicate that the integration of multi-level attention and lightweight residual design provides a robust and efficient solution for high-precision image classification tasks in modern computer vision applications.

Issue

Vol. 1 No. 2 (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section