In the realm of energy infrastructure and public safety, monitoring crowd density is crucial for efficient management and risk assessment. Researchers from the National Taiwan University, including Mas Nurul Achmadiah, Chi-Chia Sun, Wen-Kai Kuo, and Jun-Wei Hsieh, have developed a novel approach to improve crowd counting technology. Their work, published in the IEEE Transactions on Circuits and Systems for Video Technology, introduces RepSFNet, a lightweight and efficient neural network designed for accurate and real-time crowd estimation.
Crowd counting presents significant challenges due to varying densities, scale variations, occlusions, and the high computational cost of existing models. To tackle these issues, the researchers proposed RepSFNet, which stands for Reparameterized Single Fusion Network. This architecture is designed to provide accurate crowd estimates while being computationally efficient, making it suitable for real-time applications.
RepSFNet leverages a RepLK-ViT backbone, which uses large reparameterized kernels to efficiently extract multi-scale features from images. This allows the network to capture both fine details and broader context, which is essential for accurate crowd counting. The network also integrates a Feature Fusion module that combines Atrous Spatial Pyramid Pooling (ASPP) and Context-Aware Network (CAN) to model the context of the scene adaptively. This helps the network to handle varying crowd densities and complex backgrounds.
One of the key advantages of RepSFNet is its simplicity and efficiency. By avoiding attention mechanisms and multi-branch designs, the network significantly reduces the number of parameters and computational complexity. This makes it more suitable for edge computing applications, where computational resources are often limited.
The researchers also introduced a novel training objective that combines Mean Squared Error and Optimal Transport loss. This approach improves both the accuracy of the count estimates and the spatial distribution of the predicted density maps. The effectiveness of RepSFNet was demonstrated through extensive experiments on several benchmark datasets, including ShanghaiTech, NWPU, and UCF-QNRF. The results showed that RepSFNet achieves competitive accuracy while reducing inference latency by up to 34 percent compared to recent state-of-the-art methods.
For the energy sector, particularly in smart cities and infrastructure management, RepSFNet offers a practical solution for real-time crowd monitoring. This technology can be integrated into existing surveillance systems to provide accurate and timely information about crowd densities. This can help in optimizing energy usage in public spaces, managing crowd flow to prevent overloading of facilities, and enhancing public safety by identifying potential risks in real-time. The lightweight and efficient nature of RepSFNet makes it particularly suitable for deployment in edge devices, ensuring low-power and cost-effective solutions.
This article is based on research available at arXiv.

