Researchers from the Technical University of Denmark, including Agrippina Mwangi, León Navarro-Hilfiker, Lukasz Brewka, Mikkel Gryning, Elena Fumagalli, and Madeleine Gibescu, have developed a novel approach to improve the resilience of industrial networks, particularly in wind power plants. Their work, published in the IEEE Internet of Things Journal, focuses on addressing intermittent service degradation caused by stochastic disruptions.
In industrial networks, sudden traffic bursts and thermal fluctuations can lead to service degradation, violating quality-of-service requirements and service-level agreements. This can result in delayed or lost control signals, reduced operational efficiency, and increased risk of wind turbine generator downtime. To tackle these challenges, the researchers propose a threshold-triggered Deep Q-Network (DQN) self-healing agent. This agent autonomically detects, analyzes, and mitigates network disruptions while adapting routing behavior and resource allocation in real time.
The DQN self-healing agent was trained, validated, and tested on an emulated tri-clustered switch network deployed in a cloud-based proof-of-concept testbed. The results showed a significant improvement in disruption recovery performance, with a 53.84% increase compared to a baseline shortest-path and load-balanced routing approach. Moreover, the agent outperformed state-of-the-art methods, including the Adaptive Network-based Fuzzy Inference System by 13.1% and a DQN and traffic prediction-based routing optimization method by 21.5%, in a super-spine leaf data-plane architecture.
One of the key benefits of this approach is its ability to maintain switch thermal stability by proactively initiating external rack cooling when required. This proactive management of thermal conditions can help prevent potential failures and ensure the reliable operation of wind turbines.
The practical applications of this research for the energy sector are significant. By improving the resilience of industrial networks, the DQN self-healing agent can enhance the reliability and efficiency of wind power plants. This can lead to increased energy production, reduced downtime, and lower maintenance costs. Furthermore, the approach can be applied to other mission-critical, time-sensitive applications in the energy sector, such as smart grids and other renewable energy systems.
In conclusion, the research highlights the potential of deep reinforcement learning in building resilience in software-defined industrial networks. The proposed DQN self-healing agent offers a promising solution to the challenges posed by stochastic disruptions, ensuring the reliable and timely delivery of control, monitoring, and best-effort traffic in industrial networks.
This article is based on research available at arXiv.

