Post

[Reading Note] A Survey on Network Management for xANET

[Reading Note] A Survey on Network Management for xANET

Original Link

Since I’m a Production Engineer in the Backbone Release Engineering team, I’m curious about how AI could assist in my job and whether PE will be replaced by AI. After reading this survey on the evolution and future of xANET management, I gained knowledge about the management challenges of the highly dynamic xANET and how AI can optimize the autonomy and automation of network topologies.

Follow me on Facebook to get an update when my new post comes out.

The Characteristics and Applications of xANET

The xANET is a self-organizing, decentralized, distributed, and multi-hop wireless network without relying on fixed infrastructure.

The network allows end systems like personal computers, smartphones, and smart home devices to communicate with each other. To manage wireless services among different systems, we need to understand the concept of the x ad hoc network (xANET) — a family of ad hoc networks. The xANET include: mobile ad hoc network (MANET) vehicular ad hoc network (VANET) flying ad hoc network (FANET) satellite ad hoc network (SANET)

A certain end system can utilize multiple communication networks. For example, an Embodied Intelligence may use BAN (Body Area Network), which allows sensors and executors to communicate, WAN/LAN, which allows the robots to connect to edge nodes through Wi-Fi or Ethernet, and WWAN (Wireless Wide Area Network), which provides remote communication or cloud services through 4G/5G network.

Ad hoc networks are independent from infrastructure.

  • Infrastructure-dependent communication modes: These require a central device or network infrastructure to establish communication paths—for example, cellular networks or Wi-Fi.
  • Infrastructure-independent communication modes (e.g., Ad Hoc networks): These are dynamically self-organizing networks where nodes communicate directly or via intermediate nodes. They are well-suited for scenarios like disaster recovery, military operations, and drone swarms.

Basic Concepts of Network Management

We should focus on these aspects of network management:

Fault Management:

  • Fault identification and recovery
  • Fault indicators (KPIs): time required to identify faults, the network downtime length, and the fault recovery effectiveness.
  • Solutions: Automatic routing recalculation, resource reassignment from fault nodes to nearby nodes. (侧重临时恢复,“救火”)

Performance Management:

  • Efficient data transmission and resource utilization that affect quality of services (QoS).
  • Performance indicators (KPIs): Throughput, delay/latency, and packet delivery rates, Packet Loss Rate(丢包率), Jitter(抖动),Congestion(拥堵).
  • Solutions: Energy-aware routing protocols, Network topology monitoring to track node locations and connectivity, resource allocation. (侧重持续调优,“监控”)

Configuration Management:

  • Ensures that nodes are set up to meet specific tasks and environmental conditions.
  • Evaluating metrics (KPIs): the duration of configuration processes, adaptability to dynamic conditions, and efficient resource usage.
  • Solutions: Autoconfiguration protocols, Resource-aware configuration protocols

Security Management:

  • Protects against unauthorized access and malicious activities, such as disinformation and denial-of-service (DoS)
  • KPIs: response time to security breaches, data integrity rates, system availability, false positive rates (误报率)
  • Solutions: Real-time traffic monitoring, Security policies, lightweight encryption algorithms (加密算法), integrating advanced technologies such as blockchain and AI.

The Challenges of xANET Management

xANET faces a couple of challenges including:

  • Dynamic network management: promptly update network topologies according to the changes of node status. The dynamic topology imposes challenges on link stability, fast fault detection and recovery mechanisms.
  • Scalability: bandwidth and power are limited. Minimizing network management overhead is essential to save energy.
  • Security: handle eavesdropping, sabotage, and intrusion, etc. Methods include integrating encryption and authentication mechanisms.
  • Intelligent and automated management: for flexible scheduling of network resources.

To sum up, compared with traditional networks, xANETs are more complex, unpredictable, highly dynamic with limited resources and poor security. Luckily, the development of AI, software-defined network (SDN), blockchain offers new approaches to manage complex networks.

Approaches to address the challenges - The best is IDNM

With the challenges of xANET in mind, let’s take a look at some common network management approaches.

The conventional simple network management protocol (SNMP) adopts centralized management, but fails to meet the scalability requirements (too many queries generate signaling overhead).

  • Manager-Agent Model: The management system manages multiple agents (一对多的主从结构)
  • Periodic Polling (周期性查询): can introduce latency
  • MIB (Management Information Base): include parameters and status
  • Suitable for simple, static networks.

The policy-based network management (PBNM) uses static and complex management policies, which cannot adapt to dynamic changes.

  • Suitable for medium-sized enterprise networks.

The intent driven network management (IDNM) introduces automated and intelligent management, which adapts to scalable and high dynamic requirements of xANET. (类似面向对象中的抽象和封装)

  • User specifies high-level objectives without caring about the low-level implementation.
  • Intent: can be technical or non-technical. i.e.specified path of data packets, data packet classification, processing functions, QoS intents, and set bandwidth requirements.
  • IDN automatically completes intent translation, policy mapping, configuration, and fault monitoring based on real-time network changes.
  • IDNM challenge: more intelligent intent translation and policy generation mechanisms, optimizing the architecture’s generalizability and dynamic responsiveness.
  • Suitable for dynamic, complex, large-scale networks.

The article then explores the usage of the three management approaches in different types of xANET, supported by references from other related papers.

Future Directions of xANET

This is the part I’m most interested in. It proposes the future research directions of network management, especially how it integrates with Gen-AI and blockchain.

Let’s throw back to Current Challenges of xANET management:

  • Fully automated network management systems
    • Challenges: delays in recovery (could lead to severe service degradation), dynamic topologies, unpredictable failures, and limited resources.
    • Direction: direct, diagnose, recover failures across all layers of the network (physical, application, etc.)
  • Network congestion due to increasing nodes (control messages)
    • Challenges: The control traffic overload can result in network congestion, which in turn degrades the overall service quality, causing delays in routing decisions, increased packet loss, and diminished throughput.
    • Direction: find a balance between timely network awareness and minimize traffic overhead
  • QoS Assurance in dynamic environments
    • Challenges: Traditional QoS mechanisms such as RSVP and DiffServ face limitations
    • Direction: rapid adjustment to network change, QoS policy generation, routing decisions, resource reallocation and automation

To address these challenges in the future, we can resort to AI, edge computing, or blockchain to increase the autonomy and automation of network management:

  • Based on the current network state, use AI algorithms to update network topologies and optimize performance, i.e.: Maximize throughput, Minimize latency
  • AI models to predict network failures, adjust routing protocols or reallocate resources to prevent disruptions.
  • Gen-AI such as generative pre-trained transformers (GPT) to automate policy generation and automation.
  • Edge-AI for Localized Decision Making, local network health monitoring, and autonomous configuration.
  • Cloud-Assisted Global Network Orchestration
  • Blockchain for Trust and Security

So here comes a question - Will Production Engineer (PE) be replaced by AI in the future? My answer is no. Here are my opinions:

  • AI models have boundaries. PEs know the boundaries and to what extent they could integrate AI models in networks.
  • PEs are responsible for “outliers” in network errors. For example, when an unexpected error occurs, AI may overreact and force all traffic to reroute and cause network oscillation (route jitter). In this situation, manual intervention is required.
  • AI has no intent. Therefore, when it comes to overall architecture design, PEs play the role in defining the overarching strategies and boundaries, while AI determines the corresponding parameters. This is how “intent” works in “Intent-Driven Network Management (IDNM)”.

For next steps, I will read more papers on:

  • The challenges of traditional end2end networks and how AI can address these challenges.
  • The bottlenecks of communications between GPUs in the AGI era.
    • AI Networking Requirements: Low-latency, high-bandwidth multicast communication is required between thousands of GPUs.
    • Tools: ROCKET、PARAM、Chakra
  • How ML and Deep Learning enable systems to adapt to network changes, and optimize decisions in real-time.

I researched some case studies on the collaboration between PE and AI. I’ll list them here and maybe I’ll write a survey on these case studies. Stay tuned to my blogs, my friends.

This post is licensed under CC BY 4.0 by the author.

Trending Tags