Next-generation of networks are expected to compute, learn to think and respond to business needs almost automatically and manage the constant growth of data generated by the ever-growing number of connected, intelligent devices. Artificial Intelligence (AI) plays a crucial role in automating processes, managing complexity, scalability, and using information from distributed systems in real time. This article explores the technical issues the R&D community has to tackle to assist the communications service provider (CSPs) and other players in the industry fully benefit from the capabilities of AI.
A combination of millions of connected devices, petascale computing, and advanced communication technologies that allow for real-time interactions create systems that are at a complexity far beyond the capabilities of human beings to fully comprehend and manage. The operation and management of these systems require an extremely high level of automation that is intelligent.
One benefit of the increasing size and the plethora of interactions leading to the increase in complexity of the system is that they enable the collection of massive quantities of data from various components that form the entire system. The collected data can be integrated into models and utilized to gain profound insights that can be used to improve and customize the system's behavior.
The system will ultimately be in a position to learn from the results of its behavior.
The use of machine learning and other techniques from the area of AI are the most effective way to attain the higher levels of automation needed to handle the complexity and optimize the performance of systems. A portion of this work is already in progress in initiatives that support AI in standards development organizations like the 3rd Generation Partnership Project (3GPP) and the Open Radio Access Network (O-RAN) Alliance.
AI will likely play an important role in automating new-generation systems across a range of industries. It is especially relevant in enhancing the degree of autonomy for telecom networks. This, in turn, allows for transformation in other sectors like transportation and manufacturing.
Around the globe, CSPs are deploying the fifth generation of 3GPP mobile networks (in the sense of 5G). They are also preparing for the coming age, which will detect, compute, understand, think and react to business requirements in a completely autonomous manner, thus transitioning towards cognitive networks that are completely non-touch. The next generation of networks will be able to handle the constant explosion of data generated by the ever-growing number of smart, connected devices and an array of innovative applications at the edge of the network and in the cloud, in addition to complex topologies for networks. This will pose new challenges in the fields of complexity, scaling, security and reliability to the network's operations, design and deployment. The traditional mobile network is developed and operated by experts in the telecom industry who heavily rely on their vast knowledge of the topology of the web & mobile of the subscribers and patterns of usage, as well as the radio propagation models used to create and manage the policies that govern the network.
The topologies of 5G are becoming more complicated due to the smaller cell sizes and the advancement of radio technology. The usage patterns have become less reliable for humans by themselves, and radio propagation models are more difficult to compute due to the increase in radio spectrum bands and more dense topologies. This is why AI is crucial in helping CSPs in designing and running 5G networks. To advance towards developing a zero-touch framework of cognitive networks and to think and make decisions with a high degree of automation across complex dependencies and wide areas, AI will have to be a more integral element of the networks.
Every local site provides a wealth of information on the condition of various components, the time, series of events, and details about the context. This data can be utilized to create models of local behavior, and reasoning is needed to process the data collected across different sites to derive general system-wide insights. In the ideal scenario, the information and information gathered at one place could be utilized on the other sites to provide better forecasts. Network infrastructures are constantly evolving to accommodate the ever-growing demand for real-time capabilities by software-controlled data pipelines to adjust the amount, velocity and variety of real-time data and algorithms that can perform real-time decision-making.
The addition of more intelligence to the services and networks and business processes will enable the transition to a data-driven approach, allowing for a greater level of automation, performance, and efficiency. With an increase in autonomy, the job of the CSP in managing the network will shift by managing the network using intentions and overseeing automation, taking control. The intelligent features are tailored for each service, allowing them to function in a more robust, secure, reliable, and trustworthy way, bringing mobile networks to a whole new degree of technological innovation to benefit society and industry.
The transformation toward Industry 4.0 is well underway. The information generated by connected units of production and items used in production provides the efficiency and flexibility that was impossible before. Anomaly detection and root cause analysis is used to eliminate obstacles and boost production yield. Predictive algorithms provide insights to maintenance plans and improve the efficiency of the process.
5G mobile systems are currently being created based on requirements specifically designed to the needs of industries. They will also include support for bounded latency with high reliability. This will aid in the change.
For instance, control logic for production units placed in rugged machines at the factory floor can be using a 5G connection transferred to an IT infrastructure that is already in place and becomes an integral component of manufacturing operations control.
By delegating SW devices, wireless connectivity allows for greater flexibility, improves efficiency, and eliminates the requirement for the device's SW management. Mobile connected devices such as AGVs and robots managed by an infrastructure based on edge clouds could enable complex automated production in real-time, supported by AI cloud technology. Another instance is the introduction of AR/VR devices that are wirelessly connected. Specialists can remotely help local maintenance and even supervise delicate processes using feedback haptic. Furthermore, when you have SW removed by the gadget, the battery lasts longer, and the user will benefit from the most powerful AR/VR apps.
Even though AI promises to automatize and connect industries allowing for disruptive innovations, it comes with a host of problems: data is usually diverse and heterogeneous, and there are a lot of hard demands on autonomous decision-making in real-time. In addition, the safety cruciality of various applications in the field that occur in human-machine collaboration puts high standards for the requirements and increases the difficulty.
Innovations in the transportation and automotive industries have led to the creation of vehicles that feature increased levels of automation. They are designed for different uses, including the transport of goods and people and the management of utility and infrastructure. They are also connected with various operational areas, ranging from restricted controlled regions (for instance, harbors, mines, urban zones, etc.)) to vast public roads.
Connectivity plays a variety of roles according to time and geographic scale. Sharing can lead to enhanced situational awareness and agreements-seeking interactions between vehicles from the perspective of single-vehicle real-time data. Additionally, (edge)-cloud-based AI instances can analyze real-time data from roads infrastructure and off-board sensors to monitor and enhance the capabilities of vehicles provided by its sensors mounted on its board. In addition, the massive amount of sensor data produced by cars could be centrally utilized, even offline, to constantly improve operational safety and efficacy.
Data analytics and connectivity are essential for system-level traffic optimization. This involves various temporal and geographical time frames based on optimization, ranging from local traffic adjustments to account for unexpected events in the local area and long-term flow optimizations. These optimizations benefit from the ever-growing real-time information accessibility within the connected IoT ecosystem, gaining data across different domains to assess the condition of connectedness, mobility and infrastructure.
Safety is a major issue for the automotive and transportation industries. While sensors onboard will continue to play a vital role in ensuring security on public roads, central connectivity and analytics are becoming a critical component to ensure safe operations for vehicles in limited areas and road safety for cars and systems on the public. Artificial intelligence-based prediction of networks, sensors off-board, and system performance will be an essential component of automated vehicle operations. Automation is critical to speedy failure detection and mitigation throughout the system. In essence, the safety of the vehicle design should consider cloud and network-based functions and techniques built on data will be a part of the solution, creating an efficient and safe system.
It is well-known that AI is useful to nearly all industries and helps achieve efficiency by utilizing intelligent and adaptive automation. The above examples illustrate various aspects that connect AI across large industries.
Five major issues must be resolved to ensure that the public accepts AI as a practical option to the intelligent automation of complex systems that operate close to real-time. Solutions will require strong domain expertise and profound knowledge of fundamental connectivity and communication issues.
Zero-touch operations imply a lesser amount of human involvement and, therefore, an increase in autonomy for the network to handle the ever-growing complexity of the system's scale and reduced time to make a decision. One of the most important advantages of zero-touch processes is their auto-adapting, self-learning automation, in which unforeseeable situations, intentions and demands are addressed with no human intervention.
The transformation of network operations where decisions are made by humans and then carried out by machines, or those in which decisions are made entirely by machines, requires these machines to comprehend the scenario and link it to background knowledge. Devices must be given greater autonomy by shifting beyond the imperative (what should be done) towards declarative (what to accomplish) objective specifications and also provide them the ability to provide relevant knowledge of the domain. These are referred to as declarative accurate specifications intentions. Intents may be used to express functions, functional and nonfunctional, and objectives, limitations and requirements.
Knowledge-intensive systems employ formal models to understand situations and perform actions based upon autonomously made decisions. These models can be obtained through various methods: learning from data through the machine-learning (ML) method, which is provided by experts in the domain or discovered automatically through an inference process that involves logical reasoning.
Utilizing a traditional approach to data science, which is training a model by analyzing all available data, is not without its drawbacks. The first is that a conventional method might not scale and be unattainable. The state vector could be filled with many options, and combinations of scenarios to train at inferencing times could be many. Additionally, this method cannot handle unpredictable situations since there isn't any data for training on. Instead, we must split the problem into smaller ones and organize a range of more specific methods than employ an agent-based method. Agents could be conventional models, ML components, and expert-level rules, and the list goes on. Agent orchestration can be controlled by a continuous optimization loop that tries to connect the gap between the actual and desired (specified through intent) states based on the system's understanding of the state of the network.
Hybrid strategies can be beneficial in the next generation of intelligent systems. The solid modeling of intricate models can be paired with symbolic logic, which offers knowledge representation, reasoning and explanation capabilities. The knowledge could include the universal law of physics or the most well-known techniques in the particular domain.
Intelligent systems should be equipped with the capability to take decisions independently to meet goals, solve a problem in various ways and have flexibility when making decisions using a variety of pre-populated and acquired information.
In a vast network, the decisions have to be taken at various levels and locations. Certain decisions are based upon local data and are governed by tightly controlled loops of control with low latency. Others are more strategic, affect the entire system, and use data gathered from various sources. These decisions taken at a global scale may require immediate responses in urgent situations like power grid malfunctions, cascading faults of nodes or other such issues. The technology used to automate these massive and complicated systems should be able to reflect the distributed nature of their systems and support the topology of management.
Data generated at the edge, either in an edge node of a network or device, may need to be processed locally. It is not always possible to transfer data to a centrally located cloud, and there could be laws regarding the location of data and security or privacy concerns regarding data transfers. The scope of the decisions made in these instances is restricted to a limited area, so the algorithms and computing capabilities required are typically light and fast. But local models can be based on insufficient and biased data that could lead to the loss of efficiency. It is necessary to take advantage of the magnitude of distribution, create an appropriate abstraction of local models, and then transfer the insights to other models local to the model.
Learning about global patterns in data through multiple networked devices or nodes that don't have access to the data is also feasible. Federated learning has helped pave the way in this direction, and more distributed training patterns like vertical federated or split-learning have come into play. The new structures allow machine-learning models to modify their implementations to the demands they will have to meet concerning data transfer or computation and for the consumption of network resources and memory with excellent performance assurance. However, further research is required, particularly to accommodate different types of models, model combinations, and stronger privacy assurances.
A common, decentralized and distributed model is needed to make the most efficient use of global and local models and data and determine the best way to disperse reasoning and learning across nodes to meet the extremely high requirements for latency. The paradigms themselves can be developed with machine learning and other AI techniques that incorporate characteristics of self-management, auto-optimization and self-evolution.
AI-based autonomous systems are complicated algorithms and models, and, in addition, they develop over time, incorporating new information and data without the need for manual intervention. Dependence on information and complex algorithms and the risk of unanticipated behavior from AI-based systems calls for new approaches to ensure transparency, understandability, technological robustness and security in data governance and privacy, and fairness and nondiscrimination, oversight by humans as well as environmental and social well-being and accountability. These aspects are vital to the human being to be able to comprehend and thereby build trust that is calibrated in AI-based systems.
The able explanation AI(XAI) is employed to create transparency for AI-based systems, explaining to the user what the AI algorithm reached the specific conclusion. The techniques apply to a variety of AI techniques, such as reinforcement learning (RL), supervised learning (RL), machine reasoning and others . XAI is recognized as an essential feature in implementing AI models in systems to protect basic rights for AI users in AI decision-making. It is crucial in telecommunications, where standardization organizations like ETSI and IEEE insist on the necessity of XAI to ensure the security of intelligent communication systems.
The constantly evolving character of AI models demands either novel methods or enhancements to processes in place to ensure the security and reliability of AI models in both their training and deployment into the actual world. In addition to the statistical assurances provided by robustness to adversaries, formal verification methods could be modified to provide deterministic security guarantees for systems that rely on AI to be safe. Security is among the main factors in ensuring robustness in which models and data must be secured from attacks by malicious actors. Data privacy which is the source as well as the intended users, should be protected. The algorithms themselves cannot release private information. In addition, data must be vetted to ensure fairness and the expectation of the domain due to the bias it may introduce into AI decision-making.
As the people who use AI systems are ultimately human, techniques such as those based upon causality and data-proven have to be designed to give the ability to make decisions accountable. The AI systems must be developed to learn and improve the stakeholder requirements constantly they're expected to meet and move to a higher degree of automated decision-making or even human-level if they don't have enough confidence in certain choices.
Intelligent machines that are connected and of all kinds are becoming more and more prevalent throughout our lives and range in size from collaborative robotics and cobots . For proper collaboration, these machines must comprehend human intentions and needs in detail. Additionally, all information that is related to these machines needs to be readily available to enable context-awareness. AI is essential throughout this process to improve the capabilities and cooperation of both machines and humans.
Modern advances in natural language processing and computer vision have enabled computers to make more precise interpretations of humans' input. This is achieved by considering the non-verbal language, for example, the body language and voice tone. The accurate detection of emotions is evolving and may help identify more complex behavior, such as fatigue and distraction. Additionally, advancements in areas like the understanding of scenes and the extraction of semantic information are essential to fully understand the surrounding environment. The computer should use the entire perceptual data to identify the most effective method that will maximize collaboration. Learning through reinforcement (RL) is the process by which a system is trained to choose the most appropriate action, in light of the present state and observations of the surrounding environment is gaining more attention. The research is gaining attention and strategies like secure AI are being investigated to ensure safety throughout the RL model's life cycle to stay clear of dangerous circumstances. Information about RL will be described in the following section.
AI has also provided the ability to understand better the way machines work by using digital twins. Extended Reality (XR) gadgets are more common in mixed reality systems to show the intricate information about machines and interact directly with twins in digital form simultaneously. This helps humans understand the way devices operate and allows them to anticipate their actions. Together with the XR interface, XAI can justify an arbitrary decision taken by machines.
To make collaboration possible, machines must be able to respond and interact with human beings promptly. Since AI techniques involved in the collaborative system may be extremely complex in their computing and devices could have limited hardware resources, A distributed intelligence solution is required to deliver immediate responses. This implies that the infrastructure for communication plays an essential role in the entire process by facilitating high-reliability and low-latency communications networks.
RL is a machine-learning method that develops a decision-making model by using data in a data-driven approach. An RL agent learns through taking actions in the environment. It also uses its existing knowledge and examines the environment through random acts to acquire new information. By analyzing the results of such activities, like rewards, an agent adjusts its policies to maximize some idea of the cumulative reward.
One of the main factors for success when using RL in a virtual setting is the players' training to compete against each other and exploring the variety of actions that can be taken. In this context, it is necessary to conduct more research to determine how to make the most of the advantages of RL and other cutting-edge algorithms and tailor their applications to the specific needs of industrial settings and limitations. One approach is to design an appropriate and safe exploration method that allows exploration of an environment-controlled and secure way, for instance, by identifying regions of state-action or temporal slices that are where certain actions are permitted and limiting the investigation to these areas. Another alternative is to train an agent within a virtual environment like a simulator emulator, in which the agent is free to explore the environment without restrictions. This is a method of transferring into a real environment, also known as Sim2Real. Two of the most recent techniques for Sim2Real are domain randomization, in which the parameters of the simulation are randomly selected to facilitate generalized training. The other is domain adaptation, in which the model trained by simulation is shifted towards the real-world domain. Another option is offline RL, where the agent is prepared from a static database. The data should include the trajectories of state, actions, and rewards generated by a logging strategy. Because the agent can't communicate with the static dataset, one of the major challenges for training is to ensure that they are not biased towards the dataset, which typically has limited and partial information.
In industrial systems such as telecom networks, it's not always feasible to investigate every action when the system is operating because an inconvenient action or state combination could result in lower performance. Suppose more research is conducted on these methods, which are not exclusive but, in a way, complementary. In that case, the problem of exploration is possible to solve, and the potential of RL can be successfully applied in industrial processes.
Based on our research into intelligent mobile networks shortly and continuous developments in the field of automation in industry, this article identified five technical issues that need to be tackled to maximize opportunities offered by AI technologies. The technical issues are:
systems that have the ability to take decisions on their own,
Decentralized and distributed intelligence
an AI that is reliable,
human-machine interaction, and
making games and simulations available on an industrial scale
These issues are best solved through strategies built on solid domain expertise, which includes understanding fundamental connectivity and communication issues.