What is “AI edge” and what is it used for?

Share:

To analyze the enormous volumes of data, particularly that generated by the numerous sensors that now populate our lives—from dishwashers to cars, not to mention our phones—we send it to the cloud. To enable faster and more secure calculations, edge computing is developing. Its AI counterpart is edge AI , a way of doing AI without relying on the cloud. An expert explains.


Sensors are everywhere we go: in homes, offices, hospitals, transport systems, and farms. They offer the potential to improve public safety and quality of life .

The “Internet of Things” (IoT ) includes temperature and air quality sensors that aim to improve indoor comfort , wearable sensors to monitor health , lidars and radars to improve traffic flow, and detectors that allow for rapid intervention in the event of a fire .

These devices generate enormous volumes of data, which are used to train artificial intelligence models. These models learn a model of the sensor’s operating environment in order to improve its performance.

For example, connectivity data from Wi-Fi access points or Bluetooth beacons deployed in large buildings can be analyzed using AI algorithms to identify occupancy and movement patterns at different times of the year and for different types of events, depending on the building type (e.g., office, hospital, or university). These patterns can then be used to optimize heating, ventilation, exhaust systems, and so on.

Combining the Internet of Things and artificial intelligence comes with technical challenges.

Artificial Intelligence of Things ( AIoT) combines AI and IoT. Its aim is to better optimize and automate interconnected systems, paving the way for intelligent decision-making. Indeed, AIoT systems rely on real-world, large-scale data to improve the accuracy and robustness of their predictions.

But to extract information from the data collected by IoT sensors, this data must be collected, processed, and managed effectively.

To do this, we generally use “cloud platforms” (e.g., Amazon Web Services, Google Cloud Platform, etc.), which host computationally intensive AI models – including recent foundation models .

What are foundation models?

  • Foundation models are a type of machine learning model trained on general-purpose data and designed to adapt to various downstream tasks. They include, but are not limited to, large language models (LLMs), which primarily deal with textual data, but also so-called “multimodal” models that can work with images, audio, video, and time-series data.
  • In generative AI, foundation models serve as the basis for generating content (text, images, audio, or code).
  • Unlike conventional AI systems that rely on task-specific datasets and extensive preprocessing, foundational models have the ability to learn from few or no examples (known as ”  few-shot learning  ” and ”  zero-shot learning,  ” respectively). This allows them to adapt to new tasks and domains with minimal customization.
  • Although foundation models are still in their early stages, they have great potential for creating value for companies in all sectors: their rise therefore marks a paradigm shift in the field of applied artificial intelligence.

The limitations of the cloud for processing IoT data

Hosting heavy AI systems or foundation models on cloud platforms offers the advantage of abundant computing resources, but it also has several limitations.

In particular, transmitting large volumes of IoT data to the cloud can significantly increase the response times of AIoT applications, with delays ranging from a few hundred milliseconds to several seconds depending on network conditions and data volume.

Furthermore, transferring data to the cloud, especially sensitive or private information, raises privacy concerns. It is generally considered that the ideal approach for privacy is to process data locally, close to end users, to limit data transfers.

For example, in a smart home, data from smart meters or lighting controls can reveal occupancy patterns or enable location tracking within the home (for example, detecting that Helen is usually in the kitchen at 8:30 a.m. preparing breakfast). It is best if these inferences are made close to the data source to minimize delays related to communication between the edge and the cloud, and to reduce the exposure of private information on third-party cloud platforms.

What are “edge computing” and “edge AI”?

To reduce latency and improve data privacy, edge computing is a good option because it provides computing resources (i.e., devices with memory and processing capabilities) closer to IoT devices and end users, typically in the same building, on local gateways, or in nearby micro data centers.

However, these so-called ” edge ” resources are significantly more limited in terms of processing power, memory, and storage than centralized cloud platforms, which poses challenges for deploying complex AI models in distributed environments.

The emerging field of edge AI , particularly active in Europe, seeks to address this for the efficient execution of AI tasks on more frugal resources.

One of these methods is split computing , which partitions deep learning models between several nodes within the same space (e.g., a building), or even between different neighborhoods or cities.

The complexity increases further with the integration of foundation models, which make the design and execution of split computing strategies even more difficult.

What changes does this imply in terms of energy consumption, privacy, and speed?

Edge computing significantly improves response times by processing data closer to end users, eliminating the need to transmit information to distant cloud data centers. This approach also enhances privacy, especially with the advent of edge AI techniques .

For example, federated learning allows machine learning models to be trained directly on local devices, or even directly on new IoT devices. These devices have processing capabilities, ensuring that raw data remains on the device while only updates to the AI ​​models are transmitted to edge or cloud platforms, where they can be aggregated and undergo the final training phase.

Confidentiality is also preserved during inference. Indeed, once trained, AI models can be deployed on distributed computing resources (at the edge ), which allows data to be processed locally without exposing it to cloud infrastructure.

This is particularly useful for companies that want to leverage large language models within their infrastructures. For example, these models can be used to answer queries about the operating status of industrial machinery, or to predict maintenance needs based on sensor data—areas that use sensitive and confidential information. Keeping queries and responses within the organization helps protect sensitive information and comply with confidentiality and compliance requirements .

How does it work?

Unlike mature cloud platforms, such as Amazon Web Services and Google Cloud, there is currently no well-established platform to support the large-scale deployment of edge applications and services .

However, telecommunications providers are beginning to leverage existing local resources at antenna sites to offer computing capacity closer to end users. Managing these distributed resources remains challenging due to their variability and heterogeneity, often involving numerous servers and low-capacity devices.

In my opinion, the complexity of maintenance is a major obstacle to the deployment of edge AI services . But the field is progressing rapidly, with promising avenues for improving the use and management of distributed resources.

Resource allocation across the IoT-Edge-Cloud continuum for secure and efficient AIoT applications

To enable reliable and efficient deployment of AIoT systems in smart spaces (such as homes, offices, industries and hospitals), our research group, in collaboration with partners across Europe, is developing an AI-based framework as part of the Horizon Europe PANDORA project .

PANDORA provides AI-as-a-Service (AIaaS) models tailored to end-user needs (e.g., latency, accuracy, power consumption). These models can be trained either at design time or at runtime, using data collected from IoT devices deployed in smart spaces.

PANDORA also offers computing as a service (CaaS) across the entire IoT-Edge-Cloud continuum to support the deployment of AI models. The framework manages the complete lifecycle of the AI ​​model, ensuring continuous, robust, and intent-driven operation of AIoT applications for end users.

At runtime, AIoT applications are dynamically deployed across the entire IoT-Edge-Cloud continuum, based on performance metrics such as energy efficiency, latency, and compute capacity. The CaaS intelligently allocates workloads to the most appropriate resources at the appropriate level (IoT-Edge-Cloud), maximizing resource utilization. Templates are selected based on domain-specific requirements (e.g., minimizing power consumption or reducing inference time) and are continuously monitored and updated to maintain optimal performance.

Author Bio: Georgios Bouloukakis is Assistant Professor at the University of Patras; Institut Mines-Télécom (IMT)

Tags: