July 2021 Guest Opinion: To Cloud or not to Cloud; A Crossroad...

July 2021 Guest Opinion: To Cloud or not to Cloud; A Crossroad for Executives Managing OT Networks

July 28, 2021
The shifting of where compute functionality occurs has varied over the past few decades from centralized (remember platforms like mainframes and thin clients?) to distributed. Of late, there has been a lot about the rise of Edge Computing. Rather than an end state, this is just another milestone in the ongoing evolution of system of system architectures. In this article, I am going to make an argument for a shift, over time, back towards the Cloud.

When Industry 4.0 started, the primary concept was really about software-defined everything. The fourth industrial revolution was expected to deliver even more automation than the third revolution by bridging the physical and the digital worlds. Accomplishing this required a shift from centralized, fixed industrial controls to those that could adapt to changing market needs and/or feedback from the environment itself.  What this meant was a shift toward software-defined systems. The PLCs that were imagined as a physical input with IOs, will now be a container workload on a large platform. On a much higher level, this inherently changes how the physical, digital and humans interact.  The machines made up of dedicated controllers that are not updated or changed will now be driven by software defined industrial PCs that can both drive the machines and understand and adapt to their surroundings.

There is great innovation underway with cloud providers right now. One might say we are in the Cloud Wars. If I focus on North America and Europe for a moment (i.e. exclude the ecosystem in China with Ali Baba, Baidu and Tencent), the three leading cloud providers push forward with increasingly innovative and complete products. They have also recognized the concern from end customers of being tied into a single cloud provider. As an example, Google’s Anthos software platform, announced in 2019, offers a single, consistent way of managing Kubernetes workloads across on-prem and public cloud environments.

For the OT executive, connectivity to this type of functionality offers tantalizing prospects for system effectiveness through access to various services, including data lakes, streaming analytics, data storage, IoT security management, and monitoring. We are hearing from customers that the implementation of similar functionality on-prem can be two to three times more expensive. We believe that the cost gap will continue to grow.

IT organizations in almost every industry are transitioning or have transitioned to leveraging cloud services. OT operators have, however, been slow to adopt cloud-based techniques. Even though moving to the Cloud relieves the OT operator of maintenance tasks such as provisioning, installation, updates, and patches, they still want to keep control and limit the threat of cybersecurity vulnerabilities. In part, it is because this conversation clashes with the culture ingrained in OT leaders to stay away from the influence of IT organizations and remain as independent for procurement support and management of their technology infrastructure.

Some of the operators realize that in the face of increasing cost pressures, moving to Cloud could simplify their operations and allow them to be more flexible in scaling up and down. In the manufacturing industry, we have seen more in the public domain from Microsoft and its customer base that builds on a foundation of decades of business and supplier familiarity around Windows® technology. This has been initially focused on predictive maintenance and quality improvement use cases. 

  1. The food industry, packaging pioneer Tetra Pak, employs new, digital tools that enable its cloud-connected machines to predict exactly when equipment needs maintenance. By connecting packaging lines to the Microsoft Azure Cloud, Tetra Pak can collect operational data to help predict informed maintenance timing.

  2. Manufacturers have a new approach for maintaining quality in high-volume manufacturing environments thanks to the arrival of competent and cost-effective artificial intelligence (AI). Operators can analyze camera feeds in real-time to have faulty widgets identified and tagged either physically or virtually. Potentially, it has become possible to inspect every part coming off the line – something that was neither economical or practical using human operators. This solution is particularly valuable in manufacturing complex automotive components, price-sensitive, high volume and frequently safety-critical.

The cloud operators have offered various IoT strategies intended to address the concerns, but the OT operators still see a chasm between what is needed to meet their requirements and the available architectures. Fortunately, new architectures can allow the operators to have their cake and eat it too. The choice of the right system architecture will ensure that their current operations are not impacted and yet they stand to benefit from all the data based optimization, namely:

  1. By decoupling software and hardware, the cost of maintenance and upgrade decreases significantly
  2. Systems can be much more flexible and respond to changing requirements with significantly lower cost, risk and time.
  3. Systems become observable, which opens up the ability to collect data, deliver unique insights and closed-loop optimizations.

The challenge is to deliver these capabilities while maintaining the vitally essential attributes of the OT network, including system uptime, deterministic real-time functionality and immunity to cyberattacks.

The architecture that this type of system requires is what we refer to as “Mission Critical Edge” securely combining the scaling benefits of IT infrastructure with the reliability, deterministic real-time behavior of embedded platforms. Attributes include;

  1. Airgapping: System architects must precisely define and dedicate CPU, memory and IO resources to specific virtual machines. These VMs need to be isolated from each other including the northbound and southbound connectivity. This enables OT and cloud applications to reside on the same system
  2. OT Manageability: The system should be flexible on the management and control of the configuration and setup. While the system should be managed locally, specific workloads should be updated and managed by the Cloud.
  3. Performance: Real-time performance must be guaranteed for the workloads such as PLCs, PACs and ECUs. This means that the system that is hosting the cloud workloads on the shopfloor can also have a dedicated partition that can be the backup for a physical PLC.
  4. High Availability: High availability implemented at different levels, within a single system, across two systems in a cell and across an entire production line.
  5. Orchestration Framework Integration: The edge systems need to be work with either local or cloud-based management framework. For example, systems across a factory should dedicate a portion of their workload to form a Kubernetes clusters.

In conclusion, the mission-critical edge architecture can allow the OT operators to deploy Cloud-connected services and workloads on their factory floor without affecting their current operations. This is achieved by enabling the edge systems on the factory floor to run multiple airgapped workloads including real-time, AI/ML, security etc. In addition, of the airgapped workloads can be combined to run Kubernetes orchestrated container workloads.

Pavan Singh, VP Product Management, Lynx Software Technologies