What data does a ‘digital twin’ run on?
What data does a 'digital twin' run on?
Soon, every piece of equipment will have a digital twin. Image Credit: DNV.GL

What data does a ‘digital twin’ run on?

So-called ‘digital twins’ have been hailed as one of the top Internet of Things (IoT) technologies to track in 2017. They’re software-based ‘representations’ of physical objects, processes, factories or complete working systems (including production lines) that exist in the real world.

Internet of Business spoke to Dr. William L. Bain, CEO and founder of ScaleOut Software, on the birth of the digital twin. The company is a vendor of in-memory data storage and computing, real-time Hadoop MapReduce tools, disaster recovery and global data integration services.

A digital twin, he explained, captures real-time data, allowing (in theory) smarter maintenance and service schedules to be applied to physical objects. This raises a whole bunch of questions. For example, as this model now starts to surface across the industrial web, what technologies will characterize its data backbone? Can we simply use existing approaches to database structure – or is some reengineering called for? In other words, what we need to know is, what data does a ‘digital twin’ run on?

ScaleOut’s Bain points out that as businesses now track data from live systems, such as patient monitoring networks or wind turbine farms, they need insights within seconds to react to fast-changing conditions, make mission-critical decisions and capitalize upon new opportunities.

Related: TIBCO: How to double efficiency with ‘digital twins’

Looking inside live systems

“Traditional software techniques for streaming analytics can no longer provide the insights required by today’s fast, data-driven systems. As streaming analytics seek to gain deeper introspection into the dynamic behavior of live systems and provide sub-second feedback, the focus is shifting from incoming data streams, to the combination of data streams and the data sources that generate them. This enables these streams to be analyzed in a richer context and provide more effective insights,” explained Bain.

He points to an example of a patient monitoring system receiving telemetry from remote pacemakers.

By knowing each patient’s medical history, lifestyle and medications, this telemetry can yield more valuable information than just examining the data alone. Likewise, knowing the specific characteristics and condition of a wind turbine helps streaming analytics interpret telemetry and predict if a blade failure is imminent.

Analysts at Gartner also use the term digital twin to refer to software-based representations of real-world entities (such as the patients or wind turbines in the examples above). The company projects that by using digital twins, real-time feedback from IoT sensors could save $1 trillion a year in maintenance, services and consumables by 2022.

Related: GE powers up data insights

Why IMDGs help digital twins

“While it is cumbersome and often inefficient to implement digital twins with traditional stream processing technologies, in-memory data grids (IMDGs) provide an effective platform for incorporating digital twins into streaming analytics,” said Bain.

IMDGs combine object-oriented, in-memory data storage with fast data access and integrated computing to (in theory) simplify development and ensure scalable performance.

We also know that IMDGs are designed to meet the stringent high-availability requirements of live systems.

IMDGs now can take on stream-processing workloads that previously were the domain of purpose-built architectures, such as Apache Storm, Spark Streaming and commercial CEP (Complex Event Processing) and deliver the benefits offered by digital twins.

Your takeaway moment of Zen

So then, we can’t necessarily reinvent stream-processing and CEP overnight.

Any in-memory technology will always ‘suffer’ from (or at least be weighed down by) considerations relating to exactly how fast we can access the elements of Random Access Memory (RAM) that hold it – and this can get messy when we start to make lots of changes to continuously running systems (which those discussed here almost certainly would be), so don’t buy all the hype here in one mouthful.