Talend: GDPR compliance threats in the IoT
Talend: GDPR compliance threats in the IoT
The IoT needs 'privacy by design' principles. otherwise alarm bells will ring. Image Source: Talend

Talend: GDPR compliance threats in the IoT

The forthcoming General Data Protection Regulation (GDPR) regulation has already been well documented, but how will the IoT be impacted by (and respond to) this new wide-ranging federal ruling from the European Union?

As we know, European businesses (including international organizations in the USA and elsewhere with a European presence) are now in a race to become compliant with the European Union’s General Data Protection Regulation (GDPR).

Organizations around the world will need to make certain they are adequately capturing, integrating, certifying, publishing, monitoring and protecting their data to ensure compliance when GDPR enters into application in May 2018.

Data privacy knee-jerk

Jean-Michel Franco is director of product marketing for data governance products at open source software integration specialist Talend. Pointing to a GDPR-based knee-jerk reaction by most firms focused on ‘data security’, Franco has bemoaned the lack of focus on ‘data privacy’ issues surrounding the new regulation. This, he asserts, is a serious concern for two main reasons.

The following article was written for Internet of Business by Talend’s Jean-Michel Franco.

Security versus privacy

Talend’s Franco: Retailers of connected products are aware that once a product is in a customer’s hands, all data broadcast through their product could be qualified as personal data.

My serious security versus privacy concern is twofold.

First, GDPR has a broad definition of data privacy. It places far-reaching responsibilities on organizations to impose a specific ‘privacy by design’ requirement and expands the need to implement appropriate technical and organizational measures to ensure that data privacy and data protection are no longer an afterthought.

Second, the emergence and growing prevalence of the IoT exacerbates these issues. At the heart of IoT is the concept of the always-connected customer. Businesses are looking to generate and capture large volumes of data about customer preferences and behaviors to drive a competitive edge.

For most organizations, neither of these two areas has, as of now, been addressed properly.

Even though much of this data is related to products, rather than data subjects, there are still privacy implications to consider. Information provided by a connected car, for example, is likely to affect the privacy of the car owner, if his or her ownership of that vehicle is known by the car’s retailer or manufacturer, even if the data itself is not specifically linked to to the individual.

Privacy by design

Retailers of connected products are aware that once a product is in a customer’s hands, all data broadcast through their product could qualify as personal data, which means that they need to apply ‘privacy by design’ principles together with all their suppliers involved in gathering, storing, and processing the data.

Consumer electronics product developer Vizio was recently fined $2.2 million after the US consumer watchdog found that it had been using content recognition software to track users without obtaining their permission. The company reportedly installed software on 11 million Internet-connected TV sets to track customers’ detailed viewing habits. It then linked that data with specific household demographics and sold the information to third-party marketers. In its defense, Vizio said its televisions “never paired viewing data with personally identifiable information such as name or contact information”, or so it is claimed.

The punishment meted out to Vizio sounds like a significant penalty. But, let’s consider that Vizio (now part of LeEco, a Chinese company worth $7.3 billion revenue) sells its HDTV and soundbars in Europe, so now faces similar privacy issues. Under GDPR rules, it would be exposed to a fine of $292 million in this case.

Knowing where your data is

Another big challenge organizations face is knowing both where all of the private, sensitive data within their organization resides and who is responsible for taking care of it. Many businesses are unclear about this, because their data is siloed in different departments such as sales, marketing, finance, services and so on.

Under GDPR, the data controller (that’s a real person) must respond to subject access requests within a month, with the possibility of extending this period for particularly complex requests. This is typically more stringent than existing regulations.

Under the UK’s Data Protection Act, for example, the response time is 40 days. In addition, the rights for data subjects are not restricted to data access: GDPR also mandates the right for rectification, the right for erasure (also known as the right to be forgotten), the right to restrict data processing, the right to object to data processing and the right to not be evaluated on the basis of automated processing. All those rights have significant impact on data management practices.

Read more: How ING engages customers with Big Data and the Internet of Things

Putting a response in place

So given the issues outlined above, how can organizations best respond to the challenge with respect to their data management practices? In our view, this should start by carrying out an inventory of data so that they at least know exactly what they have and where it is located.

Once a clear map of the data has been developed, companies will be better placed to start assigning responsibility for looking after it. This is, in a sense, the minimum requirement. However, this can then start to act as the foundation for establishing a stronger data governance policy which is a key element of what GDPR requires.

Closely linked to data governance is the issue of data quality – an especially pressing concern when organizations are building out their IoT capability. That’s because the desire to keep costs down in the IoT world often means that organizations are forced to work with low-quality networks and data quality may suffer as a result.

In the context of GDPR, data quality and harmonization can be a critical concern, particularly if it makes it difficult for the organization to achieve ‘a single view’ of the customer – something which is mandated by the regulation. One of the most significant data quality issues in this context derives from the business keeping separate siloed pools of data which are not readily integrated.

Take, for example, a scenario where the business knows a customer partly through IoT and partly through its marketing applications. If a customer then wants to know what private data the business holds on them and the organization ends up revealing only a fraction of that data, due to the existence of these separate data pools, then ultimately it is the organization’s responsibility that a full set of data has not been provided.

That, in turn, is likely to be a breach of GDPR. It’s a stark warning that, in order to comply, organizations need to effectively reconcile the information they get from different parts of their organization, including data nodes belonging to or residing in the IoT.

Read more: IoT, Big Data and why you should care about data copies

Scoping out the IoT data challenge

IoT is set to bring a raft of benefits to organizations across the world as they generate vast volumes of new data that they can subsequently leverage to help drive the decision-making process. Because IoT enables companies to connect the physical and the digital world, it provides them with the potential to shape the future of customer experiences. However, as this article suggests, this data brings challenges not least in its implications for data privacy and the consequent challenges that businesses will face in achieving GDPR compliance.

With May 2018 fast approaching, time is rapidly running out for businesses. If they want to take advantage of the IoT and ensure they comply with GDPR, they need to put these issues on their boardroom agenda and start actively addressing them right away.

Talend cloud and big data integration software runs natively in Hadoop using the the Apache ecosystem. Talend has native Spark support and uses Spark Streaming and machine learning with a visual Eclipse-based designer. Image Source: Talend