The US economy contracted in the first three months of 2022, but the data indicated consumer spending remained healthy – Copyright AFP Noel Celis
Given the high-profile data breaches and misuse of consumer data, consumers are very conscious of what they share. However, this data is crucial for companies to innovate and offer a better, more personalized customer experience. In order to help strike a balance, a solution leveraging advanced cryptographic technology to allow consumer data to be privately and securely shared by multiple data providers while avoiding the reidentification of individual consumers is what businesses need.
To gain an insight into the current trends, Digital Journal touched base with Experian’s Kevin Chen is SVP and chief data scientist for Experian DataLabs in North America.
Digital Journal: How are businesses balancing the use of data to deliver personalized services while protecting that data?
Kevin Chen: To start, businesses should employ rigorous privacy-preserving technologies to protect the anonymity of their customers. Analysing data is very important in delivering personalized product or service recommendations when engaging with a brand, which most of us appreciate and value.
In fact, over the past few years, we have come to expect that level of personalized engagement. However, we also understand that the increasingly digitized economy creates new opportunities for fraud. Not surprisingly, keeping personal data private is a huge priority, and customers expect the companies they do business with to regard it as imperative.
DJ: What can companies do to protect data without reducing their capability to make personal recommendations and otherwise improve customer service?
Chen: Many companies employ data anonymization, in which personally identifiable information (PII) is encrypted or removed to reduce the chance of identity theft and fraud. However, to ensure data privacy and security, companies should also consider new privacy-preserving technologies that enable the delivery of personalized services while keeping data safely anonymous.
Technologies such as data clean rooms and synthetic data can be used to perform analysis and modeling to optimize the customer experience. Plus, it protects data without compromising a brand’s capability to deliver an essentially personalized experience.
DJ: What are data clean rooms? How do they work to protect information?
Chen: Data clean rooms are platforms that companies use to share data with other entities without violating user privacy. Clean data room providers allow the participants to set parameters as to how the others can view or access their data. They leverage access control, encryption, and sometimes secure hardware such as secure enclaves to ensure data are stored and processed privately and securely.
By being able to combine the data together, companies can safely employ modeling and analytic techniques to identify the distinct characteristics of consumer populations, thus allowing them to take the knowledge outside of the data clean room and provide more effective and precisely targeted offerings to current and prospective customers.
DJ: How common is the use of data clean rooms among businesses and other institutions?
Chen: Clean data rooms are already used in a wide variety of industries. For example, brands and advertising agencies are applying clean room-derived analytics to enhance their capability to target specific audiences. Other industries such as banks and financial services companies can collaborate on fraud detection and anti-money laundry efforts to improve customers’ digital banking experience. In the healthcare field, clinical researchers can use anonymized patient data to perform complex studies such as patient journey and drug adherence — without linking data to specific individuals.
DJ: What are companies doing to strengthen technological protections?
Chen: The next generation of data clean room types of solutions can benefit through the deployment of more advanced privacy preserving technology. By leveraging advanced cryptographic technology such as secure multi-party computation (SMPC), data providers can securely aggregate their data with one another while maintaining complete control of their own respective data. SMPC ensures private data remains on the premises of each data provider; only fragmented, encrypted data is exchanged throughout an entire computation. As a result, sensitive private data is never exposed to any other parties — not even the clean room providers themselves.
DJ: Do such fraud prevention efforts limit a company’s capability to perform the analytics needed to gain a better understanding of their customers and their preferences?
Chen: Because of the built-in privacy-preserving feature, SMPC can enable highly sophisticated analytics by using machine learning in a truly anonymous fashion, thus allowing the resulting models to be deployed beyond the clean room — without compromising the privacy of the data.
DJ: What other technologies (beyond SMPC), if any, are contributing to this effort?
Chen: Two other privacy-preserving technologies are playing an increasingly important role: 1) differential privacy that enables the sharing of information about datasets by describing the patterns of groups within that set, while withholding information about specific individuals; and 2) generative AI technology such as generative adversarial networks (GANs) that train competing deep-learning models in an adversarial fashion, enabling the generator model to produce very plausible examples that are indiscernible from real data. GANs can fool a separate discriminator model whose objective is to detect fake examples. Working hand-in-hand with SMPC, these privacy preserving technologies will revolutionize clean room solutions.
DJ: Are finance companies taking these steps to protect data this way?
Chen: The use of data clean rooms is likely to continue extending deeply into the banking and financial services world as the methodology is applied to the safe, anonymized sharing of data.
An all-important key to the use of banking and financial services data in an anonymized fashion is its conversion into synthetic data for the purposes of modeling and analytics. While synthetic data will look like the data and contain most of its attributes, it will not link to any specific person.
GAN technology can generate synthetic data from sensitive data that is indistinguishable from it. By using synthetic data, companies can collaborate more effectively, build models and analyse performance, expand their business, reduce risk, and prevent fraud without exposing the identity of real consumers.
Bottom line: Should the development of these technologies give us confidence that this data is being scrupulously protected by the businesses and institutions they engage with?
In short, yes. SMPC, differential privacy, GAN, and synthetic data generation are just some of the privacy-preserving technologies that enable companies to connect digital touchpoints together and access powerful insights in an anonymized way, thereby reducing the risk of fraud. Ultimately, these technologies should play a vital role in giving us the confidence that our data is being protected.