Machine learning is opening massive business opportunities in all sectors, but applying it to encrypted data has remained challenging. The international, Sweden-based information technology provider Ericsson recently outlined the emergent machine learning technologies that may be used with encrypted data.
Image Credit: Song_about_summer/Shutterstock.com
Ericsson is also at the forefront of commercialized artificial intelligence (AI) innovation, a long-standing information technology manufacturer. Machine learning is a key AI technology that Ericsson is investing in.
As part of this investment, the company is working to introduce machine learning to encrypted data such as employees’ and customers’ personal information, communications, and sensitive digital assets.
What Is Encryption?
Although different from modern methods, encrypting data has existed long before computers were invented. However, from the twentieth century onwards, computer processing and encryption grew hand
Computers are particularly well suited to the jobs of encrypting and decrypting data. Algorithms instruct computers on how to jumble up encrypted data so that it cannot be read if intercepted and tell other computers how to decipher the encrypted data.
Although modern encryption protocols were developed by the world’s militaries to enable secret communication, they are in use throughout the commercial world by nearly all sectors.
WhatsApp and other messaging platforms, cloud storage services, virtual private networks (VPNs), and bespoke business software all make use of “military-grade” encryption like the Advanced Encryption Standard (AES) 256-bit protocol. According to the Computer Security Institute’s (CSI) leading industry survey, 71% of companies in the United States have utilized encryption to protect their data.
What Is Machine Learning?
Machine learning develops computer algorithms that automatically optimize themselves by using data. It is a key branch of the study of AI.
Machine learning algorithms use sample, or “training” data to build a model. Then they glean this model for help to make predictions or decisions without being explicitly programmed to.
Today, machine learning algorithms feature in a wide variety of applications, including medicine, email filtering, speech recognition, and computer vision. This branch of AI technology enables computers to perform tasks that just a couple of decades ago were thought impossible for their mathematical, command-and-control way of working.
Challenges of Encrypting “In-Use” Data
The majority of encryption techniques are used for data that is “at rest” (stored in a stable database) or “in transit” (moving from one database to another). Encrypted cloud storage and secure USB drives are examples of encrypted at-rest data, while encrypted Bluetooth connections and WhatsApp messages are examples of encrypted in-transit data.
Encrypting data that is “in use” has remained challenging, however. In-use data is either stored in a non-persistent (changing) digital state, processed, or both. Data in a machine learning model’s life cycle is classed as in-use data.
There are still many unexplored opportunities for machine learning to realize benefits, mainly due to the difficulty of working with private but in-use data that has so far been difficult to encrypt.
Ericsson’s engineers outlined potential opportunities for machine learning to be used with encrypted data in human resources (HR) functions.
Machine learning could, they proposed, be used to infer employee satisfaction by examining the conditions under which employees leave. Algorithms can then find commonalities in the mass of HR data and make recommendations to better retain and attract workers to the company.
Natural language processing, a machine learning technique, could also be used in a HR setting. Applicant-vacancy matchmaking might pair people with roles better than recruitment managers could or help recruiters make informed decisions.
In each case, the machine learning model would have to process sensitive personal data typically governed by regulations while in storage, transit, and use.
Encryption and public keys | Internet 101 | Computer Science | Khan Academy
Video Credit: Khan Academy/YouTube.com
Emerging Technologies for Machine Learning with Encrypted Data
The two most promising technologies for encrypting in-use data, according to Ericsson engineers, are secure multiparty computation (SMPC) and homomorphic encryption (HE).
Secure Multiparty Computation (SMPC)
SMPC simultaneously computes a function while keeping the inputs private. Data scientists use it to work on distributed data without having to expose it.
First, a data scientist external to the company writes the function that will be performed on the data (a mathematical function, like functions on an Excel spreadsheet). The data scientist also selects data sources provided by the company (the data owner) in a virtual container. They only see headers (descriptions of what data is measured) and metadata (when it was entered, where, how large it is, and so on).
Then, the functions are compiled by an algorithm into binary instructions for the computer to follow. Computers in the virtual container with the data execute the instructions and communicate with one another as required.
Results from the computation are sent back to the data scientist or analyst with secure encryption.
SMPC is a technique for keeping in-use data behind a wall of encrypted in-transit or at-rest data.
Homomorphic Encryption (HE)
HE is a method for computing analytical functions on encrypted data without having to decrypt it first. It is based on the Ring-Learning with Errors problem, a highly complex (NP-hard) problem that is considered safe from breaking by a quantum computer.
In HE, a trusted zone is defined where the plaintext data is stored. Information is encrypted in the trusted zone using a HE scheme like Cheon-Kim-Kim-Song (CKKS), which enables it to be “used” without being decrypted first.
Continue reading: Offsetting Computer Demands with Specialized Deep Learning Processors
References and Further Reading
Kessler, G.C. (2021) An Overview of Cryptography [online] GaryKessler.net. Available at: https://www.garykessler.net/library/crypto.html
Nilson, N.J. (2015) Introduction to Machine Learning [online] Stanford University. Available at: https://ai.stanford.edu/people/nilsson/mlbook.html
Richardson, R. (2008) CSI Computer Crime & Security Survey [online] CSI. Available at: http://i.cmpnet.com/v2.gocsi.com/pdf/CSIsurvey2008.pdf
Wasa, G. et al. (2021) AI Confidential: How can machine learning on encrypted data improve privacy protection? [online] Ericsson. Available at: https://www.ericsson.com/en/blog/2021/9/machine-learning-on-encrypted-data
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.