Efficient security for cloud-based machine learning

MIT researchers have developed an encryption method that secures data used in online neural networks without dramatically slowing their runtimes, which could be useful for cloud-based neural networks and other applications that use sensitive data.

08/31/2018


MIT researchers have developed an encryption method that secures data used in online neural networks without dramatically slowing their runtimes. Courtesy: Chelsea Turner, MITA novel encryption method devised by MIT researchers secures data used in online neural networks without dramatically slowing their runtimes. This approach holds promise for using cloud-based neural networks for medical-image analysis and other applications that use sensitive data.

Outsourcing machine learning is a rising trend in industry. Major tech firms have launched cloud platforms that conduct computation-heavy tasks, such as running data through a convolutional neural network (CNN) for image classification. Resource-strapped small businesses and other users can upload data to those services for a fee and get back results in several hours.

But what if there are leaks of private data? In recent years, researchers have explored various secure-computation techniques to protect such sensitive data. But those methods have performance drawbacks that make neural network evaluation (testing and validating) sluggish—sometimes as much as million times slower—limiting their wider adoption.

MIT researchers developed a system that blends two conventional techniques—homomorphic encryption and garbled circuits—in a way that helps the networks run orders of magnitude faster than they do with conventional approaches.

The researchers tested the system, called GAZELLE, on two-party image-classification tasks. A user sends encrypted image data to an online server evaluating a CNN running on GAZELLE. After this, both parties share encrypted information back and forth in order to classify the user's image. Throughout the process, the system ensures that the server never learns any uploaded data, while the user never learns anything about the network parameters. Compared to traditional systems, however, GAZELLE ran 20 to 30 times faster than state-of-the-art models, while reducing the required network bandwidth by an order of magnitude.

One promising application for the system is training CNNs to diagnose diseases. Hospitals could, for instance, train a CNN to learn characteristics of certain medical conditions from magnetic resonance images (MRI) and identify those characteristics in uploaded MRIs. The hospital could make the model available in the cloud for other hospitals. But the model is trained on, and further relies on, private patient data. Because there are no efficient encryption models, this application isn't quite ready for prime time.

"In this work, we show how to efficiently do this kind of secure two-party communication by combining these two techniques in a clever way," said first author Chiraag Juvekar, a PhD student in the Department of Electrical Engineering and Computer Science (EECS). "The next step is to take real medical data and show that, even when we scale it for applications real users care about, it still provides acceptable performance."

Maximizing performance

CNNs process image data through multiple linear and nonlinear layers of computation. Linear layers do the complex math, called linear algebra, and assign some values to the data. At a certain threshold, the data is outputted to nonlinear layers that do some simpler computation, make decisions (such as identifying image features), and send the data to the next linear layer. The end result is an image with an assigned class, such as vehicle, animal, person, or anatomical feature.

Recent approaches to securing CNNs have involved applying homomorphic encryption or garbled circuits to process data throughout an entire network. These techniques are effective at securing data. "On paper, this looks like it solves the problem," Juvekar said. But they render complex neural networks inefficient, "So you wouldn't use them for any real-world application."

Homomorphic encryption, used in cloud computing, receives and executes computation all in encrypted data, called ciphertext, and generates an encrypted result that can then be decrypted by a user. When applied to neural networks, this technique is particularly fast and efficient at computing linear algebra. However, it must introduce a little noise into the data at each layer. Over multiple layers, noise accumulates, and the computation needed to filter that noise grows increasingly complex, slowing computation speeds.

Garbled circuits are a form of secure two-party computation. The technique takes an input from both parties, does some computation, and sends two separate inputs to each party. In that way, the parties send data to one another, but they never see the other party's data, only the relevant output on their side. The bandwidth needed to communicate data between parties, however, scales with computation complexity, not with the size of the input.

In an online neural network, this technique works well in the nonlinear layers, where computation is minimal, but the bandwidth becomes unwieldy in math-heavy linear layers.

The MIT researchers, instead, combined the two techniques in a way that gets around their inefficiencies.

In their system, a user will upload ciphertext to a cloud-based CNN. The user must have garbled circuits technique running on their own computer. The CNN does all the computation in the linear layer, then sends the data to the nonlinear layer. At that point, the CNN and user share the data. The user does some computation on garbled circuits, and sends the data back to the CNN.

By splitting and sharing the workload, the system restricts the homomorphic encryption to doing complex math one layer at a time, so data doesn't become too noisy. It also limits the communication of the garbled circuits to just the nonlinear layers, where it performs optimally.

"We're only using the techniques for where they're most efficient," Juvekar said.

Secret sharing

The final step was ensuring both homomorphic and garbled circuit layers maintained a common randomization scheme, called "secret sharing." In this scheme, data is divided into separate parts that are given to separate parties. All parties synch their parts to reconstruct the full data.

In GAZELLE, when a user sends encrypted data to the cloud-based service, it's split between both parties. Added to each share is a secret key (random numbers) that only the owning party knows. Throughout computation, each party will always have some portion of the data, plus random numbers, so it appears fully random. At the end of computation, the two parties synch their data. Only then does the user ask the cloud-based service for its secret key. The user can then subtract the secret key from all the data to get the result.

"At the end of the computation, we want the first party to get the classification results and the second party to get absolutely nothing," Juvekar said. Additionally, "The first party learns nothing about the parameters of the model."

Massachusetts Institute of Technology (MIT)

www.mit.edu 

- Edited by Chris Vavra, production editor, Control Engineering, CFE Media, cvavra@cfemedia.com. See more Control Engineering virtualization and cloud stories.



Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
Design of Safe and Reliable Hydraulic Systems for Subsea Applications
This eGuide explains how the operation of hydraulic systems for subsea applications requires the user to consider additional aspects because of the unique conditions that apply to the setting
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
October 2018
HMI hardware evolution, Data acquisition strategies, Matching motors and drives, Machine vision advice
September 2018
Optimize controls via cloud software, ladder logic simulation, industrial wireless best practices
August 2018
Augmented reality and virtual reality education, autotuning PID control, cybersecurity advice, educating engineers
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

October 2018
2018 Product of the Year; Subsurface data methodologies; Digital twins; Well lifecycle data
August 2018
SCADA standardization, capital expenditures, data-driven drilling and execution
June 2018
Machine learning, produced water benefits, progressive cavity pumps
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
Design of Safe and Reliable Hydraulic Systems for Subsea Applications
This eGuide explains how the operation of hydraulic systems for subsea applications requires the user to consider additional aspects because of the unique conditions that apply to the setting
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
October 2018
HMI hardware evolution, Data acquisition strategies, Matching motors and drives, Machine vision advice
September 2018
Optimize controls via cloud software, ladder logic simulation, industrial wireless best practices
August 2018
Augmented reality and virtual reality education, autotuning PID control, cybersecurity advice, educating engineers
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

October 2018
2018 Product of the Year; Subsurface data methodologies; Digital twins; Well lifecycle data
August 2018
SCADA standardization, capital expenditures, data-driven drilling and execution
June 2018
Machine learning, produced water benefits, progressive cavity pumps
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
Design of Safe and Reliable Hydraulic Systems for Subsea Applications
This eGuide explains how the operation of hydraulic systems for subsea applications requires the user to consider additional aspects because of the unique conditions that apply to the setting
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
October 2018
HMI hardware evolution, Data acquisition strategies, Matching motors and drives, Machine vision advice
September 2018
Optimize controls via cloud software, ladder logic simulation, industrial wireless best practices
August 2018
Augmented reality and virtual reality education, autotuning PID control, cybersecurity advice, educating engineers
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

October 2018
2018 Product of the Year; Subsurface data methodologies; Digital twins; Well lifecycle data
August 2018
SCADA standardization, capital expenditures, data-driven drilling and execution
June 2018
Machine learning, produced water benefits, progressive cavity pumps
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
click me