Eye-tracking system uses ordinary cellphone camera

Researchers at MIT and the University of Georgia have developed software that can turn any smartphone into an eye-tracking device, which can be used for discrete sensing and vision as well as medical applications.


Researchers developed a simple application for devices that use Apple’s iOS operating system. The application flashes a small dot somewhere on the device’s screen, attracting the user’s attention, then briefly replaces it with either an R or an L, instrucFor the past 40 years, eye-tracking technology—which can determine where in a visual scene people are directing their gaze—has been widely used in psychological experiments and marketing research, but it's required pricey hardware that has kept it from finding consumer applications. That could be changing soon, however.

Researchers at MIT's Computer Science and Artificial Intelligence Laboratory and the University of Georgia have developed software that can turn any smartphone into an eye-tracking device. They describe their new system in a paper they're presenting at the Computer Vision and Pattern Recognition conference.

In addition to making existing applications of eye-tracking technology more accessible, the system could enable new computer interfaces or help detect signs of incipient neurological disease or mental illness.

"The field is kind of stuck in this chicken-and-egg loop," said Aditya Khosla, an MIT graduate student in electrical engineering and computer science and co-first author on the paper. "Since few people have the external devices, there's no big incentive to develop applications for them. Since there are no applications, there's no incentive for people to buy the devices. We thought we should break this circle and try to make an eye tracker that works on a single mobile device, using just your front-facing camera."

Khosla and his colleagues—co-first author Kyle Krafka of the University of Georgia, MIT professors of electrical engineering and computer science Wojciech Matusik and Antonio Torralba, and three others—built their eye tracker using machine learning, a technique in which computers learn to perform tasks by looking for patterns in large sets of training examples.

Strength in numbers

Khosla and his colleagues' advantage over previous research was the amount of data they had to work with. Currently, Khosla says, their training set includes examples of gaze patterns from 1,500 mobile-device users. Previously, the largest data sets used to train experimental eye-tracking systems had topped out at about 50 users.

To assemble data sets, "Most other groups tend to call people into the lab," Khosla said. "It's really hard to scale that up. Calling 50 people in itself is already a fairly tedious process. But we realized we could do this through crowdsourcing."

In the paper, the researchers report an initial round of experiments, using training data drawn from 800 mobile-device users. On that basis, they were able to get the system's margin of error down to 1.5 cm, a twofold improvement over previous experimental systems.

Since the paper was submitted, however, they've acquired data on another 700 people, and the additional training data has reduced the margin of error to about one centimeter.

To get a sense of how larger training sets might improve performance, the researchers trained and retrained their system using different-sized subsets of their data. Those experiments suggest that about 10,000 training examples should be enough to lower the margin of error to a half-centimeter, which Khosla estimates will be good enough to make the system commercially viable.

To collect their training examples, the researchers developed a simple application for devices that use Apple's iOS operating system. The application flashes a small dot somewhere on the device's screen, attracting the user's attention, then briefly replaces it with either an "R" or an "L," instructing the user to tap either the right or left side of the screen. Correctly executing the tap ensures that the user has actually shifted his or her gaze to the intended location. During this process, the device camera continuously captures images of the user's face.

The researchers recruited application users through Amazon's Mechanical Turk crowdsourcing site and paid them a small fee for each successfully executed tap. The data set contains, on average, 1,600 images for each user.

Tightening the net

The researchers' machine-learning system was a neural network, which is a software abstraction but can be thought of as a huge network of very simple information processors arranged into discrete layers. Training modifies the settings of the individual processors so that a data item—in this case, a still image of a mobile-device user—fed to the bottom layer will be processed by the subsequent layers. The output of the top layer will be the solution to a computational problem—in this case, an estimate of the direction of the user's gaze.

Neural networks are large, however, so the MIT and Georgia researchers used a technique called "dark knowledge" to shrink theirs. Dark knowledge involves taking the outputs of a fully trained network, which are generally approximate solutions, and using those as well as the real solutions to train a much smaller network. The technique reduced the size of the researchers' network by roughly 80%, enabling it to run much more efficiently on a smartphone. With the reduced network, the eye tracker can operate at about 15 frames per second (fps), which is fast enough to record even brief glances.

Massachusetts Institute of Technology (MIT)


- Edited by Chris Vavra, production editor, Control Engineering, CFE Media, 
cvavra@cfemedia.com. See more Control Engineering CNC and motion control stories.

No comments
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by...
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
Integrated mobility; Artificial intelligence; Predictive motion control; Sensors and control system inputs; Asset Management; Cybersecurity
Big Data and IIoT value; Monitoring Big Data; Robotics safety standards and programming; Learning about PID
Motor specification guidelines; Understanding multivariable control; Improving a safety instrumented system; 2017 Engineers' Choice Award Winners
This digital report will explore several aspects of how IIoT will transform manufacturing in the coming years.
Motion control advances and solutions can help with machine control, automated control on assembly lines, integration of robotics and automation, and machine safety.
This article collection contains several articles on the Industrial Internet of Things (IIoT) and how it is transforming manufacturing.

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

Mobility as the means to offshore innovation; Preventing another Deepwater Horizon; ROVs as subsea robots; SCADA and the radio spectrum
Future of oil and gas projects; Reservoir models; The importance of SCADA to oil and gas
Big Data and bigger solutions; Tablet technologies; SCADA developments
Automation Engineer; Wood Group
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
click me