The beauty of these algorithms, Ng said in an interview, is that when you feed increasing amounts of data into traditional algorithms, they begin to stutter, slow and eventually flatten out. That's not the case with deep-learning algorithms. The more data you feed them, the better they function.
The human brain works so well because it is jam packed with a huge number of neurons that communicate through electrical impulses. Deep-learning algorithms, mimicking the brain, are based on simulated neural networks.
"As we build larger and larger simulations of the brain, these models are relatively efficient at absorbing huge amounts of data," Ng explained. "These are very high-capacity learning algorithms."
Work is progressing quickly.
About four years ago, the largest neural network, or set of deep-learning algorithms, had about 10 million connections. Ng noted that in early 2011, when he started the Google Brain project, that model had jumped to 1 billion connections. Last year, he worked with a team at Stanford to build a model with about 10 billion connections.
Part of Ng's work is to advance the algorithm, but he and his teammates also are working on using GPUs, or graphics processing units, instead of more traditional CPUs, or central processing units. The chips, designed for handling computer graphics, have turned out to be much better for building large neural networks because they're better at handling those kinds of calculations.
"We're building a new deep-learning platform with GPU hardware to help us scale better," said Ng. "My collaborators and I are the first ones to do this at scale. Other companies are starting to follow, but as far as I know, Baidu was the first company to build a large-scale GPU cluster for deep learning."
Making these algorithms even more high capacity should mean big advances in voice recognition and visual search. That's going to be critical, according to Ng.
As an increasing number of people from poor, and sometimes uneducated areas, come online, there will be a growing number of users who will speak their search query instead of typing. An increasing number also are expected to take pictures of what they're searching for, instead of typing in a description.
"Within five years, 50% of our queries will be through speech and images, so this is a technology we are investing heavily in," Ng said.
Improved speech recognition means that a driver might be able to speak aloud while driving and his phone, sitting on the passenger seat, will send a text to his friend, saying he'll be late.
"Even as the world moves to mobile, I think no one has figured out a good user interface for the mobile devices, which is why it's so slow to type on these tiny little keyboards on our smartphones," Ng said. "Speech recognition has gotten much better, but it doesn't work as well as we'd like. I'd love, when it gets better, to redesign the user interface on our cell phones around speech recognition."
Sign up for Computerworld eNewsletters.