The company tested the TPUs against hardware that was released around roughly the same time to try and get an apples-to-apples performance comparison. It's possible that newer hardware would at least narrow the performance gap.
There’s still room for TPUs to improve, too. Using the GDDR5 memory that’s present in an Nvidia K80 GPU with the TPU should provide a performance improvement over the existing configuration that Google tested. According to the company’s research, the performance of several applications was constrained by memory bandwidth.
Furthermore, the authors of Google’s paper claim that there’s room for additional software optimization to increase performance. The authors called out one of the tested convolutional neural network applications (referred to in the paper as CNN1) as a candidate. However, because of existing performance gains from the use of TPUs, it’s not clear if those optimizations will take place.
While neural networks mimic the way neurons transmit information in humans, CNNs are modeled specifically on how the brain processes visual information.
“As CNN1 currently runs more than 70 times faster on the TPU than the CPU, the CNN1 developers are already very happy, so it’s not clear whether or when such optimizations would be performed,” the authors wrote.
TPUs are what’s known in chip lingo as an application-specific integrated circuit (ASIC). They’re custom silicon built for one task, with an instruction set hard-coded into the chip itself. Jouppi said that he wasn’t overly concerned by that, and pointed out that the TPUs are flexible enough to handle changes in machine learning models.
“It’s not like it was designed for one model, and if someone comes up with a new model, we’d have to junk our chips or anything like that,” he said.
Google isn’t the only company focused on using dedicated hardware for machine learning. Jouppi said that he knows of several startups working in the space, and Microsoft has deployed a fleet of field-programmable gate arrays in its data centers to accelerate networking and machine learning applications.
Sign up for Computerworld eNewsletters.