The Cloud AI 100, as Qualcomm calls it, is a dedicated server inference accelerator that they are planning on releasing in 2020. Their first entry into the AI accelerator market has been designed from the ground up with machine learning in mind. That means that the chip isn’t suitable for other tasks—unless it’s planned for work with deep neural networks, this isn’t the product for you.
The Cloud AI 100 is, however, mainly aimed at the fast-growing data centre market.
So, what is inference and how does it fit in machine learning?
Machine learning is used to 'train' an AI system to reliably achieve a task—by feeding the neural network with tons of information, we effectively teach the AI to perform tasks by having it generalize from the examples we provide—all without explicitly programming the system to do so.
When simplified, working with neural nets can be divided into two steps:
Training—we first feed the neural network with large amounts of labelled input examples and our desired results. The neural net then tries to learn how to emulate our process by tweaking its own to match our results as closely as possible.
Inference—after we have a trained model, we use the trained parameters to process new, unlabeled inputs. The system takes small batches of data that it has never seen and 'infers' the predicted output; basically executing pre-trained neural nets.
Image courtesy of Pixabay.
While training is more resource-intensive, it only needs to be done once. Inference, however, has to be run over and over again, making the hardware we use for it plays an important role.
At first, neural networks were run on CPUs, but GPUs quickly proved to be both faster and more efficient than CPUs—in part thanks to their parallel computing capabilities and because they offer superior pattern and object recognition.
Image courtesy of Pixabay.
However, in the same way that CPUs became replaced by GPUs for us in machine learning, ASIC (application-specific integrated circuit) chips are threatening to do the same to GPUs. ASIC chips are dedicated to a set of narrow functions and offer very good parallel computing capabilities, aren’t as expensive, and aren’t as energy-intensive—very important for data servers.
Qualcomm isn’t the first company to realise the importance of the AI datacentre market. NVIDIA and Intel have both been in this space longer: NVIDIA, with its market-leading Tesla accelerators; and Intel, with its forthcoming Xe GPUs, are both already big players in the AI race.
The Qualcomm Cloud AI 100
So, what makes this upcoming Qualcomm chip a big deal?
It has been built from the ground up on their 7nm process and is promised to deliver more than 15 times the inference performance than their current, non-AI focused, Snapdragon 855 flagship chip while maintaining similar efficiency.
That high power efficiency is the main selling point of this chip. It is supposed to offer 10x the performance per watt compared to the AI inference solution currently on the market.
The Cloud AI 100 isn’t going to be one single product, but a family of cards with different form factors and TDPs that will probably all use the same processor.
Their entry into the inference accelerator market is an exciting new development that will surely shake up this fledgling market and marks the beginning of a new chapter for the California based company.