Machine Learning: MLPerf and AI Benchmark 4

Even as a new benchmark in the space, MLPerf has been made available that runs representative workloads on devices and takes advantage of both common ML frameworks such as NNAPI as well as the respective chip libraries for each vendor. Using this benchmark on retail phones to date, Qualcomm has had the lead in almost all the tests, but given that the company is promoting a 4x increase in AI performance, it will be interesting to see if that comes across all of MLPerf’s testing scenarios.

It should be noted that Apple’s CoreML is currently not supported, hence the lack of Apple numbers here.

MLPerf 1.0.1 - Image ClassificationMLPerf 1.0.1 - Object DetectionMLPerf 1.0.1 - Image SegmentationMLPerf 1.0.1 - Image Classification (Offline)

Across the board in these first four tests Qualcomm is making a sizable lead, going above and beyond what the S888 can do. Here we’re seeing up to a 2.2x result, making an average +75% gain. It’s not quite the 4x that Qualcomm promoted in its materials, but there’s a sizable gap with the other high-end silicon we’ve tested to date.

MLPerf 1.0.1 - Language Processing

The only non-lead is with the language processing, where Google’s Tensor SoC is almost 2x what the S8g1 scores. This test is based on a mobileBERT model, and either for software or architecture reasons, it fits a lot better into the Google chip than any other. As smartphones increase their ML capabilities, we might see some vendors optimizing for specific workloads over others, like Google has, or offering different accelerator blocks for different models. The ML space is also fast paced, so perhaps optimizing for one type of model might not be a great strategy long-term. We will see.

AI Benchmark 4 - NNAPI (CPU+GPU+NPU)

In AI Benchmark 4, running in pure NNAPI mode, the Qualcomm S8g1 takes a comfortable lead. Andrei noted in previous reviews with this test that the power consumed during this test can be quite high, up to 14 W, and this is where some chips might be able to pull ahead an efficiency advantage. Unfortunately we didn’t record power at the same time as the test, but it would be good to monitor this in the future.

Testing the Cortex-X2: A New Android Flagship Core System-Wide Testing and Gaming
Comments Locked

169 Comments

View All Comments

  • nucc1 - Thursday, December 16, 2021 - link

    I have a desktop and laptop, I don't need a phone that can do desktop duties.
  • michael2k - Tuesday, December 14, 2021 - link

    You do realize that's exactly what Apple does with it's CPUs right? Use them for desktop/laptop parts?
  • eastcoast_pete - Thursday, December 16, 2021 - link

    Actually, what Apple is doing is both annoying (for iPhone owners) and logical (from Apple's bottom line POV). This and the prior generation iPhone certainly have the hardware oomph to drive a desktop setup akin to Dex, but that would, of course, mean fewer ipad pro and iMac mini sales. The ability to run a desktop-type setup on an iPhone used to be minimal due to the lower RAM older generations used to have, but that has changed. Being able to run a desktop environment on a $ 1,500 iPhone would really add value.
  • Raqia - Tuesday, December 14, 2021 - link

    That said, the lagging performance of Apple's CPU+GPU in AI benchmarks proves most sites overstate the usefulness of CPUs in phones use cases when headlining with CPU specific performance metrics. Yes it's not an apples to oranges comparison, but it's proof that you should care about more than CPU benchmarks (particularly the consumer oriented Geekbench suite) even for Apple products when making comparisons between mobile phones.

    CPU performance for notebook form factors will matter a lot more, but on phones CPU bottlenecked use cases are typically web browser / apps using Javascript and app compilation, and even for most of those cases your bottleneck will be connectivity rather than local processing. Heavy lifting is much more often done by ISP and various DSPs that are harder to benchmark.

    As Andrei stated in his introduction to the S8G1:

    "Qualcomm gave examples such as concurrent processing optimizations that are meant to give large boosts in performance to real-world workloads that might not directly show up in benchmarks."

    This seems to be borne out by a reviewer of an anonymous device here:

    https://youtu.be/IpQRiM5F370?t=1002

    despite some seeming inefficiencies for the other IP blocks when individually pinned by a benchmark. It also seems like SPEC17 is showing better efficiency whilst Geekbench is showing worse which indicates that Geekbench may need to optimize better for this year's ARMv9 implementations. Still a modest improvement for CPU this year though when all's considered.
  • name99 - Tuesday, December 14, 2021 - link

    "That said, the lagging performance of Apple's CPU+GPU in AI benchmarks proves most sites overstate the usefulness of CPUs in phones use cases when headlining with CPU specific performance metrics. "

    Uh, no!
    It proves that a dedicated NPU does better than a CPU for these tasks.
    The point is that the Android tests go through Android APIs; the Apple tests are probably raw C that goes on the CPU (perhaps the GPU, but that's unlikely in the absence of using special APIs).
    Your complaint is as silly as comparing 3D SW running on a GPU vs emulated 3D running on the CPU.

    But if you prefer to compare browser benchmarks, go right ahead:
    https://www.imore.com/iphone-12-takes-1-spot-ipad-...

    A much better complaint is that I'm guessing all these tests were not compiled with SVE2 -- which could have substantial effects.
    But of course that requires the dev tools and OS to catch up, which means we have to wait for the official release.
  • Raqia - Tuesday, December 14, 2021 - link

    And other companies have decided to dedicate more of their die area to NPUs and other processing blocks than the CPU. This is usually neglected in cursory reviews of the SoC and the pinned-to-the-CPU benches are overemphasized by some reviewers with PC gaming hardware review pedigrees fixated on what's easy to benchmark and what they know rather than what's impactful to actual phone use cases.

    All that said the CPU on the iPhones is a gap step ahead of the competition for now, but Apple has consciously used more die area for this and emphasize this in their marketing. Note that Apple could market just the performance of phones and devices themselves but they unusually (for a consumer electronics oriented company) market the SoC separately in a slide in presentations and product specs. They de-emphasize the modem they use however in favor of stating what seem like phone level performance metrics. This is notable given that this is the same company with the marketing chops to morphed "Made in China" into "Designed in Cupertino. Assembled in China." (Glances at own iPhone. Hmm really...)
  • ChrisGX - Thursday, December 16, 2021 - link

    >> Qualcomm gave examples such as concurrent processing optimizations that are meant to give large boosts in performance to real-world workloads that might not directly show up in benchmarks.

    This seems to be borne out by a reviewer of an anonymous device here:

    https://youtu.be/IpQRiM5F370?t=1002 <<

    That video review can't be used to make a case for hidden virtues of the Snapdragon 8 Gen 1 that for some reason have failed to show up in benchmark results. The reviewer castigated Qualcomm for producing a poor chip with many serious shortcomings. His comments about the X2 core suggest that he saw it as little better than a joke - a power hog that barely improves on earlier generation performance cores.

    The reviewer did acknowledge that the new GPU was fast but he underscored that the performance gain came at a high cost in terms of power consumption. On those occasions that the SD8 Gen 1 showed any substantial performance advantage over the Apple A15 in tests conducted by the reviewer- some games - the advantage had disappeared after 10 minutes as progressive throttling took its toll.

    The reviewer did look at modem performance (I'm not sure whether I understood the full context of the test) and once again the conclusion is the modem is fast and power hungry.

    I don't think the reviewer conducted any AI tests, which I suspect would have been the place that the SD8 Gen 1 excelled.
  • Meteor2 - Friday, December 17, 2021 - link

    What's mind-boggling is the performance using the ARM isa that Apple has achieved. Taken an equally mind-boggling amount of money to do it, more than anyone else can afford.
  • defaultluser - Tuesday, December 14, 2021 - link

    Decent little preview, but if I had to pick a "best core test," given the two hour limit, I would have chosen the little cores!
  • Ian Cutress - Tuesday, December 14, 2021 - link

    I tried running SPEC on the little cores. After 30 mins we were less than 10% complete.

Log in

Don't have an account? Sign up now