Last week, Apple made industry news by announcing new Mac products based upon the company’s new Apple Silicon M1 SoC chip, marking the first move of a planned 2-year roadmap to transition over from Intel-based x86 CPUs to the company’s own in-house designed microprocessors running on the Arm instruction set.

During the launch we had prepared an extensive article based on the company’s already related Apple A14 chip, found in the new generation iPhone 12 phones. This includes a rather extensive microarchitectural deep-dive into Apple’s new Firestorm cores which power both the A14 as well as the new Apple Silicon M1, I would recommend a read if you haven’t had the opportunity yet:

Since a few days, we’ve been able to get our hands on one of the first Apple Silicon M1 devices: the new Mac mini 2020 edition. While in our analysis article last week we had based our numbers on the A14, this time around we’ve measured the real performance on the actual new higher-power design. We haven’t had much time, but we’ll be bringing you the key datapoints relevant to the new Apple Silicon M1.

Apple Silicon M1: Firestorm cores at 3.2GHz & ~20-24W TDP?

During the launch event, one thing that was in Apple fashion typically missing from the presentation were actual details on the clock frequencies of the design, as well as its TDP which it can sustain at maximum performance.

We can confirm that in single-threaded workloads, Apple’s Firestorm cores now clock in at 3.2GHz, a 6.66% increase over the 3GHz frequency of the Apple A14. As long as there's thermal headroom, this clock also applies to all-core loads, with in addition to 4x 3.2GHz performance cores also seeing 4x Thunder efficiency cores at 2064MHz, also quite a lot higher than 1823MHz on the A14.

Alongside the four performance Firestorm cores, the M1 also includes four Icestorm cores which are aimed for low idle power and increased power efficiency for battery-powered operation. Both the 4 performance cores and 4 efficiency cores can be active in tandem, meaning that this is an 8-core SoC, although performance throughput across all the cores isn’t identical.

The biggest question during the announcement event was the power consumption of these designs. Apple had presented several charts including performance and power axes, however we lacked comparison data as to come to any proper conclusion.

As we had access to the Mac mini rather than a Macbook, it meant that power measurement was rather simple on the device as we can just hook up a meter to the AC input of the device. It’s to be noted with a huge disclaimer that because we are measuring AC wall power here, the power figures aren’t directly comparable to that of battery-powered devices, as the Mac mini’s power supply will incur a efficiency loss greater than that of other mobile SoCs, as well as TDP figures contemporary vendors such as Intel or AMD publish.

It’s especially important to keep in mind that the figure of what we usually recall as TDP in processors is actually only a subset of the figures presented here, as beyond just the SoC we’re also measuring DRAM and voltage regulation overhead, something which is not included in TDP figures nor your typical package power readout on a laptop.

Apple Mac mini (Apple Silicon M1) AC Device Power

Starting off with an idle Mac mini in its default state while sitting idle when powered on, while connected via HDMI to a 2560p144 monitor, Wi-Fi 6 and a mouse and keyboard, we’re seeing total device power at 4.2W. Given that we’re measuring AC power into the device which can be quite inefficient at low loads, this makes quite a lot of sense and represents an excellent figure.

This idle figure also serves as a baseline for following measurements where we calculate “active power”, meaning our usual methodology of taking total power measured and subtracting the idle power.

During average single-threaded workloads on the 3.2GHz Firestorm cores, such as GCC code compilation, we’re seeing device power go up to 10.5W with active power at around 6.3W. The active power figure is very much in line with what we would expect from a higher-clocked Firestorm core, and is extremely promising for Apple and the M1.

In workloads which are more DRAM heavy and thus incur a larger power penalty on the LPDDR4X-class 128-bit 16GB of DRAM on the Mac mini, we’re seeing active power go up to 10.5W. Already with these figures the new M1 is might impressive and showcases less than a third of the power of a high-end Intel mobile CPU.

In multi-threaded scenarios, power highly depends on the workload. In memory-heavy workloads where the CPU utilisation isn’t as high, we’re seeing 18W active power, going up to around 22W in average workloads, and peaking around 27W in compute heavy workloads. These figures are generally what you’d like to compare to “TDPs” of other platforms, although again to get an apples-to-apples comparison you’d need to further subtract some of the overhead as measured on the Mac mini here – my best guess would be a 20 to 24W range.

Finally, on the part of the GPU, we’re seeing a lower power consumption figure of 17.3W in GFXBench Aztec High. This would contain a larger amount of DRAM power, so the power consumption of Apple’s GPU is definitely extremely low-power, and far less than the peak power that the CPUs can draw.

Memory Differences

Besides the additional cores on the part of the CPUs and GPU, one main performance factor of the M1 that differs from the A14 is the fact that’s it’s running on a 128-bit memory bus rather than the mobile 64-bit bus. Across 8x 16-bit memory channels and at LPDDR4X-4266-class memory, this means the M1 hits a peak of 68.25GB/s memory bandwidth.

In terms of memory latency, we’re seeing a (rather expected) reduction compared to the A14, measuring 96ns at 128MB full random test depth, compared to 102ns on the A14.

Of further note is the 12MB L2 cache of the performance cores, although here it seems that Apple continues to do some partitioning as to how much as single core can use as we’re still seeing some latency uptick after 8MB.

The M1 also contains a large SLC cache which should be accessible by all IP blocks on the chip. We’re not exactly certain, but the test results do behave a lot like on the A14 and thus we assume this is a similar 16MB chunk of cache on the SoC, as some access patterns extend beyond that of the A14, which makes sense given the larger L2.

One aspect we’ve never really had the opportunity to test is exactly how good Apple’s cores are in terms of memory bandwidth. Inside of the M1, the results are ground-breaking: A single Firestorm achieves memory reads up to around 58GB/s, with memory writes coming in at 33-36GB/s. Most importantly, memory copies land in at 60 to 62GB/s depending if you’re using scalar or vector instructions. The fact that a single Firestorm core can almost saturate the memory controllers is astounding and something we’ve never seen in a design before.

Because one core is able to make use of almost the whole memory bandwidth, having multiple cores access things at the same time don’t actually increase the system bandwidth, but actually due to congestion lower the effective achieved aggregate bandwidth. Nevertheless, this 59GB/s peak bandwidth of one core is essentially also the speed at which memory copies happen, no matter the amount of active cores in the system, again, a great feat for Apple.

Beyond the clock speed increase, L2 increase, this memory boost is also very likely to help the M1 differentiate its performance beyond that of the A14, and offer up though competition against the x86 incumbents.

Benchmarks: Whatever Is Available
Comments Locked

682 Comments

View All Comments

  • BushLin - Wednesday, November 18, 2020 - link

    8-core mobile zen 2 chips have been available for nearly a year now. By the time you can buy that unannounced product you speculate about, it'll be competing against 5nm zen 4 and would still be a toss up in performance against 7nm zen 2.
  • Spunjji - Thursday, November 19, 2020 - link

    You're both wrong.

    Zen 4 is due out in at least a year's time, possibly 18 months. I'll eat my hat if Apple haven't released their higher-end chip with larger cores by then.

    That said, there's no reason to assume its CPU performance will be significantly higher than AMD's mobile Zen 3 designs. GPU will be for sure, but you're locked to a platform without access to decent games so that will limit the appeal to a certain audience.

    So it's not "game over" for Zen 3 - especially as they don't directly compete - but BushLin's completely wrong about how an 8-core variant of this would stack up to Zen 2 and 3.
  • BushLin - Thursday, November 19, 2020 - link

    So a 15W 8-core zen 2 beats a 15W 4+4 core M1 in multithreaded, close to real world tests; but a mythical 25-30W 8+4 CPU using the same design which hasn't scaled well from the additional watts it uses over the A14 chip is going to definitely, defiantly and majestically beat all comers, including zen 3? We'll see but random guy on the internet is probably just pulling stuff out of their ass.
  • Spunjji - Monday, November 23, 2020 - link

    @BushLin - Please check what I said again: "there's no reason to assume [M1's] CPU performance will be significantly higher than AMD's mobile Zen 3 designs". So no, I don't think it's going to "definitely, defiantly and majestically beat all comers, including zen 3" and you're kind of an ass for straw-manning me like that. Please don't.

    You keep making false comparisons with TDP too. Zen 2 is 15W at base clocks, but most of the tests seen so far take place largely within its turbo window of ~30W. Zen 3 Cezanne on 7nm will be in the same ballpark. A theoretical (not "mythical") 8+4 design should provide very similar performance in a very similar TDP, with the performance edge likely going to AMD. That indicates than Zen 4 on 5nm should likely be a superior option for both perf/watt and absolute performance, but we just don't know that yet as, in your terms, Zen 4 is still "mythical".

    But sure, equally-random guy on the internet. We'll see when we see.
  • BushLin - Monday, November 23, 2020 - link

    Both the M1 and 4800U are drawing more that 15W depending on workload, both settling around 22-24W after initial boost.
  • mdriftmeyer - Friday, November 20, 2020 - link

    Zen 4 is out Nov 2021, announced Oct 2021. It's already known. Zen 4 is nearly complete in design back in September. What's coming with Zen 4 is the technologies of Xilinx --Neural Engine: Check, Machine Learning Accelerators: check, DSPs for focused A/D Convert Encode/Decode: Check.

    People the single biggest news of SV this year isn't ARM+Nvidia or Apple M1 series. It's Xilinx merging to become part of AMD.

    The IP, 13k engineers and portfolio of best in breed products by Xilinx [run by former AMD] is massive.

    And Apple nor Intel nor Nvidia saw this coming.

    Zen 4 APU will be a 5w or less CPU, with specialized add-ons, a massive Infinity Fabric interconnect, RAM not constrained like Apple, 8, 12, 16 CPU cores in dual chiplets and RDNA 3.0 CU GPU.

    Fall 2021 will be Zen 4 CPU, APU/RDNA 3.0 and RDNA 3.0 discrete GPUs with CDNA 2.0 M series Compute Processors expanding their footprint into HPC.

    You'll see the Zen 4/CDNA 2.0 solutions on El Capitan Fall 2021/Spring 2022. Clearly, to win that $600 million contract AMD showed their plans 12 months ago.

    From March 04, 2020 Press Release

    AMD technology within El Capitan includes:

    Next generation AMD EPYC processors, codenamed “Genoa” featuring the “Zen 4” processor core. These processors will support next generation memory and I/O sub systems for AI and HPC workloads,
    Next generation Radeon Instinct GPUs based on a new compute-optimized architecture for workloads including HPC and AI. These GPUs will use the next- generation high bandwidth memory and are designed for optimum deep learning performance,
    The 3rd Gen AMD Infinity Architecture, which will provide a high-bandwidth, low latency connection between the four Radeon Instinct GPUs and one AMD EPYC CPU included in each node of El Capitan. As well, the 3rd Gen AMD Infinity Architecture includes unified memory across the CPU and GPU, easing programmer access to accelerated computing,
    An enhanced version of the open source ROCm heterogenous programming environment, being developed to tap into the combined performance of AMD CPUs and GPUs, unlocking maximum performance.
    “This unprecedented computing capability, powered by advanced CPU and GPU technology from AMD, will sustain America’s position on the global stage in high performance computing and provide an observable example of the commitment of the country to maintaining an unparalleled nuclear deterrent,” said LLNL Lab Director Bill Goldstein. “Today’s news provides a prime example of how government and industry can work together for the benefit of the entire nation.”

    Note the emphasis on Genoa Zen 4 processor core, not Genoa Zen 4 CPUs.
  • Spunjji - Monday, November 23, 2020 - link

    @mrdriftmeyer - do you have a source for the 2021 claim? The last roadmap I'm aware of had a Zen 3 refresh on desktop in 2021 (likely on AM5) followed by Zen 4 some time in 2022.

    Seeing as the rest of your post appears to consist mostly of wild speculation and unsupportable assertions (e.g. the Zen 4 design is already locked in, it's NOT going to contain Xilinx IP) I'm not going to hold my breath.
  • tempestglen - Tuesday, November 17, 2020 - link

    BTW, M1 is a SoC, so please add GPU and RAM power of Zen3 during comparison.
  • RedGreenBlue - Tuesday, November 17, 2020 - link

    Benchmarks are benchmarks. I love AMD but in the power envelope the M1 is better. Also consider that this chip maxes out at 3.2Ghz not 4+. It’s just simply that x86-64 is a hinderance to AMD and Intel. That’s the real reason Apple had to switch and knew it with Steve Jobs in 2011. Intel is supposedly working on an x86 replacement. Haven’t heard anything new about it in years. But if they’re still working on it, it was expected in 2020-2022.
  • RedGreenBlue - Tuesday, November 17, 2020 - link

    This will also vastly improve their slim profit margins on macs too. Intel was charging such ridiculous prices for mediocre chips it was unbelievable. This is the best business model.

Log in

Don't have an account? Sign up now