The iPhone 5s Review

Name: The iPhone 5s Review
Item: The iPhone 5s Review
Author: Anand Lal Shimpi

by Anand Lal Shimpi on September 17, 2013 9:01 PM EST

464 Comments | Add A Comment

464 Comments

After Swift Comes Cyclone Oscar

I was fortunate enough to receive a tip last time that pointed me at some LLVM documentation calling out Apple’s Swift core by name. Scrubbing through those same docs, it seems like my leak has been plugged. Fortunately I came across a unique string looking at the iPhone 5s while it booted:

I can’t find any other references to Oscar online, in LLVM documentation or anywhere else of value. I also didn’t see Oscar references on prior iPhones, only on the 5s. I’d heard that this new core wasn’t called Swift, referencing just how different it was. Obviously Apple isn’t going to tell me what it’s called, so I’m going with Oscar unless someone tells me otherwise.

Oscar is a CPU core inside M7, Cyclone is the name of the Swift replacement.

Cyclone likely resembles a beefier Swift core (or at least Swift inspired) than a new design from the ground up. That means we’re likely talking about a 3-wide front end, and somewhere in the 5 - 7 range of execution ports. The design is likely also capable of out-of-order execution, given the performance levels we’ve been seeing.

Cyclone is a 64-bit ARMv8 core and not some Apple designed ISA. Cyclone manages to not only beat all other smartphone makers to ARMv8 but also key ARM server partners. I’ll talk about the whole 64-bit aspect of this next, but needless to say, this is a big deal.

The move to ARMv8 comes with some of its own performance enhancements. More registers, a cleaner ISA, improved SIMD extensions/performance as well as cryptographic acceleration are all on the menu for the new core.

Pipeline depth likely remains similar (maybe slightly longer) as frequencies haven’t gone up at all (1.3GHz). The A7 doesn’t feature support for any thermal driven CPU (or GPU) frequency boost.

The most visible change to Apple’s first ARMv8 core is a doubling of the L1 cache size: from 32KB/32KB (instruction/data) to 64KB/64KB. Along with this larger L1 cache comes an increase in access latency (from 2 clocks to 3 clocks from what I can tell), but the increase in hit rate likely makes up for the added latency. Such large L1 caches are quite common with AMD architectures, but unheard of in ultra mobile cores. A larger L1 cache will do a good job keeping the machine fed, implying a larger/more capable core.

The L2 cache remains unchanged in size at 1MB shared between both CPU cores. L2 access latency is improved tremendously with the new architecture. In some cases I measured L2 latency 1/2 that of what I saw with Swift.

The A7’s memory controller sees big improvements as well. I measured 20% lower main memory latency on the A7 compared to the A6. Branch prediction and memory prefetchers are both significantly better on the A7.

I noticed large increases in peak memory bandwidth on top of all of this. I used a combination of custom tools as well as publicly available benchmarks to confirm all of this. A quick look at Geekbench 3 (prior to the ARMv8 patch) gives a conservative estimate of memory bandwidth improvements:

Geekbench 3.0.0 Memory Bandwidth Comparison (1 thread)
	Stream Copy	Stream Scale	Stream Add	Stream Triad
Apple A7 1.3GHz	5.24 GB/s	5.21 GB/s	5.74 GB/s	5.71 GB/s
Apple A6 1.3GHz	4.93 GB/s	3.77 GB/s	3.63 GB/s	3.62 GB/s
A7 Advantage	6%	38%	58%	57%

We see anywhere from a 6% improvement in memory bandwidth to nearly 60% running the same Stream code. I’m not entirely sure how Geekbench implemented Stream and whether or not we’re actually testing other execution paths in addition to (or instead of) memory bandwidth. One custom piece of code I used to measure memory bandwidth showed nearly a 2x increase in peak bandwidth. That may be overstating things a bit, but needless to say this new architecture has a vastly improved cache and memory interface.

Looking at low level Geekbench 3 results (again, prior to the ARMv8 patch), we get a good feel for just how much the CPU cores have improved.

Geekbench 3.0.0 Compute Performance
	Integer (ST)	Integer (MT)	FP (ST)	FP (MT)
Apple A7 1.3GHz	1065	2095	983	1955
Apple A6 1.3GHz	750	1472	588	1165
A7 Advantage	42%	42%	67%	67%

Integer performance is up 44% on average, while floating point performance is up by 67%. Again this is without 64-bit or any other enhancements that go along with ARMv8. Memory bandwidth improves by 35% across all Geekbench tests. I confirmed with Apple that the A7 has a 64-bit wide memory interface, and we're likely talking about LPDDR3 memory this time around so there's probably some frequency uplift there as well.

The result is something Apple refers to as desktop-class CPU performance. I’ll get to evaluating those claims in a moment, but first, let’s talk about the other big part of the A7 story: the move to a 64-bit ISA.

A7 SoC Explained The Move to 64-bit

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

464 Comments

View All Comments

dugbug - Wednesday, September 18, 2013 - link
They got 2x performance in both CPU and GPU. You want to ding it? You think this is something you just drag and drop in a powerpoint and press the "Make Chips" button? jesus...
weiln12 - Wednesday, September 18, 2013 - link
Great article as always. One thing I've noticed and don't undertstand is what's up with the different SKU's for CDMA? Both could be combined into the one SKU, since A1453 has the same CDMA/GSM/WCDMA as A1533 and more LTE bands than A1533 (17/26). Why have both when one will work? Seems like I'm missing something here...
Krysto - Wednesday, September 18, 2013 - link
Ah, so now we finally see that Imagination's "triangle throughput" and all of those benchmarks spiking way ahead of other chips before, were just BS. ARM was right to call Imagination on it. Those benchmarks never mattered for gaming performance, as we can see with the new iPhone, yet Anand kept showing them at the top of each benchmark to show how "impressive" the Imagination GPU was.

Glad to see you admit how wrong you were, Anand.
Mondozai - Wednesday, September 18, 2013 - link
You just keep hating on Apple/Anandtech throughout the entire thread.
Honestly, you're just Krysto.
Mondozai - Wednesday, September 18, 2013 - link
just sad*
(love that fact that there's no edit button)
Wilco1 - Wednesday, September 18, 2013 - link
Lots of us agree most of Anand's benchmarks are rubbish. Sorry but in 2013 we shouldn't have a major tech site using JS benchmarks and pretending they show CPU performance.
tabascosauz - Wednesday, September 18, 2013 - link
This is why you can doubt Apple's claims in any area OTHER than graphics.

G6430...jesus christ. Literally destroys every other mobile GPU other than Adreno 330 which can actually put up a fight.
vision33r - Wednesday, September 18, 2013 - link
This is proof that Samsung was very much a copy cat from the start. Once the silicone is no longer produced and shared with them. They were caught messing with adding more cores that they have no clue how to optimize and now Apple jumped into 64bit and Samsung have to play catch up.
purerice - Wednesday, September 18, 2013 - link
Samsung is like that with everything from televisions to toasters.
dugbug - Wednesday, September 18, 2013 - link
and now vacuum cleaners according to dyson

The iPhone 5s Review

After Swift Comes Cyclone Oscar

Post Your Comment

464 Comments

View All Comments

dugbug - Wednesday, September 18, 2013 - link

weiln12 - Wednesday, September 18, 2013 - link

Krysto - Wednesday, September 18, 2013 - link

Mondozai - Wednesday, September 18, 2013 - link

Mondozai - Wednesday, September 18, 2013 - link

Wilco1 - Wednesday, September 18, 2013 - link

tabascosauz - Wednesday, September 18, 2013 - link

vision33r - Wednesday, September 18, 2013 - link

purerice - Wednesday, September 18, 2013 - link

dugbug - Wednesday, September 18, 2013 - link

Log in

Don't have an account? Sign up now