Comments Locked

14 Comments

Back to Article

  • shelbystripes - Tuesday, September 27, 2016 - link

    So this enables putting 128 ARM cores on a single piece of silicon? Even with little cores and 14nm process, that's going to be a pretty large die.

    This would be pretty cool for building a BIG.little supercomputer though. A small block of 4 big cores to manage the OS and task scheduling, and then 124 little cores in parallel... Add a high speed interconnect to talk to other nodes and external storage servers, and you've got an entire HPC node as an SoC. Want a half million ARM cores in a single 19" rack?
  • Arnulf - Tuesday, September 27, 2016 - link

    A quad A72 is 8 mm^2 in size (this includes L2 cache) on TSMC 16FF+.

    128 A72 cores would come out to 256 mm^2, not accounting for the interconnects and the rest of the chip. TSMC is manufacturing bigger GPU chips on this process so this is not unfeasible at all ...
  • ddriver - Tuesday, September 27, 2016 - link

    Half the die of consumer chips is graphics, remove graphics and suddenly you have plenty of room for cores.
  • jjj - Tuesday, September 27, 2016 - link

    A73 on 10nm is under 0.65mm2, a quad cluster with 2MB L2 some 5mm2,add a large L3 and it's still small.

    But the real push in server is for 7nm HPC and bigger cores.
  • name99 - Tuesday, September 27, 2016 - link

    Yeah, people are totally confused about how large (or rather small) CPU cores are.

    Even the A10 core, eyeballing it from the die shot, is perhaps 14 to 16mm^2 (including 2 fast cores, 3MiB L2, and two companion cores; L3 adds maybe 30% more).
    Not as small as A72s, but again it would be totally feasible to put 16, maybe even 24, of these (2+2) units, and L3, on a die if you had the urge to do so. The details would start to depend on howmuch else you also want to add --- memory controllers, what IO, etc.

    The high end of die size is up at 650mm^2 or so, as opposed to 100 to 150mm^2 at the low-mid end (eg the range of Apple's mobile and iPad SoCs).
    Obviously you pay serious money for that sort of large size, but it is technically quite feasible and is used.
  • patrickjp93 - Thursday, October 6, 2016 - link

    Cache is the expensive part, and you're going to need an L3 cache to keep such a cluster fed, whether it's a victim cache or a primary.
  • TeXWiller - Tuesday, September 27, 2016 - link

    ARM press release talks only about the interconnect bandwidth increase of five times, not about a similar increase in memory bandwidth. They do support 3D stacked ram though (HBM, HMC?), so that might explain the number.
  • T1beriu - Wednesday, September 28, 2016 - link

    Have you even read anything from this announcement?!
  • TeXWiller - Wednesday, September 28, 2016 - link

    Is there a source for those simulated memory controller results somewhere, a release event or an interview? News release talks about throughput related to interconnect bandwidth on the whole, not specifically that related to the memory controller like this article suggests.
  • jjj - Tuesday, September 27, 2016 - link

    Don't forget that on the slide the A57 is on the same process so real world A57 on 28nm is not the baseline.
  • Pork@III - Tuesday, September 27, 2016 - link

    I see the first new competitor in the server market that is too uniform for many years. Time for a change!
  • patrickjp93 - Thursday, October 6, 2016 - link

    Cavium and Qualcomm both just tried and failed against Xeon D and Avoton. Intel is not losing that market without quite a fight.
  • Communism - Wednesday, September 28, 2016 - link

    Certainly looks like this will put a hurt'n on atom servers.

    Perhaps this will be the straw that breaks the camel's back on 10 Gbit servers, switches, and adapters.
  • Communism - Wednesday, September 28, 2016 - link

    In terms of breaking the absurdly high pricing of them of course.

Log in

Don't have an account? Sign up now