Gen11 Graphics: Competing for 1080p Gaming

The new message from Intel is that it is driving to deliver deep gaming experiences with its technology, and the nod to the future is specifically what it wants to do with its graphics technology. Until the company is ready with its Xe designs for 2020 and beyond, it wants to start to lead the way with better integrated designs. That starts with Ice Lake, where the most powerful version of Ice Lake will offer over 1TF of compute performance, support higher resolution HEVC, better display pipes, an enhanced rasterizer, and support for Adaptive Sync.

The key words in that last sentence were ‘the most powerful version’. Because Intel hasn’t really spoken about its product stack yet, the company has been leading with its most powerful Iris Plus designs. We assume this means 28W? That means its high-end performance products, in the best designs, with the fastest memory. Compared to the standard Gen9 implementation of 24 execution units at 1150 MHz turbo, the best Ice Lake Gen11 design will deliver 64 execution units up to a 1100 MHz frequency, good for 1.15 TF of FP32 performance, or 2.30 TF of FP16 performance. Intel promise up to 1.8x better frame rates in games with the best Ice Lake compared to an average 8th Gen Core (Kaby Lake) Gen9 implementation. Intel doesn’t compare the results to a hypothetical Cannon Lake Gen10 implementation.

Intel hasn’t stated how many graphics configurations it will offer, but there would appear to be several given what information has leaked out already. The high-end design with 64 execution units will be called Iris Plus, but there will be a ‘UHD’ version for mid-range and low-end parts, however Intel has not stated how many execution units these parts will have. We suspect that standard dividers will be in play, with 24/32/48 EU designs possible as different parts of the GPU are fused off. There may be some potential for increased frequency in these designs, reducing latency, but ultimately reduced performance over the top design.

It should be noted that Intel is promoting the top model as being suitable for 1080p low-to-mid gaming, which would imply that models with fewer execution units may struggle to hit those highs with different EU counts. Until Intel gives us a full and proper product list, it is hard to tell at this point.

This slide, for example, shows where Intel expects its highest Ice Lake implementation to perform compared to the standard 8th Gen solution. As part of Computex, Intel also showed off some different data:

This graph shows relative FPS, rather than actual FPS, so it’s hard to see if certain games are just hitting 30 FPS in the highest mode. The results here are a function of the combination of increased EU count but also memory bandwidth.

Features for All

There are a number of features that all of the Gen11 graphics implementations will get, regardless of its number of execution units.

For its fixed function units, Gen11 supports two HEVC 10-bit encode pipelines, either two 4K60 4:4:4 streams simultaneously or one 8K30 4:2:2 stream using both pipelines at once. On display pipes, Gen11 has access to three 4K pipes split between DP1.4 HBR3 and HDMI 2.0b. There is also support for 2x 5K60 or 1x 4K120 with a 10-bit color depth.

The rasterizer gets an upgrade, and will now do 16 pixels per clock or 32 bilinear filtered texels per clock. Intel also gives some insight into the cache arrangements, with the execution units having their own 3 MiB of L3 cache and 0.5 MiB of shared local memory.

Intel recommends that to get the best out of the graphics, it should be paired with LPDDR4X-3733 memory in order to extract a healthy 50-60 GB/s bandwidth, and we should expect a number of Project Athena approved designs do just that. However, at the lower end of Ice Lake devices, we might see single channel DDR4 designs take over due to costs, which might limit performance. As always for integrated graphics, memory bandwidth is often a major bottleneck in performance. Back when Intel had eDRAM enabled Crystalwell designs, those chips were good for 50 GB/s bidirectional bandwidth, and we are almost at that stage with DRAM bandwidth designs now. It should be noted that there are tradeoffs with memory support: LPDDR4/X supports 4x 32b channels up to 32 GB with super low power consumption modes, but if users want more capacity, they’ll have to look to DDR4-3200 with 2x 64b channels up to 64 GB, but lose some performance and power savings.

Variable Rate Shading

A feature being implemented in Gen11 is Variable Rate Shading. VRS is a game-dependent technology that allows the GPU adjust the shading performance of the scene render based on what areas are important. All games currently do shading on a per-pixel basis, meaning that each pixel has a full calculation and that data is transferred to the final image. With VRS, shading is calculated over several pixels at once – essentially doing pixel shading in a coarser, lower-resolution manner – to save post-processing time by using averaged data.

The idea is that using this method can reduce some of the load on the execution units, ultimately increasing the frame rate. The size of that combination of pixels can be adjusted on a per-frame basis as well, allowing the game to take advantage of processing budget where it exists, or pull back to a point where performance is needed. Ultimately Intel believes that any image quality loss is not noticeable, especially for the performance impact they expect it to provide. Intel states that this technology is useful for areas such as lighting adjustments, partially obscured objects (by fog/clouds), and areas that undergo blur, or foveated rendering – basically any area where clarity isn’t explicitly required to begin with.

The only issue here though is an ecosystem one – it requires the game developer support. Intel is already working with Epic to add it to the Unreal Engine, and Intel has worked with developers to enable support in titles such as Civilization 6. The difference in performance, according to Intel, can be up to a 30% FPS increase in a best-case scenario. NVIDIA already supports VRS through dedicated hardware, whereas AMD’s current solutions are best described as a more limited shader-based approximation.

Sunny Cove Microarchitecture: Going Deeper and Wider DL Boost and New Instructions: Intel’s AI Acceleration Attack
Comments Locked

107 Comments

View All Comments

  • vFunct - Tuesday, July 30, 2019 - link

    Why did they not go with HDMI 2.1 and PCIe 4.0?
  • bug77 - Tuesday, July 30, 2019 - link

    AMD'd newly released 5700(XT) doesn't support HDMI 2.1, it's not surprising Intel doesn't support it either.
    And PCIe 4.0 would be power hog.
  • ToTTenTranz - Wednesday, July 31, 2019 - link

    The 5700 cards don't support VirtuaLink either, despite AMD belonging to the consortium since the beginning like nvidia and the RTX cards having it for about a year.

    First generation Navi cards are just very, very late.
  • tipoo - Tuesday, July 30, 2019 - link

    PCI-E 4 currently needs chipset fans on desktop parts, the power needed isn't suitable for 15-28W mobile yet.
  • DanNeely - Tuesday, July 30, 2019 - link

    Because Intel product releases have been a mess since the 10nm trainwreck began. Icelake was originally supposed to be out a few years ago. I suspect PCIe4 is stuck on whatever upcoming design was supposed to be the 7nm launch part.

    HDMI 2.1 is probably even farther down the pipeline; NVidia and AMD don't have 2.1 support on their discrete GPUs yet. Intel has historically been a lagging supporter of new standards on their IGPs, so that's probably a few years out.
  • nathanddrews - Tuesday, July 30, 2019 - link

    This whole argument that "real world" benchmarks equate to "most used" is rather dumb anyway. We don't need benchmarks to tell us how much faster Chrome opens Reddit, because the answer is always the same: fast enough to not matter. We need benchmarks at the fringes for those reasons brought up in the room: measuring extremes in single/multi threaded scenarios, power usage, memory speeds; finding weaknesses in hardware and finding flaws in software; and taking a large enough sample to be meaningful across the board.

    Intel wants to eat its cake and still have it - to be fair - who doesn't? But let's get real, AMD is kicking some major butt right now and Intel has to spin it any way they can. What's funny is that the BEST arguments that I've heard from reviewers to go AMD actually has nothing to do with performance, but rather the Zen platform as a whole in terms of features, upgradeability, and cost.

    I say this as a total Intel shill, too. The only AMD systems running in my house right now are game consoles. All my PCs/laptops are Intel.
  • twotwotwo - Tuesday, July 30, 2019 - link

    Interesting to read what Intel suggested some of their arguments in the server space would be: lower TCO like the old Microsoft argument against Linux, and having to revalidate all your stuff to use an AMD platform. Some quotes (from a story in their internal newsletter; the full thing is floating around out there, but couldn't immediately find):

    https://www.techspot.com/news/80683-intel-internal...

    I mean, they'll be fine long term, but trying to change the topic from straightforward bang-for-buck, benchmark results, etc. is an approach you only take in a...certain sort of situation.
  • eek2121 - Wednesday, July 31, 2019 - link

    Unfortunately, your average IT infrastructure guy no longer knows how fast a Xeon Platinum 8168 is vs an AMD EPYC 7601. They just ask OEMs like Dell or HP to sell them a solution. I've even seen cases where faster solutions were replaced with slower solutions because they were more expensive and the numbers looked bigger. It turns out that the numbers that looked bigger were not the numbers that they should have been paying attention to.

    One company I worked at almost bought a $100,000 (yeah I know, small change, but it was a small company) pre-built system. We, as software developers, talked them into letting us handle it instead. We knew a lot about hardware and as a result? We spent around $15,000 in hardware costs. Yes there were labor costs involved in setting everything up, but it only took about 2 weeks for 4 guys, 2 of which were juniors. Had we gone with the blade system, there would have been extensive training needed, which would have costed about the same in labor. Our solution was fully redundant, a hell of a lot faster (the blade system used hardware that was slower than our solution, and it was also a proprietary system that we would be locked into, so there was an additional service contract that costed $$$ and would have to be signed). During my entire time there, we had very few issues with the solution we built outside the occasional hard drive (2 drives in 4 years IIRC) dying and having to pop it out, pop in a new one, and let the RAID rebuild. Zero downtime. In addition, our wifi solution allowed roaming all over a giant building without dropping the signal. Speeds were lightning fast and QoS allowed us to keep someone from taking up too much bandwidth on the guest network. The entire setup worked like a dream.

    We also wanted to use a different setup for the phone system, but they opted to work with a vendor instead. They paid a lot of money for that, and constantly had issues. The administration software was buggy, sometimes the entire system would go down, even adding a user would take down the entire system until things were updated. IIRC after I left they finally switched to the system we wanted to use and had no issues after that.
  • wrkingclass_hero - Tuesday, July 30, 2019 - link

    Uh, I would not be putting cobalt anywhere near my mouth
  • PeachNCream - Tuesday, July 30, 2019 - link

    Real men aren't scared of a few toxic chemicals entering their digestive systems! Clearly you and I are not real men, but we now have a role model to emulate over the course of our soon-to-be-shortened-by-cancer lives.

Log in

Don't have an account? Sign up now