Networking

Server contributions aren't the only things happening under the Open Compute project. Over the last couple of years a new focus on networking was added. Accton, Alpha Networks, Broadcom, Mellanox and Intel have each released a draft specification of a bare-metal switch to the OCP networking group. The premise of standardized bare-metal switches is simple: you can source standard switch models from multiple vendors, and run the OS of your choosing on it, along with your own management tools like Puppet. No lock-in and almost no migration path to be concerned with when implementing different equipment.

To that end, Facebook created Wedge, a 40G QSFP+ ToR switch together with the Linux-based FBOSS switch operating system to spur development in the switching industry, and, as always, to offer a better value for the price. FBOSS (along with Wedge) was recently open sourced, and in the process accomplished something far bigger: convincing Broadcom to release OpenNSL, an open SDK for their Trident II switching ASIC. Wedge's main purpose is to decrease vendor dependency (e.g. choose between an Intel or ARM CPU, choice of switching silicon) and allow consistency across part vendors. FBOSS lets the switch be managed with Facebook's standard fleet management tools. And it's not Facebook alone who can play with Wedge anymore, as Accton announced it will bring a Wedge based switch to market.


Facebook Wedge in all its glory

 


Logical structure of the Wedge software stack

But in Facebook's leaf-spine network design, you need some heavier core switches as well, connecting all the individual ToR switches to build the datacenter fabric. Traditionally those high-capacity switches are sold by the big network gear vendors like Cisco and Juniper, and at no small cost. You might then be able to guess what happens next: a few days ago Facebook launched '6-pack', its modular, high-capacity switch.


Facebook 6-pack, with 2 groups of line/fabric cards

A '6-pack' switch consists of two module types: line cards and fabric cards. A line card is not so different from a Wegde ToR switch, where 16 40GbE QSFP+ ports at the front are supplied with 640Gbps of the 1.2Tbps ASIC's switching capacity; the main difference with Wedge is the remaining 640Gbps is linked to a new backside Ethernet-based interconnect, all in a smaller form factor. The line card also has a Panther micro server with BMC for ASIC management. In the chassis, there are two rows of two line cards in one group, each operating independently of the other.


Line card (note the debug header pins left to the QSFP+ ports)

The fabric card is the bit connecting all of the line cards together, and thus the center part of the fabric. Though the fabric switch appears to be one module, it actually contains two switches (two 1.2Tbps packet crunchers, each paired to a Panther microcontroller), and like the line cards, they operate separate from each other. The only thing being shared is the management networking path, used by the Panthers and their BMCs, along with the management ports for each of the line cards.


Fabric card, with management ports and debug headers for the Panther cards

With these systems, Facebook has come a long way towards making its entire datacenter networking built with open, commodity components and running it using open software. The networking vendors are likely to notice these developments, and not only because of their pretty blue color.

ONIE

An effort to increase modularity even more is ONIE, short for the Open Network Install Environment. ONIE is focused on eliminating operating system lock-in by providing an environment for installing common operating systems like CentOS and Ubuntu on your switching equipment. ONIE is baked into the switch firmware, and after installation the onboard bootloader (GRUB) directly boots the OS. But before you start writing your Puppet or Chef recipes to manage your switches, a small but important side-note needs to be added: to operate the switching silicon of the Trident ASIC you need a proprietary firmware blob from Broadcom. And up until very recently, Broadcom would not give you the firmware blob unless you have some kind of agreement with them. This is why, currently, the only OSs you can install on ONIE enabled switches are commercial OSes like BigSwitch and Cumulus, who have agreements in place with the silicon vendors.

Luckily, Microsoft, Dell, Facebook, Broadcom, Intel and Mellanox have started work on a Switch Abstraction Interface (proposals), which would obviate the need for any custom firmware blobs and allow standard cross-vendor compatibility, though it remains to be seen to which degree this can completely replace proprietary firmware.

Visiting Facebook's Labs & Alternative OCP Standards Open Compute Hardware Availability
Comments Locked

26 Comments

View All Comments

  • SuperVeloce - Wednesday, April 29, 2015 - link

    From Mass storage: "Compared to hard disks optical media touts greater reliability, with Blu-ray discs having a life expectancy of 50 years and some discs could even be able to live on for a century."

    Yeah sure. Like my expensive gold color cd's from different vendors, baked on different high quality writers, now mostly not working anymore after some 15-20 years. Despite being held in almost perfect environment all these years
  • Uplink10 - Wednesday, April 29, 2015 - link

    Someday they are going to figure out that:
    -SAS HDDs are costlier but if you are using RAID it does not matter, they should use consumer drives and not overpriced enterprise drives
    -I calculated sometimes back if Bluray cold storage is cheaper than HDDs but it is not and more so you cannot change the data once you write it, it is better to go with HDDs
  • toyotabedzrock - Wednesday, April 29, 2015 - link

    You have to wonder what these networking chip vendors are hiding in the firmware that makes them so resistant to open sourcing the code.
  • Casper42 - Monday, May 4, 2015 - link

    Johan, some of the HP info at the end was interesting, but incomplete.
    If you (or anyone reading this) plan to talk to HP, they will also talk about their relatively new CloudLine "CL" type machines as well.
    They come in standard 1RU/2RU designs as well as OpenRack designs coming soon.
    And the SL line is all being morphed over to Project Apollo which uses the XL prefix.
    Apollo 2500 is now live, 4X00 will replace SL4500, 6000 has already replaced S6500, and the 8000 was a net-new add for Gen9 focused on big HPC farms.
    So anything SL is, or soon will be, a dead platform. (The SLs you mention could be an exception since they are not widely commercially available)
  • Netpower - Tuesday, June 2, 2015 - link

    One general problem with this design is how to take care of power line disturbances entering the power shelves via the 277V AC lines. The 48V DC is filtered via the 48V battery but you must add a filter/power line conditioner somewhere to make sure that transients and sags doesn't kill your power shelves. The 380V DC approach by (http://www.emergealliance.org) is much more reliable and still have all the advantages with higher efficiency, lower cable losses etc.
  • Astana - Wednesday, September 16, 2020 - link

    It is believed that the nicotine addiction of a smoker prevents you from quitting. But modern research has shown that the quitting process is complicated by psychological addiction. That's why many people can't cope with the habit of smoking. But there are results with https://heroindetoxeurope.com/ drvorobjev hospital they're all successful.
    The problem of smoking is not only the problem of men, but also increasingly - women. Many women are wondering: "How can you quit smoking and not get better? Psychologists have proven that when a woman quits smoking, she subconsciously tries to replace, to compensate for the process of smoking with food. This feeling is also familiar to men who quit smoking, but they do not attach much importance to it. As a result, very soon an extra 3-4 kg or more will appear.

Log in

Don't have an account? Sign up now