CloundFounders: Cloud Storage Router

The latency of DSS on top of slow SATA disks is of course pretty bad. CloudFounders solves this by using an intelligent flash cache that caches both reads and writes. The so-called “Storage Accelerator” is part of the Cloud Storage Router and runs on top of the DSS backend.

First of all, the typical 4KB writes are all stored on the SSD. The 4KB are aggregated into a Storage Container Object (SCO), typically 16MB in size. As a result, the flash cache is used for what it can do best (working with 4KB blocks) and the DSS SATA backend will only see sequential writes of 16MB. The large SATA magnetic disks perform a lot better with large sequential writes than small random ones.

A server with virtual machines can connect to several Cloud Storage Routers. Blocks of a virtual disk can be spread over several flash caches. The result is that performance can scale with the number of nodes where Cloud Storage Routers are active. The magic to make this work is that the metadata is distributed among all cloud storage routers (using a Paxos distributed database), so “hot blocks” can be transferred from several flash caches of several nodes at the same time. The metadata also contains a hash of each 4KB block. As hash codes of each 4KB block in the cache are compared, the cache is “deduped” and each 4KB block is written only once.

As the blocks of virtual disk are distributed over the DSS, the failure of compute or storage nodes does not need to be disruptive. Compute node failure can be solved by enabling VMware HA, storage nodes failures can be solved by configuring the DSS durability policy. As the metadata is distributed, the remaining cloud storage routers will be able to find the blocks on the DSS that belong to a certain virtual disk.

Last but not least, the Cloud Storage Routers can also distribute the check blocks over cloud storage like Amazon S3 or an Openstack Swift implementation. The data is also secure as the blocks are encoded and spread over many volumes. Only the Cloud Storage Routers know how to assemble the data.

Highly scalable, very reliable, and very flexible (e.g. when used with Amazon S3): the CloudFounder Storage Router sounds almost too good to be true. We have setup a Storage Router in our lab and can confirm that it can do some amazing things like replicating an ESXi VM across the globe and booting it as a Hyper-V VM. We are currently designing some benchmarks to compare it with traditional storage systems, so we hope to report back with some solid tests. But it is safe to say that the combination of Bitspread and the Cloud Storage Router is very different from the traditional RAID enabled SAN and storage gateway.

CloudFounders: No More RAID NetApp: Flash Anywhere
Comments Locked

60 Comments

View All Comments

  • Jammrock - Monday, August 5, 2013 - link

    Great write up, Johan.

    The Fusion-IO ioDrive Octal was designed for the NSA. These babies are probably why they could spy on the entire Internet without ever running low on storage IO. Unsurprisingly that bit about the Octal being designed for the US government is no longer on their site :)
  • Seemone - Monday, August 5, 2013 - link

    I find the lack of ZFS disturbing.
  • Guspaz - Monday, August 5, 2013 - link

    Yeah, you could probably get pretty far throwing a bunch of drives into a well configured ZFS box (striped raidz2/3? Mirrored stripes? Balance performance versus redundancy and take your pick) and throwing some enterprise SSDs in front of the array as SLOG and/or L2ARC drives.

    In fact, if you don't want to completely DIY, as many enterprises don't, there are companies selling enterprise solutions doing exactly this. Nexenta, for example (who also happen to be one of the lead developers behind modern opensource ZFS), sell enterprise software solutions for this. There are other companies that sell hardware solutions based on this and other software.
  • blak0137 - Monday, August 5, 2013 - link

    Another option for this would be to go directly to Oracle with their ZFS Storage Appliances. This gives companies the very valuable benefit of having hardware and software support from the same entity. They also tend to undercut the entrenched storage vendors on price as well.
  • davegraham - Tuesday, August 6, 2013 - link

    *cough* it may be undercut on the front end but maintenance is a typical Oracle "grab you by the chestnuts" type thing.
  • Frallan - Wednesday, August 7, 2013 - link

    More like "grab you by the chestnuts - pull until they rips loose and shove em up where they don't belong" - type of thing...
  • davegraham - Wednesday, August 7, 2013 - link

    I was being nice. ;)
  • equals42 - Saturday, August 17, 2013 - link

    And perhaps lock you into Larry's platform so he can extract his tribute for Oracle software? I think I've paid for a week of vacation on Ellison's Hawaiian island.

    Everybody gets their money to appease shareholders somehow. Either maintenance, software, hardware or whatever.
  • Brutalizer - Monday, August 5, 2013 - link

    Discs have grown bigger, but not faster. Also, they are not safer nor more resilient to data corruption. Large amounts of data will have data corruption. The more data, the more corruption. NetApp has some studies on this. You need new solutions that are designed from the ground up to combat data corruption. Research papers shows that ntfs, ext, etc and hardware raid are vulnerable to data corruption. Research papers also show that ZFS do protect against data corruption. You find all papers on wikipedia article on zfs, including papers from NetApp.
  • Guspaz - Monday, August 5, 2013 - link

    It's worth pointing out, though, that enterprise use of ZFS should always use ECC RAM and disk controllers that properly report when data has actually been written to the disk. For home use, neither are really required.

Log in

Don't have an account? Sign up now