Our Best Decision Ever – Direct Wired
Before we dive into why we decided to use direct-wired architecture – let’s talk about what direct-wired actually is.
So, what is a direct-wired architecture?
In a server with direct-wired architecture, there is a 1 to 1 connection for the drive to the system with the full throughput of each drive in the server. The drives connect through a card called the LSI 9305. The LSI 9305 is the standard option, although others are available.
This card has 4 ports, each available to connect to 4 SATA or SAS hard drives. Connected to the LSI card is a custom cable that connects on one end to the LSI card through an SFF8643, which fans out to 4 SATA connectors. This way, there are 4 drives per port which work well with the 15 drives per row that connect to one card. Connecting the cables this way provides high bandwidth. To hear our lead engineer explain our direct-wired architecture in more detail – watch this video.
So why did we decide on direct-wired architecture?
The original servers used 9 port-multiplying backplanes. Each backplane had five SATA hard drive connectors, and one SATA cable connector that connects the backplane to a SATA controller plugged into the motherboard. This allowed up to five hard drives to be plugged into the board, multiplexed, and sent to a shared SATA port through one cable. Expand this out nine times, and you get 45 HDD densely packed into a storage chassis. Therefore, 45Drives required only nine cables and SATA ports. This was and still is economical as well as practical for a large number of machines.
Caveats were included:
- SATA adaptors must be chipset compatible with the backplane port multipliers, requires specific proprietary drivers
- Data for 5 drives share 1 SATA cable (3Gb in the original design), limits performance to 60MB per second per drive (mechanical drive limit approximately 150MB/s)
- More chances of failure, as one failed SATA cable could take out 5 disks. Not a problem if you are running these in a massive cluster as Backblaze intended, but a significant risk if running a standalone ZFS array on one.
When manufacturing servers Backblaze’s, we noticed that their needs were different from the needs of other storage users, and Protocase ended up customizing about 50% of ordered servers.
Many of the changes included:
- Slide rails – almost everyone wanted slide rails. The defacto standard pitch of 5-drive port multiplier backplanes makes it impossible to fit a row of 15 drives across a 19 in rackmount enclosure and still have room for slide rails. It worked fine for Backblaze but not for other storage users.
- Speed – backplanes lower throughput. It works okay in massive clusters but isn’t as practical for performance use cases.
- Reliability and compatibility – The backplane’s failure rates were high enough to be problematic, having 3 SATA cards with 9 backplanes 3x and 9x respective failure rates of a single card. In addition, some applications are incompatible with this approach, and we often rewired to direct wired.
Eventually, backplanes (CFI-B53PM (SIL 3726)) went end of life, and the need of meeting several criteria compounded the challenge of finding a replacement:
- Backblaze required the absolute lowest cost.
- Backblaze required matching SATA card chipsets, backplanes, and drivers.
At this point, we were also having interesting exchanges with engineers at Netflix, who, as many know, build their own storage servers to stream video to their customers. They were successfully making direct-wired connection architecture work on a large scale.
We and Backblaze agreed that this approach had certain advantages and began developing the Pod 4.0 architecture (otherwise known as the Storinator), based on an inventive leap of fixing connectors on ‘wired backplanes’ so drives could be mounted in the same configuration as previous servers.
Eventually, Backblaze went back to a backplane-based design because it made more sense within their business model and technology context.
Although most of those who bought the earlier backplane-based pods did just fine, we saw too many users struggle to make this class of machine work for them. However, once we got through our early cable and driver problems, we saw the load on our support team drop dramatically, and users were delighted by the performance going way up!
So, direct wired became our thing.