In this blog and the series of blogs to follow I will focus solely on Ceph Clustering. Many people are intimidated by Ceph because they find it complex – but when you understand it, that’s not the case. I want you to leave this blog with a better understanding of what Ceph is and why you should use it – then I want to dive into how it works and eventually get into some testing and results performed here in our 45Drives lab.  But first let’s start with the what, and the why.

The majority of our customers are interested in storing files – so we will focus on that capability (CephFS) in this blog. 

Ceph Cluster using 5 XL60 Storinators and 3 Ceph Monitors

What is Ceph?

Ceph is free open source clustering software that ties together multiple storage servers, each containing large amounts of hard drives. Essentially, Ceph provides object, block and file storage in a single, horizontally scalable cluster, with no single points of failure.

Note: Ceph is a powerful system which can also provide block (SAN) or object storage, so please talk to us if you need this ability.

A Ceph storage cluster can be easily scaled over time.  It may be configured for data security through redundancy and for high-availability by removing single points of failure.  It also has many enterprise features including snapshots, thin provisioning, tiering and self-healing capabilities.  Best of all, Ceph is extremely stable and developed for high performance, while also being open source and free of licensing fees.

Let’s dive a little deeper into 3 Ceph features:

Scalability: This is the reason most of our customers truly love Ceph – its ability to scale in both capacity and performance. If your data outgrows your Ceph cluster as originally configured, you simply increase capacity by adding more hard drives and/or servers. Ceph even allows you to add as little as a single hard drive to your cluster at any time.

For comparison: Gluster clusters are also scalable, but need much more thought at the configuration stage to avoid boxing yourself in.  It also requires scaling in larger increments.

Data Security: This can be easily achieved in Ceph by configuring with either replication or erasure coding.  Erasure coding is just like parity RAID when implemented at the hard drive level. It can also be applied at a server level or even higher levels of abstraction.

High Availability: Ceph storage servers create replicas on other Ceph nodes to ensure high availability. Ceph is also fault tolerant, using multiple disks over multiple servers to provide a single storage cluster, with no single point of failure – thus ensuring data access is always available.

Why Ceph?

We talked about what Ceph is and what it has to offer, but let’s talk about why you would choose a 45Drives Ceph Cluster for your use case.

1.       Highly Scalable – As mentioned above, Ceph is highly scalable. With Ceph, your storage cluster can start out as small as you want (as small as a single server), but you can also grow the cluster when you want. With very little effort, by simply adding more servers and drives – you can scale over time, in sync with your needs and budget, without ever hitting a reasonable maximum storage limit.

2.       Object, Block and File storage Ceph can provide all of these, with one cluster.

3.       Scalable Performance – Ceph has no “centralized” registry for data to flow through, so you don’t get that bottleneck of data when you add more storage, clients, etc. In fact, Ceph can automatically balance the file system to deliver maximum performance.

4.       Flexibility – You’re not limited to exact hardware nodes when adding more into the cluster. For example, you can add a Storinator AV15 to a three Storinator XL60 Ceph Cluster, or even add all-flash Storinators – to what was an all-spinning disk cluster.  You can also add any other type of server to your cluster, which allows you to make use of legacy storage servers that you may own.

5.       Self-Healing & Self-Managed – If Ceph notices a node goes down it will start replicating your data to a new location in the background so it’s always stored redundantly.

Let’s talk about CephFs, and why we love this file system.

  1.  It allows you to create a file storage system with all the advantages of Ceph.
  2. It is POSIX compliant and almost any application that addresses files can easily interact with it. CephFS has no problem sharing files out via CIFS (Windows) or NFS (UNIX). 
  3.  CephFS allows you to effectively use HDDs and SSDs in the same file system.
  4.  It has flexible Metadata caching, which allows it to handle directories with large numbers of inodes (file and directory entries).  

Because Ceph is open source, potential users may have mixed feelings depending on their organization and its needs. The fact that Ceph is free of license fees is definitely a plus, but it can leave some users wondering about stability, future development and where to turn if something goes wrong.  We call this “open source angst”.  

But here’s the reality:

  •          The good news is Ceph has achieved huge success and many major organizations depend on it (see below). 
  •          It has achieved critical mass which will carry it well into the future.
  •          It is mature, recognized as being rock solid and able to provide the stability that enterprise users require.
  •          It is well supported by commercial third parties who support Ceph.  For example, we here at 45 Drives provide Ceph support, including configuration, integration into your network, and any ongoing service that our customers may require, including expansion or maintenance.  

A few organizations currently using Ceph are:

  •  Intel
  • Blizzard Entertainment
  • Google 
  • Verizon 
  • Bloomberg
  • T-Mobile
  • Yahoo

      Ceph is the result of hundreds of contributors and organizations working together in the best practices of Open Source. Here are just some of the organizations that have invested effort into making Ceph better over the years.

Now that you have a little better understanding of Ceph and CephFS stay tuned for our next blog where will dive into how the 45Drives Ceph cluster works and how you can use it.

Check out our YouTube series titled “A Conversation about Storage Clustering: Gluster VS Ceph,” where we talk about the benefits of both clustering software. Those videos are packed with helpful information when trying to choose between software for your cluster solution.

Your feedback is always welcome and we’d love to hear your input on this software and clustering solution, please email or leave a comment below. 

About The Author

Shana Lawrence has been with 45Drives since 2016. She is our content editor, a technical writer, and the voice behind 45Drives social media.