Management improvements, memory scaling, and EOL for FileStore

Refreshed

Compression

As is the wont with Ceph, the individual Ceph components come with a cornucopia of small but subtle changes. The OSDs now support compression for inter-OSD communication, taking into account that many Ceph systems today have CPUs that are far too powerful. However, planners are still reluctant to use the latest network technologies, such as 400Gbps connections, for financial reasons. Ceph clusters that are overflowing their network at rush hour can use this feature to give themselves some breathing space at the expense of their CPUs. However, you should not have overly high expectations. Ultimately, a permanently overloaded network can only be brought to reason with faster hardware.

Improved Overview for Logs

Another change in your OSDs for which to be happy – hidden in the changelog – is that the OSDs will log slow requests far more clearly than was previously the case. A slow request is basically any write operation to an OSD that does not complete successfully within a specified time period. Virtually any type of problem has the potential to generate many slow requests, whether disruptions on the network, problems with individual OSDs, or configuration errors.

The crux is that each client that uploads something to RADOS triggers three write operations in the default configuration. The first goes directly to the target OSD, but from there replicas of the respective objects are also sent to the secondary OSDs, which, in the case of more extensive problems, results in an extremely large number of slow requests in a short period of time that the cluster is unable to process. Until now, RADOS simply forwarded corresponding messages to the ceph -w output, where they no longer made sense after a short while.

The new format now bundles the messages, generating far less text on the terminal. If you need the old format (e.g., because monitoring scripts depend on it), you can reinstate the old format in the configuration.

Drilled Out Automation

Although some might turn their noses up at Ceph-Mgr and its cohort cephadm (Figure 3), others are happy to use the tool. Both positions are justified. On the one hand, you need to ask why Ceph bothers competing with products like Ansible when alternatives like ceph-ansible exist and work well. On the other hand, Ceph-Mgr has evolved into far more than a plain vanilla automation tool.

Figure 3: Ceph has a complete automation suite of its own in cephadm, with new features and more data export formats added in Ceph 17.2.

The tool's management services are not only perfectly adapted to Ceph and its needs but now also provide an interface for other programs to retrieve data from Ceph. For example, anyone who wants Prometheus to tap data from Ceph for visualization on modified dashboards (Figure 4) will now also be able to tap into the Manager (Ceph-Mgr), because it provides a native Prometheus interface with data in the appropriate format.

Figure 4: The Ceph dashboard is now an established component of the solution and comes with some minor changes in version 17.2.

In Ceph 17.2, however, the management component of the solution has only seen minor updates. In addition to querying by way of the Prometheus protocol, for example, a Simple Network Management Protocol (SNMP) interface is now also available for extracting various key figures from Ceph and processing them in other monitoring solutions.

Moreover, the developers have massively expanded the solution's ability to roll out individual Ceph services, such as the Manager component, metadata servers, and instances of the RADOS gateway, in parallel with the same systems. Additionally, the zap subcommand can automatically zap OSDs when removed from the cluster, ensuring that the existing data is reliably deleted. Finally, cephadm now supports a server agent mode, which should significantly increase the performance of the solution.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • What's new in Ceph
    Ceph and its core component RADOS have recently undergone a number of technical and organizational changes. We take a closer look at the benefits that the move to containers, the new setup, and other feature improvements offer.
  • Ceph object store innovations
    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph c10.2.x, Jewel.
  • Getting Ready for the New Ceph Object Store

    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph v10.2.x, Jewel.

  • Manage cluster state with Ceph dashboard
    The Ceph dashboard offers a visual overview of cluster health and handles baseline maintenance tasks; with some manual work, an alerting function can also be added.
  • The RADOS Object Store and Ceph Filesystem

    Scalable storage is a key component in cloud environments. RADOS and Ceph enter the field, promising to support seamlessly scalable storage.

comments powered by Disqus