Troubleshooting and maintenance in Ceph

First Aid

Monitoring and Ceph

Monitoring is undoubtedly important in the admin's life, and Ceph supplies a number of approaches. Ceph itself knows the condition of each individual OSD, so it would be possible to monitor the OSDs individually using the monitoring application. The problem is that none of the established monitoring solutions exist in code. Nevertheless, Ceph users do not need to get along completely without monitoring because of a rudimentary Nagios plugin [4] that can at least parse the output from ceph health and display the status messages in the monitoring tool. Additionally, according to its developers, Inktank [5] is working to improve support for the current monitoring tools; on that front, you can expect to see some new features in the near future.

Admin Sockets

Finally, a little tip for admins who want to know in detail what Ceph is doing: You can obtain accurate performance data via admin sockets . Sockets are usually in /var/run/ceph, and the name ends in .asok. As an example, you can retrieve the latest performance data for an OSD on Charlie, given an OSD ID of 3, with:

ceph  --admin-daemon /var/run/ceph/ceph-osd.3.asok perf dump | python -m json.tool

The output is given in JSON format (Figure 5), so piping it to python -m json.tool makes the output readable (Figure 6).

Figure 5: Performance data queried using admin sockets.
Figure 6: Readable performance data.

Infos

  1. "The RADOS Object Store and Ceph Filesystem" by Martin Loschwitz, ADMIN , 2012, No. 9, pg. 28, http://www.admin-magazine.com/HPC/Articles/The-RADOS-Object-Store-and-Ceph-Filesystem/(language)/eng-US
  2. "The RADOS Object Store and Ceph Filesystem: Part 2" by Martin Loschwitz, ADMIN , 2012, No. 11, pg. 42, http://www.admin-magazine.com/HPC/Articles/RADOS-and-Ceph-Part-2/(language)/eng-US
  3. "The RADOS Object Store and Ceph Filesystem: Part 3" by Martin Loschwitz: http://www.admin-magazine.com/HPC/Articles/rados_and_ceph/(language)/eng-US
  4. Nagios plugin: https://github.com/ceph/ceph-nagios-plugin
  5. Inktank: http://www.inktank.com/

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Ceph Maintenance

    We look into some everyday questions that administrators with Ceph clusters tend to ask: What do I do if a fire breaks out or I run out of space in the cluster?

  • Manage cluster state with Ceph dashboard
    The Ceph dashboard offers a visual overview of cluster health and handles baseline maintenance tasks; with some manual work, an alerting function can also be added.
  • Getting Ready for the New Ceph Object Store

    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph v10.2.x, Jewel.

  • Ceph object store innovations
    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph c10.2.x, Jewel.
  • What's new in Ceph
    Ceph and its core component RADOS have recently undergone a number of technical and organizational changes. We take a closer look at the benefits that the move to containers, the new setup, and other feature improvements offer.
comments powered by Disqus