« Previous 1 2 3
Troubleshooting and maintenance in Ceph
First Aid
Monitoring and Ceph
Monitoring is undoubtedly important in the admin's life, and Ceph supplies a number of approaches. Ceph itself knows the condition of each individual OSD, so it would be possible to monitor the OSDs individually using the monitoring application. The problem is that none of the established monitoring solutions exist in code. Nevertheless, Ceph users do not need to get along completely without monitoring because of a rudimentary Nagios plugin [4] that can at least parse the output from ceph health
and display the status messages in the monitoring tool. Additionally, according to its developers, Inktank [5] is working to improve support for the current monitoring tools; on that front, you can expect to see some new features in the near future.
Admin Sockets
Finally, a little tip for admins who want to know in detail what Ceph is doing: You can obtain accurate performance data via admin sockets
. Sockets are usually in /var/run/ceph
, and the name ends in .asok
. As an example, you can retrieve the latest performance data for an OSD on Charlie, given an OSD ID of 3, with:
ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok perf dump | python -m json.tool
The output is given in JSON format (Figure 5), so piping it to python -m json.tool
makes the output readable (Figure 6).
Infos
- "The RADOS Object Store and Ceph Filesystem" by Martin Loschwitz, ADMIN , 2012, No. 9, pg. 28, http://www.admin-magazine.com/HPC/Articles/The-RADOS-Object-Store-and-Ceph-Filesystem/(language)/eng-US
- "The RADOS Object Store and Ceph Filesystem: Part 2" by Martin Loschwitz, ADMIN , 2012, No. 11, pg. 42, http://www.admin-magazine.com/HPC/Articles/RADOS-and-Ceph-Part-2/(language)/eng-US
- "The RADOS Object Store and Ceph Filesystem: Part 3" by Martin Loschwitz: http://www.admin-magazine.com/HPC/Articles/rados_and_ceph/(language)/eng-US
- Nagios plugin: https://github.com/ceph/ceph-nagios-plugin
- Inktank: http://www.inktank.com/
« Previous 1 2 3