background
OS:Ubuntu 16.04
Some Osd configurations have been modified. After modification, the Osd service needs to be restarted before it can take effect. After the first restart, the configuration will take effect immediately. After changing some configurations, the configuration no longer takes effect after restarting the Osd service. The ps command checks the process and finds that the osd process has not started.
analyze
The osd process has not started, and the first intuition is that there is a configuration error, and the osd process has hanged up after it is started. So, enter the /var/log/ceph directory and check ceph-osd. and find that at the end of the log, there are only relevant logs for the process that closes the process, and there is no information about the start of the osd. Check the time of the log again, and the time is the time when the service is closed. In other words, after the second restart of the service, osd did not start. Since it has not started, it is not a problem with osd itself, but is related to the restart service command systemctl restart.
Check the status of the Osd service first.
$ systemctl status ceph-osd.target
● ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service instances at once
Loaded: loaded (/lib/systemd/system/ceph-osd.target; enabled; vendor preset: enabled)
Active: inactive (dead) since Sun 2017-03-05 16:52:04 CST; 3s ago
Sure enough, the service is inactvice. Check the service-related logs:
$ journalctl -xe
Mar 05 14:21:43 node3 systemd[1]: ceph-osd@: Start request repeated too quickly.
Mar 05 14:21:43 node3 systemd[1]: Failed to start Ceph object storage daemon.
Sure enough, the service failed to start, and the reason given is that the startup request is too fast. This is likely related to the configuration of the osd service. Open the osd service configuration file /etc/systemd/system//ceph-osd@ and find that there is a limit on the server startup interval, and the limit time is 30 minutes. No wonder the first time the service was started successfully, but the second time failed.
$ vi /etc/systemd/system/ceph-osd.target.wants/ceph-osd@0.service
StartLimitInterval=30min
Solution
Comment out the startup interval limit of the service configuration file and reload the service configuration.
$ systemctl daemon-reload
Restart the Osd service and check the status of the Osd service.
$ systemctl restart
$ systemctl status
● - ceph target allowing to start/stop all [email protected] instances at once
Loaded: loaded (/lib/systemd/system/; enabled; vendor preset: enabled)
Active: active since Sun 2017-03-05 16:47:53 CST; 5s ago
Mar 05 16:47:53 node2 systemd[1]: Reached target ceph target allowing to start/stop all [email protected] instances at once.
The service status becomes active and the problem is solved.