部署ceph cluster的第一步是部署monitor节点, 本例我们使用3台monitor. 并且monitor之间使用cephx认证.
INSTALL CEPH STORAGE CLUSTER
1. 安装yum-plugin-priorities (所有节点)
[root@deploy yum.repos.d]# su - ceph
[ceph@deploy ~]$ ssh 172.17.0.1
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.2
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.3
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.4
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.5
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.6
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.7
sudo yum install -y yum-plugin-priorities
[ceph@deploy ~]$ ssh 172.17.0.8
sudo yum install -y yum-plugin-priorities
确保 :
[ceph@deploy ~]$ ssh 172.17.0.1 sudo cat /etc/yum/pluginconf.d/priorities.conf
[main]
enabled = 1
[ceph@deploy ~]$ ssh 172.17.0.2 sudo cat /etc/yum/pluginconf.d/priorities.conf
[main]
enabled = 1
.......................
[ceph@deploy ~]$ ssh 172.17.0.8 sudo cat /etc/yum/pluginconf.d/priorities.conf
[main]
enabled = 1
2. 安装依赖包(所有节点)
[ceph@deploy ~]$ ssh 172.17.0.1 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.2 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.3 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.4 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.5 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.6 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.7 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
[ceph@deploy ~]$ ssh 172.17.0.8 sudo yum install -y snappy leveldb gdisk python-argparse gperftools-libs
3. 安装ceph包(所有节点)
[ceph@deploy ~]$ ssh 172.17.0.1 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.2 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.3 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.4 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.5 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.6 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.7 sudo yum install -y ceph
[ceph@deploy ~]$ ssh 172.17.0.8 sudo yum install -y ceph
4. 部署监控节点, (所有mon1, mon2, mon3节点执行)
创建配置文件目录
# 一般安装完ceph rpm已经有这两个目录了.
sudo mkdir -p /var/lib/ceph/mon
sudo mkdir -p /etc/ceph
4.1 使用uuidgen命令生成fsid, 用来唯一标识一个cluster : (mon1执行)
Unique Identifier: The fsid is a unique identifier for the cluster, and stands for File System ID from the days when the Ceph Storage Cluster was principally for the Ceph Filesystem.
Ceph now supports native interfaces, block devices, and object storage gateway interfaces too, so fsid is a bit of a misnomer.
[ceph@mon1 ~]$ uuidgen
f649b128-963c-4802-ae17-5a76f36c4c76
对应配置参数
fsid = f649b128-963c-4802-ae17-5a76f36c4c76
4.2 选择合适的集群名, 例如, 集群名为ceph, 则配置文件名为ceph.conf :
Cluster Name: Ceph clusters have a cluster name, which is a simple string without spaces. The default cluster name is ceph, but you may specify a different cluster name. Overriding the default cluster name is especially useful when you are working with multiple clusters and you need to clearly understand which cluster your are working with.
For example, when you run multiple clusters in a federated architecture, the cluster name (e.g., us-west, us-east) identifies the cluster for the current CLI session. Note: To identify the cluster name on the command line interface, specify the a Ceph configuration file with the cluster name (e.g., ceph.conf, us-west.conf, us-east.conf, etc.). Also see CLI usage (ceph --cluster {cluster-name}).
4.3 确定监控节点名(`hostname -s`) :
Monitor Name: Each monitor instance within a cluster has a unique name. In common practice, the Ceph Monitor name is the host name (we recommend one Ceph Monitor per host, and no commingling of Ceph OSD Daemons with Ceph Monitors). You may retrieve the short hostname with hostname -s.
本例监控节点名 mon1, mon2, mon3
对应IP 172.17.0.2, 172.17.0.3, 172.17.0.4
对应配置参数 :
mon initial members = mon1, mon2, mon3
mon host = 172.17.0.2, 172.17.0.3, 172.17.0.4
4.4 创建监控和管理keyring. (mon1执行)
Monitor Keyring: Monitors communicate with each other via a secret key. You must generate a keyring with a monitor secret and provide it when bootstrapping the initial monitor(s).
Create a keyring for your cluster and generate a monitor secret key.
操作如下
[ceph@mon1 ~]$ sudo ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
creating /tmp/ceph.mon.keyring
Administrator Keyring: To use the ceph CLI tools, you must have a client.admin user. So you must generate the admin user and keyring, and you must also add the client.admin user to the monitor keyring.
Generate an administrator keyring, generate a client.admin user and add the user to the keyring.
操作如下
[ceph@mon1 ~]$ sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow'
creating /etc/ceph/ceph.client.admin.keyring
4.5 合并keyring (mon1执行)
Add the client.admin key to the ceph.mon.keyring.
[ceph@mon1 ~]$ sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
importing contents of /etc/ceph/ceph.client.admin.keyring into /tmp/ceph.mon.keyring
4.6 生成temp monitor map. (mon1执行)
Monitor Map: Bootstrapping the initial monitor(s) requires you to generate a monitor map. The monitor map requires the fsid, the cluster name (or uses the default), and at least one host name and its IP address.
Generate a monitor map using the hostname(s), host IP address(es) and the FSID. Save it as /tmp/monmap:
[ceph@mon1 ~]$ sudo monmaptool --create --clobber --add mon1 172.17.0.2 --add mon2 172.17.0.3 --add mon3 172.17.0.4 --fsid f649b128-963c-4802-ae17-5a76f36c4c76 /tmp/monmap
monmaptool: monmap file /tmp/monmap
monmaptool: set fsid to f649b128-963c-4802-ae17-5a76f36c4c76
monmaptool: writing epoch 0 to /tmp/monmap (3 monitors)
4.7 创建监控节点数据目录, (存储cluster maps) (mon1执行)
Create a default data directory (or directories) on the monitor host(s).
sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname}
For example:
为所有监控节点创建各自的map数据目录, 在mon1节点生成后拷贝到其他mon节点.
[ceph@mon1 ~]$ sudo mkdir -p /var/lib/ceph/mon/ceph-mon1
[ceph@mon1 ~]$ sudo mkdir -p /var/lib/ceph/mon/ceph-mon2
[ceph@mon1 ~]$ sudo mkdir -p /var/lib/ceph/mon/ceph-mon3
4.8 使用前面创建的监控和管理key, 以及map数据, 写入数据目录. (mon1执行)
Populate the monitor daemon(s) with the monitor map and keyring.
ceph-mon --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
For example:
[ceph@mon1 ~]$ sudo ceph-mon --mkfs -i mon1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
ceph-mon: set fsid to f649b128-963c-4802-ae17-5a76f36c4c76
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-mon1 for mon.mon1
[ceph@mon1 ~]$ sudo ceph-mon --mkfs -i mon2 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
ceph-mon: set fsid to f649b128-963c-4802-ae17-5a76f36c4c76
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-mon2 for mon.mon2
[ceph@mon1 ~]$ sudo ceph-mon --mkfs -i mon3 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
ceph-mon: set fsid to f649b128-963c-4802-ae17-5a76f36c4c76
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-mon3 for mon.mon3
在centos中使用, 需要添加sysvinit文件, 否则无法使用service来管理服务.
当然如果不用/etc/init.d/ceph来管理, 也可以直接使用ceph-mon来管理.
Touch the done file.
Mark that the monitor is created and ready to be started:
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon1/done
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon2/done
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon3/done
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon1/sysvinit
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon2/sysvinit
[ceph@mon1 ~]$ sudo touch /var/lib/ceph/mon/ceph-mon3/sysvinit
4.9 创建ceph集群配置文件ceph.conf . (mon1执行)
Consider settings for a Ceph configuration file. Common settings include the following:
[global]
fsid = {cluster-id}
mon initial members = {hostname}[, {hostname}]
mon host = {ip-address}[, {ip-address}]
public network = {network}[, {network}]
cluster network = {network}[, {network}]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = {n}
filestore xattr use omap = true
osd pool default size = {n} # Write an object n times.
osd pool default min size = {n} # Allow writing n copy in a degraded state.
osd pool default pg num = {n}
osd pool default pgp num = {n}
osd crush chooseleaf type = {n}
In the foregoing example, the [global] section of the configuration might look like this:
[ceph@mon1 ceph]$ sudo vi /etc/ceph/ceph.conf
[global]
fsid = f649b128-963c-4802-ae17-5a76f36c4c76
mon initial members = mon1, mon2, mon3
mon host = 172.17.0.2, 172.17.0.3, 172.17.0.4
public network = 172.17.0.0/16
cluster network = 172.18.0.0/16, 172.19.0.0/16
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 1024
filestore xattr use omap = true
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 333
osd pool default pgp num = 333
osd crush chooseleaf type = 1
ms_bind_ipv6 = false
注意我们使用了3张网, 1张用于client交互, 另外两张用于集群内部交互(如data replication, rebalance...)
网络配置参考
http://docs.ceph.com/docs/master/rados/configuration/network-config-ref/
ceph集群配置参考
http://ceph.com/docs/master/rados/configuration/ceph-conf/
4.10 拷贝配置文件和cluster map数据目录到其他几个monitor节点. (mon1执行)
[root@mon1 ~]# scp /etc/ceph/ceph.conf 172.17.0.3:/etc/ceph/
[root@mon1 ~]# scp /etc/ceph/ceph.conf 172.17.0.4:/etc/ceph/
[root@mon1 ~]# scp -r /var/lib/ceph/mon/ceph-mon2 172.17.0.3:/var/lib/ceph/mon/
[root@mon1 ~]# scp -r /var/lib/ceph/mon/ceph-mon3 172.17.0.4:/var/lib/ceph/mon/
4.11 启动monitor后台进程. (mon1, mon2, mon3执行)
Start the monitor(s).
[ceph@mon1 ]$ sudo /etc/init.d/ceph start mon.mon1
[ceph@mon2 ~]$ sudo /etc/init.d/ceph start mon.mon2
[ceph@mon3 ~]$ sudo /etc/init.d/ceph start mon.mon3
直接启动mon的命令 :
例如
/usr/bin/ceph-mon -i mon1 --pid-file /var/run/ceph/mon.mon1.pid -c /etc/ceph/ceph.conf --cluster ceph
支持指定数据目录
[root@mon1 ~]# ceph-mon --help
usage: ceph-mon -i monid [flags]
--debug_mon n
debug monitor level (e.g. 10)
--mkfs
build fresh monitor fs
--force-sync
force a sync from another mon by wiping local data (BE CAREFUL)
--yes-i-really-mean-it
mandatory safeguard for --force-sync
--compact
compact the monitor store
--osdmap <filename>
only used when --mkfs is provided: load the osdmap from <filename>
--inject-monmap <filename>
write the <filename> monmap to the local monitor store and exit
--extract-monmap <filename>
extract the monmap from the local monitor store and exit
--mon-data <directory>
where the mon store and keyring are located
--conf/-c FILE read configuration from the given configuration file
--id/-i ID set ID portion of my name
--name/-n TYPE.ID set name
--cluster NAME set cluster name (default: ceph)
--version show version and quit
-d run in foreground, log to stderr.
-f run in foreground, log to usual location.
--debug_ms N set message debug level (e.g. 1)
4.12 检查集群是否已经启动.
Verify that Ceph created the default pools.
[ceph@mon1 ~]$ sudo ceph osd lspools
0 rbd,
[ceph@mon2 ~]$ sudo ceph osd lspools
0 rbd,
[ceph@mon3 ~]$ sudo ceph osd lspools
0 rbd,
检查集群状态, 在3个monitor节点都应返回正常信息如下当前为HEALTH_ERR状态, 因为还没有加入任何Osd :
Verify that the monitor is running.
[ceph@mon3 ~]$ sudo ceph -s
cluster f649b128-963c-4802-ae17-5a76f36c4c76
health HEALTH_ERR 64 pgs stuck inactive; 64 pgs stuck unclean; no osds
monmap e1: 3 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0}, election epoch 6, quorum 0,1,2 mon1,mon2,mon3
osdmap e1: 0 osds: 0 up, 0 in
pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
0 kB used, 0 kB / 0 kB avail
64 creating
You should see output that the monitor you started is up and running, and you should see a health error indicating that placement groups are stuck inactive. It should look something like this:
Note: Once you add OSDs and start them, the placement group health errors should disappear
到此 monitor配置和启动完成.
[参考]
1. http://blog.163.com/digoal@126/blog/static/163877040201410269169450/
2. http://blog.163.com/digoal@126/blog/static/1638770402014102732027457/
3. http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/
4. 网络配置
PUBLIC NETWORK
The public network configuration allows you specifically define IP addresses and subnets for the public network. You may specifically assign static IP addresses or override public network settings using the public addr setting for a specific daemon.
public network
Description: | The IP address and netmask of the public (front-side) network (e.g., 192.168.0.0/24). Set in [global]. You may specify comma-delimited subnets. |
---|---|
Type: | {ip-address}/{netmask} [, {ip-address}/{netmask}] |
Required: | No |
Default: | N/A |
public addr
Description: | The IP address for the public (front-side) network. Set for each daemon. |
---|---|
Type: | IP Address |
Required: | No |
Default: | N/A |
CLUSTER NETWORK
The cluster network configuration allows you to declare a cluster network, and specifically define IP addresses and subnets for the cluster network. You may specifically assign static IP addresses or override cluster network settings using the cluster addr setting for specific OSD daemons.
cluster network
Description: | The IP address and netmask of the cluster (back-side) network (e.g., 10.0.0.0/24). Set in [global]. You may specify comma-delimited subnets. |
---|---|
Type: | {ip-address}/{netmask} [, {ip-address}/{netmask}] |
Required: | No |
Default: | N/A |
cluster addr
Description: | The IP address for the cluster (back-side) network. Set for each daemon. |
---|---|
Type: | Address |
Required: | No |
Default: | N/A |