@zhangsiming65965 2019-01-20T08:06:27.000000Z 字数 18576 阅读 194

GlusterFS分布式存储系统

云计算

---Author：张思明 ZhangSiming

---Mail：1151004164@cnu.edu.cn

---QQ：1030728296

如果有梦想，就放开的去追；
因为只有奋斗，才能改变命运；

一、分布式存储系统简介

1.1 分布式存储简介

计算机通过文件系统管理，存储数据，而现在数据信息爆炸的时代中人们可以获取的数据成指数倍的增长，单纯通过增加硬盘个数来扩展计算机文件系统的存储容量的方式，已经不能满足目前的需求，所以需要GlusterFS，分布式云存储

分布式存储好处：

分布式文件系统可以有效解决数据的存储和管理难题，将固定于某个地点的某个文件系统，扩展到任意多个地点/多个文件系统，众多的节点组成一个文件系统网络。每个节点可以分布在不同的地点，通过网络进行节点间的通信和数据传输。人们在使用分布式文件系统时，无需关心数据是存储在哪个节点上，或者是从哪个节点从获取的，只需要像使用本地文件系统一样管理和存储文件系统中的数据

1.2 NFS简介(Network File System)

NFS的特点：允许网络中的计算机通过TCP/IP网络共享资源，只需要挂在到NFS即可访问

优点：

节约使用的磁盘空间
节约硬件资源
用户主目录设定

缺点：

存储空间不足，难以扩展存储空间
存在单点故障
大量的并发IO读写承受能力差

1.3 GlusterFS简介

image_1d1j4oajmgu61fsev0n1k4k10189.png-42.6kB

GlusterFS借助TCP/IP或InfiniBand RDMA网络将物理分布的存储资源聚集在一起，使用单一全局命名空间来管理数据；
GlusterFS具有强大的横向扩展能力，通过扩展能够支持数PB（1PB=1024TB）存储容量和处理数千客户端；
GlusterFS使得用户可摆脱原有的独立，高成本的封闭存储系统，能够利用普通廉价的存储设备来部署可集中管理，横向扩展，虚拟化的存储池，存储容量可扩展至TB/PB级
官方宣布：Gluster有64个存储节点的时候，可以达到32GB/s的吞吐量

优点：

高性能：PB级容量，GB级吞吐量，数百集群规模
扩展容量简单便捷，集中化管理
不存在单点故障

缺点：

GlusterFS对小文件，尤其是海量小文件存储效率和访问性能表现
网络带宽瓶颈

1.4 Gluster企业应用场景

非结构数据存储：MySQL是结构化的，存图片碎片叫非结构
归档，容灾
虚拟机存储
云存储
内容云
大数据
结构化和半结构化数据

企业场景：
官方公布：GlusterFS有64个存储节点的时候，可以达到32GB/s的吞吐量
公司DELL服务器，一台服务器8块硬盘，2块做一个Raid1,6块做一个Raid5,8个节点服务器组成Gluster吞吐量大概是32/64 *8 -0.8（实际会比官方数据小一些）=3.2GB/s,GlusterFS受网络带宽影响，需要专门配4块千兆网卡，交换机需要是万兆交换机
公司环境业务带宽一般为：
30-50台服务器：30MB/s上下行带宽
70-200台服务器：50MB/s上下行带宽
视频服务的公司:100MB/s上下行带宽

二、部署安装GlusterFS

2.1 实验环境

描述	IP	主机名	配置需求
GlusterFS node1	192.168.17.225	GlusterFS1	多添加两块20G的硬盘
GlusterFS node2	192.168.17.131	GlusterFS2	多添加两块20G的硬盘
GlusterFS node3	192.168.17.226	GlusterFS3	多添加两块20G的硬盘
GlusterFS node4	192.168.17.227	GlusterFS4	多添加两块20G的硬盘
GlusterFS node5	192.168.17.228	GlusterFS5	多添加两块20G的硬盘

[root@GlusterFS1 /]# cat /etc/redhat-release
CentOS release 6.5 (Final)      #强烈建议用CentOS6做glusterfs服务器
[root@GlusterFS1 /]# uname -r
2.6.32-431.el6.x86_64
[root@ZhangSiming ~]# sed -i 's#SELINUX=enforcing#SELINUX=disabled#' /etc/sysconfig/selinux      #关闭selinux
[root@GlusterFS1 /]# service iptables stop
[root@GlusterFS1 /]# chkconfig iptables off
#关闭防火墙及开机自启动
[root@ZhangSiming ~]# fdisk -l | egrep -w "/dev/sdb|/dev/sdc"
Disk /dev/sdb: 21.5 GB, 21474836480 bytes, 41943040 sectors
Disk /dev/sdc: 21.5 GB, 21474836480 bytes, 41943040 sectors

2.2 利用脚本和ansible一键给所有GlusterFS节点部署GlusterFS工具

#一键分发脚本
#!/bin/bash
ssh-keygen -f ~/.ssh/id_rsa -t rsa -P "" &>/dev/null < /yes.txt
tar zcf glu.tar.gz glu
for i in 192.168.17.22{5..7} 192.168.17.131
do
    sshpass -p "666666" ssh-copy-id -i ~/.ssh/id_rsa.pub "-o StrictHostKeyChecking=no" root@$i   &>/dev/null
    scp /glu.tar.gz root@$i:/ 
    scp /etc/hosts root@$i:/etc/hosts
done
echo "transfer successful"
#脚本执行
[root@GlusterFS5 /]# sh Glu.sh 
glu.tar.gz                                    100% 6751KB  27.7MB/s   00:00    
hosts                                         100%  309   102.9KB/s   00:00    
glu.tar.gz                                    100% 6751KB   9.7MB/s   00:00    
hosts                                         100%  309    96.5KB/s   00:00    
glu.tar.gz                                    100% 6751KB   8.9MB/s   00:00    
hosts                                         100%  309   100.9KB/s   00:00    
glu.tar.gz                                    100% 6751KB  13.0MB/s   00:00    
hosts                                         100%  309    52.4KB/s   00:00    
glu.tar.gz                                    100% 6751KB  16.0MB/s   00:00    
hosts                                         100%  309   181.4KB/s   00:00    
transfer successful
#ansible一键安装glusterfs脚本
#!/bin/bash
mount /dev/sr0 /media/cdrom
cd /
tar xf /glu.tar.gz
cd /glu 
yum -y install createrepo &>/dev/null 
createrepo -v .
cat > /etc/yum.repos.d/glu.repo << FOF
[glu]    
name=glu
baseurl=file:///glu
gpgcheck=0   
enabled=1
FOF
rm -rf /etc/yum.repos.d/CentOS-Media.repo
yum -y install glusterfs-server glusterfs-cli glusterfs-geo-replication &>/dev/null
#ansible一键运行命令
ansible nginx -m script -a 'Gluinstall.sh'  
#安装结果
[root@GlusterFS3 glu]# which glusterfs
/usr/sbin/glusterfs      #安装成功
[root@GlusterFS3 glu]# glusterfs -V
glusterfs 3.7.20 built on Jan 30 2017 15:39:27
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

2.3 启动所有节点的glusterfs服务，并加入信任池

[root@GlusterFS1 ~]# /etc/init.d/glusterd start
Starting glusterd:                                         [  OK  ]
[root@GlusterFS1 ~]# chkconfig glusterd on      #开启glusterfs服务，并添加开机自启动
[root@GlusterFS1 ~]# gluster peer probe GlusterFS2
peer probe: success. 
[root@GlusterFS1 ~]# gluster peer probe GlusterFS3
peer probe: success. 
[root@GlusterFS1 ~]# gluster peer probe GlusterFS4
peer probe: success. 
[root@GlusterFS1 ~]# gluster peer status
Number of Peers: 3      #把四个gluster加入信任池(在一台操作即可)
Hostname: GlusterFS2 
Uuid: 3618a2a3-2ef4-48c9-a7a5-32a11db7d58d
State: Peer in Cluster (Connected)
Hostname: GlusterFS3
Uuid: 07f31c0f-3c17-4928-a1fe-b008593fe327
State: Peer in Cluster (Connected)
Hostname: GlusterFS4
Uuid: e3b38653-2ede-43d6-8e63-4260801b359c
State: Peer in Cluster (Connected)

2.4 格式化磁盘，并挂载到相应目录

xfs和elt4的区别：

CentOS7默认文件系统是xfs，CentOS6默认文件系统是elt4。综合来说，多文件、大文件系统、空间利用率等方面，xfs比ext4有优势

#ansible一键格式化sdb，sdc；并挂载
#ansible脚本
#!/bin/bash
mkfs.ext4 /dev/sdb < /yes.txt
mkfs.ext4 /dev/sdc < /yes.txt
mkdir -p /gluster/brick1
mkdir -p /gluster/brick2
mount /dev/sdb /gluster/brick1
mount /dev/sdc /gluster/brick2
#ansible运行命令
ansible nginx -m script -a '/fs.sh'
#挂载结果
[root@GlusterFS3 glu]# df -h
Filesystem                    Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root   18G  997M   16G   6% /
tmpfs                         491M     0  491M   0% /dev/shm
/dev/sda1                     485M   33M  427M   8% /boot
/dev/sr0                      4.2G  4.2G     0 100% /media/cdrom
/dev/sdb                       20G  172M   19G   1% /gluster/brick1
/dev/sdc                       20G  172M   19G   1% /gluster/brick2

至此，gluster工具及节点基本部署完毕

三、Glusterfs的操作使用

3.1 GlusterFS常用卷类型

常用卷类型：

分布式卷(Distributed):（hash卷）

分布式卷，文件通过hash算法将一致的文件随机的分布到由bricks组成的卷上，相当于raid0的读写速度

复制卷(Replicated):

复制式卷，类似raid1，replica数必须等于volume中brick所包含的存储服务器数，可用性高。创建一个两两互为备份的卷，存储池中一块硬盘损坏，不会影响到数据的使用

条带式卷(Striped)：

条带式卷，类似与raid0，stripe数必须等于volume中brick所包含的存储服务器数，文件被分成数据块，以Round Robin的方式存储在bricks中，并发粒度是数据块，针对大文件

分布式卷和条带式卷的区别在于是否切割大文件

复合卷类型：(通常工作中用的是复合卷)

分布式复制卷(Distributed Replicated):

分配文件再复制brick的体积，类似Raid1+Raid0

分布式条带卷(Replicated):

分布式的条带卷，volume中brick所包含的存储服务器数必须是stripe的倍数(>=2倍)，兼顾分布式和条带式的功能。每个文件分布在四台共享服务器上，通常用于大文件访问处理，最少需要 4 台服务器才能创建分布条带卷

复制条带卷(Striped)：

复制条带数据再复制集群中的brick

分布式复制条带卷(Striped)：

分布条带数据再复制brick集群，最安全的。

综上所述：

分布式--->速度
复制式--->安全
条带式--->大文件

3.2 创建分布式Volume

[root@GlusterFS1 ~]# gluster volume create gs1 GlusterFS1:/gluster/brick1 GlusterFS2:/gluster/brick1 force      #由GlusterFS1和GlusterFS2的sdb组成一个分布式卷
volume create: gs1: success: please start the volume to access data
[root@GlusterFS1 ~]# gluster volume start gs1      #启动分布式卷
volume start: gs1: success
[root@GlusterFS1 ~]# gluster volume info      #查看卷信息
Volume Name: gs1     #卷名
Type: Distribute     #分布式卷
Volume ID: 7f504698-9575-4485-85d5-e919cc70bf83
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:     #块的信息
Brick1: GlusterFS1:/gluster/brick1
Brick2: GlusterFS2:/gluster/brick1
Options Reconfigured:
performance.readdir-ahead: on

3.3 Volume的两种挂载方式

3.3.1 以glusterfs方式挂载

要求：服务器需要安装glusterfs工具

[root@GlusterFS1 ~]# mkdir /zhangsiming
[root@GlusterFS1 ~]# mount -t glusterfs 127.0.0.1:/gs1 /zhangsiming      #mount -t指定挂载方式为glusterfs的挂载方式
[root@GlusterFS1 ~]# df -h
Filesystem                    Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root   18G  997M   16G   6% /
tmpfs                         491M     0  491M   0% /dev/shm
/dev/sda1                     485M   33M  427M   8% /boot
/dev/sr0                      4.2G  4.2G     0 100% /media/cdrom
/dev/sdb                       20G  173M   19G   1% /gluster/brick1
/dev/sdc                       20G  172M   19G   1% /gluster/brick2
127.0.0.1:/gs1                 40G  344M   38G   1% /zhangsiming
[root@GlusterFS1 ~]# touch /zhangsiming/{1..5}
[root@GlusterFS1 ~]# ls /zhangsiming/
1  2  3  4  5  lost+found      #挂载点可以看到全部的文件
#查看GlusterFS1的brick
[root@GlusterFS1 ~]# ls /gluster/brick1/
1  5  lost+found
#查看GlusterFS2的brick
[root@GlusterFS2 glu]# ls /gluster/brick1
2  3  4  lost+found
#可见，分布式卷式分布在所有节点上的
#分布式卷的加快了文件的读写速度

3.3.2 以NFS方式挂载

因为往往服务器不一定都有安装了gluster工具，所有还有一种NFS挂载方式

[root@GlusterFS1 ~]# gluster volume status
Status of volume: gs1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick GlusterFS1:/gluster/brick1            49152     0          Y       2247 
Brick GlusterFS2:/gluster/brick1            49152     0          Y       2157 
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on GlusterFS4                    N/A       N/A        N       N/A  
NFS Server on GlusterFS2                    N/A       N/A        N       N/A  
NFS Server on GlusterFS3                    N/A       N/A        N       N/A  
 #N/A表示未检测到可连接的NFS
Task Status of Volume gs1
------------------------------------------------------------------------------
There are no active volume tasks
[root@GlusterFS1 ~]# rpm -qa rpcbind
rpcbind-0.2.0-13.el6_9.1.x86_64
[root@GlusterFS1 ~]# rpm -qa nfs-utils
nfs-utils-1.2.3-75.el6_9.x86_64
#查看是否安装了rpc和nfs工具
#所有节只点开启rpcbind服务，不要开启nfs服务，如果开启了nfs服务，会导致volume失败
[root@GlusterFS3 glu]# /etc/init.d/rpcbind start
Starting rpcbind:                                          [  OK  ]
[root@GlusterFS3 glu]# /etc/init.d/glusterd restart
Stopping glusterd:                                         [  OK  ]
Starting glusterd:                                         [  OK  ]
[root@GlusterFS1 ~]# gluster volume status
Status of volume: gs1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick GlusterFS1:/gluster/brick1            49152     0          Y       3136 
Brick GlusterFS2:/gluster/brick1            49152     0          Y       2157 
NFS Server on localhost                     2049      0          Y       4379 
NFS Server on GlusterFS3                    2049      0          Y       2610 
NFS Server on GlusterFS2                    2049      0          Y       3055 
NFS Server on GlusterFS4                    2049      0          Y       2713 
Task Status of Volume gs1
------------------------------------------------------------------------------
There are no active volume tasks
#所有节点的NFS挂载方式都被检测到了

NFS挂载分布式卷

[root@GlusterFS5 ~]# rpm -qa nfs-utils
nfs-utils-1.2.3-39.el6.x86_64
[root@GlusterFS5 ~]# rpm -qa rpcbind
rpcbind-0.2.0-11.el6.x86_64
[root@GlusterFS5 ~]# /etc/init.d/rpcbind start
Starting rpcbind:                                          [  OK  ]
[root@GlusterFS5 ~]# /etc/init.d/nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS mountd:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting RPC idmapd:                                       [  OK  ]      #在没有安装gluster的服务器开启rpc和nfs服务
[root@GlusterFS5 ~]# showmount -e 192.168.17.225      #查看这个ip有没有NFS提供挂载
Export list for 192.168.17.225:
/gs1 *
[root@GlusterFS5 ~]# mkdir /zhangsiming
[root@GlusterFS5 ~]# mount -t nfs 192.168.17.225:/gs1 /zhangsiming      #nfs方式挂载到/zhangsiming目录
[root@GlusterFS5 ~]# ls /zhangsiming
1  2  3  4  5  lost+found
[root@GlusterFS5 ~]# df -hT
Filesystem                   Type     Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root ext4      18G  972M   16G   6% /
tmpfs                        tmpfs    491M     0  491M   0% /dev/shm
/dev/sda1                    ext4     485M   33M  427M   8% /boot
/dev/sr0                     iso9660  4.2G  4.2G     0 100% /media/cdrom
192.168.17.225:/gs1          nfs       40G  344M   38G   1% /zhangsiming
#挂载成功

3.4 创建分布式复制卷

[root@GlusterFS1 ~]# gluster volume create gs2 replica 2 GlusterFS3:/gluster/brick1 GlusterFS4:/gluster/brick1 force
volume create: gs2: success: please start the volume to access data
[root@GlusterFS1 ~]# gluster volume info gs2
Volume Name: gs2
Type: Replicate      #复制卷
Volume ID: db4b349a-ef68-49cf-8fdc-8946dd2a32cc
Status: Created
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick1
Brick2: GlusterFS4:/gluster/brick1
Options Reconfigured:
performance.readdir-ahead: on
#挂载测试
[root@GlusterFS1 glusterfs]# gluster volume start gs2      #启动gs2复制卷
volume start: gs2: success
[root@GlusterFS1 glusterfs]# mount -t glusterfs 127.0.0.1:gs2 /zhangsicong
[root@GlusterFS1 glusterfs]# cd /zhangsicong
[root@GlusterFS1 zhangsicong]# vim file
[root@GlusterFS1 zhangsicong]# cat file
this is a file
#GlusterFS3 brick1
[root@GlusterFS3 glu]# cat /gluster/brick1/file 
this is a file
#GlusterFS4 brick1
[root@ZhangSiming ~]# cat /gluster/brick1/file
this is a file
#可见，复制卷是全冗余的

3.5 创建分布式条带卷

[root@GlusterFS1 ~]# gluster volume create gs3 stripe 2 GlusterFS3:/gluster/brick2 GlusterFS4:/gluster/brick2 force
volume create: gs3: success: please start the volume to access data
[root@GlusterFS1 ~]# gluster volume info gs3
Volume Name: gs3
Type: Stripe      #条带卷
Volume ID: 7b98fac3-2c5a-4025-a669-e2cf6df41f8c
Status: Created
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick2
Brick2: GlusterFS4:/gluster/brick2
Options Reconfigured:
performance.readdir-ahead: on
#挂载测试
[root@GlusterFS1 zhangsicong]# gluster volume start gs3      #启动条带卷
volume start: gs3: success
[root@GlusterFS1 zhangsicong]# mkdir /zhangside
[root@GlusterFS1 zhangsicong]# mount -t glusterfs 127.0.0.1:gs3 /zhangside      #挂载条带卷
[root@GlusterFS1 zhangsicong]# cd /zhangside
[root@GlusterFS1 zhangside]# vim file
[root@GlusterFS1 zhangside]# cat file
this is a page
#GlusterFS3 brick2
[root@GlusterFS3 glu]# cat /gluster/brick2/file 
this is a page
#GlusterFS4 brick2
[root@ZhangSiming ~]# cat /gluster/brick2/file
[root@ZhangSiming ~]# 
#可见，条带式卷式把单个文件拆开存储的，适合大文件

总结：

卷类型	特性
分布式卷	分布式卷的数据存储方式是将数据平均写入到每个整合的磁盘中,类似于raid0，写入速度快，但这样磁盘一旦损坏没有纠错能力。
复制式卷	复制卷的数据存储方式为，每个整合的磁盘中都写入同样的数据内容，类似于raid1，数据非常安全，读取性能高，占磁盘容量。
条带式卷	条带卷，是将数据的容量平均分配到了每个整合的磁盘节点上。大幅提高大文件的并发读访问。

四、存储卷中的brick块设备的扩容

4.1 分布式复制卷的扩容

image_1d1kn5e43jt8bmro3h1aevm6823.png-31.2kB

[root@GlusterFS1 zhangside]# gluster volume add-brick gs2 replica 2 GlusterFS1:/gluster/brick2 GlusterFS2:/gluster/brick2 force      #进行复制卷的扩容，如果第一次指定的replica为2，扩容也必须是2的倍数
volume add-brick: success
[root@GlusterFS1 zhangside]# gluster volume info gs2
Volume Name: gs2
Type: Distributed-Replicate
Volume ID: db4b349a-ef68-49cf-8fdc-8946dd2a32cc
Status: Started
Number of Bricks: 2 x 2 = 4      #已经扩容
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick1
Brick2: GlusterFS4:/gluster/brick1
Brick3: GlusterFS1:/gluster/brick2
Brick4: GlusterFS2:/gluster/brick2
Options Reconfigured:
performance.readdir-ahead: on
[root@GlusterFS1 zhangside]# gluster volume rebalance gs2 start      #扩容完了进行平衡数据，否则不会生效
volume rebalance: gs2: success: Rebalance on gs2 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: f7eea5b5-3517-40a2-8967-f95550491d25
#创建数据测试
[root@GlusterFS1 zhangsicong]# ls /gluster/brick2
10  12  14  15  16  17  2  3  4  6  lost+found
[root@GlusterFS2 glu]# ls /gluster/brick2
10  12  14  15  16  17  2  3  4  6  lost+found
[root@GlusterFS3 glu]# ls /gluster/brick1
1  11  13  18  19  20  5  7  8  9  lost+found
[root@ZhangSiming ~]# ls /gluster/brick1
1  11  13  18  19  20  5  7  8  9  lost+found
#可见，每次replica的为一个复制对，各个复制对之间分布式存储

五、存储卷的缩减与删除

5.1 对存储卷中的brick进行缩减

[root@GlusterFS1 zhangsicong]# gluster volume stop gs2      #先停止卷
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: gs2: success
[root@GlusterFS1 zhangsicong]# gluster volume remove-brick gs2 replica 2 GlusterFS1:/gluster/brick2 GlusterFS2:/gluster/brick2 force      #由于是复制卷，移除必须是移除replica的倍数
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: success
[root@GlusterFS1 zhangsicong]# gluster volume info gs2
Volume Name: gs2
Type: Replicate
Volume ID: db4b349a-ef68-49cf-8fdc-8946dd2a32cc
Status: Stopped
Number of Bricks: 1 x 2 = 2      #移除成功
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick1
Brick2: GlusterFS4:/gluster/brick1
Options Reconfigured:
performance.readdir-ahead: on
#之后再重新启动gs2分布式复制卷即可

5.2 对存储卷进行删除

[root@GlusterFS1 zhangsicong]# gluster volume stop gs1      #先停止卷
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: gs1: success
[root@GlusterFS1 zhangsicong]# gluster volume delete gs1      #直接删除卷
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: gs1: success
[root@GlusterFS1 zhangsicong]# gluster volume info | grep gs1      #已经没有gs1的信息了

注意：无论是缩减卷还是删除卷，并不会是清除卷中的数据。数据仍旧会保存在对应磁盘上

[root@GlusterFS1 zhangsicong]# ls /gluster/brick1/
1  5  lost+found
[root@GlusterFS1 zhangsicong]# ls /gluster/brick2/
10  12  14  15  16  17  2  3  4  6  lost+found
[root@GlusterFS1 zhangsicong]# touch aaa
touch: cannot touch 'aaa': Transport endpoint is not connected      #挂载点已经连接不到了

六、构建企业级分布式存储

6.1 硬件要求

DEEL 2U机型，磁盘STAT盘4T，如果I/O要求比较高，可以换为SSD固态硬盘；为了充分保证系统的稳定性和性能，所有Glusterfs节点配置尽量一致；一台节点服务器8块硬盘，2块做一个Raid1,6块做一个Raid5,8个节点服务器组成Gluster吞吐量大概是32/64 *8 -0.8（实际会比官方数据小一些）=3.2GB/s,GlusterFS受网络带宽影响，需要专门配4块千兆网卡，交换机需要是万兆交换机，

6.2 系统要求和分区划分

系统要求使用CentOS6，分区的时候，swap分区和内存一样大小(服务器内存小于16G)，剩余空间给Glusterfs用，划分单独的硬盘空间，其他杂余的工具一概不装

6.3 网络环境

Gluster服务器至少两块网卡，一块为管理IP，一块为供给Glusterfs服务使用，需要千兆网卡。除了内网交换机，Gluster的传输最好配备万兆交换机，万兆网卡，网络方面如果安全性要求高，可以多网卡绑定

6.4 服务器摆放分布

服务器主备机器要放在不同的机柜，连接不同的交换机，即使一个机柜出现问题，不影响服务的运行

6.5 存储卷选用

一般再企业中，选用分布式复制卷，因为有数据冗余备份，相对安全。分布式条带卷技术尚未成熟，对于非大文件存储的企业业务场景，暂不考虑

6.6 Glusterfs文件系统优化

[root@GlusterFS1 ~]# cat /etc/glusterfs/glusterd.vol 
volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option transport.socket.read-fail-log off
    option ping-timeout 0
    option event-threads 1
#   option base-port 49152      #glusterfs各个节点之间连接的tcp端口，如果冲突，可以在这里修改，或者做防火墙规则

GlusterFS支持的优化参数：

调整参数方法：Glusster volume set <卷> <参数>

参数项目	说明	缺省值
Auth.allow	IP访问授权	allow all
Cluster.min-free-disk	剩余磁盘空间阈值	10%
Cluster.stripe-block-size	条带大小	128KB
Network.frame-timeout	请求等待时间	1800s
Network.ping-timeout	客户端等待时间	42s
Nfs.disabled	关闭NFS服务	OFF
Performance.io-thread-count	IO线程数	16
Performance.cache-refresh-timeout	缓存校验周期	1s
Performance.cache-size	读缓存大小	32MB
Performance.quick-read	优化读取小文件的性能	off
Performance.read-ahead	用预读的方式提高读取的性能，有利于应用频繁持续性的访问文件，当应用完成当前数据块读取的时候，下一个数据块就已经准备好了	off
Performance.write-behind	写入数据时，先写入缓存内，再写入硬盘内，以提高写入的性能	off
Performance.io-cache	缓存已经被读过的	off

[root@GlusterFS1 ~]# gluster volume info gs3
Volume Name: gs3
Type: Stripe
Volume ID: 7b98fac3-2c5a-4025-a669-e2cf6df41f8c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick2
Brick2: GlusterFS4:/gluster/brick2
Options Reconfigured:
performance.readdir-ahead: on
[root@GlusterFS1 ~]# gluster volume set gs3 performance.read-ahead on
volume set: success
[root@GlusterFS1 ~]# gluster volume set gs3 performance.cache-size 256MB      #设置预先缓存优化，并且读缓存设为256MB
volume set: success
[root@GlusterFS1 ~]# gluster volume info gs3
Volume Name: gs3
Type: Stripe
Volume ID: 7b98fac3-2c5a-4025-a669-e2cf6df41f8c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: GlusterFS3:/gluster/brick2
Brick2: GlusterFS4:/gluster/brick2
Options Reconfigured:
performance.cache-size: 256MB      
performance.read-ahead: on
performance.readdir-ahead: on

6.3 监控及日常维护

监控使用Zabbix的自带模板，监控项：CPU，内存，主机存活，磁盘空间，主机运行时间，系统负载LOAD...

还可以对GlusterFS做quota磁盘限额

以下命令针对复制卷

[root@GlusterFS1 gs2]# mkdir /bbb
[root@GlusterFS1 gs2]# mount -t glusterfs 127.0.0.1:gs2 /bbb      #挂载Glusterfs分布式复制卷
[root@GlusterFS1 gs2]# gluster volume quota gs2 enable
quota command failed : Quota is already enabled
[root@GlusterFS1 gs2]# gluster volume quota gs2 limit-usage   /   10GB      #进行磁盘限额，这里的/指卷的挂载点目录
volume quota : success
[root@GlusterFS1 gs2]# gluster volume quota gs2 list      #查看磁盘限额
                  Path                   Hard-limit  Soft-limit      Used  Available  Soft-limit exceeded? Hard-limit exceeded?
-------------------------------------------------------------------------------------------------------------------------------
/                                         10.0GB     80%(8.0GB)   0Bytes  10.0GB              No                   No
/dir1                                     10.0GB     80%(8.0GB)   0Bytes  10.0GB              No                   No
[root@GlusterFS1 gs2]# gluster volume quota gs2 remove /       #删除磁盘限额
volume quota : success
[root@GlusterFS1 gs2]# gluster volume quota gs2 disable      #关闭Glusterfs磁盘限额服务
Disabling quota will delete all the quota configuration. Do you want to continue? (y/n) y
volume quota : success

七、生产环境遇到常见故障处理

7.1 硬盘故障

因为主机做了硬件RAID处理，硬盘故障可以直接换硬盘，会自动同步数据

7.2 一台Glusterfs节点故障

7.2.1 首先配置一台与故障机完全一致的机器(IP,硬盘，等等...)，查看故障节点的UUID

[root@GlusterFS2 ~]# gluster peer status
Number of Peers: 3
Hostname: GlusterFS3
Uuid: 07f31c0f-3c17-4928-a1fe-b008593fe327
State: Peer in Cluster (Disconnected)      #记录下这个UUID
Hostname: GlusterFS1
Uuid: c4ef65ed-9dec-4e85-8577-ffb3eb35f471
State: Peer in Cluster (Connected)
Hostname: GlusterFS4
Uuid: e3b38653-2ede-43d6-8e63-4260801b359c
State: Peer in Cluster (Connected)
#把新主机的环境设置为和GlusterFS3一致
[root@GlusterFS5 glu]# hostname -I
192.168.17.226       #IP一致
[root@GlusterFS5 glu]# df -hT | tail -2
df: `/zhangsiming': Stale file handle
/dev/sdb                     ext4      20G  172M   19G   1% /gluster/brick1
/dev/sdc                     ext4      20G  172M   19G   1% /gluster/brick2
[root@GlusterFS5 glu]# which glusterfs
/usr/sbin/glusterfs      #安装glusterfs工具

7.2.1 把UUID赋给新主机，执行修复命令

[root@GlusterFS5 glu]# vim /var/lib/glusterd/glusterd.info 
[root@GlusterFS5 glu]# cat /var/lib/glusterd/glusterd.info
UUID=07f31c0f-3c17-4928-a1fe-b008593fe327
operating-version=30712
#自动检测修复成功
[root@GlusterFS2 ~]# gluster peer status
Number of Peers: 3
Hostname: GlusterFS3
Uuid: 07f31c0f-3c17-4928-a1fe-b008593fe327
State: Peer Rejected (Connected)
Hostname: GlusterFS1
Uuid: c4ef65ed-9dec-4e85-8577-ffb3eb35f471
State: Peer in Cluster (Connected)
Hostname: GlusterFS4
Uuid: e3b38653-2ede-43d6-8e63-4260801b359c
State: Peer in Cluster (Connected)
#如果是Volume中的brick坏了，还要在新机器执行修复命令
gluster volume heal gs2 full
Launching heal operation to perform full self heal on volume gs2 has been successful 
Use heal info commands to check status
#查看修复状态
[root@glusterfs04 ~]# gluster volume heal gs2 info
Brick glusterfs03:/gluster/brick1
Status: Connected
Number of entries: 0
Brick glusterfs04:/gluster/brick1
Status: Connected
Number of entries: 0