MHA的主要作用:
在mysql的主从复制中,当master崩溃了,利用mha实现backup顶替崩溃的master自动切换为master继续工作,从而实现高可用。
下面介绍本次实验的环境:
MHA分为manager管理节点和node节点,一般来讲最少是三台服务器,两台node节点,一台manager节点,但本次环境限制,只能使用两台,所以把manager也装在一台node节点上。
两台服务器,两个网口:
IP:
10.2.16.253 10.0.0.1 node1
10.2.16.254 10.0.0.2 node2
准备工作:
1、首先根据环境修改两台服务器的/etc/hosts文件
在hosts文件最后加入两行:
10.0.0.2node2
10.0.0.1node1
2、关闭selinux\iptables\ip6tables
3、配置网卡:
eth0 为10.2.16.0网段 (用于外网通信)
eth1 为10.0.0.0 网段 (用于内网监测)
4、本例中的系统版本为:centos 6.4 x64 ,MHA版本为 0.54
安装MHA-manager:
mha依赖Perl模块,安装依赖非常的麻烦,首先需要确保拥有 centos的yum源和epel的yum源。(本例中使用的epel源为6.8版本)
注意:经验证有些包epel没有centos源有,有些相反,所以需要互补,通过两个源再结合cpan来安装依赖。
yum -y install perl* ncftp cpan (centos源安装)
perl-Log-Dispatch 是下载的rpm包安装,其他的缺少包找依赖下载吧。
shell> wget http://downloads.naulinux.ru/pub/NauLinux/6x/i386/sites/School/RPMS/perl-Log-Dispatch-2.27-1.el6.noarch.rpm
shell> wget http://dl.Fedoraproject.org/pub/epel/6/i386/perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
shell> wget http://mirror.centos.org/centos/6/os/x86_64/Packages/perl-Time-HiRes-1.9721-136.el6.x86_64.rpm
装依赖时候,我也废了很大劲才装上,所以,缺啥找啥装啥吧.....
大概是这些包:perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-MIME-Lite* perl-Mail-Send*
依赖安装完毕后,下面讲解本次实验所需的大致步骤:
1、首先需要配置两台主机的SSH密钥免密码验证登录
2、建立 mysql-master\mysql-slave之间的主从复制
3、配置 mha-manager文件
4、用mha-chech-ssh和mha-check-repl验证登录和mysql复制是否成功
5、启动mha-manager(确认以上都无问题后)
一、配置SSH免密钥登录:
[root@node1 ~]# ssh-keygen -t rsa 连按三个回车(确认密钥文件位置、设置密钥密码、确认密钥密码)
[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.0.0.2 拷贝密钥密码到对方的/root/.ssh/authorized_keys文件
[root@node2 ~]# ssh-keygen -t rsa
[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.0.0.1
注意:master的authorized_keys文件也需要加入本机的id.pub内容,不然后期check-ssh会不通过(当时这个问题困扰了两天才找到原因)
[root@node1 ~]#cat .ssh/id_rsa.pub >> .ssh/authorized_keys
二、安装mysql配置主从复制:
为了方便,就用yum装了哈
[root@node1 ~]# yum -y install mysql*
[root@node2 ~]# yum -y install mysql*
node1(master)的配置文件
[root@node1 ~]# cat /etc/my.cnf
[mysqld]
server-id=1 #全局唯一
log-bin=mysql-bin #生成二进制复制文件的前缀
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
node2(slave)的配置文件
[root@node2 ~]# cat /etc/my.cnf
[mysqld]
server-id=2
log-bin=mysql-bin
#binlog_format=mixed
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
配置mysql主节点:
添加用户以及复制权限:
mysql>grant replication slave on *.* to 'repl'@'10.0.0.%' identified by '123456';
mysql>grant all on *.* to 'root'@'10.0.0.2' identified by '123456';
mysql>grant all on *.* to 'root'@'node1' identified by '123456';
mysql>flush privileges;
配置完之后查看:
mysql> select user,host,password from mysql.user;
+------+-----------+-------------------------------------------+
| user | host | password |
+------+-----------+-------------------------------------------+
| root | localhost | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | node1 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | 127.0.0.1 | |
| | localhost | |
| | node1 | |
| repl | 10.0.0.% | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | 10.0.0.2 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
+------+-----------+-------------------------------------------+
7 rows in set (0.02 sec)
mysql> show master status;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 | 106 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.01 sec)
配置mysql的slave节点:
mysql> change master to
master_host='10.0.0.1',
master_port=3306,
master_user='repl',
master_password='123456',
master_log_file='mysql-bin.000001',
master_log_pos=106;
slave机子上也要授权,因为这个是备用master;
mysql>grant replication slave on *.* to 'repl'@'10.0.0.%' identified by '123456';
mysql>grant all on *.* to 'root'@'10.0.0.1' identified by '123456';
mysql>grant all on *.* to 'root'@'node2' identified by '123456';
mysql>flush privileges;
mysql> select user,host,password from mysql.user;
+------+-----------+-------------------------------------------+
| user | host | password |
+------+-----------+-------------------------------------------+
| root | localhost | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | node2 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | 127.0.0.1 | |
| | localhost | |
| | node2 | |
| repl | 10.0.0.% | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
| root | 10.0.0.1 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
+------+-----------+-------------------------------------------+
7 rows in set (0.00 sec)
启动slave
mysql> slave start;
Query OK, 0 rows affected (0.02 sec)
查看slave 的Slave_IO_Running 和Slave_SQL_Running,都为yes 时表示配置成功
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.0.1
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 361
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 326
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1007
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 181
Relay_Log_Space: 807
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1007
Last_SQL_Error:
三、安装配置MHA-manager:
[root@node1 mha4mysql-manager-0.54]# perl Makefile.PL #perl-cpan非常麻烦,应该会遇到各种错误,按报错找问题再解决吧...
[root@node1 ~] make && make install
[root@node1 ~]# mkdir /etc/masterha 配置manager的主文件目录
[root@node1 ~]# mkdir -p /master/app1 配置manager的log文件目录
[root@node1 mha4mysql-manager-0.54]# cp samples/conf/* /etc/masterha/ 拷贝配置文件模版
编辑配置文件app1.conf,另一个masterha_default.cnf 无需配置
[root@node1 ~]# cat /etc/masterha/app1.cnf
[server default]
manager_workdir=/masterha/app1
manager_log=/masterha/app1/manager.log
user=root #配置manager-mysql的root管理用户和密码
password=123456
ssh_user=root#之前配置的用来无密码登录的用户
repl_user=repl#用来数据同步复制的用户
repl_password=123456
ping_interval=1#ping验证检查mysql状态,每隔一秒检查一次
shutdown_script=""
#master_ip_failover_script="/data/master_ip_failover"
master_ip_online_change_script=""
report_script=""
[server1] #配置node节点
hostname=10.0.0.1
master_binlog_dir="/var/lib/mysql/" #mysql主从复制的二进制文件位置
candidate_master=1#master机宕掉后,优先启用这台作为新master
#no_master=1 #设置使主机不能成为新master
[server2]
hostname=10.0.0.2
master_binlog_dir="/var/lib/mysql/"
candidate_master=1
保存退出,配置完成。
在两台机器上安装node的rpm包
[root@node1 ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm
[root@node2 ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm
进行ssh验证
[root@node1 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Fri Jun 27 15:28:50 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jun 27 15:28:50 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Fri Jun 27 15:28:50 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Fri Jun 27 15:28:50 2014 - [info] Starting SSH connection tests..
Fri Jun 27 15:28:52 2014 - [debug]
Fri Jun 27 15:28:51 2014 - [debug] Connecting via SSH from root@10.0.0.1(10.0.0.1:22) to root@10.0.0.2(10.0.0.2:22)..
Fri Jun 27 15:28:52 2014 - [debug] ok.
Fri Jun 27 15:28:52 2014 - [debug]
Fri Jun 27 15:28:51 2014 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0.2:22) to root@10.0.0.1(10.0.0.1:22)..
Fri Jun 27 15:28:52 2014 - [debug] ok.
Fri Jun 27 15:28:52 2014 - [info] All SSH connection tests passed successfully.
进行mysql主从复制验证
[root@node1 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf
重要信息显示当前master和slave:
10.2.16.253 (current master)
+--10.2.16.254
最后输出以下,则表示配置成功!
MySQL Replication Health is OK.
启动MHA-manager管理进程:
nohup masterha_manager --conf=/etc/masterha/app1.cnf >/tmp/mha_manager.log 2>&1
查看管理节点的进程和当前的Master
masterha_check_status --conf=/etc/masterha/app1.cnf
*****************此时MHA的配置已经全部完成!***********************
模拟当mha-master挂掉之后,master切换为备用的254之后,如何切回来:
首先需要删除rm -f /masterha/app1/app1.failover.complete #此为每次启动MHA管理进程时生成的临时文件
*******************让旧的master先变为slave,同步宕机时丢失的数据***********
一、在活这的master中切换
在旧的master 253上执行:
mysql> reset master;
Query OK, 0 rows affected (0.04 sec)
#查看现在备用的master上的pos和log_file值之后,再写:
mysql> change master to master_host='10.0.0.1', master_port=3306, master_user='repl', master_password='123456', master_log_file='mysql-bin.000002', master_log_pos=106;
Query OK, 0 rows affected (0.05 sec)
mysql> start slave; #暂时先把旧master变为从
shell> masterha_master_switch --master_state=alive --conf=/etc/masterha/app1.cnf
二、在有宕机的master中切换:
[root@racdb scripts]# masterha_master_switch --master_state=dead --conf=/etc/app1.cnf --dead_master_host=10.0.0.2 --new_master_host=10.0.0.1
如果,MHA manager检测到没有dead的server,将报错,并结束failover:
Thu May 17 17:53:33 2012 - [info] Dead Servers:Thu May 17 17:53:33 2012 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterFailover.pm, ln181] None of server is dead. Stop failover.
###########################master成功切换回###########################
如何结合KeepAlived实现宕机时IP漂移达到mysql高可用,请查看本博另一篇文章”keepalived结合MHA实现mysql高可用”