来源: http://hi.baidu.com/leechl
今天一台数据库的slave报 Slave_IO_Running: No的错误, 登陆上机器执行.
>slave stop;
>SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
>slave start;
>SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
>slave start;
看看slave的状态, 依然是Slave_IO_Running: No
看看mater错误日志, 发现有一段奇怪的日志如下:
Got timeout reading communication packets
看看master的错误日志, 那就更奇怪了:
090430 15:49:38 [Note] Slave I/O thread: connected to master 'user@192.16.0.123:3306',replication started in log 'xxx-bin.000815' at position 3776386
090430 15:49:38 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236)
090430 15:49:38 [ERROR] Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log
090430 15:49:38 [Note] Slave I/O thread exiting, read up to log 'xxx-bin.000815', position 3776386
090430 15:49:38 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236)
090430 15:49:38 [ERROR] Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log
090430 15:49:38 [Note] Slave I/O thread exiting, read up to log 'xxx-bin.000815', position 3776386
可能是xxx-bin.000815这个文件有问题, 看了一下它的大小, 果然没有3776386这个位置, slave读的时候肯定是错误了, 到底为什么会这样就不清楚了.
解决办法就是读取下一个bin-log了
>slave stop;
>CHANGE MASTER TO MASTER_LOG_FILE='xxx-bin.000816',MASTER_LOG_POS=0;
>slave start;
>show slave status\G;
>CHANGE MASTER TO MASTER_LOG_FILE='xxx-bin.000816',MASTER_LOG_POS=0;
>slave start;
>show slave status\G;
可以看到Slave_IO_Running: Yes, 问题解决. mysql的replication总是会有各种各样奇怪的问题...

