一台MySQL服务器突然Crash了,检查进程 ps -ef | grep -i mysql 发现mysqld进程已经没有了, 检查错误日志时发现MySQL确实Crash了。具体如下所示:
注意日志中的时间: 09:49:52 UTC是UTC时间(协调世界时间) , 加上8小时就是东八区的时间17:45:52,日志前段是UTC时间,后面又是系统时间。这个需要注意!
09:49:52 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=48
max_threads=151
thread_count=20
connection_count=20
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68245 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x24c53a0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f8b28d22e90 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x946155]
/usr/sbin/mysqld(handle_fatal_signal+0x3d8)[0x6a58c8]
/lib64/libpthread.so.0(+0xf100)[0x7f8b56fe0100]
/usr/sbin/mysqld(mi_extra+0x2b)[0xb414cb]
/usr/sbin/mysqld(_ZN13QEP_tmp_table8end_sendEv+0x49)[0x706bc9]
/usr/sbin/mysqld(_Z13sub_select_opP4JOINP13st_join_tableb+0x75)[0x705f25]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x3a9)[0x7039d9]
/usr/sbin/mysqld[0x74ac19]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x74b6dc]
/usr/sbin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x74b8e5]
/usr/sbin/mysqld[0x725079]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x34ae)[0x72a1ae]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x318)[0x72de48]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x11b6)[0x72f7f6]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xd7)[0x7310a7]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x116)[0x6f8856]
/usr/sbin/mysqld(handle_one_connection+0x45)[0x6f8935]
/usr/sbin/mysqld(pfs_spawn_thread+0x126)[0xb153e6]
/lib64/libpthread.so.0(+0x7dc5)[0x7f8b56fd8dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f8b55c9fced]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f8ac4ddf810): is an invalid pointer
Connection ID (thread ID): 77704
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
170919 17:49:52 mysqld_safe Number of processes running now: 0
170919 17:49:52 mysqld_safe mysqld restarted
2017-09-19 17:49:53 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2017-09-19 17:49:53 61774 [Note] Plugin 'FEDERATED' is disabled.
2017-09-19 17:49:53 61774 [Note] InnoDB: Using atomics to ref count buffer pool pages
2017-09-19 17:49:53 61774 [Note] InnoDB: The InnoDB memory heap is disabled
2017-09-19 17:49:53 61774 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-19 17:49:53 61774 [Note] InnoDB: Memory barrier is not used
2017-09-19 17:49:53 61774 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-09-19 17:49:53 61774 [Note] InnoDB: Using Linux native AIO
2017-09-19 17:49:53 61774 [Note] InnoDB: Using CPU crc32 instructions
2017-09-19 17:49:53 61774 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2017-09-19 17:49:53 61774 [Note] InnoDB: Completed initialization of buffer pool
2017-09-19 17:49:53 61774 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-19 17:49:53 61774 [Note] InnoDB: Log scan progressed past the checkpoint lsn 50601391009
2017-09-19 17:49:53 61774 [Note] InnoDB: Database was not shutdown normally!
2017-09-19 17:49:53 61774 [Note] InnoDB: Starting crash recovery.
2017-09-19 17:49:53 61774 [Note] InnoDB: Reading tablespace information from the .ibd files...
2017-09-19 17:49:53 61774 [Note] InnoDB: Restoring possible half-written data pages
2017-09-19 17:49:53 61774 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 50605593148
2017-09-19 17:49:54 61774 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch comple2017-09-19 17:49:55 61774 [Note] Crash recovery finished.
2017-09-19 17:49:57 61774 [Note] RSA private key file not found: /var/lib/mysql//private_key.pem. Some authentication plugins will not work.
2017-09-19 17:49:57 61774 [Note] RSA public key file not found: /var/lib/mysql//public_key.pem. Some authentication plugins will not work.
2017-09-19 17:49:57 61774 [Note] Server hostname (bind-address): '*'; port: 3306
2017-09-19 17:49:57 61774 [Note] IPv6 is available.
2017-09-19 17:49:57 61774 [Note] - '::' resolves to '::';
2017-09-19 17:49:57 61774 [Note] Server socket created on IP: '::'.
2017-09-19 17:49:57 61774 [ERROR] /usr/sbin/mysqld: Error writing file '/var/lib/mysql/xxxx.pid' (Errcode: 28 - No space left on device)
2017-09-19 17:49:57 61774 [ERROR] Can't start server: can't create PID file: No space left on device
170919 17:49:57 mysqld_safe Number of processes running now: 0
170919 17:49:57 mysqld_safe mysqld restarted
2017-09-19 17:49:57 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2017-09-19 17:49:57 61816 [Note] Plugin 'FEDERATED' is disabled.
2017-09-19 17:49:57 61816 [Note] InnoDB: Using atomics to ref count buffer pool pages
2017-09-19 17:49:57 61816 [Note] InnoDB: The InnoDB memory heap is disabled
2017-09-19 17:49:57 61816 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-19 17:49:57 61816 [Note] InnoDB: Memory barrier is not used
2017-09-19 17:49:57 61816 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-09-19 17:49:57 61816 [Note] InnoDB: Using Linux native AIO
2017-09-19 17:49:57 61816 [Note] InnoDB: Using CPU crc32 instructions
2017-09-19 17:49:57 61816 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2017-09-19 17:49:57 61816 [Note] InnoDB: Completed initialization of buffer pool
2017-09-19 17:49:58 61816 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-19 17:49:58 61816 [Note] InnoDB: Log scan progressed past the checkpoint lsn 50601391009
2017-09-19 17:49:58 61816 [Note] InnoDB: Database was not shutdown normally!
2017-09-19 17:49:58 61816 [Note] InnoDB: Starting crash recovery.
2017-09-19 17:49:58 61816 [Note] InnoDB: Reading tablespace information from the .ibd files...
2017-09-19 17:49:58 61816 [Note] InnoDB: Restoring possible half-written data pages
2017-09-19 17:49:58 61816 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 50605596446
2017-09-19 17:49:58 61816 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 36549295, file name mysql-bin.000031
2017-09-19 17:49:59 61816 [Note] InnoDB: 128 rollback segment(s) are active.
2017-09-19 17:49:59 61816 [Note] InnoDB: Waiting for purge to start
2017-09-19 17:49:59 61816 [Note] InnoDB: 5.6.20 started; log sequence number 50605596446
^G/usr/sbin/mysqld: Error writing file '/var/lib/mysql/xxxx-slow.log' (Errcode: 28 - No space left on device)
2017-09-19 17:49:59 61816 [ERROR] Could not open /var/lib/mysql/xxxx-slow.log for logging (error 28). Turning logging off for the whole duration of the MySQL server process. To turn it on again: fix the cause, shutdown the MySQL server and restart it.
2017-09-19 17:49:59 61816 [Note] Recovering after a crash using mysql-bin
2017-09-19 17:49:59 61816 [Note] Starting crash recovery...
2017-09-19 17:49:59 61816 [Note] Crash recovery finished.
09:49:59 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by....
日志“mysqld got signal 11”标志着MySQL实例Crash的信号,可以看到MySQL实例Crash后,又有几次尝试Crash recovery,在详细信息里面,我们可以看到(Errcode: 28 - No space left on device),那么很有可能是磁盘空间不足导致MySQLCrash了。后面确认检查后,确实发现分区突然被耗尽了(本身空间不充裕、有人在其中一个分区上面备份了大量文件导致空间被耗尽)
2017-09-19 17:49:57 61774 [ERROR] /usr/sbin/mysqld: Error writing file '/var/lib/mysql/xxxxx.pid' (Errcode: 28 - No space left on device)
2017-09-19 17:49:57 61774 [ERROR] Can't start server: can't create PID file: No space left on device
后面搜索了一下引起MySQL Crash的可能因素,有篇文章10+ Ways to Crash or Overload MySQL介绍了导致MySQL Crash的一些因素, 非常全面的介绍了这方面的知识,内容虽然浅显,但是是一个不错的入门介绍,下文收录了可能引起MySQL Crash的知识点:
Temporary Tables You can build query (with derived tables) which uses as many temporary tables as you like and you can size them so they would be still be created in memory.
Memory Tables If you can create memory tables you can create any number of them and even though there is max_heap_table_size to restrict size of each table total size is unrestricted. Note you can create tables as TEMPORARY so they would not be easily visible on file system.
MyISAM Sort Buffer – This one is typically set large as it is assumed only couple of tables would be repaired at the same time. What if user uses all 100 of his allowed connections to ALTER 100 different tables ? This can be offset by keeping myisam_sort_buffer_size low, but then performance would suffer.
Prepared Statements Number – Happily now there is a limit on total number of prepared statements (max_stmt_count) which can be created per server, so it is better than it was before when application which forgot to close prepared statements could easily make server to take up all memory. However there is no per user limit so one user can consume all prepared breaking other applications which need prepared statements. Moreover not all prepared statements consume same amount of memory and by preparing complex prepared statements you can eat a lot of memory. The workaround for this issue is to avoid use of prepared statements and keep max_prepared_stmt_count very low.
Prepared Statements and Blob Data If you’re want to get memory consumed by single prepared statement you can create statement with thousands of placeholders and send data for each of them using mysql_stmt_send_long_data call – Server buffers such data until you have executed prepared statement.
Innodb Table Cache Leak – Innodb never shrinks its internal table cache (data dictionary) so by creating and accessing large amount of Innodb tables you can allocate large amount of memory on server. The size is typically 4-8K per table though complex tables can require larger sizes, so this is mainly the problem for smaller servers.
Table Cache Merge Tables – Table Cache is allocated in entries and it is normally assumed each entry will use no more than couple of file descriptors. This is not the case with Merge tables for example – creating and accessing few merge tables with 1000 of subtables will likely cause your MySQL server to run out of file descriptors. The same is true for Partitioned tables in MySQL 5.1
Disk Space For MyISAM tables hosting providers used to use disk quotas for MyISAM tables. You can also use similar technique with innodb_file_per_table. However you can’t control growth of Innodb system tablespace which is used to hold undo data and which you can grow over roof by opening transaction and doing a lot of updates, or simply keeping transaction open and allowing other users to do updates – Innodb only can purge data after oldest transactions needing snapshot commits. You can kind of workaround this issue by killing transactions which are too old though proper solution would be implementing some form of limit on undo segment size. Another possibility is to use queries which use large temporary tables or sort files which can take up all space and even if they can be placed on separate partition filling it up will cause other users being unable to run their queries.
Stored Procedures – How much memory can stored procedure allocate ? say can you create 1000 variables in stored procedure and set 1M result set to each of them ? I have not experimented with other stored procedure language constructions but I do not think tight memory allocation policy is enforced.
Stored Procedures Cursors – Cursors inside of stored procedures are implemented as temporary tables, so by opening large amount of cursors which are handled as in-memory temporary tables you can consume unbound amount of memory.
Stored Procedures Recursion – It does not have to be recursion per say – just different stored procedures calling each other. Calls require memory allocation and especially stack memory allocations. There are some protections to ensure you would not run out of stack but they might not cover all cases.
Server Side Variables – Each server side can hold value up to max_allowed_packet in size (1M by default) but there does not seems to be any limit for server side variables one can create.
Parse Tree The query is internally presented using parse tree inside MySQL which of course depends on query size which is controlled by max_allowed_packet. However some MySQL optimizations such as equity propagation and range expansion can cause Parse Tree to blow up in size. For most trivial case it was fixed it was fixed though I’m not sure if all possibilities were validated.
Session variables There is no restriction on how large you can set per connection variables for unprivileged user which allows to run queries with uncontrolled resource usage.
Host Block You can have given client host blocked from accessing the server by simulating number of failed connections. This can be avoided by having high max_connect_errors variables but this will obviously disable password brute force protection.
Mutex Saturation Both Innodb and MyISAM have hotspots and having few connections which use appropriate operations heavily you can reduce system performance beyond being functional.
General Overload As MySQL does not have much of resource utilization control you can simply run heavy queries to get MySQL Server barely functional. The limits which exist are not really helpful as they do not define query complexity and resource consumption allowed for user. Heavy Disk IO queries can be one of the worse because they would both overload IO system and wipe off both MySQL and OS caches which can cause other users queries to perform order of magnitude slower than in normal case.
参考资料:
https://www.percona.com/blog/2007/11/13/10-ways-to-crash-or-overload-mysql/