Sunday, January 3, 2016

How to identify if it is a memory bottleneck or a hardware issue in a DB server

How to identify if it is a memory bottleneck or a hardware issue for a DB server:

If The server was hung due to out of memory error.

When all memory is exhausted, the out of memory killer (oom-killer) process begins killing processes and try to keep server alive.

 But if, the memory recover ended with rebooting the server it indicates a hardware issue.



Below are the error logs available in the server before the reboot.



Apr  4 09:10:09 testoraclerac-d kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0

Apr  4 09:10:09 testoraclerac-d kernel: 22954 pages of slabcache

Apr  4 09:10:09 testoraclerac-d kernel: 722 pages of kernel stacks

Apr  4 09:10:09 testoraclerac-d kernel: 13 lowmem pagetables, 17746 highmem pagetables

Apr  4 09:10:09 testoraclerac-d kernel: Free swap:            0kB

Apr  4 10:20:09 testoraclerac-d kernel: Out of Memory: Killed process 15944 (oracle).

Apr  4 10:30:19 testoraclerac-d kernel: Out of Memory: Killed process 22600 (oracle).

Apr  4 10:30:19 testoraclerac-d kernel: Out of Memory: Killed process 22600 (oracle).

Apr  4 10:37:56 testoraclerac-d kernel: Out of Memory: Killed process 29002 (oracle).

Apr  4 10:37:56 testoraclerac-d kernel: Out of Memory: Killed process 29002 (oracle).

Apr  4 11:55:06 testoraclerac-d kernel: Memory: 4025116k/4653052k available (1710k kernel code, 98984k reserved, 1296k data, 228k init, 3211208k highmem)

Apr  4 11:55:14 testoraclerac-d kernel: Total HugeTLB memory allocated, 0

Apr  4 11:55:14 testoraclerac-d kernel: Freeing initrd memory: 1455k freed

Apr  4 11:55:16 testoraclerac-d kernel: Freeing unused kernel memory: 228k freed

No comments: