[mdlug] Server stability

Michael ORourke mrorourke at earthlink.net
Wed Jun 10 14:40:48 EDT 2009


Lugnuts,

I've got a server that is experiencing some stability issues.
The server was built with CentOS 5.2 i386 and has been patched recently.  It's running Oracle XE, tomcat, httpd, and some java.  But when I put any load on the system, it just spontaneously reboots.  Any thoughts and/or suggestions?
Would we be better off reloading this box with an x86_64 bit version of CentOS?
Here is some info on the server...

--kernel 
[root at patchserv proc]# uname -a
Linux patchserv.xxxxxxxxxxxxxxxxxx.xxx 2.6.18-128.1.6.el5PAE #1 SMP Wed Apr 1 10:02:22 EDT 2009 i686 i686 i386 GNU/Linux

--memory info (8GB RAM)
[root at patchserv log]# cat /proc/meminfo
MemTotal:      8178228 kB
MemFree:        300404 kB
Buffers:         43884 kB
Cached:        6252688 kB
SwapCached:      44140 kB
Active:        3276088 kB
Inactive:      4392996 kB
HighTotal:     7470840 kB
HighFree:        17988 kB
LowTotal:       707388 kB
LowFree:        282416 kB
SwapTotal:     2031608 kB
SwapFree:      1987468 kB
Dirty:             136 kB
Writeback:           0 kB
AnonPages:     1267384 kB
Mapped:         684764 kB
Slab:           144172 kB
PageTables:      50672 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   6120720 kB
Committed_AS:  3973640 kB
VmallocTotal:   116728 kB
VmallocUsed:      3900 kB
VmallocChunk:   112712 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

--processor info (8 CPUs)
processor       : 7
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 2.80GHz
stepping        : 8
cpu MHz         : 2793.483
cache size      : 2048 KB
physical id     : 1
siblings        : 4
core id         : 1
cpu cores       : 2
apicid          : 7
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est cid cx16 xtpr lahf_lm
bogomips        : 5586.43

--recent reboots (not user initiated)
[root at patchserv log]# last | grep reboot
reboot   system boot  2.6.18-128.1.6.e Wed Jun 10 07:42          (06:50)
reboot   system boot  2.6.18-128.1.6.e Wed Jun 10 07:33          (00:04)
reboot   system boot  2.6.18-128.1.6.e Tue Jun  9 01:12         (1+06:24)
reboot   system boot  2.6.18-128.1.6.e Tue Jun  9 01:03          (00:04)
reboot   system boot  2.6.18-128.1.6.e Tue Jun  9 00:26          (00:41)
reboot   system boot  2.6.18-128.1.6.e Tue Jun  9 00:18          (00:03)

--/var/log/messages around the time of the last reboot...
Jun 10 07:33:26 patchserv kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Jun 10 07:33:26 patchserv kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Jun 10 07:37:30 patchserv kdump: saved a vmcore to /var/crash/2009-06-10-07:33
Jun 10 07:37:31 patchserv shutdown[3048]: shutting down for system reboot
Jun 10 07:37:31 patchserv init: Switching to runlevel: 6
Jun 10 07:37:32 patchserv rpc.statd[2983]: Caught signal 15, un-registering and exiting.
Jun 10 07:37:32 patchserv portmap[3087]: connect from 127.0.0.1 to unset(status): request from unprivileged port
Jun 10 07:37:33 patchserv auditd[2888]: The audit daemon is exiting.
Jun 10 07:37:33 patchserv kernel: audit(1244633853.205:5): audit_pid=0 old=2888 by auid=4294967295
Jun 10 07:37:33 patchserv kernel: Kernel logging (proc) stopped.
Jun 10 07:37:33 patchserv kernel: Kernel log daemon terminating.
Jun 10 07:37:34 patchserv exiting on signal 15
Jun 10 07:42:29 patchserv syslogd 1.4.1: restart.
Jun 10 07:42:29 patchserv kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jun 10 07:42:29 patchserv kernel: Linux version 2.6.18-128.1.6.el5PAE (mockbuild at builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Apr 1 10:02:22 EDT 2009
Jun 10 07:42:29 patchserv kernel: BIOS-provided physical RAM map:
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 0000000000100000 - 00000000bffc0000 (usable)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000bffc0000 - 00000000bffcfc00 (ACPI data)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000bffcfc00 - 00000000bffff000 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000e0000000 - 00000000fec90000 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000fed00000 - 00000000fed00400 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 0000000100000000 - 00000001ffffe000 (usable)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 00000001ffffe000 - 0000000200000000 (reserved)
Jun 10 07:42:29 patchserv kernel:  BIOS-e820: 0000000200000000 - 0000000240000000 (usable)
Jun 10 07:42:29 patchserv kernel: 8320MB HIGHMEM available.
Jun 10 07:42:29 patchserv kernel: 896MB LOWMEM available.
Jun 10 07:42:29 patchserv kernel: found SMP MP-table at 000fe710
Jun 10 07:42:29 patchserv kernel: NX (Execute Disable) protection: active
Jun 10 07:42:29 patchserv kernel: DMI 2.3 present.
Jun 10 07:42:29 patchserv kernel: Using APIC driver default
Jun 10 07:42:29 patchserv kernel: ACPI: PM-Timer IO Port: 0x808
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #0 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #6 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #2 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #4 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #1 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x07] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #7 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x03] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #3 15:4 APIC version 20
Jun 10 07:42:29 patchserv kernel: ACPI: LAPIC (acpi_id[0x08] lapic_id[0x05] enabled)
Jun 10 07:42:29 patchserv kernel: Processor #5 15:4 APIC version 20

Thanks,
Mike



More information about the mdlug mailing list