[20130220]centos 6.2与hugepages.txt
1.昨天把一个刚上线的系统换成使用hugepages.做一个简单记录:
# cat hugepages_settings.sh
#!/bin/bash -x
#
# hugepages_settings.sh
#
# Linux bash script. to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script. does calculation for all shared memory
# segments available when the script. is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script. is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
# Welcome text
echo "
This script. is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please make sure
that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk '{print $5}' | grep "[0-9][0-9]*"`
do
MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo "***********"
echo "** ERROR **"
echo "***********"
echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured"
# exit 1
fi
# Finish with results
case $KERN in
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
*) echo "Unrecognized kernel version $KERN. Exiting." ;;
esac
--执行以上脚本,获得如下信息:
Recommended setting: vm.nr_hugepages = 5860
2.修改 /etc/sysctl.conf文件加入如下信息: [我稍微加大一点]
vm.nr_hugepages = 6000
3.修改/etc/security/limits.conf文件加入如下,计算方法是6000*2048=1228800
* soft memlock 12288000
* hard memlock 12288000
2048的信息来源这里:
# cat /proc/meminfo | grep -i hugepagesize
Hugepagesize: 2048 kB
4.启动数据库,发现出现如下提示:
------------[ cut here ]------------
WARNING: at fs/hugetlbfs/inode.c:951 hugetlb_file_setup+0x227/0x250() (Tainted: G ---------------- T)
Hardware name: PowerEdge R710
Using mlock ulimits for SHM_HUGETLB deprecated
--这里我编辑了一下,折行了.
Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables
bridge autofs4 sunrpc 8021q garp stp llc cachefiles fscache(T) ipv6 vhost_net macvtap macvlan tun
kvm uinput power_meter sg ses enclosure dcdbas microcode serio_raw iTCO_wdt iTCO_vendor_support
i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic
ata_piix megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Pid: 2924, comm: oracle Tainted: G ---------------- T 2.6.32-220.17.1.el6.x86_64 #1
Call Trace:
[] ? warn_slowpath_common+0x87/0xc0
[] ? warn_slowpath_fmt+0x46/0x50
[] ? user_shm_lock+0x54/0xc0
[] ? hugetlb_file_setup+0x227/0x250
[] ? sprintf+0x40/0x50
[] ? newseg+0x152/0x290
[] ? ipcget+0x1f5/0x200
[] ? sys_shmget+0x59/0x60
[] ? newseg+0x0/0x290
[] ? shm_security+0x0/0x10
[] ? shm_more_checks+0x0/0x20
[] ? system_call_fastpath+0x16/0x1b
---[ end trace 79a13eb94b968dda ]---
不过数据库启动正常.查询进程 pid=2924不存在.从这行观察
Pid: 2924, comm: oracle Tainted: G ---------------- T 2.6.32-220.17.1.el6.x86_64 #1
好像应该是oracle的进程.
看alert.log文件:
pga_aggregate_target = 6719275008
PMON started with pid=2, OS id=2932
PSP0 started with pid=3, OS id=2934
--也没有OS id=2924,奇怪pid=1不存在.难道不是oracle启动出现的信息? 只好留到下次判断了.
参考这个链接: https://oss.oracle.com/el6/docs/RELEASE-NOTES-GA-en.html
http://www.eygle.com/archives/2011/12/hugepageshugetl.html
hugepages warning messages (9861498)
An application using hugepages may see a warning message like "Using mlock ulimits for SHM_HUGETLB deprecated." To
avoid this warning, the application should be configured to use CAP_IPC_LOCK or the process (e.g. Oracle) should be
added to the hugetlb_shm_group.
--BTW :我并没有出现ORA-27125错误.
# su - oracle
$ id
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),502(dba)
--dba组502,按照上述执行如下:
# echo 502 >| /proc/sys/vm/hugetlb_shm_group
--为了下次启动有效,修改 /etc/sysctl.conf文件加入如下信息:
vm.hugetlb_shm_group = 502 --不确定这样修改是否影响其他程序.
--由于是生产系统现在在使用,许多问题我无法验证,留待下次启动观察.