Announcement

Collapse

http://progeeking.com

See more
See less

HugePages Calculation and Activation but cannot disable Transparent HugePages on RedHat Linux 6.2 - RedHat bug - 422283

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • HugePages Calculation and Activation but cannot disable Transparent HugePages on RedHat Linux 6.2 - RedHat bug - 422283

    Hello Beta testers

    I have very interesting case to share

    In order to improve database server performance on Oracle RAC one Node i introduce change on ASM - as memory allocation from 380M to 1.5GB.

    Also HugePages is caviar of this topic

    Additional details:
    OS Linux - Red Hat Enterprise Linux Server release 6.2 (Santiago)
    Oracle GI 11.2.0.3
    Oracle DB 11.2.0.3
    First database with Total 14GB SGA allocation
    Second database with Total 3GB SGA allocation

    ---- script to calculate Huge Pages

    Code:
    #!/bin/bash
    #
    # hugepages_settings.sh
    #
    # Linux bash script to compute values for the
    # recommended HugePages/HugeTLB configuration
    #
    # Note: This script does calculation for all shared memory
    # segments available when the script is run, no matter it
    # is an Oracle RDBMS shared memory segment or not.
    # Check for the kernel version
    KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
    # Find out the HugePage size
    HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {'print $2'}`
    # Start from 1 pages to be on the safe side and guarantee 1 free HugePage
    NUM_PG=1
    # Cumulative number of pages required to handle the running shared memory segments
    for SEG_BYTES in `ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"`
    do
       MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
       if [ $MIN_PG -gt 0 ]; then
          NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
       fi
    done
    # Finish with results
    case $KERN in
       '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
              echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
       '2.6' | '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
        *) echo "Unrecognized kernel version $KERN. Exiting." ;;
    esac
    # End
    
    --- check recomendation
    ./hugepages_settings.sh
    Recommended setting: vm.nr_hugepages = 8704
    in order to calculate correct value you can simple to following /2048*1024 - e.g. (17GB) 17 * 1024 = 17408 /2048*1024 = 8704

    Will be good take into account future SGA change because in order to change to upper value you have to perform server/node restart.

    So, i decided to add 80GB - this is -> vm.nr_hugepages = 40960

    Simple add this in /etc/sysctl.conf

    Code:
    [root@nodea ~]# grep vm.nr_hugepages /etc/sysctl.conf
    vm.nr_hugepages = 40960
    [root@nodea ~]#
    I did restart and HugePage have been activated

    Code:
    [root@nodea ~]# grep Huge /proc/meminfo
    AnonHugePages:   1007616 kB
    HugePages_Total:   40960 < this one >
    HugePages_Free:    33910
    HugePages_Rsvd:      247
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    [root@nodea ~]#
    But in the same time we have something strange - AnonHugePages - this is Transparen HugePages..
    As per Oracle Recommendation you have to deactivate THP in order to avoid performance issues - https://blogs.oracle.com/linux/entry...ansparent_huge

    -- Explanation from RedHat - https://access.redhat.com/solutions/46111
    -- Explanation from Oracle -
    ALERT: Disable Transparent HugePages on SLES11, RHEL6, OL6 and UEK2 Kernels (Doc ID 1557478.1) ****** important one *******
    HugePages on Linux: What It Is... and What It Is Not... (Doc ID 361323.1)
    HugePages on Oracle Linux 64-bit (Doc ID 361468.1)
    Hugepages are Not used by Database Buffer Cache (Doc ID 829850.1)

    Monitoring Linux 6 Transparent Huge Pages - not on Oracle but on Hadoop, very useful monitoring methods
    http://structureddata.org/2012/06/18...oop-workloads/

    -- very good explanation about HugePages on Tim website - https://oracle-base.com/articles/lin...le-on-linux-64

    Start with procedure how to disable THP

    --
    First you can do via grub.conf or with some script...

    -- better options is via grub by changing kernel option

    --- my setup

    Code:
    [root@nodea ~]# cat /etc/grub.conf
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this file
    # NOTICE:  You do not have a /boot partition.  This means that
    #          all kernel and initrd paths are relative to /, eg.
    #          root (hd0,5)
    #          kernel /boot/vmlinuz-version ro root=/dev/sda6
    #          initrd /boot/initrd-[generic-]version.img
    #boot=/dev/sda
    default=0
    timeout=5
    splashimage=(hd0,5)/boot/grub/splash.xpm.gz
    hiddenmenu
    title Red Hat Enterprise Linux (2.6.32-220.el6.x86_64)
            root (hd0,5)
            kernel /boot/vmlinuz-2.6.32-220.el6.x86_64 ro root=UUID=1ac29f32-9a3e-426e-98cf-b76e1e66c512 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM
            initrd /boot/initramfs-2.6.32-220.el6.x86_64.img
    [root@nodea ~]#
    --- change to

    Code:
    nodea
    [root@sdp13a ~]# cat /etc/grub.conf
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this file
    # NOTICE:  You do not have a /boot partition.  This means that
    #          all kernel and initrd paths are relative to /, eg.
    #          root (hd0,5)
    #          kernel /boot/vmlinuz-version ro root=/dev/sda6
    #          initrd /boot/initrd-[generic-]version.img
    #boot=/dev/sda
    default=0
    timeout=5
    splashimage=(hd0,5)/boot/grub/splash.xpm.gz
    hiddenmenu
    title Red Hat Enterprise Linux (2.6.32-220.el6.x86_64)
            root (hd0,5)
            kernel /boot/vmlinuz-2.6.32-220.el6.x86_64 ro root=UUID=1ac29f32-9a3e-426e-98cf-b76e1e66c512 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM 
    transparent_hugepage=never
    initrd /boot/initramfs-2.6.32-220.el6.x86_64.img [root@nodea ~]#
    -- after restart i did checks on which process allocating THP

    Code:
    [root@nodea ~]#  grep -e AnonHugePages  /proc/*/smaps | awk  '{ if($2>4) print $0} ' |  awk -F "/"  '{print $0; system("ps -fp " $3)} '
    /proc/12253/smaps:AnonHugePages:      4096 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     12253     1  0 17:43 ?        00:00:00 /usr/sbin/nsrexecd
    /proc/12253/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     12253     1  0 17:43 ?        00:00:00 /usr/sbin/nsrexecd
    /proc/12304/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     12304     1  7 17:43 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/ohasd.bin reboot
    /proc/12304/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     12304     1  7 17:43 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/ohasd.bin reboot
    /proc/12677/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    grid     12677     1  2 17:44 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/oraagent.bin
    /proc/12691/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    grid     12691     1  1 17:44 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/mdnsd.bin
    /proc/12710/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    grid     12710     1  2 17:44 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/gpnpd.bin
    /proc/12714/smaps:AnonHugePages:     24576 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    oracle   12714 12440 88 17:44 ?        00:00:01 /data/app/oracle/agent12c/core/12.1.0.3.0/jdk/bin/java -Xmx128M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///d
    /proc/12714/smaps:AnonHugePages:     18432 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    oracle   12714 12440 90 17:44 ?        00:00:01 /data/app/oracle/agent12c/core/12.1.0.3.0/jdk/bin/java -Xmx128M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///d
    /proc/12714/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    oracle   12714 12440 94 17:44 ?        00:00:01 /data/app/oracle/agent12c/core/12.1.0.3.0/jdk/bin/java -Xmx128M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///d
    /proc/12798/smaps:AnonHugePages:      2048 kB
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     12798     1  0 17:44 ?        00:00:00 /data/grid/11.2.0.3/grid/bin/orarootagent.bin
    [root@nodea ~]#
    --- check HugePages and THP

    Code:
    [root@nodea ~]# grep HugePages /proc/meminfo
    AnonHugePages:    126976 kB
    HugePages_Total:   40960
    HugePages_Free:    40960
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    [root@nodea ~]#
    As you can see have are still using THP.....

    i try second option as per the procedure from Tim Blog... and Oracle Metalink

    Add the following lines into the "/etc/rc.local" file and reboot the server.

    Code:
    if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
       echo never > /sys/kernel/mm/transparent_hugepage/enabled
    fi
    if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
       echo never > /sys/kernel/mm/transparent_hugepage/defrag
    fi
    i did restart and situation was the same.....

    why?

    ---after some check on run level which is 3

    Code:
    [root@nodea ~]# who -r
             run-level 3  2015-07-06 15:16
    [root@nodea ~]#
    
    -- check ohasd sequence of run
    
    [root@nodea ~]# ls -la /etc/rc3.d/ | grep ohas
    lrwxrwxrwx   1 root root   17 Feb  5 12:10 K15ohasd -> /etc/init.d/ohasd
    lrwxrwxrwx   1 root root   17 Feb  5 12:10 S96ohasd -> /etc/init.d/ohasd
    [root@nodea ~]#
    
    -- check rc.local sequence of run
    
    [root@nodea ~]# ls -la /etc/rc3.d/ | grep local
    lrwxrwxrwx.  1 root root   11 Feb  6  2014 S99local -> ../rc.local
    [root@nodea ~]#
    So, this is normal ohasd start prior rc.local ..... after some googling resolution is not visible .

    I did following:

    - create script in /etc/thp.sh

    Code:
    [root@nodea ~]# cat /etc/thp.sh
    #!/bin/bash
    
             if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
             echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
             fi
    
             if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then
             echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
             fi
    [root@nodea ~]#
    -add that script as part of /etc/init.d/ohasd

    Code:
    [root@nodea ~]# cat /etc/init.d/ohasd
    #!/bin/sh
    #
    # Copyright (c) 2001, 2011, Oracle and/or its affiliates. All rights reserved.
    #
    # ohasd.sbs  - Control script for the Oracle HA Services daemon
    # This script is invoked by the rc system.
    #
    # Note:
    #   For security reason,  all cli tools shipped with Clusterware should be
    # executed as HAS_USER in init.ohasd and ohasd rc script for SIHA. (See bug
    # 9216334 for more details)
    #
    
    . /etc/thp.sh
    
    ######### Shell functions #########
    -- after reboot

    Code:
    [root@nodea ~]# grep Huge /proc/meminfo
    AnonHugePages:         0 kB
    HugePages_Total:   40960
    HugePages_Free:    40960
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    [root@nodea ~]#
    --- check for any process with THP

    Code:
    [root@nodea ~]#  grep -e AnonHugePages  /proc/*/smaps | awk  '{ if($2>4) print $0} ' |  awk -F "/"  '{print $0; system("ps -fp " $3)} '
    [root@nodea ~]#
    -- checking for ASM or DB

    [CODE]

    [root@nodea ~]# ps -ef | grep pmon
    grid 14774 1 0 20:26 ? 00:00:00 asm_pmon_+ASM2
    root 15203 12258 0 20:27 pts/1 00:00:00 grep pmon
    [root@nodea ~]#

    [CODE]

    -- check Oracle RAC cluster services

    Code:
    [root@nodea ~]# crsctl stat res -t -init
    --------------------------------------------------------------------------------
    NAME           TARGET  STATE        SERVER                   STATE_DETAILS
    --------------------------------------------------------------------------------
    Cluster Resources
    --------------------------------------------------------------------------------
    ora.asm
          1        ONLINE  ONLINE       nodea                   Started
    ora.cluster_interconnect.haip
          1        ONLINE  ONLINE       nodea
    ora.crf
          1        ONLINE  ONLINE       nodea
    ora.crsd
          1        ONLINE  ONLINE       nodea
    ora.cssd
          1        ONLINE  ONLINE       nodea
    ora.cssdmonitor
          1        ONLINE  ONLINE       nodea
    ora.ctssd
          1        ONLINE  ONLINE       nodea                   OBSERVER
    ora.diskmon
          1        OFFLINE OFFLINE
    ora.evmd
          1        ONLINE  ONLINE       nodea
    ora.gipcd
          1        ONLINE  ONLINE       nodea
    ora.gpnpd
          1        ONLINE  ONLINE       nodea
    ora.mdnsd
          1        ONLINE  ONLINE       nodea
    [root@nodea ~]#
    --- checking for java process

    Code:
    [root@sdp13a ~]# ps -ef | grep java |  wc -l
    4
    [root@sdp13a ~]#
    its look like this workaroung works...

    As per Redhat there are bug for this.. -> https://access.redhat.com/solutions/422283

    Have fun
Working...
X