ganglia metric extended by mod

ganglia 对监控数据(metric)的采样是模块化方式进行的, 例如 :

自带的监控项其实是放在一些C写的动态链接库来加载并通过gmond.conf配置来调度的.

[root@db-172-16-3-221 ganglia]# cd /opt/ganglia-core-3.6.0/
[root@db-172-16-3-221 ganglia-core-3.6.0]# ll
total 24
drwxr-xr-x 2 root root 4096 Sep  9 11:03 bin
drwxr-xr-x 3 root root 4096 Sep 15 09:31 etc
drwxr-xr-x 2 root root 4096 Sep  9 11:03 include
drwxr-xr-x 3 root root 4096 Sep  9 11:03 lib64
drwxr-xr-x 2 root root 4096 Sep  9 11:03 sbin
drwxr-xr-x 3 root root 4096 Sep  9 11:03 share
[root@db-172-16-3-221 ganglia-core-3.6.0]# cd lib64/
[root@db-172-16-3-221 lib64]# ll
total 660
drwxr-xr-x 2 root root   4096 Sep  9 11:03 ganglia
lrwxrwxrwx 1 root root     25 Sep  9 11:03 libganglia-3.6.0.so.0 -> libganglia-3.6.0.so.0.0.0
-rwxr-xr-x 1 root root 257554 Sep  9 11:03 libganglia-3.6.0.so.0.0.0
-rw-r--r-- 1 root root 408276 Sep  9 11:03 libganglia.a
-rwxr-xr-x 1 root root   1270 Sep  9 11:03 libganglia.la
lrwxrwxrwx 1 root root     25 Sep  9 11:03 libganglia.so -> libganglia-3.6.0.so.0.0.0
[root@db-172-16-3-221 lib64]# cd ganglia/
[root@db-172-16-3-221 ganglia]# ll
total 700
-rwxr-xr-x 1 root root 86896 Sep  9 11:03 modcpu.so
-rwxr-xr-x 1 root root 84118 Sep  9 11:03 moddisk.so
-rwxr-xr-x 1 root root 17896 Sep  9 11:03 modgstatus.so
-rwxr-xr-x 1 root root 84078 Sep  9 11:03 modload.so
-rwxr-xr-x 1 root root 85832 Sep  9 11:03 modmem.so
-rwxr-xr-x 1 root root 31655 Sep  9 11:03 modmulticpu.so
-rwxr-xr-x 1 root root 84480 Sep  9 11:03 modnet.so
-rwxr-xr-x 1 root root 83862 Sep  9 11:03 modproc.so
-rwxr-xr-x 1 root root 53954 Sep  9 11:03 modpython.so
-rwxr-xr-x 1 root root 85136 Sep  9 11:03 modsys.so

在gmond中加载这些模块, 才可以使用这些模块来采样监控数据(metric).

[root@db-172-16-3-221 ganglia]# cd /opt/ganglia-core-3.6.0/etc/
[root@db-172-16-3-221 etc]# ll
total 24
drwxr-xr-x 2 root root 4096 Sep  9 11:26 conf.d
-rw-r--r-- 1 root root 7973 Sep 10 17:37 gmetad.conf
-rw-r--r-- 1 root root 9157 Sep 15 09:31 gmond.conf
[root@db-172-16-3-221 etc]# less gmond.conf
/* Each metrics module that is referenced by gmond must be specified and
   loaded. If the module has been statically linked with gmond, it does
   not require a load path. However all dynamically loadable modules must
   include a load path. */
modules {
  module {
    name = "core_metrics"
  }
  module {
    name = "cpu_module"
    path = "modcpu.so"
  }
  module {
    name = "disk_module"
    path = "moddisk.so"
  }
  module {
    name = "load_module"
    path = "modload.so"
  }
  module {
    name = "mem_module"
    path = "modmem.so"
  }
  module {
    name = "net_module"
    path = "modnet.so"
  }
  module {
    name = "proc_module"
    path = "modproc.so"
  }
  module {
    name = "sys_module"
    path = "modsys.so"
  }
}

以上模块都是C写的, 为了提高开发效率, ganglia metric 扩展还支持php, python, perl等语言的接口.

在编译ganglia时, 需要指定 :

[root@db-172-16-3-221 etc]# cd /opt/soft_bak/ganglia-3.6.0
[root@db-172-16-3-221 ganglia-3.6.0]# ./configure --help
  --enable-perl           include mod_perl and support for metric modules written in perl
  --enable-php            include mod_php and support for metric modules written in php

  --with-python=PATH      Specify prefix for python or full path to interpreter
  --with-perl=PATH        Specify prefix for perl or full path to interpreter
  --with-php=PATH         Specify prefix for php or full path to php-config

默认是支持python扩展的, 如下, 我们可以看到modpython.so动态链接库.

[root@db-172-16-3-221 ~]# cd /opt/ganglia-core-3.6.0/
[root@db-172-16-3-221 ganglia-core-3.6.0]# cd lib64/
[root@db-172-16-3-221 lib64]# cd ganglia/
[root@db-172-16-3-221 ganglia]# ll
total 700
...
-rwxr-xr-x 1 root root 53954 Sep  9 11:03 modpython.so
...

注意modpython.so, 显然这个模块也是C写的, 这个模块并不是直接用来采样监控数据(metric), 而是用来解释python脚本的, 并且被解释和执行的python脚本要按照一定的规则来定义.

例子可以参考github里面ganglia的gmond_python_modules项目.

https://github.com/ganglia/gmond_python_modules

下面简单的讲一下如何使用python来扩展metric采样.

一. 配置gmond以支持python metric 模块.

首先要写一个配置文件, 配置modpython.so以及存放.py脚本的路径, 存放.py脚本对应的.pyconf配置文件的路径.

[root@db-172-16-3-221 etc]# cd /opt/ganglia-core-3.6.0/etc/conf.d/
[root@db-172-16-3-221 etc]# vi modpython.conf
/*
  params - path to the directory where mod_python
           should look for python metric modules

  the "pyconf" files in the include directory below
  will be scanned for configurations for those modules
*/
modules {
  module {
    name = "python_module"
    path = "modpython.so"
    params = "/opt/ganglia-core-3.6.0/lib64/ganglia/python_modules"   # 存放.py脚本的目录
  }
}

include ("/opt/ganglia-core-3.6.0/etc/conf.d/*.pyconf")     # 存放.pyconf配置文件的目录.

每一个python metric 模块, 都需要2个文件, 一个.py脚本文件, 一个.pyconf配置文件.

例如 :

postgres.py脚本以及对应的postgres.pyconf配置文件.

将modpython.conf这个配置文件加入gmond.conf

[root@db-172-16-3-221 etc]# vi /opt/ganglia-core-3.6.0/etc/gmond.conf
include ("/opt/ganglia-core-3.6.0/etc/conf.d/*.conf")

二. 编写python metric 模块

1. 编写.py脚本

.py脚本需要按照ganglia的协定编写, 必须包含3类函数 :

metric_init(params), metric_handler(name), and metric_cleanup().

metric_init(params) 函数, 有且仅有一个参数params, 这个函数在模块初始化时调用, 主要目的是构造metric的定义字典. (当然, 你也可以在初始化过程执行你想要的动作)

params参数是字典类型, 字典的内容取自.py脚本对应的.pyconf配置文件中module section定义的param/value值. 例如 :

gmond_python_modules / postgresql / conf.d / postgres.pyconf

modules {
   module {
     name = "postgres"
     language = "python"
     param host {
       value = "hostname_goes_here"  # 这里需要填写真实的主机名或IP, 例如127.0.0.1
     }
     param port {
       value = "port_goes_here"    # 这里需要填写真实的postgresql监听端口, 如5432
     }
     param dbname {
       value = "database_goes_here"   # 这里需要填写真实的数据库名, 如postgres
     }
     param username {
       value = "username_goes_here"    # 这里需要填写真实的用户名
     }
     param password {
       value = "password_goes_here"   # 这里需要填写真实的密码
     }
   }
}

以上.pyconf配置文件中定义的param有host, port, dbname, username, password.

那么在metric_init(params)函数中可以这么来用:

# Metric descriptors are initialized here
def metric_init(params):
    HOST = str(params.get('host'))
    PORT = str(params.get('port'))
    DB = str(params.get('dbname'))
    USER = str(params.get('username'))
    PASSWORD = str(params.get('password'))

    global pgdsn
    pgdsn = "dbname=" + DB + " host=" + HOST + " user=" + USER + " port=" + PORT + " password=" + PASSWORD

鉴于以上情形, postgresql python metric module for ganglia 需要访问数据库, 所以最好在数据库本地服务器安装gmond , 并且新建一个超级用户, 配置该超级用户只能从127.0.0.1访问, 其他不允许访问.

例如 :

=> create role digoal superuser login encrypted password 'digoal';
vi $PGDATA/pg_hba.conf
host all digoal 127.0.0.1/32 trust     # 允许本地无密码访问
host all digoal 0.0.0.0/0 reject    #  不允许从任何来源访问

metric 定义字典是什么意思呢? 来看看例子 :

  descriptors = [
        {'name':'Pypg_idle_sessions','units':'Sessions','slope':'both','description':'PG Idle Sessions'},
        {'name':'Pypg_active_sessions','units':'Sessions','slope':'both','description':'PG Active Sessions'},
        {'name':'Pypg_idle_in_transaction_sessions','units':'Sessions','slope':'both','description':'PG Idle In Transaction Sessions'},
        {'name':'Pypg_waiting_sessions','units':'Sessions','slope':'both','description':'PG Waiting Sessions Blocked'},
        {'name':'Pypg_longest_xact','units':'Seconds','slope':'both','description':'PG Longest Transaction in Seconds'},
        {'name':'Pypg_longest_query','units':'Seconds','slope':'both','description':'PG Longest Query in Seconds'},
        {'name':'Pypg_locks_accessexclusive','units':'Locks','slope':'both','description':'PG AccessExclusive Locks read write blocking'},
        {'name':'Pypg_locks_otherexclusive','units':'Locks','slope':'both','description':'PG Exclusive Locks write blocking'},
        {'name':'Pypg_locks_shared','units':'Locks','slope':'both','description':'PG Shared Locks NON blocking'},
        {'name':'Pypg_bgwriter_checkpoints_timed','units':'checkpoints','slope':'positive','description':'PG scheduled checkpoints'},
        {'name':'Pypg_bgwriter_checkpoints_req','units':'checkpoints','slope':'positive','description':'PG unscheduled checkpoints'},
        {'name':'Pypg_bgwriter_checkpoint_write_time','units':'ms','slope':'positive','description':'PG time to write checkpoints to disk'},
        {'name':'Pypg_bgwriter_checkpoint_sync_time','units':'checkpoints','slope':'positive','description':'PG time to sync checkpoints to disk'},
        {'name':'Pypg_bgwriter_buffers_checkpoint','units':'buffers','slope':'positive','description':'PG number of buffers written during checkpoint'},
        {'name':'Pypg_bgwriter_buffers_clean','units':'buffers','slope':'positive','description':'PG number of buffers written by the background writer'},
        {'name':'Pypg_bgwriter_buffers_backend','units':'buffers','slope':'positive','description':'PG number of buffers written directly by a backend'},
        {'name':'Pypg_bgwriter_buffers_alloc','units':'buffers','slope':'positive','description':'PG number of buffers allocated'},
        {'name':'Pypg_transactions','units':'xacts','slope':'positive','description':'PG Transactions'},
        {'name':'Pypg_inserts','units':'tuples','slope':'positive','description':'PG Inserts'},
        {'name':'Pypg_updates','units':'tuples','slope':'positive','description':'PG Updates'},
        {'name':'Pypg_deletes','units':'tuples','slope':'positive','description':'PG Deletes'},
        {'name':'Pypg_reads','units':'tuples','slope':'positive','description':'PG Reads'},
        {'name':'Pypg_blks_diskread','units':'blocks','slope':'positive','description':'PG Blocks Read from Disk'},
        {'name':'Pypg_blks_memread','units':'blocks','slope':'positive','description':'PG Blocks Read from Memory'},
        {'name':'Pypg_tup_seqscan','units':'tuples','slope':'positive','description':'PG Tuples sequentially scanned'},
        {'name':'Pypg_tup_idxfetch','units':'tuples','slope':'positive','description':'PG Tuples fetched from indexes'},
        {'name':'Pypg_hours_since_last_vacuum','units':'hours','slope':'both','description':'PG hours since last vacuum'},
        {'name':'Pypg_hours_since_last_analyze','units':'hours','slope':'both','description':'PG hours since last analyze'}]

    for d in descriptors:
        # Add default values to dictionary, 本例所有metric对应的call_back函数都是同一个, 值类型都是uint
        d.update({'call_back': metric_handler, 'time_max': 90, 'value_type': 'uint', 'format': '%d', 'groups': 'Postgres'})

    return descriptors

metric_init(params)函数返回的metric定义字典结构, 包含了以下基本信息(基本信息必须包含)

(还可以包含其他内容, 如SPOOF要用到的SPOOF_HOST, SPOOF_NAME等, 还有GROUP等(group用于gweb对图形分组)) :

'name': '<name>',                            #Name used in configuration
'call_back': <handler_function>,             #Call back function queried by gmond
'time_max': int(<time_max>),                 #Maximum metric value gathering interval
'value_type': '<data_type>',                 #Data type (string, uint, float, double)
'units': '<label>',                          #Units label
'slope': '<slope_type>',                     #Slope ('zero' constant values, 'both' numeric values)
'format': '<format>',                        #String formatting ('%s', '%u','%f')
'description': '<description>'}              #Free form metric description

以上即每个metric都必须包含的基本信息,

在对metric进行采样时, 会调用call_back指定的函数, call_back函数的参数是metric字典里的 name字段的值.

接下来要说的是metric_handler(name)函数, 在对metric进行采样时, 会调用call_back指定的函数, call_back函数的参数是metric字典里的 name字段的值.

例如 :

metric_handler(Pypg_hours_since_last_analyze)

我们上面的例子中, 每个metric用的都是同一个函数metric_handler :

    for d in descriptors:
        # Add default values to dictionary
        d.update({'call_back': metric_handler, 'time_max': 90, 'value_type': 'uint', 'format': '%d', 'groups': 'Postgres'})

这个函数的定义如下 :

# Metric handler uses dictionary pg_metrics keys to return values from queries based on metric name
def metric_handler(name):
    pg_metrics = pg_metrics_queries()
    return int(pg_metrics[name])

每个metric被采样时都会触发这个函数, 也就是说都会调用pg_metrics_queries(), 所以重复工作比较多.

这个module的作者显然是偷懒了, 应该为每个metric写一个call back函数比较好(或者这个pg_metrics_queries函数加个参数, 判断一下当前是那个metric触发的), 当然, 如果这些查询的效率很高的话, 重复就重复吧.(图省事)

import psycopg2
import psycopg2.extras
import syslog
import functools
import time

# Cache for postgres query values, this prevents opening db connections for each metric_handler callback
class Cache(object):
    def __init__(self, expiry):
        self.expiry = expiry
        self.curr_time = 0
        self.last_time = 0
        self.last_value = None

    def __call__(self, func):
        @functools.wraps(func)
        def deco(*args, **kwds):
            self.curr_time = time.time()
            if self.curr_time - self.last_time > self.expiry:
                self.last_value = func(*args, **kwds)
                self.last_time = self.curr_time
            return self.last_value
        return deco

# Queries update the pg_metrics dict with their values based on cache interval
@Cache(60)
def pg_metrics_queries():
    pg_metrics = {}
    db_conn = psycopg2.connect(pgdsn)
    db_curs = db_conn.cursor()

    # single session state query avoids multiple scans of pg_stat_activity
    # state is a different column name in postgres 9.2, previous versions will have to update this query accordingly
    db_curs.execute(
        'select state, waiting, \
        extract(epoch from current_timestamp - xact_start)::int, \
        extract(epoch from current_timestamp - query_start)::int from pg_stat_activity;')
    results = db_curs.fetchall()
    active = 0
    idle = 0
    idleintxn = 0
    waiting = 0
    active_results = []
    for state, waiting, xact_start_sec, query_start_sec in results:
        if state == 'active':
            active = int(active + 1)
            # build a list of query start times where query is active
            active_results.append(query_start_sec)
        if state == 'idle':
            idle = int(idle + 1)
        if state == 'idle in transaction':
            idleintxn = int(idleintxn + 1)
        if waiting == True:
            waitingtrue = int(waitingtrue + 1)

    # determine longest transaction in seconds
    sorted_by_xact = sorted(results, key=lambda tup: tup[2], reverse=True)
    longest_xact_in_sec = (sorted_by_xact[0])[2]

    # determine longest active query in seconds
    sorted_by_query = sorted(active_results, reverse=True)
    longest_query_in_sec = sorted_by_query[0]

    pg_metrics.update(
        {'Pypg_idle_sessions':idle,
        'Pypg_active_sessions':active,
        'Pypg_waiting_sessions':waiting,
        'Pypg_idle_in_transaction_sessions':idleintxn,
        'Pypg_longest_xact':longest_xact_in_sec,
        'Pypg_longest_query':longest_query_in_sec})

    # locks query
    db_curs.execute('select mode, locktype from pg_locks;')
    results = db_curs.fetchall()
    accessexclusive = 0
    otherexclusive = 0
    shared = 0
    for mode, locktype in results:
        if (mode == 'AccessExclusiveLock' and locktype != 'virtualxid'):
            accessexclusive = int(accessexclusive + 1)
        if (mode != 'AccessExclusiveLock' and locktype != 'virtualxid'):
            if 'Exclusive' in mode:
                otherexclusive = int(otherexclusive + 1)
        if ('Share' in mode and locktype != 'virtualxid'):
            shared = int(shared + 1)
    pg_metrics.update(
        {'Pypg_locks_accessexclusive':accessexclusive,
        'Pypg_locks_otherexclusive':otherexclusive,
        'Pypg_locks_shared':shared})

    # background writer query returns one row that needs to be parsed
    db_curs.execute(
        'select checkpoints_timed, checkpoints_req, checkpoint_write_time, \
        checkpoint_sync_time, buffers_checkpoint, buffers_clean, \
        buffers_backend, buffers_alloc from pg_stat_bgwriter;')
    results = db_curs.fetchall()
    bgwriter_values = results[0]
    checkpoints_timed = int(bgwriter_values[0])
    checkpoints_req = int(bgwriter_values[1])
    checkpoint_write_time = int(bgwriter_values[2])
    checkpoint_sync_time = int(bgwriter_values[3])
    buffers_checkpoint = int(bgwriter_values[4])
    buffers_clean = int(bgwriter_values[5])
    buffers_backend = int(bgwriter_values[6])
    buffers_alloc = int(bgwriter_values[7])
    pg_metrics.update(
        {'Pypg_bgwriter_checkpoints_timed':checkpoints_timed,
        'Pypg_bgwriter_checkpoints_req':checkpoints_req,
        'Pypg_bgwriter_checkpoint_write_time':checkpoint_write_time,
        'Pypg_bgwriter_checkpoint_sync_time':checkpoint_sync_time,
        'Pypg_bgwriter_buffers_checkpoint':buffers_checkpoint,
        'Pypg_bgwriter_buffers_clean':buffers_clean,
        'Pypg_bgwriter_buffers_backend':buffers_backend,
        'Pypg_bgwriter_buffers_alloc':buffers_alloc})

    # database statistics returns one row that needs to be parsed
    db_curs.execute(
        'select (sum(xact_commit) + sum(xact_rollback)), sum(tup_inserted), \
        sum(tup_updated), sum(tup_deleted), (sum(tup_returned) + sum(tup_fetched)), \
        sum(blks_read), sum(blks_hit) from pg_stat_database;')
    results = db_curs.fetchall()
    pg_stat_db_values = results[0]
    transactions = int(pg_stat_db_values[0])
    inserts = int(pg_stat_db_values[1])
    updates = int(pg_stat_db_values[2])
    deletes = int(pg_stat_db_values[3])
    reads = int(pg_stat_db_values[4])
    blksdisk = int(pg_stat_db_values[5])
    blksmem = int(pg_stat_db_values[6])
    pg_metrics.update(
        {'Pypg_transactions':transactions,
        'Pypg_inserts':inserts,
        'Pypg_updates':updates,
        'Pypg_deletes':deletes,
        'Pypg_reads':reads,
        'Pypg_blks_diskread':blksdisk,
        'Pypg_blks_memread':blksmem})

    # table statistics returns one row that needs to be parsed
    db_curs.execute(
        'select sum(seq_tup_read), sum(idx_tup_fetch), \
        extract(epoch from now() - min(last_vacuum))::int/60/60, \
        extract(epoch from now() - min(last_analyze))::int/60/60 \
        from pg_stat_all_tables;')
    results = db_curs.fetchall()
    pg_stat_table_values = results[0]
    seqscan = int(pg_stat_table_values[0])
    idxfetch = int(pg_stat_table_values[1])
    hours_since_vacuum = int(pg_stat_table_values[2])
    hours_since_analyze = int(pg_stat_table_values[3])
    pg_metrics.update(
        {'Pypg_tup_seqscan':seqscan,
        'Pypg_tup_idxfetch':idxfetch,
        'Pypg_hours_since_last_vacuum':hours_since_vacuum,
        'Pypg_hours_since_last_analyze':hours_since_analyze})

    db_curs.close()
    return pg_metrics

以上pg_metrics字典其实已经包含了所有metric name对应的value , 所以在metric对应的call back函数直接返回其值int(pg_metrics[name])即可.

def metric_handler(name):
    pg_metrics = pg_metrics_queries()
    return int(pg_metrics[name])

最后一个必须的函数是metric_cleanup(), 在gmond关闭时调用它. 一般用于关闭打开的文件, 断开网络等.

This function will be called once when gmond is shutting down. The cleanupfunction
should include any code that is required to close files, disconnection from the network,
or  any  other  type  of  clean  up  functionality.  Like  the  metric_init function,  the
cleanupfunction must be called metric_cleanupand must not take any parameters. In
addition, the cleanupfunction must not return a value.

例如 :

# ganglia requires metric cleanup
def metric_cleanup():
    '''Clean up the metric module.'''
    pass

2. 编写.py脚本对应的.pyconf配置文件. 注意这个配置文件的位置需对应在modpython.conf中配置的位置.

文件内容和gmond.conf其实差别不大(除了modules section).

例如postgres.pyconf

modules {
   module {
     name = "postgres"
     language = "python"
     param host {
       value = "hostname_goes_here"
     }
     param port {
       value = "port_goes_here"
     }
     param dbname {
       value = "database_goes_here"
     }
     param username {
       value = "username_goes_here"
     }
     param password {
       value = "password_goes_here"
     }
   }
}

collection_group {
   collect_every = 120
   time_threshold = 120
   metric {
     name = "Pypg_idle_sessions"
     title = "Postgres Idle Sessions"
     value_threshold = 0
   }
   metric {
     name = "Pypg_idle_in_transaction_sessions"
     title = "Postgres Idle In Transaction Sessions"
     value_threshold = 0
   }
   metric {
     name = "Pypg_active_sessions"
     title = "Postgres Active Sessions"
     value_threshold = 0
   }
   metric {
     name = "Pypg_hours_since_last_vacuum"
     title = "Postgres Hours Since Last Vacuum"
     value_threshold = 0
   }
   metric {
     name = "Pypg_hours_since_last_analyze"
     title = "Postgres Hours Since Last Analyze"
     value_threshold = 0
   }
   metric {
     name = "Pypg_waiting_sessions"
     title = "Postgres Waiting Sessions blocked"
     value_threshold = 0
   }
   metric {
     name = "Pypg_locks_accessexclusive"
     title = "Postgres AccessExclusive Locks read write blocking"
     value_threshold = 0
   }
   metric {
     name = "Pypg_locks_otherexclusive"
     title = "Postgres Exclusive Locks write blocking"
     value_threshold = 0
   }
   metric {
     name = "Pypg_locks_shared"
     title = "Postgres Shared Locks NON blocking"
     value_threshold = 0
   }
   metric {
     name = "Pypg_longest_xact"
     title = "Postgres Longest Transaction in Minutes"
     value_threshold = 0
   }
   metric {
     name = "Pypg_longest_query"
     title = "Postgres Longest Active Query in Minutes"
     value_threshold = 0
   }
   metric {
     name = "Pypg_transactions"
     title = "Postgres Transactions"
     value_threshold = 0
   }
   metric {
     name = "Pypg_inserts"
     title = "Postgres Inserts"
     value_threshold = 0
   }
   metric {
     name = "Pypg_updates"
     title = "Postgres Updates"
     value_threshold = 0
   }
   metric {
     name = "Pypg_deletes"
     title = "Postgres Deletes"
     value_threshold = 0
   }
   metric {
     name = "Pypg_reads"
     title = "Postgres Reads"
     value_threshold = 0
   }
   metric {
     name = "Pypg_blks_diskread"
     title = "Postgres Blocks Read From Disk"
     value_threshold = 0
   }
   metric {
     name = "Pypg_blks_memread"
     title = "Postgres Blocks Read From Memory Buffer Cache"
     value_threshold = 0
   }
   metric {
     name = "Pypg_tup_seqscan"
     title = "Postgres Tuples Read From Table Sequence Scans"
     value_threshold = 0
   }
   metric {
     name = "Pypg_tup_idxfetch"
     title = "Postgres Tuples Fetched From Indexes"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_checkpoints_timed"
     title = "Postgres Scheduled Checkpoints"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_checkpoints_req"
     title = "Postgres Unscheduled Checkpoints"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_checkpoint_write_time"
     title = "Postgres Time to Write Checkpoints to Disk"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_checkpoint_sync_time"
     title = "Postgres Time to Sync Checkpoints to Disk"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_buffers_checkpoint"
     title = "Postgres Buffers Written During Checkpoint"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_buffers_clean"
     title = "Postgres Buffers Written By Background Writer"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_buffers_backend"
     title = "Postgres Buffers Written Directly By Backend"
     value_threshold = 0
   }
   metric {
     name = "Pypg_bgwriter_buffers_alloc"
     title = "Postgres Number Of Buffers Allocated"
     value_threshold = 0
   }
}

三. 调试 python metric 模块. 这部分内容也写在.py脚本里面. 用以输出metric的值

例如 :

# this code is for debugging and unit testing
if __name__ == '__main__':
    descriptors = metric_init({"host":"hostname_goes_here","port":"port_goes_here","dbname":"database_name_goes_here","username":"username_goes_here","password":"password_goes_here"})
    while True:
        for d in descriptors:
            v = d['call_back'](d['name'])
            print 'value for %s is %u' % (d['name'],  v)
        time.sleep(5)

四. 配置和部署python metric 模块

前面已经讲过了, 需要配置gmond.conf以及.pyconf配置文件.

[参考]
1. http://blog.163.com/digoal@126/blog/static/16387704020148345219914/

2. http://blog.163.com/digoal@126/blog/static/163877040201481835916946/

3. http://blog.163.com/digoal@126/blog/static/163877040201481843417188/

4. https://github.com/ganglia/gmond_python_modules/blob/master/postgresql/conf.d/postgres.pyconf

5. https://github.com/ganglia/gmond_python_modules/blob/master/postgresql/python_modules/postgres.py

6. https://github.com/ganglia/gmond_python_modules

时间： 2024-10-30 07:26:30