PostgreSQL 流复制xlog异步send

PostgreSQL 流复制xlog异步send

作者

digoal

日期

2016-11-07

标签

PostgreSQL , 同步流复制 , 异步send


背景

PostgreSQL的流复制相比大家并不陌生,但是目前PG为了保证主的高度统治地位,一切以主库为准。包括SEND WAL时,也要求主已经FLUSH才能发给备库。

这实际上会导致些许的延迟,当然这个延迟目前来看可以忽略不计,但是随着硬件的发展,将来这个模式可能就会不适应。

那么能不能让主库的WAL record已经调用write或者已经写入wal buffer就允许发给备库,实现一步的wal send呢。

当然是可以的,来看一下。

源码

GetFlushRecPtr()可以修改为write位置,或者Insert的位置,实现异步的send。

《PostgreSQL xlog的位置》

src/backend/replication/walsender.c

/*
 * Wait till WAL < loc is flushed to disk so it can be safely read.
 */
static XLogRecPtr
WalSndWaitForWal(XLogRecPtr loc)
{
        int                     wakeEvents;
        static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;

        /*
         * Fast path to avoid acquiring the spinlock in the we already know we
         * have enough WAL available. This is particularly interesting if we're
         * far behind.
         */
        if (RecentFlushPtr != InvalidXLogRecPtr &&
                loc <= RecentFlushPtr)
                return RecentFlushPtr;

        /* Get a more recent flush pointer. */
        if (!RecoveryInProgress())
                RecentFlushPtr = GetFlushRecPtr();  // 获取已flush位点
        else
                RecentFlushPtr = GetXLogReplayRecPtr(NULL);

        for (;;)
        {
                long            sleeptime;
                TimestampTz now;

                /*
                 * Emergency bailout if postmaster has died.  This is to avoid the
                 * necessity for manual cleanup of all postmaster children.
                 */
                if (!PostmasterIsAlive())
                        exit(1);

                /* Clear any already-pending wakeups */
                ResetLatch(MyLatch);

                CHECK_FOR_INTERRUPTS();

                /* Process any requests or signals received recently */
                if (got_SIGHUP)
                {
                        got_SIGHUP = false;
                        ProcessConfigFile(PGC_SIGHUP);
                        SyncRepInitConfig();
                }

                /* Check for input from the client */
                ProcessRepliesIfAny();

                /* Update our idea of the currently flushed position. */
                if (!RecoveryInProgress())
                        RecentFlushPtr = GetFlushRecPtr();  // 获取已flush位点
                else
                        RecentFlushPtr = GetXLogReplayRecPtr(NULL);

                /*
                 * If postmaster asked us to stop, don't wait here anymore. This will
                 * cause the xlogreader to return without reading a full record, which
                 * is the fastest way to reach the mainloop which then can quit.
                 *
                 * It's important to do this check after the recomputation of
                 * RecentFlushPtr, so we can send all remaining data before shutting
                 * down.
                 */
                if (walsender_ready_to_stop)
                        break;

                /*
                 * We only send regular messages to the client for full decoded
                 * transactions, but a synchronous replication and walsender shutdown
                 * possibly are waiting for a later location. So we send pings
                 * containing the flush location every now and then.
                 */
                if (MyWalSnd->flush < sentPtr &&
                        MyWalSnd->write < sentPtr &&
                        !waiting_for_ping_response)
                {
                        WalSndKeepalive(false);
                        waiting_for_ping_response = true;
                }

                /* check whether we're done */
                if (loc <= RecentFlushPtr)
                        break;

                /* Waiting for new WAL. Since we need to wait, we're now caught up. */
                WalSndCaughtUp = true;

                /*
                 * Try to flush pending output to the client. Also wait for the socket
                 * becoming writable, if there's still pending output after an attempt
                 * to flush. Otherwise we might just sit on output data while waiting
                 * for new WAL being generated.
                 */
                if (pq_flush_if_writable() != 0)
                        WalSndShutdown();

                now = GetCurrentTimestamp();

                /* die if timeout was reached */
                WalSndCheckTimeOut(now);

                /* Send keepalive if the time has come */
                WalSndKeepaliveIfNecessary(now);
                sleeptime = WalSndComputeSleeptime(now);

                wakeEvents = WL_LATCH_SET | WL_POSTMASTER_DEATH |
                        WL_SOCKET_READABLE | WL_TIMEOUT;

                if (pq_is_send_pending())
                        wakeEvents |= WL_SOCKET_WRITEABLE;

                /* Sleep until something happens or we time out */
                WaitLatchOrSocket(MyLatch, wakeEvents,
                                                  MyProcPort->sock, sleeptime);
        }

        /* reactivate latch so WalSndLoop knows to continue */
        SetLatch(MyLatch);
        return RecentFlushPtr;
}
static void
XLogSendPhysical(void)
{
......
        /* Figure out how far we can safely send the WAL. */
        if (sendTimeLineIsHistoric)
        {
......
        }
        else if (am_cascading_walsender)
        {
......
        }
        else
        {
                /*
                 * Streaming the current timeline on a master.
                 *
                 * Attempt to send all data that's already been written out and
                 * fsync'd to disk.  We cannot go further than what's been written out
                 * given the current implementation of XLogRead().  And in any case
                 * it's unsafe to send WAL that is not securely down to disk on the
                 * master: if the master subsequently crashes and restarts, slaves
                 * must not have applied any WAL that gets lost on the master.
                 */
                SendRqstPtr = GetFlushRecPtr();
        }

src/backend/access/transam/xlog.c

/*
 * Return the current Redo pointer from shared memory.
 *
 * As a side-effect, the local RedoRecPtr copy is updated.
 */
XLogRecPtr
GetRedoRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  ptr;

    /*
     * The possibly not up-to-date copy in XlogCtl is enough. Even if we
     * grabbed a WAL insertion lock to read the master copy, someone might
     * update it just after we've released the lock.
     */
    SpinLockAcquire(&xlogctl->info_lck);
    ptr = xlogctl->RedoRecPtr;
    SpinLockRelease(&xlogctl->info_lck);

    if (RedoRecPtr < ptr)
        RedoRecPtr = ptr;

    return RedoRecPtr;
}

/*
 * GetInsertRecPtr -- Returns the current insert position.
 *
 * NOTE: The value *actually* returned is the position of the last full
 * xlog page. It lags behind the real insert position by at most 1 page.
 * For that, we don't need to scan through WAL insertion locks, and an
 * approximation is enough for the current usage of this function.
 */
XLogRecPtr
GetInsertRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  recptr;

    SpinLockAcquire(&xlogctl->info_lck);
    recptr = xlogctl->LogwrtRqst.Write;
    SpinLockRelease(&xlogctl->info_lck);

    return recptr;
}

/*
 * GetFlushRecPtr -- Returns the current flush position, ie, the last WAL
 * position known to be fsync'd to disk.
 */
XLogRecPtr
GetFlushRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  recptr;

    SpinLockAcquire(&xlogctl->info_lck);
    recptr = xlogctl->LogwrtResult.Flush;
    SpinLockRelease(&xlogctl->info_lck);

    return recptr;
}
时间: 2025-01-01 16:39:56

PostgreSQL 流复制xlog异步send的相关文章

PostgreSQL 10.0 preview 流复制增强 - 支持可配置的wal send max size

标签 PostgreSQL , 10.0 , 流复制增强 , max wal send size 背景 以前的版本,wal sender进程使用流复制协议,将WAL信息发送给下游的wal receiver进程时,一次最多发送128KiB,是在宏中设置的. 现在允许用户设置GUC参数,来控制这个最大值. 在测试环境中设置为16MB有2倍的性能提升,可以更好的利用网络带宽,提升流复制的传输效率. Attached please find a patch for PostgreSQL 9.4 whic

PostgreSQL 同步流复制原理和代码浅析

数据库ACID中的持久化如何实现 数据库ACID里面的D,持久化. 指的是对于用户来说提交的事务,数据是可靠的,即使数据库crash了,在硬件完好的情况下,也能恢复回来.PostgreSQL是怎么做到的呢,看一幅图,画得比较丑,凑合看吧.假设一个事务,对数据库做了一些操作,并且产生了一些脏数据,首先这些脏数据会在数据库的shared buffer中.同时,产生这些脏数据的同时也会产生对应的redo信息,产生的REDO会有对应的LSN号(你可以理解为REDO 的虚拟地址空间的一个唯一的OFFSET

PostgreSQL PG主备流复制机制

PostgreSQL在9.0之后引入了主备流复制机制,通过流复制,备库不断的从主库同步相应的数据,并在备库apply每个WAL record,这里的流复制每次传输单位是WAL日志的record.而PostgreSQL9.0之前提供的方法是主库写完一个WAL日志文件后,才把WAL日志文件传送到备库,这样的方式导致主备延迟特别大.同时PostgreSQL9.0之后提供了Hot Standby,备库在应用WAL record的同时也能够提供只读服务,大大提升了用户体验. 主备总体结构 PG主备流复制的

PostgreSQL 同步流复制锁瓶颈分析

PostgreSQL 同步流复制锁瓶颈分析 作者 digoal 日期 2016-11-07 标签 PostgreSQL , 同步流复制 , mutex , Linux , latch 背景 PostgreSQL的同步流复制实际上是通过walsender接收到的walreceiver的LSN位点,来唤醒和释放那些需要等待WAL已被备库接收的事务的. 对同步事务来说,用户发起结束事务的请求后,产生的RECORD LSN必须要小于或等于walsender接收到的walreceiver反馈的LSN位点.

postgresql简单搭建流复制

一.准备环境 准备两台相同环境的虚拟机,安装相同版本的pg. pg版本:9.3 master:192.168.23.128 slave:192.168.23.129 二.主库配置 1.创建备库访问用户: 修改postgresql.conf中的参数 listen_addresses='*'(默认值是localhost,改为*代表监听所有ip) 创建用户 CREATE USER repuser replication LOGIN CONNECTION LIMIT 5 ENCRYPTED PASSWO

PostgreSQL 9.0 流复制介绍

PostgreSQL9提供了一个非常兴奋的功能,hot-standby,功能与ORACLE 11G的ACTIVE STANDBY类似.并且增加了流复制的功能,这个与oracle 的standby redo log功能类似,大大的缩短了备份库与主库的事务间隔. HOT-STANDBY可以提供容灾,恢复的同时可以把数据库打开,提供查询功能.以前的版本恢复的时候是不能打开的. 首先看一张postgreSQL的高可用,负载均衡,复制特征矩阵图 这里有一个很好的特性 Slaves accept read-

postgres配置主从流复制

postgres主从流复制 postgres在9.0之后引入了主从的流复制机制,所谓流复制,就是从库通过tcp流从主库中同步相应的数据.postgres的主从看过一个视频,大概效率为3w多事务qps. postgres的主从主称之为primary,从称为stand_by.主从配置需要注意的一个是主从的postgres的版本,环境,等最好都需要一致,否则可能产生奇奇怪怪的问题. postgres的主配置 主是10.12.12.10这台机器 首先需要配置一个账号进行主从同步. 修改pg_hba.co

pgpool 主从流复制模式下的安装使用

pgpool-II 是一个位于 PostgreSQL 服务器和 PostgreSQL 数据库客户端之间的中间件,它提供以下功能:  连接池 pgpool-II 保持已经连接到 PostgreSQL 服务器的连接,并在使用相同参数(例如:用户名,数据库,协议版本)连接进来时重用它们.它减少了连接开销,并增加了系统的总体吞吐量.  复制 pgpool-II 可以管理多个 PostgreSQL 服务器.激活复制功能并使在2台或者更多 PostgreSQL 节点中建立一个实时备份成为可能,这样,如果

数据库内核月报 - 2015 / 10-PgSQL · 特性分析 · PG主备流复制机制

PostgreSQL在9.0之后引入了主备流复制机制,通过流复制,备库不断的从主库同步相应的数据,并在备库apply每个WAL record,这里的流复制每次传输单位是WAL日志的record.而PostgreSQL9.0之前提供的方法是主库写完一个WAL日志文件后,才把WAL日志文件传送到备库,这样的方式导致主备延迟特别大.同时PostgreSQL9.0之后提供了Hot Standby,备库在应用WAL record的同时也能够提供只读服务,大大提升了用户体验. 主备总体结构 PG主备流复制的