Speeding up Migration on ApsaraDB for Redis

Abstract: Redis supports the MIGRATE command to transfer a key from a source instance to a destination instance. During migration, the serialized version of the key value is generated with the DUMP command, and then the target node executes the RESTORE command to load data into memory. In this article, we migrated a key with a data size of 800 MB and a data type (ZSET). We compared the performance of a migration on a native Redis environment with a migration on an optimized environment.

Redis supports the MIGRATE command to transfer a key from a source instance to a destination instance. During migration, the serialized version of the key value is generated with the DUMP command, and then the target node executes the RESTORE command to load data into memory. In this article, we migrated a key with a data size of 800 MB and a data type (ZSET). We compared the performance of a migration on a native Redis environment with a migration on an optimized environment. The test environment consists of two Redis databases on the local development machine and the impact of the network is ignored. Based on these conditions, executing the RESTORE command on the native Redis environment takes 163 seconds while executing it on the optimized Redis takes only 27 seconds. This analysis was performed using Alibaba Cloud ApsaraDB for Redis.

1. Native Redis RESTORE performance bottleneck

Our analysis result shows the CPU status as follows:

We can see from the source code that the hash table values and scores of the ZSET from migrate traversal are serialized and then packaged to the target node.

The target node then deserializes the data and refactors the ZSET structure, including running the zslinsert and dictadd operations. This process is time-consuming, and the refactoring cost increases as the data size increases.

2. Method of optimization

From our analysis, we can see that the bottleneck is due to data model refactoring. To optimize the process, we can serialize and package the data model of the source node together and send the data to the target node. The target node parses the data, pre-constructs the memory, and then crams the parsed members.

Because ZSET is a fairly complicated data structure in Redis, we will briefly introduce the concepts used in ZSET..

2.1 ZSET data structure

ZSET consists of two data structures, one being the hash table, which stores the value of each member and the corresponding scores, and the other being the skip list, where all members are sorted in order as shown in the figure below:

2.2 Serialize the ZSET structure model

In Redis, the memory for ZSET dicts and the memory for zsl members and scores are shared. The two structures also share the same memory. The cost will be higher if you describe the same copy of data in two indexes in the serialization.

2.2.1 Serialize the dict model

When looking at the CPU resource consumption, we can see that the hash table part consumes more CPU resources when calculating the index, rehash, and compare key. (Rehash is used when the pre-allocated hash table size is not enough, and a larger hash table is needed to transfer the old table to the new table. The compare key is used when traversing in the list to determine whether a key already exists).

Based on this, the largest hash table size is specified during serialization, removing the need for rehash when generating a dict table when executing RESTROE.

To restore the zsl structure, we need to deserialize the member and score, as well as recalculate the member index and insert it to the table of the designated index. Because the zsl from the traversal does not contain key conflicts, members of the same index can be added to the list directly, eliminating the compare key.

2.2.2 Serialize zsl model

Zsl has a multi-layer structure as shown in the figure below.

The difficulty of the description lies in the unknown total number of levels of zskiplistNode on each layer. We also need to describe the node context on each layer while considering compatibility.

Based on the above considerations, we decided to traverse from the highest level of zsl, and the serialized format is:
level | header span | level_len | [ span ( | member | score ... ) ]

Item Description
level Level of the data
header span The span value on the layer of the header node
level len Total number of nodes on this layer
span The span value on the layer of the node
member | score Because redundant nodes may exist on top of Level 0, we can add up the span values to determine whether a redundant node exists. If a redundant node exists, the member | score will not be serialized. Otherwise, member | score are included for non-redundant nodes. The deserialization algorithm follows the same principle.

Conclusion

By now, the description of the ZSET data model is complete and the performance of RESTORE is faster. However, this optimization method introduces a tradeoff because it consumes more bandwidth. The extra bandwidth originates from the field that describes the node. The data size after optimization is 20 MB larger than the 800 MB of data before the optimization.

ApsaraDB for Redis is a stable, reliable, and scalable database service with superb performance. It is structured on the Apsara Distributed File System and full SSD high-performance storage, and supports master-slave and cluster-based high-availability architectures. ApsaraDB for Redis offers a full range of database solutions including disaster switchover, failover, online expansion, and performance optimization. Try ApsaraDB for Redis today!

时间: 2024-09-20 06:30:02

Speeding up Migration on ApsaraDB for Redis的相关文章

ApsaraDB for Redis(阿里云redis) 如何导出redis数据?

问题描述 ApsaraDB for Redis(阿里云redis) 如何导出redis数据? ApsaraDB for Redis(阿里云redis) 如何导出redis数据? 需求直接把所有数据复制到我本地redis,设置主从不行,求办法

ApsaraDB for Redis之内存去哪儿了(一)数据过期与逐出策略

背景 Redis作为一个高性能的内存NoSQL数据库,其容量受到最大内存限制的限制. 用户在使用阿里云Redis时,除了对性能,稳定性有很高的要求外,对内存占用也比较敏感.在使用过程中,有些用户会觉得自己的线上实例内存占用比自己预想的要大. 事实上,实例中的内存除了保存原始的键值对所需的开销外,还有一些运行时产生的额外内存,包括: 垃圾数据和过期Key所占空间 字典渐进式Rehash导致未及时删除的空间 Redis管理数据,包括底层数据结构开销,客户端信息,读写缓冲区等 主从复制,bgsave时

ApsaraDB for Redis,与创客同行:阿里云Redis技术架构简介及后续规划

从单机.集群.容灾.多活等概览阿里云Redis. 设计思想: 稳定性>体验>成本 运维导向:面向FAILOVER 重监控:态势感知,可回溯 重管控:突破规模制约 资源隔离:用户独占资源 技术特点: 无感知热升级,Proxy防闪断:Redis内核so热升级:Proxy链接漂移热升级 全量备份恢复:按时间点备份恢复,游戏滚服利器 容灾:双机房:异地多活for 高可用.高可靠 无缝扩缩容:单机<->集群:云上云下数据搬迁 基于binlog弱网数据同步 内核改进:消除Aof Rewrite

Redis短连接性能优化

对于Redis服务,通常我们推荐用户使用长连接来访问Redis,由于短连接每次需要新建链接所以短连接在tcp协议层面性能就比长连接低效,但是由于某些用户在连接池失效的时候还是会建立大量的短连接或者用户由于客户端限制还是只能使用短连接来访问Redis,而原生的Redis在频繁建立短连接的时候有一定性能缺陷,我们在云上就碰到用户短连接的性能问题. 1. 问题 通过历史监控我们可以发现用户在频繁使用短连接的时候Redis的cpu使用率有显著的上升 2. 排查 通过扁鹊查看但是Redis的cpu运行情况

解密阿里云Redis助力双十一背后的技术

双11如火如荼的结束了,阿里云Redis(ApsaraDB for Redis原KVStore)也圆满完成了双11Redis的保障工作.目前阿里云Redis提供了单机版本和集群版本的Redis. 单机版本Redis具有很高的兼容性,并且支持Lua脚本及地理位置计算. 集群版本具有大容量.高性能的特性,能够突破Redis单线程的单机性能极限. 阿里云Redis默认双机热备并提供了备份恢复支持,同时阿里云Redis源码团队持续对Redis进行优化升级,提供了强大的安全防护能力.本文将选取双11的一些

Redis数据编码方式详解

引言 Redis是一个key-value存储系统.和Memcached类似,它支持存储的value类型相对更多,包括string(字符串).list(链表).set(集合).zset(sorted set有序集合)和hash(哈希类型).这些数据类型都支持push/pop.add/remove以及取交集并集和差集及更丰富的操作,而且这些操作都是原子性的. 本文将对Redis数据的编码方式和底层数据结构进行分析和介绍,帮助读者更好的了解和使用它们. 数据类型和编码方式 Redis中数据对象的定义如

Redis · 最佳实践 · 阿里云Redis助力双11业务

双11如火如荼的结束了,阿里云Redis(ApsaraDB for Redis原KVStore)也圆满完成了双11Redis的保障工作.目前阿里云Redis提供了单机版本和集群版本的Redis. 单机版本Redis具有很高的兼容性,并且支持Lua脚本及地理位置计算. 集群版本具有大容量.高性能的特性,能够突破Redis单线程的单机性能极限. 阿里云Redis默认双机热备并提供了备份恢复支持,同时阿里云Redis源码团队持续对Redis进行优化升级,提供了强大的安全防护能力.本文将选取双11的一些

Redis内存分析方法

背景 线上经常遇到用户想知道自己Redis实例中数据的内存分布情况. 为了不影响线上实例的使用,我们一般会采用bgsave生成dump.rdb文件,再结合redis-rdb-tools和sqlite来进行静态分析. 创建备份 自建Redis可在客户端执行bgsave生成rdb文件. 阿里云数据库Redis版可以在控制台上可以进行数据备份和下载的操作,下载后的数据为rdb格式文件. 步骤详见下图: 生成内存快照 redis-rdb-tools是一个python的解析rdb文件工具, 主要有一下三个

Guarding Against CSRF Vulnerability in Redis

Abstract: What is Redis CSRF vulnerability and how can we guarantee the security of Redis? Redis's CSRF vulnerability was exposed in February 2017, and the author of Redis has fixed the vulnerability in the latest release of Redis 3.2.7. This article