笔记:Ceph: A Scalable, High-Performance Distributed File System

关于Ceph的名篇。Ceph是现在很火的一个存储系统,不同于HDSF主要是面向大数据应用,Ceph是立志要做一个通用的存储解决方案,要同时很好的支持对象存储(Object Storage),块存储(Block Storage)以及文件系统(File System) 。现在很多Openstack私有云的存储都是基于Ceph的。Ceph就是基于这篇论文做得。

摘要
很明确的指出了Ceph的使命:
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability.
以及关键方法和技术:
Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs).
We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system.
A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads.
然后就是性能
Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supportingmore than 250,000metadata operations per second.

介绍:
先把NFS和传统OSD的问题说了一下。
然后介绍Ceph:
We present Ceph, a distributed file system that provides excellent performance and reliability while promising unparalleled scalability.
这句是一个关键:Our architecture is based on the assumption that systems at the petabyte scale are inherently dynamic: large systems are inevitably built incrementally, node failures are the norm rather than the exception, and the quality and character of workloads are constantly shifting over time.
Ceph的架构如下:

系统介绍:
Ceph分3部分:
the client, each instance of which exposes a near-POSIX file system interface to a host or process;
a cluster of OSDs, which collectively stores all data and metadata;
A metadata server cluster, which manages the namespace (file names and directories) while coordinating security, consistency and coherence (see Figure 1).
如下图所示:

主要做法:
Decoupled Data and Metadata
Dynamic Distributed Metadata Management
Reliable Autonomic Distributed Object Storage

后面几章是对每部分具体实现的介绍,没有什么太高深的公式和理论,大家一般都能看明白,挺有意思的。
原文链接:
http://www.ece.eng.wayne.edu/~sjiang/ECE7650-winter-15/topic5B-S.pdf
如果下不了可以去百度学术上再搜一下。

时间: 2024-10-30 07:13:52

笔记:Ceph: A Scalable, High-Performance Distributed File System的相关文章

Distributed File System(簇文件系统)

Distributed File System(簇文件系统) 我吧分布式文件系统分为三类,聚合文件系统,全局文件系统,负载均衡文件系统.除了gfs其他文件系统都是建立在本地文件系统之上的网络文件系统. 几乎所有DFS都能通过fuse mount 到本地,但有些DFS mount 后性能不佳. 3.1. 聚合文件系统 以NFS, glusterfs 为代表,其特点是server独立运行,Server与Server间没有通信,然后访问者将其聚合组织并规划目录,为client提供数据共享. glust

第 21 章 Distributed File System(簇文件系统)

我吧分布式文件系统分为三类,聚合文件系统,全局文件系统,负载均衡文件系统. 除了gfs其他文件系统都是建立在本地文件系统之上的网络文件系统. 几乎所有DFS都能通过fuse mount 到本地,但有些DFS mount 后性能不佳. 还有一个与分布式文件系统密切相关的,就是块设备,块设备不是文件系统,可以称为裸设备. 21.1. 聚合文件系统 以NFS, glusterfs 为代表,其特点是server独立运行,Server与Server间没有通信,然后访问者将其聚合组织并规划目录,为clien

第 73 章 Distributed File Systems

73.1. DRBD (Distributed Replicated Block Device) Homepage: http://www.drbd.org/ 实验环境需要两台电脑,如果你没有,建议你使用VMware,并且为每一个虚拟机添加两块硬盘. 实验环境 master: 192.168.0.1 DRBD:/dev/sdb slave: 192.168.0.2 DRBD:/dev/sdb 73.1.1. disk and partition Each of the following ste

Samsung_tiny4412(驱动笔记06)----list_head,proc file system,GPIO,ioremap

/**************************************************************************** * * list_head,proc file system,GPIO,ioremap * * 声明: * 1. 本系列文档是在vim下编辑,请尽量是用vim来阅读,在其它编辑器下可能会 * 不对齐,从而影响阅读. * 2. 本文中有些源代码没有全部帖出来,主要是因为篇幅太大的原因; * 3. 基于2中的原因,本文借鉴了python中的缩进代

bigtable: A Distributed Storage System for Structured Data

bigtable: A Distributed Storage System for Structured Data http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf http://www.dbthink.com/?p=493, 中文翻译   总结 A Bigtable is a sparse, distri

ganglia - distributed monitor system

传统的监控系统, 通常采用agent+server的方式, agent负责收集监控信息, 主动或被动发送给server, server负责向agent请求监控数据(agent被动), server和agent都通常使用TCP来进行连接.  传统监控的主要弊端, 当被监控的主机很多的情况下, server端的压力会很大, 例如要监控2万台主机的30个监控项, 就有60万个监控数据要从agent收集, 假设每分钟收集一次监控数据, 每秒需要上千次的metric get请求.  ganglia的设计思

GFS - The Google File System

The Google File System http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.789&rep=rep1&type=pdf http://www.dbthink.com/?p=501, 中文翻译   Google牛人云集的地方, 但在设计系统时, 却非常务实, 没有采用什么复杂和时髦的算法和机制  设计大型系统, 最重要的就是, 简单和可靠, 复杂就意味着失控... 在设计GFS, 首先一个选择就是,

Distributed Message System

http://dongxicheng.org/search-engine/log-systems/ 包括facebook的scribe,apache的chukwa,linkedin的kafka和cloudera的flume   Kafka http://www.cnblogs.com/fxjwind/archive/2013/03/22/2975573.html http://www.cnblogs.com/fxjwind/archive/2013/03/19/2969655.html    F

Cloud File System

Cloud File System Liberios Vokorokos, Anton Bal´aˇz, Branislav Madoˇs and J´an Raduˇsovsk´ Cloud computing represents a dis-tributed computing paradigm that focuses on providing a wide range of users with distributed access to scalable, virtualized h