Distributed Message System

http://dongxicheng.org/search-engine/log-systems/

包括facebook的scribe,apache的chukwa,linkedin的kafka和cloudera的flume

 

Kafka

http://www.cnblogs.com/fxjwind/archive/2013/03/22/2975573.html

http://www.cnblogs.com/fxjwind/archive/2013/03/19/2969655.html 

 

Flume

Flume User Guide, http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html

1.1. Architecture

Flume’s architecture is simple, robust, and flexible.

The graph above shows a typical deployment of Flume that collects log data from a set of application servers. The deployment consists of a number of logical nodes, arranged into three tiers. The first tier is the agent tier. Agent nodes are typically installed on the machines that generate the logs and are your data’s initial point of contact with Flume. They forward data to the next tier of collector nodes, which aggregate the separate data flows and forward them to the finalstorage tier.

 

Logical nodes are a very flexible abstraction. Every logical node has just two components - a source and a sink.

Both source and sink can additionally be configured with decorators which perform some simple processing on data as it passes through.

The source tells a logical node where to collect data.

The sink tells it where to send the data.

The only difference between two logical nodes is how the source and sink are configured.

The source, sink, and optional decorators are a powerful set of primitives.

 

Logical and Physical Nodes

It’s important to make the distinction between logical nodes and physical nodes. A physical node corresponds to a single Java process running on one machine in a single JVM instance. Usually there is just one physical node per machine.

Physical nodes act as containers for logical nodes, which are wired together to form data flows. Each physical node can play host to many logical nodes, and takes care of arbitrating the assignment of machine resources between them.

So, although the agents and the collectors in the preceding example are logically separate processes, they could be running on the same physical node.

The Master assigns a configuration to each logical node at run-time - all components of a node’s configuration are instantiated dynamically at run-time, and therefore configurations can be changed many times throughout the lifetime of a Flume service without having to restart any Java processes or log into the machines themselves. In fact, logical nodes themselves can be created and deleted dynamically.

 

1.2. Reliability

Flume can guarantee that all data received by an agent node will eventually make it to the collector at the end of its flow as long as the agent node keeps running. That is, data can be reliably delivered to its eventual destination.

这点做的似乎比kafka要好, 并且是可定制的, 分为以下几级, 用户可用根据需要任选:

However, reliable delivery can be very resource intensive and is often a stronger guarantee than some data sources require. Therefore, Flume allows the user to specify, on a per-flow basis, the level of reliability required. There are three supported reliability levels:

The end-to-end reliability level,

The first thing the agent does in this setting is write the event to disk in a 'write-ahead log' (WAL) so that, if the agent crashes and restarts, knowledge of the event is not lost.

After the event has successfully made its way to the end of its flow, an acknowledgment is sent back to the originating agent so that it knows it no longer needs to store the event on disk.

This reliability level can withstand any number of failures downstream of the initial agent.

 

The store on failure reliability level, only require an acknowledgement from the node one hop downstream.

If the sending node detects a failure, it will store data on its local disk until the downstream node is repaired, or an alternate downstream destination can be selected.

Data can be lost if a compound or silent failure occurs.

 

The best-effort reliability level sends data to the next hop with no attempts to confirm or retry delivery. If nodes fail, any data that they were in the process of transmitting or receiving can be lost. This is the weakest reliability level, but also the most lightweight.

 

1.3. Scalability

Scalability is the ability to increase system performance linearly - or better - by adding more resources to the system. Flume’s goal is horizontal scalability — the ability to incrementally add more machines to the system to increase throughput.

 

1.4. Manageability

Manageability is the ability to control data flows, monitor nodes, modify settings, and control outputs of a large system.

The Flume Master is the point where global state such as the data flows can be managed, by a web interface or thescriptable Flume command shell.

Via the Flume Master, users can monitor flows on the fly, such as load imbalances, partial failures, or newly provisioned hardware.

You can dynamically reconfigure nodes by using the Flume Master. You can reconfigure nodes by using small scripts written in a flexible dataflow specification language, which can be submitted via the Flume Master interface.

 

1.5. Extensibility

Extensibility is the ability to add new functionality to a system. For example, you can extend Flume by adding connectors to existing storage layers or data platforms.

Some general sources include files from the file system, syslog and syslog-ng emulation, or the standard output of a process. More specific sources such as IRC channels and Twitter streams can also be added.

Similarly, there are many output destinations for events. Although HDFS is the primary output destination, events can be sent to local files, or to monitoring and alerting applications such as Ganglia or communication channels such as IRC.

 

3. Pseudo-distributed Mode

Flume is intended to be run as a distributed system with processes spread out across many machines. It can also be run as several processes on a single machine, which is called “pseudo-distributed” mode.

3.1. Starting Pseudo-distributed Flume Daemons

There are two kinds of processes in the system: the Flume master and the Flume node.

The Flume Master is the central management point and controls the data flows of the nodes. It is the single logical entity that holds global state data and controls the Flume node data flows and monitors Flume nodes.

Flume nodes serve as the data path for streams of events. They can be the sources, conduits, and consumers of event data. The nodes periodically contact the Master to transmit a heartbeat and to get their data flow configuration.

3.1.1. The Master

The Master can be manually started by executing the following command:

$ flume master

After the Master is started, you can access it by pointing a web browser to http://localhost:35871/. This web page displays the status of all Flume nodes that have contacted the Master, and shows each node’s currently assigned configuration. When you start this up without Flume nodes running, the status and configuration tables will be empty.

不错, 提供的基于webUI的监控...

3.1.2. The Flume Node

To start a Flume node, invoke the following command in another terminal.

$ flume node_nowatch

To check whether a Flume node is up, point your browser to the Flume Node status page at http://localhost:35862/.

3.2. Configuring a Node via the Master

Requiring nodes to contact the Master to get their configuration enables you to dynamically change the configuration of nodes without having to log in to the remote machine to restart the daemon. You can quickly change the node’s previous data flow configuration to a new one.

The following describes how to "wire" nodes using the Master’s web interface.

On the Master’s web page, click on the config link. You are presented with two forms. These are web interfaces for setting the node’s data flows. When Flume nodes contact the Master, they will notice that the data flow version has changed, instantiate, and activate the configuration.

这个真的相当的方便, 打开WebUI, 就可用随便配置每个node的name, souce, sink…当下次node heartbeat时, 会自动更新自己的配置

If you enter:

Node name:  host

Source:  text("/etc/services")

Sink:  console("avrojson")

You get the file with each record in JSON format displayed to the console.

 

3.5. Tiering Flume Nodes: Agents and Collectors

A simple network connection is abstractly just another sink. It would be great if sending events over the network was easy, efficient, and reliable. In reality, collecting data from a distributed set of machines and relying on networking connectivity greatly increases the likelihood and kinds of failures that can occur. The bottom line is that providing reliability guarantees introduces complexity and many tradeoffs.

为什么要给flume nodes分层, 读完直接写到存储层不行吗, 为什么要分成agents和collectors

首先, agent工作的系统, 往往不是很稳定的, 有各种fail的可能, 而且在存储前, 如果对数据做些预处理和整合应该更有效一些.

4. Fully-distributed Mode

Steps to Deploy Flume On a Cluster

  • Install Flume on each machine.
  • Select one or more nodes to be the Master.
  • Modify a static configuration file to use site specific properties.
  • Start the Flume Master node on at least one machine.
  • Start a Flume node on each machine.
4.2. Multiple Collectors
4.2.1. Partitioning Agents across Multiple Collectors

The preceding graph and dataflow spec shows a typical topology for Flume nodes. For reliable delivery, in the event that the collector stops operating or disconnects from the agents, the agents would need to store their events to their respective disks locally. The agents would then periodically attempt to recontact a collector. Because the collector is down, any analysis or processing downstream is blocked.

当一个collector fail了, agent可以把数据在本地做缓存, 直到collector恢复了, 继续发送.

这个明显有点傻, 不是有其他的collector吗, 这个坏了用其他的好了, 人为去调整, Manually Specifying Failover Chains

当然如果可以自动去调整, 更好, 不过not currently work when using multiple masters.

 

4.4. Multiple Masters

The Master has two main jobs to perform. The first is to keep track of all the nodes in a Flume deployment and to keep them informed of any changes to their configuration. The second is to track acknowledgements from the end of a Flume flow that is operating in reliable mode so that the source at the top of that flow knows when to stop transmitting an event.

这明显有单点问题...一挂全挂

4.4.3. Running in Distributed Mode

Running the Flume Master in distributed mode provides better fault tolerance than in standalone mode, and scalability for hundreds of nodes.

Configuring machines to run as part of a distributed Flume Master is nearly as simple as standalone mode. As before,flume.master.servers needs to be set, this time to a list of machines:

<property>
<name>flume.master.servers</name>
<value>masterA,masterB,masterC</value>
</property>

How many machines do I need? The distributed Flume Master will continue to work correctly as long as more than half the physical machines running it are still working and haven’t crashed. Therefore if you want to survive one fault, you need three machines (because 3-1 = 2 > 3/2).

为什么要半数以上, 就算剩一个, 当Standalone Mode跑, 不行吗, 暂时不明白...

Each Master process will initially try and contact all other nodes in the ensemble. Until more than half (in this case, two) nodes are alive and contactable, the configuration store will be unable to start, and the Flume Master will not be able to read or write configuration data.

不到半数不干活...

 

4.4.4. Configuration Stores

The Flume Master stores all its data in a configuration store. Flume has a pluggable configuration store architecture, and supports two implementations.

  • The Memory-Backed Config Store (MBCS) stores configurations temporarily in memory. If the master node fails and reboots, all the configuration data will be lost. The MBCS is incompatible with distributed masters. However, it is very easy to administer, computationally lightweight, and good for testing and experimentation.
  • The ZooKeeper-Backed Config Store (ZBCS) stores configurations persistently and takes care of synchronizing them between multiple masters.

Flume and Apache ZooKeeper . Flume relies on the Apache ZooKeeper coordination platform to provide reliable, consistent, and persistent storage for node configuration data. A ZooKeeper ensemble is made up of two or more nodes which communicate regularly with each other to make sure each is up to date. Flume embeds a ZooKeeper server inside the Master process, so starting and maintaining the service is taken care of. However, if you have an existing ZooKeeper service running, Flume supports using that external cluster as well.

还是要靠zookeeper, 这玩意实在太有用了...

本文章摘自博客园,原文发布日期:2012-03-17

时间: 2024-11-01 11:12:40

Distributed Message System的相关文章

ganglia - distributed monitor system

传统的监控系统, 通常采用agent+server的方式, agent负责收集监控信息, 主动或被动发送给server, server负责向agent请求监控数据(agent被动), server和agent都通常使用TCP来进行连接.  传统监控的主要弊端, 当被监控的主机很多的情况下, server端的压力会很大, 例如要监控2万台主机的30个监控项, 就有60万个监控数据要从agent收集, 假设每分钟收集一次监控数据, 每秒需要上千次的metric get请求.  ganglia的设计思

笔记:Ceph: A Scalable, High-Performance Distributed File System

关于Ceph的名篇.Ceph是现在很火的一个存储系统,不同于HDSF主要是面向大数据应用,Ceph是立志要做一个通用的存储解决方案,要同时很好的支持对象存储(Object Storage),块存储(Block Storage)以及文件系统(File System) .现在很多Openstack私有云的存储都是基于Ceph的.Ceph就是基于这篇论文做得. 摘要 很明确的指出了Ceph的使命: We have developed Ceph, a distributed file system th

bigtable: A Distributed Storage System for Structured Data

bigtable: A Distributed Storage System for Structured Data http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf http://www.dbthink.com/?p=493, 中文翻译   总结 A Bigtable is a sparse, distri

Distributed File System(簇文件系统)

Distributed File System(簇文件系统) 我吧分布式文件系统分为三类,聚合文件系统,全局文件系统,负载均衡文件系统.除了gfs其他文件系统都是建立在本地文件系统之上的网络文件系统. 几乎所有DFS都能通过fuse mount 到本地,但有些DFS mount 后性能不佳. 3.1. 聚合文件系统 以NFS, glusterfs 为代表,其特点是server独立运行,Server与Server间没有通信,然后访问者将其聚合组织并规划目录,为client提供数据共享. glust

第 21 章 Distributed File System(簇文件系统)

我吧分布式文件系统分为三类,聚合文件系统,全局文件系统,负载均衡文件系统. 除了gfs其他文件系统都是建立在本地文件系统之上的网络文件系统. 几乎所有DFS都能通过fuse mount 到本地,但有些DFS mount 后性能不佳. 还有一个与分布式文件系统密切相关的,就是块设备,块设备不是文件系统,可以称为裸设备. 21.1. 聚合文件系统 以NFS, glusterfs 为代表,其特点是server独立运行,Server与Server间没有通信,然后访问者将其聚合组织并规划目录,为clien

Design and Application Learning of the Distributed Call Tracing System

1. Why the distributed call tracing system? With the surge in popularity of distributed service architecture and the application of design architectures, especially the microservices architecture, the chain of service calls is becoming increasingly c

分布式系统(Distributed System)资料

原文地址:https://github.com/ty4z2008/Qix/blob/master/ds.md 希望转载的朋友,你可以不用联系我.但是**一定要保留原文链接**,因为这个项目还在继续也在不定期更新.希望看到文章的朋友能够学到更多. <Reconfigurable Distributed Storage for Dynamic Networks> 介绍:这是一篇介绍在动态网络里面实现分布式系统重构的paper.论文的作者(导师)是MIT读博的时候是做分布式系统的研究的,现在在NUS

Pregel: A System for Large-Scale Graph Processing

作者Grzegorz Malewicz, Matthew H. Austern .etc.Google Inc 2010-6 原文http://people.apache.org/~edwardyoon/documents/pregel.pdf 译者phylips@bmy 2012-09-14 译文http://duanple.blog.163.com/blog/static/70971767201281610126277/ [说明Pregel这篇是发表在2010年的SIGMOD上Pregel这

在.NET 应用程序中用System.Web.Mail 发送电子邮件

web|程序 在.NET 应用程序中用System.Web.Mail 发送电子邮件 作者:Mark Strawmyer日期:February 9, 2004 -------------------------------------------------------------------------------- 欢迎来到 .NET Nuts & Bolts 栏目.在这个栏目中,我们将探讨怎样在应用中发送电子邮件.这将用到System.Web.Mail 名字空间中的类. 协作数据对象Wind