Live long and process (#LLAP) 翻译

Live long and process (#LLAP) 

sershe, sseth, hagleitn, 2014-08-27.

Overview综述

Hive has become significantly faster thanks to various features and improvements 
that were built by the community over the past two years. Keeping the 
momentum, here are some examples of what we think will take us to the next 
level: asynchronous spindle-aware IO, pre-fetching and caching of column chunks, 
and multi-threaded JIT-friendly operator pipelines.

得益于在过去两年里社区提交的各种特性和改进,Hive能够显著的变快。

为保持这一势头,这里有一些示例,被认为能够带领我们(将Hive)提升到下一个水平:异步主轴感知IO、预取、缓存列块、多线程运行时友好的操作管道。

In order to achieve this we are proposing a hybrid execution model which consists 
of a long-lived daemon replacing direct interactions with the HDFS DataNode and a 
tightly integrated DAG-based framework. Functionality such as caching, 
pre-fetching, some query processing and access control will move into the 
daemon. Small/short queries can be largely processed by this daemon directly, 
while any heavy lifting will be performed in standard YARN containers.

为了实现这些,我们提出一种混合执行模型,它由一个替代直接与HDFS DataNode交互的长期存活的守护进程、一个基于DAG紧密集成的框架组成。
诸如缓存caching、预取pre-fetching、一些查询处理和访问控制将会被移动到守护进程。
小的或短的查询主要被该守护进程直接处理,同时任何繁重的工作将会在标准的Yarn容器中执行。

Similar to the DataNode, LLAP daemons can be used by other applications as well, 
especially if a relational view on the data is preferred over file-centric processing. 
We’re thus planning to open the daemon up through optional APIs (e.g.: 
InputFormat) that can be leveraged by other data processing frameworks as a 
building block. 
与DataNode一样,LLAP守护进程也能够被其它应用使用,特别是优先于以文件为中心处理的数据之上关系型视图。
因此我们打算通过可选的api(比如InputFormat)打开这个守护进程,可以利用其他数据处理框架作为构建块。

Last, but not least, fine-grained column-level access control -- a key requirement 
for mainstream adoption of Hive -- fits nicely into this model. 
最后但也是最重要的,细粒度的列级访问控制,Hive主流应用的一个关键需求,很好地符合这个模型。

 

Example of execution with #LLAP. Tez AM orchestrates overall 
execution. Initial stage of query is pushed into #LLAP, large 
shuffle is performed in their own containers. Multiple queries 
and applications can access #LLAP concurrently. 
LLAP执行示例:Tez AM统筹总体执行。查询的初级阶段被推进到LLAP,较大的shuffle在它们自己的容器中执行。
多种查询或应用能够在LLAP中并行执行。

Persistent daemon 持续的守护进程

To facilitate caching, JIT optimization and to eliminate most of the startup costs, 
we will run a daemon on the worker nodes on the cluster. The daemon will handle 
I/O, caching, and query fragment execution. 
为促进缓存、运行时优化,同时为消除大部分启动开销,我们将会在集群中的一些工作节点上运行一个守护进程。
这个守护进程将会处理I/O、缓存和查询片段的执行。

● These nodes will be stateless. Any request to an #LLAP node will contain 
the data location and metadata. It will process local and remote locations; 
locality will be the caller’s responsibility (YARN).
这些节点将会是无状态的。任何发往LLAP节点的请求将会包含数据位置和元数据信息。它将会处理本地和远程位置,位置将是调用者的责任。

● Recovery/resiliency. Failure and recovery is simplified because any data 
node can still be used to process any fragment of the input data. The Tez 
AM can thus simply rerun failed fragments on the cluster. 
恢复/弹性。故障和恢复被简化,因为任何数据节点仍然可以用于处理输入数据的任何片段。Tez AM可以简单地在集群上重新运行失败的碎片。

 

● Communication between nodes. #LLAP nodes will be able to share data 
(e.g., fetching partitions, broadcasting fragments). This will be realized 
with the same mechanisms used today in Tez.
节点间通信。LLAP节点能够共享数据(比如加载分区、广播片段等)。现在这将在使用Tez通过相同的途径实现。

 

Working within existing execution model 在现有的执行模型中工作

 

#LLAP will work within existing, process-based Hive execution to preserve the 
scalability and versatility of Hive. It will not replace the existing execution model 
but enhance it. 
LLAP能够工作在现有的基于进程的hive执行模型,以此来保护Hive的可扩展性和多功能性。它不会替代现有的执行模型,反而会提升它。

 

● The daemons are optional. Hive will continue to work without them and
will also be able to bypass them even if they are deployed and operational. 
Feature parity with regard to language features will be maintained. 
守护进程是可选的。没有守护进程Hive仍将继续工作,并且即使守护进程被部署和运作,hive仍可以绕过它们。
关于语言的特性仍然被保留。

 

● External orchestration and execution engines. #LLAP is not an 
execution engine (like MR or Tez). Overall execution will be scheduled and 
monitored by existing Hive execution engine such as Tez; transparently 
over both #LLAP nodes, as well as regular containers. Obviously, #LLAP 
level of support will depend on each individual execution engine (starting 
with Tez). MapReduce support is not planned, but other engines may be 
added later. Other frameworks like Pig will also have the choice of using 
#LLAP daemons. 
外部协调和执行引擎。
LLAP不是一个执行引擎(像MR或Tez)。总的执行将会被现有的Hive执行引擎比如Tez调度和监控。
在LLAP节点和常规容器间透明。明显的,LLAP支持水平将取决于每个执行引擎(随Tez启动)。MapReduce的支持目前没有计划,但是其它引擎可能在随后添加。
其它框架比如Pig等也会添加使用LLAP守护进程的选项。

 

● Partial execution. The result of the work performed by an #LLAP daemon 
can either form part of the result of Hive query, or be passed on to external 
Hive tasks, depending on the query. 
部分执行。LLAP守护进程执行的工作结果,即可能是Hive查询的部分结果,也可能是被传递给外部依赖查询的Hive任务。

● Resource management. YARN will remain responsible for the 
management and allocation of resources. The YARN container delegation 
model will be used for users to transfer allocated resources to #LLAP. To 
avoid the limitations of JVM memory settings, we will keep cached data, as 
well as large buffers for processing (e.g., group by, joins), off-heap. This 
way, the daemon can use a small amount of memory, and additional 
resources (i.e., CPU and memory) will be assigned based on workload. 
资源管理。Yarn仍将负责资源的管理和分配。Yarn容器代理模型将被用户用来传递被分配的资源到LLAP。
为避免JVM内存设置的局限性,我们将把缓存的数据、为处理数据而设置的缓冲区(比如group by、joins等)保持在堆外内存。
这种方式,守护进程能够使用少量的内存,其它附加资源(比如CPU、内存)将根据工作负载分配。

 

Query fragment execution 查询分片执行

 

For partial execution as described above, #LLAP nodes will execute “query 
fragments” such as filters, projections, data transformations, partial aggregates, 
sorting, bucketing, hash joins/semi-joins, etc. Only Hive code and blessed UDFs 
will be accepted in #LLAP. No code will be localized and executed on the fly. This 
is done for stability and security reasons. 
对于上面提到的分片查询,LLAP节点将会执行查询的片段,比如过滤器、预测、数据转换、部分聚集、排序、分桶、哈希join、semi-joins等。
仅仅是Hive代码、UDFs将会被LLAP接受。

● Parallel execution. The node will allow parallel execution for multiple 
query fragments from different queries and sessions. 
并行执行。该节点将允许不同查询和会话的多种查询片段的并行执行。

● Interface. Users can access #LLAP nodes directly via client API. They will 
be able to specify relational transformations and read data via 
record-oriented streams. 
接口。用户可以直接通过客户端API访问LLAP节点。它们能够列举关系转换,并且能够通过面向记录的流读取数据。

 

I/O 输入/输出

 

The daemon will off-load I/O and transformation from compressed format to 
separate threads. The data will be passed on to execution as it becomes ready, so 
the previous batches can be processed while the next ones are being prepared. 
The data will be passed to execution in a simple RLE-encoded columnar format 
that is ready for vectorized processing; this will also be the caching format, and
intends to minimize copying between I/O, cache, and execution. 
守护进程将摆脱I/O,和从压缩格式转换到单独的线程。数据在准备好后将被传递到执行,前面的batchs被处理,同时下一组batchs也已准备好。
数据将会以简单的RLE列编码格式被传递以执行,以备向量化处理。这也将会是缓存格式、打算减少I / O之间的复制、缓存和执行。

● Multiple file formats. I/O and caching depend on some knowledge of the 
underlying file format (especially if it is to be done efficiently). Therefore, 
similar to Vectorization work, different file formats will be supported 
through plugins specific to each format (starting with ORC). Additionally, a 
generic, less-efficient plugin may be added that supports any Hive input 
format. The plugins have to maintain metadata and transform the raw data 
to column chunks.

多种文件格式。I/O和缓存依赖于底层的文件格式的一些知识(特别是如果要有效地完成)。因此,类似向量化工作,不同的文件格式将通过针对每种文件格式的插件来支持(从ORC开始)。
此外,一种一般的,低效的,能够支持任何Hive输入格式的插件会被添加。插件必须保持元数据,并将行数据转换成列块。

 

● Predicates and bloom filters. SARGs and bloom filters will be pushed 
down to storage layer, if they are supported.

谓词和布隆过滤器。如果他们支持的话,查询参数和布隆过滤器将会被下推到存储层。

 

Caching 缓存

 

The daemon will cache metadata for input files, as well as the data. The metadata 
and index information can be cached even for data that is not currently cached. 
Metadata will be stored in process in Java objects; cached data will be stored in the 
format described in the I/O section, and kept off-heap (see Resource 
management). 
守护进程将会缓存输入文件的元数据、数据。元数据和索引信息甚至可以缓存当前没有缓存的数据。
元数据将会存储在进程的java对象中,缓存的数据将被以I/O section格式描述形式存储,并保持在堆外,即非堆存储。

● Eviction policy. The eviction policy will be tuned for analytical workloads 
with frequent (partial) table-scans. Initially, a simple policy like LRFU will 
be used. The policy will be pluggable. 
回收策略。回收策略将会根据为分析工作(而进行的)频繁的或者部分的表扫描而调整。最初,一个简单的LRU回收策略将会被使用。回收策略是可以插件化的。

● Caching granularity. Column-chunks will be the unit of data in the cache. 
This achieves a compromise between low-overhead processing and storage 
efficiency. The granularity of the chunks depends on particular file format 
and execution engine (Vectorized Row Batch size, ORC stripe, etc.). 
缓存粒度。缓存中数据的单位是列块。这将在低开销处理和存储效率方面实现一种平衡(妥协)。
块的粒度取决于特定的文件格式和执行引擎(向量化行批处理大小、ORC stripe等)。

 

Workload Management 工作负载管理

 

YARN will be used to obtain resources for different workloads. Once resources 
(CPU, memory, etc) have been obtained from YARN for a specific workload, the 
execution engine can choose to delegate these resources to #LLAP, or to launch 
Hive executors in separate processes. Resource enforcement via YARN has the 
advantage of ensuring that nodes do not get overloaded, either by #LLAP or by 
other containers. The daemons themselves will be under YARN’s control. 
Yarn将被用于为不同的负载获取资源。一种资源(CPU、内存等)为专门的工作负载而从Yarn获取,
执行引擎可以选择将这些资源委托给LLAP,或者在分离的进程中执行Hive操作。
通过Yarn获取的资源永远能够保证节点不会超负荷的优势,不管是对LLAP还是其它容器。守护进程自己也会在Yarn的控制下。

 

Acid 事务

 

#LLAP will be aware of transactions. The merging of delta files to produce a 
certain state of the tables will be performed before the data is placed in cache. 
Multiple versions are possible and the request will specify which version is to be
used. This has the benefit of doing the merge async and only once for cached data, 
thus avoiding the hit on the operator pipeline.
LLAP将会了解事务。为表产生一定状态的差异文件的合并,将会在数据被放置到缓存之前执行。
可能存在多个版本,请求将指定使用哪个版本。这样做的好处是异步合并,且只有一次数据缓存,这样就避免了操作管道的热点问题。

Security 安全性

 

#LLAP servers are a natural place to enforce access control at a more fine-grained 
level than “per file”. Since the daemons know which columns and records are 
processed, policies on these objects can be enforced. This is not intended to 
replace the current mechanisms, but rather to enhance and open them up to other 
applications as well. 
LLAP服务器是一个(实现)比每个文件更细粒度水平的强制访问控制的合适的地方。
由于守护进程知道哪些列和记录能够被处理,这些对象之上的策略可以被强制执行。这不是为了替代当前的机制,
而是使其对其他应用程序来说更加强大和开放。

时间: 2024-10-23 03:41:59

Live long and process (#LLAP) 翻译的相关文章

如何玩转JavaScript的事件循环

听多了JavaScript单线程,异步,V8,便会很想去知道JavaScript是如何利用单线程来实现所谓的异步的.我参考了一些文章,了解到一个很重要的词汇:事件循环(Event Loop).在这些文章中,有: 阮一峰老师的JavaScript 运行机制详解:再谈Event Loop Philip Roberts的What the heck is the event loop anyway? Erin Swenson-Healey的The JavaScript Event Loop: Expla

[翻译]JDK 8 兼容性指南

翻译官方文档,删除部分可忽略. 译者:坤谷,井桐,激酶 兼容性是一个复杂的问题. 本文介绍了Java平台潜在的三种不兼容问题: 源码: 源码兼容性问题关注Java源代码转换成class文件是否兼容,包括代码是否仍然可编译. 二进制: 在Java语言规范中,二进制兼容性定义为:"类的改变是二进制兼容的(或者不破坏二进制兼容性),是指如果改变前的类的二进制在链接时没有错误,那么改变后的类在链接时仍然没有错误." 行为 : 行为兼容性包括在运行时执行的代码的语义. 欲了解更多信息,请参阅Op

Whats New in PHP 5 countstars(翻译)

翻译:深空 作者:Andi Gutmans, Stig Bakken, and Derick Rethans 不得擅自转载. Introduction [绪论] Language Features [语言特性] • New Object Oriented model [新的面向对象模型] • New Object Oriented Features [新的面向对象特性] • Other New Language Features [其他新的语言特性] General PHP changes [P

不知道大家对DES有没有兴趣,今天在整理的时候,看到我在一年半前翻译的一篇文章。

如何实现 DES 算法(全). 这是摘自清华BBS的一篇文章,洋文的,小弟把它翻成中文请各位高手指点.分号(:)后的话是小弟的翻译,井号(#)后的是小弟的一点感想.                           How to implement the                      Data Encryption Standard (DES)                         A step by step tutorial                     

Win7系统System Idle Process占用率高怎么办

  1.这个名为"System Idle Process"的进程在WIndows XP和Windows 7中能在进程中看到. 2.但是在Windows 10中,已经翻译为中文了,它叫"系统空闲进程". 3."System Idle Process"也就是"系统空闲进程",意思就是100%减去你已经使用的CPU占用率,也就是"System Idle Process"和已经使用的CPU占用率加在一起就大约等于1

[翻译] Working with NSURLSession: AFNetworking 2.0

Working with NSURLSession: AFNetworking 2.0   简单翻译,有很多错误,看官无法理解处请英文原文对照. http://code.tutsplus.com/tutorials/working-with-nsurlsession-afnetworking-20--mobile-22651 by Bart Jacobs3 Feb 2014 In the previous installments of this series, we've taken a cl

帮我翻译一软件相关的段话,中文译英文

问题描述 帮我翻译一软件相关的段话,中文译英文 因嵌入式系统硬件资源有限,多采用体积小.效率高和面向过程的C语言进行开发,C语言是嵌入式软件中必不可少的开发语言,对嵌入式软件的进行单元测试多针对C语言进行开展. 测试用例的批量执行和结果的自动回收. 解决方案 中文谚语英文翻译行业软件英文缩写中文翻译 解决方案二: Because of the limited hardware resources of the embedded system, the development of C langu

[翻译]:SQL死锁-阻塞探测

原文:[翻译]:SQL死锁-阻塞探测 到了这篇,才是真正动手解决问题的时候,有了死锁之后就要分析死锁的原因,具体就是需要定位到具体的SQL语句上.那么如何发现产生死锁的问题本质呢?下面这篇讲的非常细了,还提到了不少实用的SQL,但对我个人来讲,前半部分基本就够用,可以指出死锁的原因,至于后面那些有兴趣可以多研究研究. As we already know, usually blocking happens due non-optimized queries. But how to detect

TensorFlow博客翻译——TensorFlow v0.9发布,带有增强版的移动支持

TensorFlow v0.9 now available with improved mobile support Monday, June 27, 2016 Posted by Pete Warden, Software Engineer When we started building TensorFlow, supporting mobile devices was a top priority. We were already supporting many of Google's m