开源云计算技术系列(六)hypertable(hadoop hdfs)

选择virtualbox建立ubuntu server 904 的虚拟机作为基础环境。

hadoop@hadoop:~$ sudo apt-get install g++ cmake libboost-dev liblog4cpp5-dev git-core cronolog libgoogle-perftools-dev libevent-dev zlib1g-dev libexpat1-dev libdb4.6++-dev libncurses-dev libreadline5-dev

hadoop@hadoop:~/build/hypertable$ sudo apt-get install ant autoconf automake libtool bison flex pkg-config php5 php5-cli ruby-dev libhttp-access2-ruby libbit-vector-perl

hadoop@hadoop:~/build/hypertable$ sudo ln -f -s /bin/bash /bin/sh
[sudo] password for hadoop:

hadoop@hadoop:~$ tar xvzf hyperic-sigar-1.6.3.tar.gz

hadoop@hadoop:~$ sudo cp hyperic-sigar-1.6.3/sigar-bin/include/*.h /usr/local/include/
[sudo] password for hadoop:
hadoop@hadoop:~$ sudo cp hyperic-sigar-1.6.3/sigar-bin/lib/libsigar-x86-linux.so /usr/local/lib/

hadoop@hadoop:~$ sudo ldconfig

hadoop@hadoop:~/build/hypertable$

hadoop@hadoop:~$ wget http://hypertable.org/pub/thrift.tgz
--2009-08-17 21:12:14--  http://hypertable.org/pub/thrift.tgz
Resolving hypertable.org... 72.51.43.91
Connecting to hypertable.org|72.51.43.91|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1144224 (1.1M) [application/x-gzip]
Saving to: `thrift.tgz'

100%[======================================>] 1,144,224   20.9K/s   in 44s

2009-08-17 21:13:00 (25.3 KB/s) - `thrift.tgz' saved [1144224/1144224]

hadoop@hadoop:~$ tar xvzf thrift.tgz

hadoop@hadoop:~$ cd thrift/
hadoop@hadoop:~/thrift$ ls
aclocal       config.guess  contrib       lib          NEWS              ylwrap
aclocal.m4    config.h      CONTRIBUTORS  LICENSE      NOTICE
bootstrap.sh  config.hin    depcomp       ltmain.sh    print_version.sh
CHANGES       config.sub    DISCLAIMER    Makefile.am  README
cleanup.sh    configure     doc           Makefile.in  test
compiler      configure.ac  install-sh    missing      tutorial
hadoop@hadoop:~/thrift$ ./bootstrap.sh
configure.ac: warning: missing AC_PROG_AWK wanted by: test/py/Makefile:80
configure.ac: warning: missing AC_PROG_RANLIB wanted by: test/py/Makefile:151
configure.ac:44: installing `./config.guess'
configure.ac:44: installing `./config.sub'
configure.ac:26: installing `./install-sh'
configure.ac:26: installing `./missing'
compiler/cpp/Makefile.am: installing `./depcomp'
configure.ac: installing `./ylwrap'

hadoop@hadoop:~/thrift$ ./configure

hadoop@hadoop:~/thrift$ make -j 3

hadoop@hadoop:~/thrift$ sudo make install

hadoop@hadoop:~/thrift$ sudo ldconfig

hadoop@hadoop:~$ ls
hypertable-0.9.2.6-alpha-src.tar.gz  jdk-6u15-linux-i586.bin
hadoop@hadoop:~$ chmod +x jdk-6u15-linux-i586.bin
hadoop@hadoop:~$ pwd
/home/hadoop

hadoop@hadoop:~$ ./jdk-6u15-linux-i586.bin

hadoop@hadoop:~$ ls
hypertable-0.9.2.6-alpha-src.tar.gz  jdk1.6.0_15  jdk-6u15-linux-i586.bin
hadoop@hadoop:~$ tar xvzf hypertable-0.9.2.6-alpha-src.tar.gz

Create an install directory (optional)

mkdir ~/hypertable

Create a build directory

mkdir -p ~/build/hypertable

hadoop@hadoop:~$ cd ~/build/hypertable
hadoop@hadoop:~/build/hypertable$
hadoop@hadoop:~/build/hypertable$ cmake ~/hypertable
hypertable/                          hypertable-0.9.2.5-alpha-src.tar.gz
hypertable-0.9.2.5-alpha/
hadoop@hadoop:~/build/hypertable$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=~/hypertable ~/hypertable-0.9.2.5-alpha

hadoop@hadoop:~/build/hypertable$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=~/hypertable ~/hypertable-0.9.2.5-alpha

hadoop@hadoop:~/build/hypertable$ make -j 3

这一步时间比较长,并且机器满负载运行,可以喝杯咖啡在来看结果。

hadoop@hadoop:~/build/hypertable$ make install

接下来验证安装是否成功:

hadoop@hadoop:~/build/hypertable$  make alltests
Scanning dependencies of target runtestservers

Starting test servers
Shutdown hyperspace complete
ERROR: DfsBroker not running, database not cleaned
Shutdown thrift broker complete
Shutdown hypertable master complete
Shutdown range server complete
DFS broker: available file descriptors: 1024
Successfully started DFSBroker (local)
Successfully started Hyperspace
Successfully started Hypertable.Master
Successfully started Hypertable.RangeServer
Successfully started ThriftBroker
Built target runtestservers
Scanning dependencies of target alltests
Running tests...
Start processing tests
Test project /home/hadoop/build/hypertable
  1/ 60 Testing Common-Exception                 Passed
  2/ 60 Testing Common-Logging                   Passed
  3/ 60 Testing Common-Serialization             Passed
  4/ 60 Testing Common-ScopeGuard                Passed
  5/ 60 Testing Common-InetAddr                  Passed
  6/ 60 Testing Common-PageArena                 Passed
  7/ 60 Testing Common-Properties                Passed
  8/ 60 Testing Common-BloomFilter               Passed
  9/ 60 Testing Common-Hash                      Passed
10/ 60 Testing HyperComm                        Passed
11/ 60 Testing HyperComm-datagram               Passed
12/ 60 Testing HyperComm-timeout                Passed
13/ 60 Testing HyperComm-timer                  Passed
14/ 60 Testing HyperComm-reverse-request        Passed
15/ 60 Testing BerkeleyDbFilesystem             Passed
16/ 60 Testing FileBlockCache                   Passed
17/ 60 Testing TableIdCache                     Passed
18/ 60 Testing CellStoreScanner                 Passed
19/ 60 Testing CellStoreScanner-delete          Passed
20/ 60 Testing Schema                           Passed
21/ 60 Testing LocationCache                    Passed
22/ 60 Testing LoadDataSource                   Passed
23/ 60 Testing LoadDataEscape                   Passed
24/ 60 Testing BlockCompressor-BMZ              Passed
25/ 60 Testing BlockCompressor-LZO              Passed
26/ 60 Testing BlockCompressor-NONE             Passed
27/ 60 Testing BlockCompressor-QUICKLZ          Passed
28/ 60 Testing BlockCompressor-ZLIB             Passed
29/ 60 Testing CommitLog                        Passed
30/ 60 Testing MetaLog-RangeServer              Passed
31/ 60 Testing Client-large-block               Passed
32/ 60 Testing Client-periodic-flush            Passed
33/ 60 Testing HyperDfsBroker                   Passed
34/ 60 Testing Hyperspace                       Passed
35/ 60 Testing Hypertable-shell                 Passed
36/ 60 Testing Hypertable-shell-ldi-stdin       Passed
37/ 60 Testing RangeServer                      Passed
38/ 60 Testing ThriftClient-cpp                 Passed
39/ 60 Testing ThriftClient-perl                Passed
40/ 60 Testing ThriftClient-java                Passed
41/ 60 Testing Client-random-write-read         Passed
42/ 60 Testing RangeServer-commit-log-gc        Passed
43/ 60 Testing RangeServer-load-exception       Passed
44/ 60 Testing RangeServer-metadata-split       Passed
45/ 60 Testing RangeServer-maintenance-thread   Passed
46/ 60 Testing RangeServer-row-overflow         Passed
47/ 60 Testing RangeServer-rowkey-ag-imbalanc   Passed
48/ 60 Testing RangeServer-split-recovery-1     Passed
49/ 60 Testing RangeServer-split-recovery-2     Passed
50/ 60 Testing RangeServer-split-recovery-3     Passed
51/ 60 Testing RangeServer-split-recovery-4     Passed
52/ 60 Testing RangeServer-split-recovery-5     Passed
53/ 60 Testing RangeServer-split-recovery-6     Passed
54/ 60 Testing RangeServer-split-recovery-7     Passed
55/ 60 Testing RangeServer-split-recovery-8     Passed
56/ 60 Testing RangeServer-split-merge-loop10   Passed
57/ 60 Testing RangeServer-bloomfilter-rows     Passed
58/ 60 Testing RangeServer-bloomfilter-rows-c   Passed
59/ 60 Testing RangeServer-ScanLimit            Passed
60/ 60 Testing Client-no-log-sync               Passed

100% tests passed, 0 tests failed out of 60
Built target alltests
hadoop@hadoop:~/build/hypertable$

至此,hypertable单机版ok。

接着,我们来体验一下架在hadoop的hdfs上的hypertable。

hadoop@hadoop:~/hadoop-0.20.0/conf$ more hadoop-env.sh
# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.  Required.
export JAVA_HOME=~/jdk1.6.0_15

# Extra Java CLASSPATH elements.  Optional.
# export HADOOP_CLASSPATH=

# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=100

hadoop@hadoop:~/hadoop-0.20.0/conf$ vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/hadoop/hadoop-0.20.0/tmp/dir/hadoop-${user.name}</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:9000</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

</configuration>

hadoop@hadoop:~/hadoop-0.20.0/conf$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

</configuration>
~

~
"mapred-site.

hadoop@hadoop:~/hadoop-0.20.0/conf$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

</configuration>
~
~

hadoop@hadoop:~$ more .bash_profile
export JAVA_HOME=~/jdk1.6.0_15
export HADOOP_HOME=~/hadoop-0.20.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

hadoop@hadoop:~$ hadoop namenode -format
09/08/18 21:10:25 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hadoop/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
09/08/18 21:10:26 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop,adm,dialout,cdrom,plugdev,lpadmin,sambashare,admin
09/08/18 21:10:26 INFO namenode.FSNamesystem: supergroup=supergroup
09/08/18 21:10:26 INFO namenode.FSNamesystem: isPermissionEnabled=true
09/08/18 21:10:26 INFO common.Storage: Image file of size 96 saved in 0 seconds.
09/08/18 21:10:26 INFO common.Storage: Storage directory /home/hadoop/hadoop-0.20.0/tmp/dir/hadoop-hadoop/dfs/name has been successfully formatted.
09/08/18 21:10:26 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop/127.0.1.1
************************************************************/
hadoop@hadoop:~$

hadoop@hadoop:~$ start-all.sh
starting namenode, logging to /home/hadoop/hadoop-0.20.0/bin/../logs/hadoop-hadoop-namenode-hadoop.out
localhost: starting datanode, logging to /home/hadoop/hadoop-0.20.0/bin/../logs/hadoop-hadoop-datanode-hadoop.out
localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-0.20.0/bin/../logs/hadoop-hadoop-secondarynamenode-hadoop.out
starting jobtracker, logging to /home/hadoop/hadoop-0.20.0/bin/../logs/hadoop-hadoop-jobtracker-hadoop.out
localhost: starting tasktracker, logging to /home/hadoop/hadoop-0.20.0/bin/../logs/hadoop-hadoop-tasktracker-hadoop.out

hadoop@hadoop:~$ jps
12959 JobTracker
12760 DataNode
12657 NameNode
13069 TaskTracker
13149 Jps
12876 SecondaryNameNode

ok,hadoop 0.20.0配置完成。

接着我们整合hypertable和hadoop hdfs。

hadoop@hadoop:~/hypertable/0.9.2.5/bin$ more ~/.bash_profile
export JAVA_HOME=~/jdk1.6.0_15
export HADOOP_HOME=~/hadoop-0.20.0
export HYPERTABLE_HOME=~/hypertable/0.9.2.5/
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HYPERTABLE_HOME/bin

hadoop@hadoop:~/hypertable/0.9.2.5/conf$ ls
hypertable.cfg  METADATA.xml
hadoop@hadoop:~/hypertable/0.9.2.5/conf$ more hypertable.cfg
#
# hypertable.cfg
#

# Global properties
Hypertable.Request.Timeout=180000

# HDFS Broker
HdfsBroker.Port=38030
HdfsBroker.fs.default.name=hdfs://localhost:9000
HdfsBroker.Workers=20

# Local Broker
DfsBroker.Local.Port=38030
DfsBroker.Local.Root=fs/local

# DFS Broker - for clients
DfsBroker.Host=localhost
DfsBroker.Port=38030

# Hyperspace
Hyperspace.Master.Host=localhost
Hyperspace.Master.Port=38040
Hyperspace.Master.Dir=hyperspace
Hyperspace.Master.Workers=20

# Hypertable.Master
Hypertable.Master.Host=localhost
Hypertable.Master.Port=38050
Hypertable.Master.Workers=20

# Hypertable.RangeServer
Hypertable.RangeServer.Port=38060

Hyperspace.KeepAlive.Interval=30000
Hyperspace.Lease.Interval=1000000
Hyperspace.GracePeriod=200000

# ThriftBroker
ThriftBroker.Port=38080
hadoop@hadoop:~/hypertable/0.9.2.5/conf$

启动基于hadoop的hypertalbe

hadoop@hadoop:~/hypertable/0.9.2.5/bin$ start-all-servers.sh hadoop
DFS broker: available file descriptors: 1024
Successfully started DFSBroker (hadoop)
Successfully started Hyperspace
Successfully started Hypertable.Master
Successfully started Hypertable.RangeServer
Successfully started ThriftBroker

hadoop@hadoop:~/hypertable/0.9.2.5/log$ hadoop fs -ls /
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2009-08-18 21:25 /home
drwxr-xr-x   - hadoop supergroup          0 2009-08-18 21:28 /hypertable
hadoop@hadoop:~/hypertable/0.9.2.5/log$ hadoop fs -ls /hypertable
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2009-08-18 21:28 /hypertable/server
drwxr-xr-x   - hadoop supergroup          0 2009-08-18 21:28 /hypertable/tables

至此,基于hadoop的hypertable成功启动。下一篇进行hql的体验。

时间: 2024-08-01 13:34:27

开源云计算技术系列(六)hypertable(hadoop hdfs)的相关文章

开源云计算技术系列(五)(崛起的黑马Sector/Sphere 实战篇)

在基于java的hadoop如日中天的时代,开源云计算界有一匹基于C++的黑马,Sector/Sphere在性能方面对hadoop提出了挑战,在Open Cloud Consortium(OCC)开放云计算协会建立的Open Cloud Testbed开放云实验床的软件测试中, Sector is about twice as fast as Hadoop. 本篇先对这匹黑马做一次实战演习,先感受一下,下一篇深入其设计原理,探讨云计算的本质. OCT是一套跨核心10G带宽教育网的多个数据中心的计

开源云计算技术系列(四)(Cloudera安装配置hadoop 0.20最新版配置)

接上文,我们继续体验Cloudera 0.20最新版. wget hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb wget hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb debian:~# dpkg –i hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb dpkg –i hadoop-0.20_0.20.0

开源云计算技术系列(四)(Cloudera安装配置)

节省篇幅,直入正题. 首先用虚拟机virtualbox 配置一台debian 5.0. debian在开源linux里面始终是最为纯正的linux血统,使用起来方便,运行起来高效,重新审视一下最新的5.0,别有一番似是故人来的感觉. 只需要下载debian-501-i386-CD-1.iso进行安装,剩下的基于debian强大的网络功能,可以很方便的进行软件包的配置.具体过程这里略去,可以在www.debian.org里面找到所有你需要的信息. 下面我们来体验一下稳定版0.183的方便和简洁.

开源云计算技术系列(六)hypertable (HQL)

既然已经安装配置好hypertable,那趁热打铁体验一下HQL. 准备好实验数据 hadoop@hadoop:~$ gunzip access.tsv.gz hadoop@hadoop:~$ mv access.tsv ~/hypertable/0.9.2.5/examples/hql_tutorial/ hadoop@hadoop:~$ cd ~/hypertable/0.9.2.5/examples/hql_tutorial/ hadoop@hadoop:~/hypertable/0.9.

开源云计算技术系列三(10gen)安装配置

10gen 是一套云计算平台,可以为web应用提供可以扩展的高性能的数据存储解决方案.10gen的开源项目是mongoDB,主要功能是解决website的操作性数据存储,session对象的存储,数据缓存,高效率的实时计数(比如统计pv,uv),并支持ruby,python,java,c++,php等众多的页面语言. MongoDB主要特征是存储数据非常方便,不在是传统的object-relational mapping的模式,高性能,可以存储大对象数据,比如视频等,可以自动复制和failove

开源云计算技术系列(四)(Cloudera体验篇)

Cloudera  的定位在于 Bringing Big Data to the Enterprise with Hadoop Cloudera为了让Hadoop的配置标准化,可以帮助企业安装,配置,运行hadoop以达到大规模企业数据的处理和分析. 既然是给企业使用,Cloudera的软件配置不是采用最新的hadoop 0.20,而是采用了Hadoop 0.18.3-12.cloudera.CH0_3的版本进行封装,并且集成了facebook提供的hive,yahoo提供的pig等基于hado

源云计算技术系列(七)Cloudera (hadoop 0.20)

虚拟一套centos 5.3 os. 下载 jdk-6u16-linux-i586-rpm.bin [root@hadoop ~]# chmod +x jdk-6u16-linux-i586-rpm.bin [root@hadoop ~]# ./jdk-6u16-linux-i586-rpm.bin [root@hadoop ~]#  java -version java version "1.6.0" OpenJDK  Runtime Environment (build 1.6.0

云计算技术发展的六大趋势

一.数据中心向整合化和绿色节能方向发展 目前传统数据中心的建设正面临异构网络.静态资源.管理复杂.能耗高等方面问题,云计算数据中心与传统数据中心有所不同,它既要解决如何在短时间内快速.高效完成企业级数据中心的扩容部署问题,同时要兼顾绿色节能和高可靠性要求.高利用率.一体化.低功耗.自动化管理成为云计算数据中心建设的关注点,整合.绿色节能成为云计算数据中心构建技术的发展特点. 数据中心的整合首先是物理环境的整合,包括供配电和精密制冷等,主要是解决数据中心基础设施的可靠性和可用性问题.进一步的整合是

美国陆军为分布式通用地面系统寻求云计算技术

[据军事航空电子网站2012年12月23日报道]http://www.aliyun.com/zixun/aggregation/39424.html">美国陆军研究人员将于2013年1月18日召开行业交流会,主要关于美国陆军情报和信息作战处(I2WD)使用和正在开发的云计算技术,以及该机构的技术不足和未来需求.此次交流会由I2WD战术云集成实验室(TCIL)主办,会议时间从上午9时30分至下午2时30分,地点在阿伯丁试验场6000号楼. 陆军研究人员将概述当前已部署技术的情况,评估未来技术