windows配置hadoop-1.1.0的伪分布式环境(续) 博客分类: bigdata windowshadoop
在前一篇文章中,介绍了一写常见问题的解决方法。
但是,当我重装系统,再次按照前面一篇文章( http://winseclone.iteye.com/blog/1734737 ) 安装cygwin和hadoop-1时,发现伪分布式环境使用mapred时,总是报错。(忘了,但是好像当时没有遇到过这种情况。就当是安装win8送给自己的礼物吧!)。
怀疑了很多东西,配置有问题,重新自定hadoop.tmp.dir,把hadoop-1.1.0换成hadoop-1.0.0等等。
错误日志如下:
$ hhadoop fs -rmr /test/output ; hhadoop jar hadoop-examples-1.0.0.jar wordcount /test/input /test/outputDeleted hdfs://WINSE:
9000/test/output13/03/23 22:46:07 INFO input.FileInputFormat: Total input paths to process : 113/03/23 22:46:08 INFO mapred.JobClient: Running job: job_201303232144_000213/03/23 22:46:09 INFO mapred.JobClient: map 0% reduce 0%13/03/23 22:46:16 INFO mapred.JobClient: Task Id : attempt_201303232144_0002_m_000002_0, Status : FAILEDjava.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:272)Caused by: java.io.IOException: Task process exit with nonzero status of -1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:259)13/03/23 22:46:16 WARN mapred.JobClient: Error reading task outputhttp://WINSE:50060/tasklog?plaintext=true&attemptid=attempt_201303232144_0002_m_000002_0&filter=stdout13/03/23 22:46:16 WARN mapred.JobClient: Error reading task outputhttp://WINSE:50060/tasklog?plaintext=true&attemptid=attempt_201303232144_0002_m_000002_0&filter=stderr13/03/23 22:46:22 INFO mapred.JobClient: Task Id : attempt_201303232144_0002_m_000002_1, Status : FAILED
经过修改原来,不断的修改,加入sysout打印,算是最终找出程序出现错误的地方!
org.apache.hadoop.mapred.DefaultTaskController.java #launchTask
org.apache.hadoop.mapred.JvmManager.java #runChild
org.apache.hadoop.mapred.TaskRunner.java #launchJvmAndWait
org.apache.hadoop.fs.FileUtil.java #checkReturnValue
org.apache.hadoop.fs.RawLocalFileSystem.java #setPermission #mkdirs
发现在org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(Path)方法中,建立文件的路径方法检查attempt_201303232144_0002_m_000001_0是否为文件夹会失败!
而在cygwin中查看:
Winseliu@WINSE ~/hadoop/logs/userlogs/job_201303232144_0002$ ll总用量 9lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000001_0 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000001_0lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000001_1 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000001_1lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000001_2 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000001_2lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000001_3 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000001_3lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000002_0 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000002_0lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000002_1 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000002_1lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000002_2 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000002_2lrwxrwxrwx 1 Winseliu None 89 3月 23 22:46 attempt_201303232144_0002_m_000002_3 -> /cluster/mapred/local/userlogs/job_201303232144_0002/attempt_201303232144_0002_m_000002_3-rwxr-xr-x 1 Winseliu None 404 3月 23 22:46 job-acls.xml
对于linux来说,这些就是引用到另一个文件夹,它本身应该也是文件夹!但是window的jdk不认识这些东西!
public boolean mkdirs(Path f) throws IOException { Path parent = f.getParent(); File p2f = pathToFile(f); return (parent == null || mkdirs(parent)) && (p2f.mkdir() || p2f.is
Directory()); }
所以在判断p2f.isDirectory()返回false,然后会抛出IOException,最终以-1的状态退出Map Child的程序!
使用org.apache.hadoop.mapred.TaskRunner.prepareLogFiles(TaskAttemptID, boolean)方法来制定了输出的日志输出的位置。在最终执行的会在shell命令中把输出的sysout和syserr输出到日志文件中。userlogs的父目录是使用hadoop.log.dir系统属性来配置的!
mapred.DefaultTaskController.launchTask()
|--mapred.TaskLog.buildCommandLine()
临时解决方法:
把hadoop.log.dir定位到真正mapred日志的目录( mapred.local.dir : ${hadoop.tmp.dir}/mapred/local )!
export HADOOP_LOG_DIR=/cluster/mapred/local
把windows的/cluster映射到cygwin(linux)的/cluster:
Winseliu@WINSE ~/hadoop$ ll /cygdrive/c | grep clusterdrwxr-xr-x+ 1 Winseliu None 0 3月 24 00:08 clusterWinseliu@WINSE ~/hadoop$ ll / | grep clusterlrwxrwxrwx 1 Winseliu None 19 3月 23 09:39 cluster -> /cygdrive/c/cluster
但是,运行wordcount的例子时,还是不正常!查看tasktracker的日志时,发现有String装成Integer的NumberFormatException异常!
修改org.apache.hadoop.mapred.JvmManager.JvmManagerForType.JvmRunner.kill()方法。添加pidStr为空字符串的检查!
String pidStr = jvmIdToPid.get(jvmId);if (pidStr != null && !pidStr.isEmpty()) {
然后,终于看到Finish咯!在/test/output/part-r-00000中也看到了结果。
其他一些简化处理,即配置文件:
alias start
Cluster="~/hadoop/bin/start-all.sh"alias stopCluster="~/hadoop/bin/stop-all.sh; ~/hadoop/bin/stop-all.sh"alias hhadoop="~/hadoop/bin/hadoop"
Winseliu@WINSE ~$ ll | grep hadooplrwxrwxrwx 1 Winseliu None 12 3月 23 10:44 hadoop -> hadoop-1.0.0drwx------+ 1 Winseliu None 0 3月 24 00:06 hadoop-1.0.0
<!-- core-site.xml --><configuration><property><name>fs.default.name</name><value>hdfs://WINSE:9000</value></property><property><name>hadoop.tmp.dir</name><value>/cluster</value></property></configuration><!-- hdfs-site.xml --><configuration><property><name>dfs.
replication</name><value>1</value></property><property> <name>dfs.permissions</name> <value>false</value></property><property> <name>dfs.permissions.supergroup</name> <value>None</value></property><property><name>dfs.safemode.extension</name><value>1000</value></property></configuration><!-- mapred-site.xml --><configuration><property><name>mapred.job.tracker</name><value>WINSE:9001</value></property></configuration>
关于查看启动的进程,看可以通过任务管理器来查看:
大小: 56.6 KB 查看图片附件