测试spark集群入门级wordcount出错,求大神们帮忙解决啊

问题描述

测试spark集群入门级wordcount出错,求大神们帮忙解决啊
  • Created by jyq on 10/14/15.*/就这么点源代码

import org.apache.spark.{SparkConfSparkContextSparkFiles}

object WordCount {
def main(args: Array[String]):Unit=
{
val conf =new SparkConf().setAppName(""WordCount"").setMaster(""spark://master:7077"")

  val sc = new SparkContext(conf)  sc.addFile(""file:///home/jyq/Desktop/1.txt"")  val textRDD=sc.textFile(SparkFiles.get(""file:///home/jyq/Desktop/1.txt""))  val result = textRDD.flatMap(line =>line.split(""\s+"") ).map(word=> (word 1)).reduceByKey(_ + _)  result.saveAsTextFile(""/home/jyq/Desktop/2.txt"")  println(""hello world"")}

}

在IDEA编译运行下输出的日志:

Exception in thread ""main"" java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.(Path.java:172)
at org.apache.hadoop.fs.Path.(Path.java:94)
at org.apache.hadoop.fs.Globber.glob(Globber.java:211)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1644)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:257)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:290)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:290)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:289)
at WordCount$.main(WordCount.scala:16)
at WordCount.main(WordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.failExpecting(URI.java:2854)
at java.net.URI$Parser.parse(URI.java:3057)
at java.net.URI.(URI.java:746)
at org.apache.hadoop.fs.Path.initialize(Path.java:203)
... 41 more
15/10/15 20:08:36 INFO SparkContext: Invoking stop() from shutdown hook
15/10/15 20:08:36 INFO SparkUI: Stopped Spark web UI at http://192.168.179.111:4040
15/10/15 20:08:36 INFO DAGScheduler: Stopping DAGScheduler
15/10/15 20:08:36 INFO SparkDeploySchedulerBackend: Shutting down all executors
15/10/15 20:08:36 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
15/10/15 20:08:36 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/10/15 20:08:36 INFO MemoryStore: MemoryStore cleared
15/10/15 20:08:36 INFO BlockManager: BlockManager stopped
15/10/15 20:08:36 INFO BlockManagerMaster: BlockManagerMaster stopped
15/10/15 20:08:36 INFO SparkContext: Successfully stopped SparkContext
15/10/15 20:08:36 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
15/10/15 20:08:36 INFO ShutdownHookManager: Shutdown hook called
15/10/15 20:08:36 INFO ShutdownHookManager: Deleting directory /tmp/spark-d7ca48d5-4e31-4a07-9264-8d7f5e8e1032
15/10/15 20:08:36 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.

Process finished with exit code 1

解决方案

http://www.zhihu.com/question/36534667

解决方案二:
我是在用SparkR的时候遇到了跟你一样的error message的 我当时把'file:///home/jyq/Desktop/1.txt'中的file:///拿掉只留下文件路径就可以读了

时间: 2024-12-21 02:58:06

测试spark集群入门级wordcount出错,求大神们帮忙解决啊的相关文章

arp攻击 arp-ubuntu 怀疑遭受arp攻击,求大神,帮忙解决。

问题描述 ubuntu 怀疑遭受arp攻击,求大神,帮忙解决. 如果将arp缓存清除掉的话,可以上一段时间的网,过几分钟又不能上网了,怀疑是遭受了arp攻击,但是用抓包工具监测局域网并没有发现什么明显的问题,求大神帮忙解决. 解决方案 可能有人把自己的IP设置成了网关的IP,所以ARP一会解析到网关的,一会解析到那人的,解决方案:静态绑定IP->MAC 解决方案二: 你抓包看看你的包是不是没有正确发送到网关,而是发送到其他地址去了,从而被arp欺骗了.其次就是安装arp攻击防火墙等工具 解决方案

spark dataframe 中write 方法,求大神指点下,不胜感激

问题描述 spark dataframe 中write 方法,求大神指点下,不胜感激 dataframe的write方法将spark分析后的结果放到pg数据库,结果表中有个自曾字段,而那个write方法不能指定添加那几个字段只能全部添加,怎么办,求大神指导换种思路也行,不胜感激,小弟欲哭无泪啊 解决方案 http://www.open-open.com/lib/view/open1452259673808.html

用exe4j生成exe时出错 求大神帮忙

问题描述 用exe4j生成exe时出错 求大神帮忙 java.lang.ExceptionInInitializerError at init.ContextFactory.getContext(ContextFactory.java:17) at window.ExitDialog.(ExitDialog.java:38) at window.ExitDialog.main(ExitDialog.java:96) at sun.reflect.NativeMethodAccessorImpl.

cmake-刚刚接触Cmake,Cmake构建VTK工程时出错求大神解答。

问题描述 刚刚接触Cmake,Cmake构建VTK工程时出错求大神解答. Cmake编译时出错: CMake Error at CMakeLists.txt:11 (target_link_libraries): Cannot specify link libraries for target "TestVTKInstallvtkRendering" which is not built by this project. CMakeLists.txt的内容是: cmake_minimu

C语言,寻找二维数组鞍点,代码个人测试正确, 但是wrong answer, 求大神指点

问题描述 C语言,寻找二维数组鞍点,代码个人测试正确, 但是wrong answer, 求大神指点 Description 给定一个海拔平面图,相当于一个二维数组,数组的每个元素表示一个点的海拔高度.请判断该图中是否存在鞍点,如果存在,则输出该鞍点的位置,即行.列坐标. 本题规定鞍点的定义为:该点的值在它所在的那行中是唯一最大的,且该点的值在它所在的那列中是唯一最小的. Input 输入有多个测试用例,如果把每个测试用例看作一个"块",那么,在一个"块"中: 第一行

android导入第三方包后出错 求大神解惑

问题描述 android导入第三方包后出错 求大神解惑 出现如下错误 Error:Execution failed for task ':app:transformResourcesWithMergeJavaResForDebug'. com.android.build.api.transform.TransformException: com.android.builder.packaging.DuplicateFileException: Duplicate files copied in

c语言-菜鸟OJ, C语言数简单列求和, 感觉测试没错, 但是wrong answer, 求大神指点

问题描述 菜鸟OJ, C语言数简单列求和, 感觉测试没错, 但是wrong answer, 求大神指点 Description 有一个分数序列:2/1,3/2,5/3,8/5,13/8,21/13... 求出这个数列的前n项之和. Input 多测试用例,每个测试用例一行,每行是一个正整数n Output 为每个测试用例单独输出一行:该数列的前n项之和.结果均保留小数点后10位. Sample Input 1 2 3 Sample Output 2.0000000000 3.5000000000

jsp 中用jfreechart 运行出错 求大神解答 或求折线例子

问题描述 jsp 中用jfreechart 运行出错 求大神解答 或求折线例子 代码如下 DefaultCategoryDataset linedataset = new DefaultCategoryDataset();// 各曲线名称String series1 = ""订单量与时间"";String series2 = ""发生金额与时间""; // DefaultCategoryDataset dataset = ne

编程-c#,winform跨线程更改ui组件,出错,求大神帮帮忙

问题描述 c#,winform跨线程更改ui组件,出错,求大神帮帮忙 以下是截图: 源码: using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Drawing; using System.Linq; using System.Text; using System.Threading.Tasks; using System.Windo