今天把新数据中心基于 spark 1.6 jdk 1.7 hadoop2.7的scala程序 拿到老数据中心jdk1.6/jdk.17 hadoop 2.2 环境 进行 spark on yarn测试
使用spark-shell 或者 spark-submit 提交时报如下错误
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
.Failing this attempt.. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: 0
queue: default
start time: 1481532476948
final status: FAILED
tracking URL: xxx:8088/cluster/app/application_1480324693568_0053
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
at Regex$.main(Regex.scala:14)
at Regex.main(Regex.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
查看container日志 报如下错误
xxx(&container_1480324693568_0052_01_000001έstderr661Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/deploy/yarn/ExecutorLauncher
Caused by: java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.ExecutorLauncher
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
一开始以为是 spark基于hadoop的包spark_home/lib/spark-assembly-1.6.1-hadoop2.3.0.jar 没有加载 还试着用–jar 或者 –diver-class-path 加载 但是后来发现了启动客户端日志里面 有这句话
16/12/12 16:47:54 INFO yarn.Client: Uploading resource file:/home/xxx/spark-1.6.1/lib/spark-assembly-1.6.1-hadoop2.3.0.jar -> hdfs://ns1/user/xxx/.sparkStaging/application_1480324693568_0053/spark-assembly-1.6.1-hadoop2.3.0.jar
说明该工具包已经上送的yarn环境 还怀疑这个类 不在这个包里 但是通过eclipse 反查类 这个类是在里面的
比较巧合的是 程序 在 有两个节点运行成功了 后来发现 这两个节点的nodemanager是jdk1.7运行的 而其他的 失败的节点是 jdk1.6运行的 所以怀疑是jdk版本早造成的 所以 换这个环境试试spark-1.4.0-bin-hadoop2.3
最后成功
看来还是版本的问题