spark on yarn 读取hdfs拒绝连接AnnotatedConnectException: 拒绝连接: localhost/127.0.0.1:53056

博客讲述了在Spark运行于YARN上时遇到读取HDFS出现拒绝连接的问题。作者分析了可能由于服务器找不到Driver导致,并提出在服务器host配置中添加Driver的映射作为解决办法。由于环境限制(云服务器,不同局域网,无公网IP),作者选择了使用命令行提交代替远程提交。欢迎有更好解决方案的读者留言分享。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

  • spark on yarn 读取hdfs拒绝连接

代码:

package dsy.read_hdfs

import org.apache.spark.SparkConf
import org.apache.spark.sql.{DataFrame, SparkSession}

object read_hdfs {
  def main(args: Array[String]): Unit = {
    System.setProperty("HADOOP_USER_NAME", "root");
    System.setProperty("user.name", "root");
    val value = this.getClass.getClassLoader.loadClass("org.apache.spark.scheduler.cluster.YarnClusterManager")

    val spark: SparkSession = {
      val conf: SparkConf = new SparkConf()
        // 设置yarn-client模式提交
        .setMaster("yarn")
        //App名字
        .set("spark.app.name", this.getClass.getSimpleName.stripSuffix("$"))
        // 设置resourcemanager的ip
        .set("yarn.resourcemanager.hostname", "dsy")
        // 设置executor的个数
        .set("spark.executor.instance", "2")
        // 设置e
from pyspark.sql import SparkSession spark = SparkSession.builder \ .appName("DriverCheck") \ .config("spark.jars", "file:///export/server/spark/jars/mysql-connector-java-8.0.28.jar") \ .getOrCreate() # 显式加载驱动类 jvm = spark._jvm try: driver_class = jvm.org.apache.spark.util.Utils.classForName("com.mysql.cj.jdbc.Driver") print("✅ 驱动加载成功:", driver_class) except Exception as e: print("❌ 驱动加载失败:", str(e)) classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). ❌ 驱动加载失败: An error occurred while calling z:org.apache.spark.util.Utils.classForName. Trace: py4j.Py4JException: Method classForName([class java.lang.String]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:321) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) at py4j.Gateway.invoke(Gateway.java:276) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:184) at py4j.ClientServerConnection.run(ClientServerConnection.java:108) at java.base/java.lang.Thread.run(Thread.java:842) spark on yarn pyspark conda的虚拟环境下 # 强制加载驱动到 Driver 和 Executor spark.driver.extraClassPath /export/server/anaconda3/envs/pyspark/lib/python3.12/site-packages/pyspark/jars/mysql-connector-java-8.0.28.jarspark.executor.extraClassPath /export/server/anaconda3/envs/pyspark/lib/python3.12/site-packages/pyspark/jars/mysql-connector-java-8.0.28.jar # # # 指定本地驱动路径(需所有节点路径一致) spark.jars file:///export/server/spark/jars/mysql-connector-java-8.0.28.jar
最新发布
08-08
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值