OOM:GC overhead limit exceeded 错处理方法

本文介绍了在使用Spark处理大数据时遇到的“GC overhead limit exceeded”错误,并详细记录了通过调整spark.driver.memory参数来解决问题的过程。同时,还讨论了spark.sql.shuffle.partitions参数对输出文件数量的影响。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

*OOM:GC overhead limit exceeded spark.driver.memory spark.sql.shuffle.partitions*
错误现象:
调整不同资源进行执行时报错:
18/07/26 17:02:03 INFO spark.ContextCleaner: Cleaned accumulator 18
Exception in thread “broadcast-hash-join-1” 18/07/26 17:10:59 WARN nio.NioEventLoop: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: GC overhead limit exceeded

18/07/26 15:11:04 INFO spark.ContextCleaner: Cleaned accumulator 18
Exception in thread “broadcast-hash-join-1” java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.spark.sql.catalyst.expressions.UnsafeRow.copy(UnsafeRow.java:537)
at org.apache.spark.sql.execution.joins.UnsafeHashedRelation.apply(HashedRelation.scala:403)atorg.apache.spark.sql.execution.joins.HashedRelation.apply(HashedRelation.scala:403)atorg.apache.spark.sql.execution.joins.HashedRelation.apply(HashedRelation.scala:128)
at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin

anonfun$broadcastFuture$1anonfun$broadcastFuture$1
anonfunapplyapply1.apply(BroadcastHashOuterJoin.scala:92)
at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin
anonfun$broadcastFuture$1anonfun$broadcastFuture$1
anonfunapplyapply1.apply(BroadcastHashOuterJoin.scala:82)
at org.apache.spark.sql.execution.SQLExecution.withExecutionId(SQLExecution.scala:90)atorg.apache.spark.sql.execution.joins.BroadcastHashOuterJoin.withExecutionId(SQLExecution.scala:90)atorg.apache.spark.sql.execution.joins.BroadcastHashOuterJoinanonfunanonfunbroadcastFuture1.apply(BroadcastHashOuterJoin.scala:82)atorg.apache.spark.sql.execution.joins.BroadcastHashOuterJoin1.apply(BroadcastHashOuterJoin.scala:82)atorg.apache.spark.sql.execution.joins.BroadcastHashOuterJoinanonfunanonfunbroadcastFuture1.apply(BroadcastHashOuterJoin.scala:82)atscala.concurrent.impl.Future1.apply(BroadcastHashOuterJoin.scala:82)atscala.concurrent.impl.FuturePromiseCompletingRunnable.liftedTree11(Future.scala:24)atscala.concurrent.impl.Future1(Future.scala:24)atscala.concurrent.impl.FuturePromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

经过度娘查看有两种说法,看样子是在yarn-standalone方式下调 executor.memory,或driver.memory(多建议由512m或1g调至2g)而当前问题发生集群上有,本身指字的driver.memory 已经是3G了,依然如此(集群限driver.memory最大3G),多次调整语句与其它未能达到解决此问题的效果;便提议集群放大一下driver.memory 的大小。
随后调整集群参数( SPARK_HOME/conf/spark-defaults.conf 中设置的spark.driver.memory )值到5g;再在集群中调用spark-submit 时指定driver.memory为 5g 启动任务就可以调用过去了。

另外在执行任务时,发现集群写出的文件数量比较大,调用时设置的: spark.sql.shuffle.partitions 与spark.default.parallelism 是
spark.executor.cores 与 spark.executor.instances 的积的1-3倍;后续经多次测试,发现文件大小与:spark.sql.shuffle.partitions 有直接关系一般是一次写出到hive会写 此值+1/2 个文件,因此调整此参数据的数值,文件数量减少,当前未明了其原因,请路过的大拿给讲解讲解。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值