HQL错误:cannot be cast to org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess

本文讲述了在使用Azkaban调度Hive作业时遇到的ClassCastException错误,焦点在于空hashtable导致的MapJoin问题。作者提供了设置hive.mapjoin.optimized.hashtable=false来解决该问题的方法,并附上详细报错堆栈信息。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

背景:

  新写的azkaban调度,手动执行没问题,定时执行就报错,如下:

 

11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - Job failed with java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper cannot be cast to org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - 2022-05-11 06:13:14,838 ERROR [cc03d3bf-accd-4779-b30b-613ec97f72f8 main] status.SparkJobMonitor (SessionState.java:printError(1250)) - Job failed with java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper cannot be cast to org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - java.util.concurrent.ExecutionException: Exception thrown by job
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:282)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:287)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO -     at java.lang.Thread.run(Thread.java:748)

手动执行hql,结果都能出来。

发现问题:

当hashTable为空时,会发生这种情况。

官方解释:

 解决方法:

执行hql前,添加命令如下:

set hive.mapjoin.optimized.hashtable=false;

[root@hadoop01 apache-hive-3.1.3-bin]# hive which: no hbase in (/export/servers/hadoop-3.3.5/bin::/export/servers/apache-hive-3.1.3-bin/bin:/export/servers/flume-1.9.0/bin::/export/servers/apache-hive-3.1.3-bin/bin:/export/servers/flume-1.9.0/bin:/export/servers/flume-1.9.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/export/servers/jdk1.8.0_161/bin:/export/servers/hadoop-3.3.5/bin:/export/servers/hadoop-3.3.5/sbin:/export/servers/scala-2.12.10/bin:/root/bin:/export/servers/jdk1.8.0_161/bin:/export/servers/hadoop-3.3.5/bin:/export/servers/hadoop-3.3.5/sbin:/export/servers/scala-2.12.10/bin:/export/servers/jdk1.8.0_161/bin:/export/servers/hadoop-3.3.5/bin:/export/servers/hadoop-3.3.5/sbin:/export/servers/scala-2.12.10/bin:/root/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/servers/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/servers/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] 2025-06-16 17:53:36,956 INFO conf.HiveConf: Found configuration file file:/export/servers/apache-hive-3.1.3-bin/conf/hive-site.xml Hive Session ID = 30846036-47a7-480e-81e3-48f09d764412 2025-06-16 17:53:40,291 INFO SessionState: Hive Session ID = 30846036-47a7-480e-81e3-48f09d764412 Logging initialized using configuration in jar:file:/export/servers/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true 2025-06-16 17:53:40,414 INFO SessionState: Logging initialized using configuration in jar:file:/export/servers/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true 2025-06-16 17:53:43,041 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/30846036-47a7-480e-81e3-48f09d764412 2025-06-16 17:53:43,099 INFO session.SessionState: Created local directory: /tmp/root/30846036-47a7-480e-81e3-48f09d764412 2025-06-16 17:53:43,119 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/30846036-47a7-480e-81e3-48f09d764412/_tmp_space.db 2025-06-16 17:53:43,154 INFO conf.HiveConf: Using the default value passed in for log id: 30846036-47a7-480e-81e3-48f09d764412 2025-06-16 17:53:43,154 INFO session.SessionState: Updating thread name to 30846036-47a7-480e-81e3-48f09d764412 main 2025-06-16 17:53:45,040 INFO metastore.HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore 2025-06-16 17:53:45,120 WARN metastore.ObjectStore: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored 2025-06-16 17:53:45,133 INFO metastore.ObjectStore: ObjectStore, initialize called 2025-06-16 17:53:45,139 INFO conf.MetastoreConf: Found configuration file file:/export/servers/apache-hive-3.1.3-bin/conf/hive-site.xml 2025-06-16 17:53:45,141 INFO conf.MetastoreConf: Unable to find config file hivemetastore-site.xml 2025-06-16 17:53:45,141 INFO conf.MetastoreConf: Found configuration file null 2025-06-16 17:53:45,143 INFO conf.MetastoreConf: Unable to find config file metastore-site.xml 2025-06-16 17:53:45,143 INFO conf.MetastoreConf: Found configuration file null 2025-06-16 17:53:45,603 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored 2025-06-16 17:53:46,052 INFO hikari.HikariDataSource: HikariPool-1 - Starting... 2025-06-16 17:53:46,556 INFO hikari.HikariDataSource: HikariPool-1 - Start completed. 2025-06-16 17:53:46,645 INFO hikari.HikariDataSource: HikariPool-2 - Starting... 2025-06-16 17:53:46,677 INFO hikari.HikariDataSource: HikariPool-2 - Start completed. 2025-06-16 17:53:47,494 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 2025-06-16 17:53:47,815 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL 2025-06-16 17:53:47,820 INFO metastore.ObjectStore: Initialized ObjectStore 2025-06-16 17:53:48,285 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:48,286 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:48,287 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:48,287 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:48,288 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:48,288 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,987 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,988 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,988 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,988 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,989 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:51,989 WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored 2025-06-16 17:53:57,252 WARN metastore.ObjectStore: Version information not found in metastore. metastore.schema.verification is not enabled so recording the schema version 3.1.0 2025-06-16 17:53:57,253 WARN metastore.ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 3.1.0, comment = Set by MetaStore root@192.168.245.131 2025-06-16 17:53:57,550 INFO metastore.HiveMetaStore: Added admin role in metastore 2025-06-16 17:53:57,560 INFO metastore.HiveMetaStore: Added public role in metastore 2025-06-16 17:53:57,673 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty 2025-06-16 17:53:58,030 INFO metastore.RetryingMetaStoreClient: RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=root (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-06-16 17:53:58,085 INFO metastore.HiveMetaStore: 0: get_all_functions 2025-06-16 17:53:58,089 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_functions Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. 2025-06-16 17:53:58,263 INFO CliDriver: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = c998401c-9255-4863-8e6f-4932f9f591fa 2025-06-16 17:53:58,266 INFO SessionState: Hive Session ID = c998401c-9255-4863-8e6f-4932f9f591fa 2025-06-16 17:53:58,341 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/c998401c-9255-4863-8e6f-4932f9f591fa 2025-06-16 17:53:58,350 INFO session.SessionState: Created local directory: /tmp/root/c998401c-9255-4863-8e6f-4932f9f591fa 2025-06-16 17:53:58,365 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/c998401c-9255-4863-8e6f-4932f9f591fa/_tmp_space.db 2025-06-16 17:53:58,372 INFO metastore.HiveMetaStore: 1: get_databases: @hive# 2025-06-16 17:53:58,373 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_databases: @hive# 2025-06-16 17:53:58,377 INFO metastore.HiveMetaStore: 1: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore 2025-06-16 17:53:58,383 INFO metastore.ObjectStore: ObjectStore, initialize called 2025-06-16 17:53:58,462 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL 2025-06-16 17:53:58,466 INFO metastore.ObjectStore: Initialized ObjectStore 2025-06-16 17:53:58,494 INFO metastore.HiveMetaStore: 1: get_tables_by_type: db=@hive#db_hive1 pat=.*,type=MATERIALIZED_VIEW 2025-06-16 17:53:58,495 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_tables_by_type: db=@hive#db_hive1 pat=.*,type=MATERIALIZED_VIEW 2025-06-16 17:53:58,525 INFO metastore.HiveMetaStore: 1: get_multi_table : db=db_hive1 tbls= 2025-06-16 17:53:58,526 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_multi_table : db=db_hive1 tbls= 2025-06-16 17:53:58,530 INFO metastore.HiveMetaStore: 1: get_tables_by_type: db=@hive#default pat=.*,type=MATERIALIZED_VIEW 2025-06-16 17:53:58,530 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_tables_by_type: db=@hive#default pat=.*,type=MATERIALIZED_VIEW 2025-06-16 17:53:58,539 INFO metastore.HiveMetaStore: 1: get_multi_table : db=default tbls= 2025-06-16 17:53:58,539 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_multi_table : db=default tbls= 2025-06-16 17:53:58,539 INFO metadata.HiveMaterializedViewsRegistry: Materialized views registry has been initialized
最新发布
06-17
<think>好的,我现在需要帮助用户解决如何在Java项目中使用org.apache.hadoop.hive.ql.parse包的问题。首先,我得回想一下Hive相关的依赖配置。 用户提到的包属于Hive的执行引擎部分,特别是解析HQL语句的部分。通常,Hive的解析模块依赖Hive的执行模块(hive-exec)。所以,首先想到需要添加Hive Exec的依赖。 不过,Hive本身有很多依赖,尤其是和Hadoop相关的。用户的项目可能已经用了某个版本的Hadoop,所以需要注意版本兼容性。用户可能没有指定版本,我需要提醒他们选择与现有环境匹配的版本,比如Hive 3.1.2对应Hadoop 3.x。 接下来,考虑Maven依赖的写法。核心依赖应该是org.apache.hive.hive-exec。但有时候,Hive的依赖可能会引入一些冲突,比如与Hadoop核心库或者其他组件。这时候可能需要排除一些传递性依赖,比如Log4j或者Servlet API,避免版本冲突。 另外,用户可能需要访问Hadoop的公共库,比如hadoop-common,所以也要添加Hadoop Client的依赖,确保相关的Hadoop配置类可用。 仓库方面,HiveHadoop的库应该在Maven中央仓库,或者像Apache仓库中有。如果用户使用内部仓库,可能需要配置相应的地址,但一般情况不用特别处理,除非遇到下载问题。 最后,需要提醒用户注意依赖冲突,建议用Maven的依赖树分析工具检查,排除不必要的依赖。另外,单元测试可能需要MiniHiveCluster之类的工具,但生产环境不需要。 总结步骤:添加hive-exechadoop-client依赖,注意版本匹配,处理排除项,检查依赖冲突。这样应该就能正确引入org.apache.hadoop.hive.ql.parse包了。</think>在Java项目中引入`org.apache.hadoop.hive.ql.parse`包时,需添加Hive相关依赖。以下是分步说明: --- ### **1. 核心依赖** 在Maven项目的`pom.xml`中添加以下依赖: ```xml <!-- Hive Exec (包含Hive解析器) --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>${hive.version}</version> <!-- 排除冲突的日志依赖 --> <exclusions> <exclusion> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> </exclusion> <exclusion> <groupId>javax.servlet</groupId> <artifactId>servlet-api</artifactId> </exclusion> </exclusions> </dependency> <!-- Hadoop客户端依赖 --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> ``` --- ### **2. 版本选择** 需确保版本兼容性,推荐组合: - **Hadoop 3.x** ➔ Hive 3.1.2 - **Hadoop 2.x** ➔ Hive 2.3.9 例如: ```xml <properties> <hadoop.version>3.3.6</hadoop.version> <hive.version>3.1.2</hive.version> </properties> ``` --- ### **3. 依赖冲突处理** Hive依赖可能引发以下冲突: - **日志框架冲突**:排除`log4j`或`slf4j`重复依赖 - **Servlet API冲突**:排除`servlet-api` --- ### **4. 仓库配置(可选)** 若无法从Maven中央仓库下载,需添加Apache仓库: ```xml <repositories> <repository> <id>apache-releases</id> <url>https://repository.apache.org/content/repositories/releases/</url> </repository> </repositories> ``` --- ### **5. 验证依赖** 尝试解析HQL语句的简单示例: ```java import org.apache.hadoop.hive.ql.parse.ParseDriver; import org.apache.hadoop.hive.ql.parse.ASTNode; public class HiveParserDemo { public static void main(String[] args) throws Exception { String hql = "SELECT * FROM table"; ASTNode root = new ParseDriver().parse(hql); System.out.println(root.dump()); } } ``` --- ### **常见问题** 1. **ClassNotFoundException** 检查是否遗漏`hive-exec`或`hadoop-client`依赖。 2. **版本不兼容** 确保HiveHadoop版本匹配。 3. **日志冲突** 使用`mvn dependency:tree`排查冲突依赖。 --- ### **总结** 核心依赖为`hive-exec`和`hadoop-client`,需注意版本兼容性及依赖排除。实际开发中建议使用依赖管理工具(如Maven/Gradle)统一管理版本。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值