Hive常見問題

1. 內存溢出

虛擬內存溢出:

Current usage: 1.1gb of 2.0gb physical memory used; 4.6gb of 4.2gb virtual memory used. Killing container.==【即虛擬內存溢出】==;

方法一:提高yarn.nodemanager.vmem-pmem-ratio = 5或者更高;【推薦】

方法二:yarn.nodemanager.vmem-check-enabled =false ;關閉虛擬內存檢查;不推薦

方法三:提高物理內存分配,相應的虛擬內存自然就多了,但是這樣不是最優

物理內存溢出:

Current usage: 2.1gb of 2.0gb physical memory used; 3.6gb of 4.2gb virtual memory used. Killing container.【即物理內存溢出】;

方法一:mapreduce.map.memory.mb = 3GB以上,然后測試這個map/reduce task需要使用多少內存才夠用,提高這個值直到不報錯為止。

方法二:提高yarn.scheduler.minimum-allocation-mb = 3GB以上,同理【不推薦】

打開低版本hive報錯:

ls: cannot access /app/local/spark-2.0.2-bin-hadoop2.6/lib/spark-assembly-*.jar: No such file or directory

修改hive啟動文件

vim /app/local/hive/bin/hive
找到下面這一行:
# add Spark assembly jar to the classpath
if [[ -n "SPARK_HOME" ]]
then
  # sparkAssemblyPath=`ls{SPARK_HOME}/lib/spark-assembly-*.jar`
  sparkAssemblyPath=`ls {SPARK_HOME}/jars/*.jar`
  CLASSPATH="{CLASSPATH}:${sparkAssemblyPath}"
fi

2. 關聯查詢

2018-11-25 14:43:04,199 main ERROR Unable to invoke factory method in class class org.apache.hadoop.hive.ql.log element HushableMutableRandomAccess. java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:132)
    at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.
    at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration
    at org.apache.logging.log4j.core.appender.routing.RoutingAppender.createAppender(RoutingAppender.java:2
    at org.apache.logging.log4j.core.appender.routing.RoutingAppender.getControl(RoutingAppender.java:255)
    at org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:225)
    at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
    at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
    at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.ja
    at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
    at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448)
    at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433)
    at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
    at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:403)
    at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabili
    at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2091)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:1993)
    at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1852)
    at org.apache.logging.slf4j.Log4jLogger.info(Log4jLogger.java:179)
    at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.<init>(MapJoinMemoryExhaustion
    at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:129)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:358)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:546)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:498)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:368)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:514)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:418)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:393)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:774)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
Caused by: java.lang.IllegalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.RandomAccessctory@6dac64ea] unable to create manager for [/var/log/hive/operation_logs/5396b439-4945-483d-b8eb-b5c478e6fbb5ae-9f97-23f95080e4be] with data [org.apache.logging.log4j.core.appender.RandomAccessFileManager$FactoryData@5de
    at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:114)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:114)
    at org.apache.logging.log4j.core.appender.RandomAccessFileManager.getFileManager(RandomAccessFileManage
    at org.apache.hadoop.hive.ql.log.HushableRandomAccessFileAppender.createAppender(HushableRandomAccessFi
    ... 40 more
  • 異常原因:mr將數據量小的表識別成了大表,數據量大的識別成小表,導致將數據量大的表加入到內存,導致程序異常
  • 處理方法:
    set hive.execution.engine=mr;
    set hive.mapjoin.smalltable.filesize=55000000;
    set hive.auto.convert.join = false;  #取消小表加載至內存中

    ==通常情況下==,設置取消小表加載至內存中即可:set hive.auto.convert.join = false;

3. hive on spark問題

Job aborted due to stage failure: Aborting TaskSet 2.0 because task 8 (partition 8) cannot run anywhere due to node and executor blacklist. Blacklisting behavior can be configured via spark.blacklist.*.

臨時解決辦法:

set hive.execution.engine = mr;

4. hive 事務表

執行的操作

delete from hm2.history_helper_back where starttime = '2019-06-12';

報錯信息

FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

解決辦法:

set hive.support.concurrency = true;
set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

5. 修復大量分區

hive> MSCK REPAIR TABLE employee;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

設置:

set hive.msck.path.validation=ignore;

6. hiveserver2 不識別udf函數

在無法使用UDF的 HiveServer2 上,執行 reload function 命令,將MetaStore中新增的UDF信息同步到HiveServer2內存中。

7. 動態分區

Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to 100 partitions per node, number of dynamic partitions on this node: 101
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:941)
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:712)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:147)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:487)
    ... 9 more

設置每個節點最大動態分區個數.

set hive.exec.max.dynamic.partitions=3000;
set hive.exec.max.dynamic.partitions.pernode=1000;
SQL

8. block塊丟失

Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-808991319-10.1.0.62-1541662386662:blk_1110742285_40900ile=/user/hive/warehouse/hm4.db/hm4_format_log_his_tmp/dt=2019-09-16/hour=11/product=mini/event=click/part-r-00013_copy_1

由于block塊受損,無法恢復,只能刪除。








作者:柯廣的網絡日志

微信公眾號:Java大數據與數據倉庫