报错1: [ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.22:install-node-and-npm (install node and npm) on project tez-ui: Could not download npm: Could not download http://registry.npmjs.org/npm/-/npm-1.3.8.tgz: Unknown host registry.npmjs.org: Name or service not known -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :tez-ui
解决:https://issues.apache.org/jira/browse/TEZ-2229 # cd tez-ui/ then I changed tez-ui/pom.xml, add '--allow-root' argument: <configuration> <workingDirectory>$ {webappDir} </workingDirectory> <executable>$ {node.executable} </executable> <arguments> <argument>node_modules/bower/bin/bower</argument> <argument>install</argument> <argument>--remove-unnecessary-resolutions=false</argument> <argument>--allow-root</argument> </arguments> </configuration>
# yum -y install gcc make gcc-c++ openssl # wget http://nodejs.org/dist/v0.10.26/node-v0.10.26.tar.gz # tar -zxvf node-v0.10.26.tar.gz # make && make install
# ls apache-tez-0.7.0-src/tez-dist/target/ archive-tmp maven-archiver tez-0.7.0 tez-0.7.0-minimal.tar.gz tez-0.7.0.tar.gz tez-dist-0.7.0-tests.jar
# hadoop version Hadoop 2.4.0 Subversion Unknown -r Unknown Compiled by root on 2014-07-27T04:58Z Compiled with protoc 2.5.0 From source with checksum 375b2832a6641759c6eaf6e3e998147 This command was run using /usr/local/hadoop-2.4.0/share/hadoop/common/hadoop-common-2.4.0.jar
# hadoop jar tez-examples-0.7.0.jar orderedwordcount /apps/tez-0.7.0/conf/tez-site.xml /output INFO client.TezClientUtils: Using tez.lib.uris value from configuration: hdfs://mycluster/apps/tez-0.7.0,hdfs://mycluster/apps/tez-0.7.0/lib/ INFO client.DAGClientImpl: DAG initialized: CurrentState=Running
报错: 15/07/18 02:17:38 INFO examples.OrderedWordCount: DAG diagnostics: [Application application_1437156593890_0002 failed 2 times due to AM Container for appattempt_1437156593890_0002_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
Container exited with a non-zero exit code 1 .Failing this attempt.. Failing the application.]
7、hive on tez
hive版本说明:hive-0.14+
版本为hive-0.13.1和hadoop2.4测试失败
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
修改hive的执行引擎为 Tez <property> <name>hive.execution.engine</name> <value>tez</value> </property>
hive (default)> set hive.execution.engine; hive.execution.engine=mr hive (default)> set hive.execution.engine=tez; hive (default)> select count(*) from order_test; Total jobs = 1 Launching Job 1 out of 1 Exception in thread "Thread-6" java.lang.NoSuchMethodError: org.apache.tez.mapreduce.hadoop.MRHelpers.updateEnvironmentForMRAM(Lorg/apache/hadoop/conf/Configuration;Ljava/util/Map;)V
Logging initialized using configuration in file:/home/hsu/apache-hive-1.2.1-bin/conf/hive-log4j.properties hive (default)> show tables; OK tab_name test Time taken: 2.33 seconds, Fetched: 1 row(s) hive (default)> select * from test; OK test.id 1 2 3 Time taken: 0.648 seconds, Fetched: 3 row(s)
hive (default)> set hive.execution.engine; hive.execution.engine=tez hive (default)> select count(*) from test; Query ID = hsu_20150813161742_a64d46a7-7349-41b3-be27-9a91bb490a8b Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1439448136822_0002) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 51.17 s -------------------------------------------------------------------------------- OK _c0 3 Time taken: 54.354 seconds, Fetched: 1 row(s)
#队列使用 set tez.queue.name=eda;
hive (default)> set tez.queue.name=test; hive (default)> select count(*) from test; Query ID = hsu_20150813162458_b0ce5d5f-09d2-40ae-8b6a-90f1f4cebe91 Total jobs = 1 Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1439448136822_0003)
-------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 32.71 s -------------------------------------------------------------------------------- OK _c0 3 Time taken: 84.191 seconds, Fetched: 1 row(s)
hive (default)> set hive.execution.engine; hive.execution.engine=tez hive (default)> set hive.execution.engine=mr; hive (default)> select count(*) from test; Query ID = hsu_20150813162723_31559eda-c8ab-4ca6-a445-b75afc908d28 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1439448136822_0004, Tracking URL = http://server1:23188/proxy/application_1439448136822_0004/ Kill Command = /home/hsu/hadoop/bin/hadoop job -kill job_1439448136822_0004 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-08-13 16:28:33,107 Stage-1 map = 0%, reduce = 0% 2015-08-13 16:29:20,688 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.99 sec 2015-08-13 16:29:49,215 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.27 sec MapReduce Total cumulative CPU time: 2 seconds 270 msec Ended Job = job_1439448136822_0004 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.27 sec HDFS Read: 210 HDFS Write: 2 SUCCESS Total MapReduce CPU Time Spent: 2 seconds 270 msec OK _c0 3 Time taken: 148.332 seconds, Fetched: 1 row(s)
hive on tez,直接配置hive支持tez,其实是在yarn上面启动一个app,这个app永久运行,接受不同sql,除非退出hive cli此app才会结束
可以通过tez提高的ui查看任务执行情况,需要单独配置tezUI!
tez runing app在yarn上面和spark on yarn很类似,而且tez,spark底层原理也很相似!
[hsu@server1 ~]$ hive Logging initialized using configuration in file:/home/hsu/apache-hive-1.2.1-bin /conf/hive-log4j.properties hive (default)> show tables; OK tab_name test Time taken: 1.106 seconds, Fetched: 1 row(s) hive (default)> select count(1) from test; Query ID = hsu_20150813170521_77b18935-4f38-4f84-8da7-6774b9aff194 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1439456630574_0001, Tracking URL = http://itr- mastertest01:23188/proxy/application_1439456630574_0001/ Kill Command = /home/hsu/hadoop/bin/hadoop job -kill job_1439456630574_0001 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2015-08-13 17:06:36,372 Stage-1 map = 0%, reduce = 0% 2015-08-13 17:07:26,923 Stage-1 map = 100%, reduce = 100% Ended Job = job_1439456630574_0001 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK _c0 3 Time taken: 126.726 seconds, Fetched: 1 row(s)
<property> <description>Enable Tez to use the Timeline Server for History Logging</description> <name>tez.history.logging.service.class</name> <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value> </property>
<property> <description>URL for where the Tez UI is hosted</description> <!--tomcat 9999 端口--> <name>tez.tez-ui.history-url.base</name> <value>http://server2:9999/tez-ui/</value> </property>
<property> <description>Publish configuration information to Timeline server.</description> <name>tez.runtime.convert.user-payload.to.history-text</name> <value>true</value> </property>
<property> <description>Indicate to clients whether Timeline service is enabled or not. If enabled, the TimelineClient library used by end-users will post entities and events to the Timeline server.</description> <name>yarn.timeline-service.enabled</name> <value>true</value> </property> <property> <description>The hostname of the Timeline service web application.</description> <name>yarn.timeline-service.hostname</name> <value>server2</value> </property> <property> <description>Enables cross-origin support (CORS) for web services where cross-origin web response headers are needed. For example, javascript making a web services request to the timeline server.</description> <name>yarn.timeline-service.http-cross-origin.enabled</name> <value>true</value> </property> <property> <description>Publish YARN information to Timeline Server</description> <name>yarn.resourcemanager.system-metrics-publisher.enabled</name> <value>true</value> </property> <property> <description>The http address of the Timeline service web application.</description> <name>yarn.timeline-service.webapp.address</name> <value>${yarn.timeline-service.hostname}:8188</value> </property> <property> <description>The https address of the Timeline service web application.</description> <name>yarn.timeline-service.webapp.https.address</name> <value>${yarn.timeline-service.hostname}:2191</value> </property>
[hsu@server2 ~]$ tomcat/bin/startup.sh Using CATALINA_BASE: /home/hsu/tomcat Using CATALINA_HOME: /home/hsu/tomcat Using CATALINA_TMPDIR: /home/hsu/tomcat/temp Using JRE_HOME: /usr/local/jdk1.7.0_45 Using CLASSPATH: /home/hsu/tomcat/bin/bootstrap.jar:/home/hsu/tomcat/bin/tomcat-juli.jar Tomcat started.
#修改端口为9999 [hsu@server2 ~]$ tomcat/bin/shutdown.sh Using CATALINA_BASE: /home/hsu/tomcat Using CATALINA_HOME: /home/hsu/tomcat Using CATALINA_TMPDIR: /home/hsu/tomcat/temp Using JRE_HOME: /usr/local/jdk1.7.0_45 Using CLASSPATH: /home/hsu/tomcat/bin/bootstrap.jar:/home/hsu/tomcat/bin/tomcat-juli.jar
[hsu@server2 ~]$ tomcat/bin/startup.sh [hsu@server2 ~]$ lsof -i :9999 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 7824 hsu 43u IPv6 75557 0t0 TCP *:distinct (LISTEN)
1、整合hive 2.0.1 and tez 0.7.1发现一些class包没用被加载,提示是没用找到类。 2、通过降低hive版本为 1.2.1发现没用任何问题,可以正常执行tez任务! 3、社区查看hive2.0.1依赖的tez版本是0.8.3,打算重新手动编译在测试是否可行,待验证。 4、这一点可以说明tez这个包各个版本直接接口的兼容性非常差,导致升级代价大,难以维护!