阿帕奇凤凰(4.3.1和4.4.0-HBase的-0.98)在星火1.3.1的ClassNotFoundException [英] Apache Phoenix (4.3.1 and 4.4.0-HBase-0.98) on Spark 1.3.1 ClassNotFoundException
问题描述
我想通过星火连接到凤凰城和经开JDBC驱动程序(切为简便起见,下面全堆栈跟踪)的连接时,我不断收到以下异常:
产生的原因:抛出java.lang.ClassNotFoundException:org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:366)
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:355)
在java.security.AccessController.doPrivileged(本机方法)
问题的类是由罐子叫凤芯-4.3.1.jar(尽管它在HBase的包命名空间中,我想他们需要用它来整合HBase的)。
提供有关于星火SO约ClassNotFoundExceptions许多问题,我已经试过了脂肪的罐子办法(均与Maven的组装和灯罩的插件,我已经考察了罐子,他们的做包含ClientRpcControllerFactory ),我尝试了一个精干的罐子,而命令行上指定的罐子。对于后者,我使用的命令如下:
/opt/mapr/spark/spark-1.3.1/bin/spark-submit --jars lib/spark-streaming-kafka_10-1.3.1.jar,lib/kafka_2.10-0.8.1.1.jar,lib/zkclient-0.3.jar,lib/metrics-core-3.1.0.jar,lib/metrics-core-2.2.0.jar,lib/phoenix-core-4.3.1.jar --class nl.work.kafkastreamconsumer.phoenix.KafkaPhoenixConnector KafkaStreamConsumer.jar节点1:0 5181 JDBC话题:凤凰:节点1:5181真
我也从code和在已经知道凤凰罐子层次结构中的第一个类加载器中完成的类路径转储:
选择月份10:52:34323 [执行人任务发射工人1] INFO nl.work.kafkastreamconsumer.phoenix.LinePersister - [文件:/家庭/工作/项目/customer/KafkaStreamConsumer.jar,文件:/home/work/projects/customer/lib/spark-streaming-kafka_2.10-1.3.1.jar,文件:/home/work/projects/customer/lib/kafka_2.10 -0.8.1.1.jar,文件:/home/work/projects/customer/lib/zkclient-0.3.jar,文件:/home/work/projects/customer/lib/metrics-core-3.1.0.jar,文件:/home/work/projects/customer/lib/metrics-core-2.2.0.jar,文件:/home/work/projects/customer/lib/phoenix-core-4.3.1.jar]
所以,问题是:什么我在这里丢失?为什么不能星火加载正确的类?应该只有一个版本的类飞来飞去(即从凤芯的),所以我怀疑它的一个版本冲突。
[执行人任务发射工人3]错误nl.work.kafkastreamconsumer.phoenix.LinePersister - 错误而加工生产线
了java.lang.RuntimeException:值java.sql.SQLException:ERROR 103(08004):无法建立连接。
在nl.work.kafkastreamconsumer.phoenix.PhoenixConnection<&初始化GT;(PhoenixConnection.java:41)
在nl.work.kafkastreamconsumer.phoenix.LinePersister $ 1.call(LinePersister.java:40)
在nl.work.kafkastreamconsumer.phoenix.LinePersister $ 1.call(LinePersister.java:32)
在org.apache.spark.api.java.JavaPairRDD $$ anonfun $ toScalaFunction $ 1.适用(JavaPairRDD.scala:999)
在scala.collection.Iterator $$不久$ 11.next(Iterator.scala:328)
在scala.collection.Iterator $ class.foreach(Iterator.scala:727)
在scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
在scala.collection.generic.Growable $ $类加$另加$ EQ(Growable.scala:48)。
在scala.collection.mutable.ArrayBuffer $另加$另加$ EQ(ArrayBuffer.scala:103)。
在scala.collection.mutable.ArrayBuffer $另加$另加$ EQ(ArrayBuffer.scala:47)。
在scala.collection.TraversableOnce $ class.to(TraversableOnce.scala:273)
在scala.collection.AbstractIterator.to(Iterator.scala:1157)
在scala.collection.TraversableOnce $ class.toBuffer(TraversableOnce.scala:265)
在scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
在scala.collection.TraversableOnce $ class.toArray(TraversableOnce.scala:252)
在scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
在org.apache.spark.rdd.RDD $$ anonfun $ 17.apply(RDD.scala:813)
在org.apache.spark.rdd.RDD $$ anonfun $ 17.apply(RDD.scala:813)
在org.apache.spark.SparkContext $$ anonfun $ runJob $ 5.apply(SparkContext.scala:1498)
在org.apache.spark.SparkContext $$ anonfun $ runJob $ 5.apply(SparkContext.scala:1498)
在org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
在org.apache.spark.scheduler.Task.run(Task.scala:64)
在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:203)
在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
在java.lang.Thread.run(Thread.java:745)
值java.sql.SQLException:引起错误103(08004):无法建立连接。
在org.apache.phoenix.exception.SQLException code $ $厂1.newException(的SQLException code.java:362)
在org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:282)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:166)
在org.apache.phoenix.query.ConnectionQueryServicesImpl $ 11.call(ConnectionQueryServicesImpl.java:1831)
在org.apache.phoenix.query.ConnectionQueryServicesImpl $ 11.call(ConnectionQueryServicesImpl.java:1810)
在org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1810)
在org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
在org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:126)
在org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
在java.sql.DriverManager.getConnection(DriverManager.java:571)
在java.sql.DriverManager.getConnection(DriverManager.java:233)
在nl.work.kafkastreamconsumer.phoenix.PhoenixConnection<&初始化GT;(PhoenixConnection.java:39)
... 25个
java.io.IOException异常:引起java.lang.reflect.InvocationTargetException
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:457)
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:350)
在org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:280)
... 36更多
java.lang.reflect.InvocationTargetException:产生的原因
在sun.reflect.GeneratedConstructorAccessor8.newInstance(来源不明)
在sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
在java.lang.reflect.Constructor.newInstance(Constructor.java:526)
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:455)
... 39更多
java.lang.UnsupportedOperationException:无法找到org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory所致
在org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)
在org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:56)
在org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:769)
在org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:689)
... 43更多
抛出java.lang.ClassNotFoundException:引起org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:366)
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:355)
在java.security.AccessController.doPrivileged(本机方法)
在java.net.URLClassLoader.findClass(URLClassLoader.java:354)
在java.lang.ClassLoader.loadClass(ClassLoader.java:425)
在sun.misc.Launcher $ AppClassLoader.loadClass(Launcher.java:308)
在java.lang.ClassLoader.loadClass(ClassLoader.java:358)
在java.lang.Class.forName0(本机方法)
在java.lang.Class.forName(Class.java:191)
在org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)
... 46更多
/修改
不幸的是,问题仍与4.4.0-HBase的0.98。下面是有问题的类。由于saveToPhoenix()方法还没有可用于Java API和,因为这仅仅是一个POC,我的想法是简单地使用每个小批量的JDBC驱动程序。
公共类PhoenixConnection实现AutoCloseable,序列化{
私有静态最后的serialVersionUID长= -4491057264383873689L;
私有静态最后弦乐PHOENIX_DRIVER =org.apache.phoenix.jdbc.PhoenixDriver; 静态的 {
尝试{
的Class.forName(PHOENIX_DRIVER);
}赶上(ClassNotFoundException的E){
抛出新的RuntimeException(E);
}
} 专用连接的连接; 公共PhoenixConnection(最后弦乐jdbcUri){ 尝试{
连接=的DriverManager.getConnection(jdbcUri);
}赶上(的SQLException E){
抛出新的RuntimeException(E);
}
} 公开名单&LT;地图&LT;弦乐,对象&gt;&GT;的executeQuery(最后弦乐SQL)抛出的SQLException { ArrayList的&LT;地图&LT;弦乐,对象&gt;&GT; resultList =新的ArrayList&LT;&GT;();
尝试(preparedStatement声明=连接prepareStatement(SQL); ResultSet中的resultSet = Statement.executeQuery的()){
ResultSetMetaData的元数据= ResultSet.getMetaData得到();
而(resultSet.next()){
地图&LT;弦乐,对象&gt;行=新的HashMap&LT;&GT;(metaData.getColumnCount());
对于(INT列= 0;&列LT; metaData.getColumnCount(); ++列){
最后弦乐columnLabel = metaData.getColumnLabel(列);
row.put(columnLabel,的ResultSet.getObject(columnLabel));
}
}
}
resultList.trimToSize(); 返回resultList;
} @覆盖
公共无效的close(){
尝试{
connection.close()时;
}赶上(的SQLException E){
抛出新的RuntimeException(E);
}
}}公共类LinePersister实现了功能与LT; JavaRDD&LT;弦乐&gt;中太虚&GT; {
私有静态最后的serialVersionUID长= -2529724617108874989L;
私有静态最后记录器记录仪= Logger.getLogger(LinePersister.class);
私有静态最后弦乐TABLE_NAME =mail_events; 私人最终字符串JDBCURL; 公共LinePersister(字符串JDBCURL){
this.jdbcUrl = JDBCURL;
} @覆盖
公共无效调用(JavaRDD&LT;串GT;数据集)抛出异常{
LOGGER.info(的String.format(
与%D单元RDD开始转换,dataSet.count())); 清单&LT;无效&GT; collectResult = dataSet.map(新功能&LT;弦乐,太虚&GT;(){ 私有静态最后的serialVersionUID长= -6651313541439109868L; @覆盖
公共无效调用(串线)抛出异常{
LOGGER.info(书写行+线);
事件事件= EventParser.parseLine(线);
尝试(PhoenixConnection连接=新PhoenixConnection(
JDBCURL)){
connection.executeQuery(事件
.createUpsertStatement(TABLE_NAME));
}赶上(例外五){
LOGGER.error(错误而加工生产线,E);
dumpClasspath(this.getClass()getClassLoader()); }
返回null;
}
})。搜集(); LOGGER.info(的String.format(得到%d的结果:collectResult.size())); 返回null;
} 公共静态无效dumpClasspath(类加载器加载器)
{
LOGGER.info(类装载器+装载机+:); 如果(装载机的instanceof的URLClassLoader)
{
URLClassLoader的UCL =(的URLClassLoader)装载机;
LOGGER.info(Arrays.toString(ucl.getURLs()));
}
其他
LOGGER.error(不能显示部件不是URLClassLoader的)); 如果(loader.getParent()!= NULL)
dumpClasspath(loader.getParent());
}
}&LT; XML版本=1.0编码=UTF-8&GT?;
&LT;项目的xmlns =http://maven.apache.org/POM/4.0.0的xmlns:XSI =http://www.w3.org/2001/XMLSchema-instance
XSI:的schemaLocation =http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd\">
&LT; modelVersion&GT; 4.0.0&LT; / modelVersion&GT;
&LT;&的groupId GT; nl.work&LT; /的groupId&GT;
&LT;&的artifactId GT; KafkaStreamConsumer&LT; / artifactId的&GT;
&LT;&版GT; 1.0 LT; /版本&GT;
&LT;包装和GT;&罐子LT; /包装&GT;
&LT;性状&gt;
&LT; project.build.sourceEncoding&GT; UTF-8&LT; /project.build.sourceEncoding>
&LT; maven.compiler.source&GT; 1.7 LT; /maven.compiler.source>
&LT; maven.compiler.target&GT; 1.7 LT; /maven.compiler.target>
&LT; spark.version&GT; 1.3.1&LT; /spark.version>
&LT; hibernate.version&GT; 4.3.10.Final&LT; /hibernate.version>
&LT; phoenix.version&GT; 4.4.0-HBase的0.98&LT; /phoenix.version>
&LT; hbase.version&GT; 0.98.9-hadoop2&LT; /hbase.version>
&LT;火花hbase.version&GT; 0.0.2-clabs火花-1.3.1&LT; /spark-hbase.version>
&LT; /性状&gt;
&LT;依赖和GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.spark&LT; /的groupId&GT;
&LT;&的artifactId GT;火花core_2.10&LT; / artifactId的&GT;
&LT;&版GT; $ {spark.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.spark&LT; /的groupId&GT;
&LT;&的artifactId GT;火花streaming_2.10&LT; / artifactId的&GT;
&LT;&版GT; $ {spark.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.spark&LT; /的groupId&GT;
&LT;&的artifactId GT;火花流-kafka_2.10&LT; / artifactId的&GT;
&LT;&版GT; $ {spark.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.phoenix&LT; /的groupId&GT;
&LT;&的artifactId GT;凤芯&LT; / artifactId的&GT;
&LT;&版GT; $ {phoenix.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.phoenix&LT; /的groupId&GT;
&LT;&的artifactId GT;凤火花&LT; / artifactId的&GT;
&LT;&版GT; $ {phoenix.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; org.apache.hbase&LT; /的groupId&GT;
&LT;&的artifactId GT; HBase的客户端&LT; / artifactId的&GT;
&LT;&版GT; $ {hbase.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT; com.cloudera&LT; /的groupId&GT;
&LT;&的artifactId GT;火花HBase的&LT; / artifactId的&GT;
&LT;&版GT; $ {火花hbase.version}&LT; /版本&GT;
&LT;&范围GT;及提供LT; /&范围GT;
&LT; /依赖性&GT;
&LT;&依赖性GT;
&LT;&的groupId GT;的JUnit&LT; /的groupId&GT;
&LT;&的artifactId GT;的JUnit&LT; / artifactId的&GT;
&LT;&版GT; 4.10 LT; /版本&GT;
&LT;&范围GT;试验&LT; /&范围GT;
&LT; /依赖性&GT;
&LT; /依赖和GT;
&LT;建立&GT;
&LT;&插件GT;
&LT;&插件GT;
&LT;&的groupId GT; org.apache.maven.plugins&LT; /的groupId&GT;
&LT;&的artifactId GT; Maven的编译器插件&LT; / artifactId的&GT;
&LT;&版GT; 3.3&LT; /版本&GT;
&LT;结构&gt;
&lt;信源&GT; $ {maven.compiler.source}&LT; /源&GT;
&lt;目标&GT; $ {maven.compiler.target}&LT; /目标与GT;
&LT; /结构&gt;
&LT; /插件&GT;
&LT;! - &LT;&插件GT; &LT;&的groupId GT; org.apache.maven.plugins&LT; /的groupId&GT; &LT;&的artifactId GT; Maven的遮阳插件&LT; / artifactId的&GT;
&LT;&版GT; 2.3&LT; /版本&GT; &LT;&执行GT; &LT;执行与GT; &LT;阶段&gt;包装及LT; /阶段&gt; &LT;目标&GT;
&LT;&目标GT;遮阳和LT; /目标&GT; &LT; /目标&GT; &LT;结构&gt; &LT;过滤器和GT; &所述;滤光器&gt; &LT;&神器GT; *:*&LT; /神器&GT;
&LT;&排除GT; &LT;排除方式&gt; META-INF / * SF&LT; /排除&GT; &LT;排除方式&gt; META-INF / * DSA&LT; /排除&GT;
&LT;排除方式&gt; META-INF / * RSA&LT; /排除&GT; &LT; /排除&GT; &LT; /滤光器&gt; &LT; /过滤器&GT; &LT; /结构&gt;
&LT; /执行&GT; &LT; /处决&GT; &LT; /插件&GT; - &GT;
&LT; /插件&GT;
&LT; /构建&GT;
&LT;库&GT;
&LT;&库GT;
&LT;&ID GT;未知罐,温回购&LT; / ID&GT;
&LT;名称&gt;通过NetBeans的图书馆和罐子它不能确定创建的临时存储库。请更换依赖于这个仓库与正确的人,并删除该存储库&LT; /名称&gt;
&LT; URL&GT;文件:$ {} project.basedir / lib目录下及/ URL&GT;
&LT; /存储库&GT;
&LT; /存储库&GT;
&LT; /项目&GT;
/ EDIT2
我已经试过saveAsHadoopApiFile办法( https://gist.github.com/mravi / 444afe7f49821819c987#文件phoenixsparkjob-java的),但得到了同样的错误,只是一个不同的堆栈跟踪:
了java.lang.RuntimeException:java.sql.SQLException中:错误103(08004):无法建立连接。
在org.apache.phoenix.ma preduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
在org.apache.spark.rdd.PairRDDFunctions $$ anonfun $ 12.apply(PairRDDFunctions.scala:995)
在org.apache.spark.rdd.PairRDDFunctions $$ anonfun $ 12.apply(PairRDDFunctions.scala:979)
在org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
在org.apache.spark.scheduler.Task.run(Task.scala:64)
在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:203)
在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
在java.lang.Thread.run(Thread.java:745)
值java.sql.SQLException:引起错误103(08004):无法建立连接。
在org.apache.phoenix.exception.SQLException code $ $厂1.newException(的SQLException code.java:386)
在org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:288)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:171)
在org.apache.phoenix.query.ConnectionQueryServicesImpl $ 12.call(ConnectionQueryServicesImpl.java:1881)
在org.apache.phoenix.query.ConnectionQueryServicesImpl $ 12.call(ConnectionQueryServicesImpl.java:1860)
在org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
在org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
在org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
在org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
在java.sql.DriverManager.getConnection(DriverManager.java:571)
在java.sql.DriverManager.getConnection(DriverManager.java:187)
在org.apache.phoenix.ma preduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:92)
在org.apache.phoenix.ma preduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
在org.apache.phoenix.ma preduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
在org.apache.phoenix.ma preduce.PhoenixRecordWriter&LT;&初始化GT;(PhoenixRecordWriter.java:49)
在org.apache.phoenix.ma preduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)
... 8个
java.io.IOException异常:引起java.lang.reflect.InvocationTargetException
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:457)
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:350)
在org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
在org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:286)
... 23更多
java.lang.reflect.InvocationTargetException:产生的原因
在sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)
在sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
在sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
在java.lang.reflect.Constructor.newInstance(Constructor.java:526)
在org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:455)
... 26更多
java.lang.UnsupportedOperationException:无法找到org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory所致
在org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)
在org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:56)
在org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:769)
在org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:689)
... 31更多
抛出java.lang.ClassNotFoundException:引起org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:366)
在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:355)
在java.security.AccessController.doPrivileged(本机方法)
在java.net.URLClassLoader.findClass(URLClassLoader.java:354)
在java.lang.ClassLoader.loadClass(ClassLoader.java:425)
在sun.misc.Launcher $ AppClassLoader.loadClass(Launcher.java:308)
在java.lang.ClassLoader.loadClass(ClassLoader.java:358)
在java.lang.Class.forName0(本机方法)
在java.lang.Class.forName(Class.java:191)
在org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)
... 34更多
的好心人在凤凰邮件列表给我的答案是:
的,而不是与你的应用程序捆绑凤凰客户端JAR,你能
它包含在一个静态的位置,无论是在SPARK_CLASSPATH,或设置
下面的conf值(我用自己SPARK_CLASSPATH,虽然它是德precated):
spark.driver.extraClassPath
spark.executor.extraClassPath
的
https://www.mail-archive.com/ user@spark.apache.org/msg29978.html
I'm trying to connect to Phoenix via Spark and I keep getting the following exception when opening a connection via the JDBC driver (cut for brevity, full stacktrace below):
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
The class in question is provided by the jar called phoenix-core-4.3.1.jar (despite it being in the HBase package namespace, I guess they need it to integrate with HBase).
There are numerous questions on SO about ClassNotFoundExceptions on Spark and I've tried the fat-jar approach (both with Maven's assembly and shade plugins; I've inspected the jars, they do contain ClientRpcControllerFactory), and I've tried a lean jar while specifying the jars on the command line. For the latter, the command I used is as follows:
/opt/mapr/spark/spark-1.3.1/bin/spark-submit --jars lib/spark-streaming-kafka_10-1.3.1.jar,lib/kafka_2.10-0.8.1.1.jar,lib/zkclient-0.3.jar,lib/metrics-core-3.1.0.jar,lib/metrics-core-2.2.0.jar,lib/phoenix-core-4.3.1.jar --class nl.work.kafkastreamconsumer.phoenix.KafkaPhoenixConnector KafkaStreamConsumer.jar node1:5181 0 topic jdbc:phoenix:node1:5181 true
I've also done a classpath dump from within the code and the first classloader in the hierarchy already knows the Phoenix jar:
2015-06-04 10:52:34,323 [Executor task launch worker-1] INFO nl.work.kafkastreamconsumer.phoenix.LinePersister - [file:/home/work/projects/customer/KafkaStreamConsumer.jar, file:/home/work/projects/customer/lib/spark-streaming-kafka_2.10-1.3.1.jar, file:/home/work/projects/customer/lib/kafka_2.10-0.8.1.1.jar, file:/home/work/projects/customer/lib/zkclient-0.3.jar, file:/home/work/projects/customer/lib/metrics-core-3.1.0.jar, file:/home/work/projects/customer/lib/metrics-core-2.2.0.jar, file:/home/work/projects/customer/lib/phoenix-core-4.3.1.jar]
So the question is: What am I missing here? Why can't Spark load the correct class? There should be only one version of the class flying around (namely the one from phoenix-core), so I doubt it's a versioning conflict.
[Executor task launch worker-3] ERROR nl.work.kafkastreamconsumer.phoenix.LinePersister - Error while processing line
java.lang.RuntimeException: java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at nl.work.kafkastreamconsumer.phoenix.PhoenixConnection.<init>(PhoenixConnection.java:41)
at nl.work.kafkastreamconsumer.phoenix.LinePersister$1.call(LinePersister.java:40)
at nl.work.kafkastreamconsumer.phoenix.LinePersister$1.call(LinePersister.java:32)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:999)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:362)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:282)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:166)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$11.call(ConnectionQueryServicesImpl.java:1831)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$11.call(ConnectionQueryServicesImpl.java:1810)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1810)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:126)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:233)
at nl.work.kafkastreamconsumer.phoenix.PhoenixConnection.<init>(PhoenixConnection.java:39)
... 25 more
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:457)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:350)
at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:280)
... 36 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:455)
... 39 more
Caused by: java.lang.UnsupportedOperationException: Unable to find org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)
at org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:56)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:769)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:689)
... 43 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)
... 46 more
/edit
Unfortunately the issue remains with 4.4.0-HBase-0.98. Below are the classes in question. Since the saveToPhoenix() method is not yet available for the Java API and since this is just a POC, my idea was to simply use the JDBC driver for each mini-batch.
public class PhoenixConnection implements AutoCloseable, Serializable {
private static final long serialVersionUID = -4491057264383873689L;
private static final String PHOENIX_DRIVER = "org.apache.phoenix.jdbc.PhoenixDriver";
static {
try {
Class.forName(PHOENIX_DRIVER);
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}
}
private Connection connection;
public PhoenixConnection(final String jdbcUri) {
try {
connection = DriverManager.getConnection(jdbcUri);
} catch (SQLException e) {
throw new RuntimeException(e);
}
}
public List<Map<String, Object>> executeQuery(final String sql) throws SQLException {
ArrayList<Map<String, Object>> resultList = new ArrayList<>();
try (PreparedStatement statement = connection.prepareStatement(sql); ResultSet resultSet = statement.executeQuery() ) {
ResultSetMetaData metaData = resultSet.getMetaData();
while (resultSet.next()) {
Map<String, Object> row = new HashMap<>(metaData.getColumnCount());
for (int column = 0; column < metaData.getColumnCount(); ++column) {
final String columnLabel = metaData.getColumnLabel(column);
row.put(columnLabel, resultSet.getObject(columnLabel));
}
}
}
resultList.trimToSize();
return resultList;
}
@Override
public void close() {
try {
connection.close();
} catch (SQLException e) {
throw new RuntimeException(e);
}
}
}
public class LinePersister implements Function<JavaRDD<String>, Void> {
private static final long serialVersionUID = -2529724617108874989L;
private static final Logger LOGGER = Logger.getLogger(LinePersister.class);
private static final String TABLE_NAME = "mail_events";
private final String jdbcUrl;
public LinePersister(String jdbcUrl) {
this.jdbcUrl = jdbcUrl;
}
@Override
public Void call(JavaRDD<String> dataSet) throws Exception {
LOGGER.info(String.format(
"Starting conversion on rdd with %d elements", dataSet.count()));
List<Void> collectResult = dataSet.map(new Function<String, Void>() {
private static final long serialVersionUID = -6651313541439109868L;
@Override
public Void call(String line) throws Exception {
LOGGER.info("Writing line " + line);
Event event = EventParser.parseLine(line);
try (PhoenixConnection connection = new PhoenixConnection(
jdbcUrl)) {
connection.executeQuery(event
.createUpsertStatement(TABLE_NAME));
} catch (Exception e) {
LOGGER.error("Error while processing line", e);
dumpClasspath(this.getClass().getClassLoader());
}
return null;
}
}).collect();
LOGGER.info(String.format("Got %d results: ", collectResult.size()));
return null;
}
public static void dumpClasspath(ClassLoader loader)
{
LOGGER.info("Classloader " + loader + ":");
if (loader instanceof URLClassLoader)
{
URLClassLoader ucl = (URLClassLoader)loader;
LOGGER.info(Arrays.toString(ucl.getURLs()));
}
else
LOGGER.error("cannot display components as not a URLClassLoader)");
if (loader.getParent() != null)
dumpClasspath(loader.getParent());
}
}
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>nl.work</groupId>
<artifactId>KafkaStreamConsumer</artifactId>
<version>1.0</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<spark.version>1.3.1</spark.version>
<hibernate.version>4.3.10.Final</hibernate.version>
<phoenix.version>4.4.0-HBase-0.98</phoenix.version>
<hbase.version>0.98.9-hadoop2</hbase.version>
<spark-hbase.version>0.0.2-clabs-spark-1.3.1</spark-hbase.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.phoenix</groupId>
<artifactId>phoenix-core</artifactId>
<version>${phoenix.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.phoenix</groupId>
<artifactId>phoenix-spark</artifactId>
<version>${phoenix.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.cloudera</groupId>
<artifactId>spark-hbase</artifactId>
<version>${spark-hbase.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>${maven.compiler.source}</source>
<target>${maven.compiler.target}</target>
</configuration>
</plugin>
<!-- <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId>
<version>2.3</version> <executions> <execution> <phase>package</phase> <goals>
<goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact>
<excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration>
</execution> </executions> </plugin> -->
</plugins>
</build>
<repositories>
<repository>
<id>unknown-jars-temp-repo</id>
<name>A temporary repository created by NetBeans for libraries and jars it could not identify. Please replace the dependencies in this repository with correct ones and delete this repository.</name>
<url>file:${project.basedir}/lib</url>
</repository>
</repositories>
</project>
/edit2 I've tried the saveAsHadoopApiFile approach (https://gist.github.com/mravi/444afe7f49821819c987#file-phoenixsparkjob-java) but that yields the same error, just a different stacktrace:
java.lang.RuntimeException: java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:995)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:386)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:288)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:171)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1881)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1860)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:92)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
at org.apache.phoenix.mapreduce.PhoenixRecordWriter.<init>(PhoenixRecordWriter.java:49)
at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)
... 8 more
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:457)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:350)
at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:286)
... 23 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:455)
... 26 more
Caused by: java.lang.UnsupportedOperationException: Unable to find org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)
at org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:56)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:769)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:689)
... 31 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)
... 34 more
The nice people at the Phoenix mailinglist gave me the answer:
"Rather than bundle the Phoenix client JAR with your app, are you able to include it in a static location either in the SPARK_CLASSPATH, or set the conf values below (I use SPARK_CLASSPATH myself, though it's deprecated): spark.driver.extraClassPath spark.executor.extraClassPath "
https://www.mail-archive.com/user@spark.apache.org/msg29978.html
这篇关于阿帕奇凤凰(4.3.1和4.4.0-HBase的-0.98)在星火1.3.1的ClassNotFoundException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!