Pig Connector

Moderator: NorbertKrupa

Post Reply
martijn
Newbie
Newbie
Posts: 3
Joined: Mon Nov 12, 2012 2:51 pm

Pig Connector

Post by martijn » Mon Nov 12, 2012 2:59 pm

Hello,

When i try to load data from vertica into hdfs using Pig.

Code: Select all

grunt> register /usr/local/hadoop-vertica.jar
grunt> register /usr/local/pig-vertica.jar
grunt> A = LOAD 'sql://{select * from table LIMIT 100}' USING com.vertica.pig.VerticaLoader('192.168.55.48,192.168.55.13,192.168.55.173', 'baseline', '5433',  'dbadmin', 'db');
grunt> STORE A INTO '/user/test.txt'
I get the following error:

Code: Select all

ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A
	at org.apache.pig.PigServer.openIterator(PigServer.java:862)
	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
	at org.apache.pig.Main.run(Main.java:490)
	at org.apache.pig.Main.main(Main.java:111)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias A
	at org.apache.pig.PigServer.storeEx(PigServer.java:961)
	at org.apache.pig.PigServer.store(PigServer.java:924)
	at org.apache.pig.PigServer.openIterator(PigServer.java:837)
	... 12 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117: Unexpected error when launching map reduce job.
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:322)
	at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
	at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1260)
	at org.apache.pig.PigServer.storeEx(PigServer.java:957)
	... 14 more
Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
	at com.vertica.hadoop.VerticaUtil.getSplits(VerticaUtil.java:102)
	at com.vertica.hadoop.VerticaInputFormat.getSplits(VerticaInputFormat.java:140)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)
	at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1014)
	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1031)
	at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:172)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:943)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:318)
	at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:238)
	at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:269)
	at java.lang.Thread.run(Thread.java:662)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)

	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:631)
	at java.lang.Thread.dispatchUncaughtException(Thread.java:1874)

Correct me if iam wrong but, it looks like the vertica connector expects JobContext to be an class but it is an Interface.
Iam using Cloudera's CDH 4.1.1 hadoop distribution.

Does anyone know what i need to do to make this work, or what i did wrong?

Greetings,
Martijn

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Pig Connector

Post by id10t » Mon Nov 12, 2012 3:13 pm

Hi!


Have you tried it locally (pig -x local)? Same error?

I can think only about:
* you have no permissions for Pig temporary folder or defined same as Hadoop/MapReduce.
* Vertica splits failed (if Pig require splits by partitions - it should fail)

martijn
Newbie
Newbie
Posts: 3
Joined: Mon Nov 12, 2012 2:51 pm

Re: Pig Connector

Post by martijn » Mon Nov 12, 2012 4:50 pm

sKwa wrote:Hi!


Have you tried it locally (pig -x local)? Same error?

I can think only about:
* you have no permissions for Pig temporary folder or defined same as Hadoop/MapReduce.
* Vertica splits failed (if Pig require splits by partitions - it should fail)
Yes same error in local mode.

After some more research and looking into the Hadoop and Pig connector source code.
Its looks like the connecter is not compatible with Hadoop 2.0.0 (which i'm running)
In Hadoop version 1.x JobContext was a Class but they changed it to an Interface.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Pig Connector

Post by id10t » Mon Nov 12, 2012 6:07 pm

Hi!

Yeap! You are right, just found:
Question:

The Vertica 6 Hadoop Connector supports the combinations of Apache, Hadoop, and Apache Pig listed below.


Solution:

Use the Vertica 6 Hadoop connector with only these version pairs:
• Hadoop 0.20.2 and Pig 0.7.0
• Hadoop 0.20.205.0 and Pig 0.9.1
• Hadoop 1.0.0 and Pig 0.9.2
Date of info: 6/25/2012

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Pig Connector

Post by id10t » Mon Nov 12, 2012 6:18 pm

BTW: I see you know java, take a look on hadoop connector source on GitHub. Current source, that in "trunk" so terrible that i think it's a Halloween joke and code that deprecated much better. For big vendor I think it's a shame to put such code in trunk. I suggest you to rewrite it, it's not so hard (current code mostly based on Cloudera DBInputFile.java)

[me? :-) Waiting for a new connector, so far I'm writing myself connections and splits, 'coz `LIMIT-OFFSET` method for even a couple millions rows just will kill db.]

martijn
Newbie
Newbie
Posts: 3
Joined: Mon Nov 12, 2012 2:51 pm

Re: Pig Connector

Post by martijn » Tue Nov 13, 2012 12:47 pm

I made a working version of the Hadoop Connector (both Hadoop and PIG) for Hadoop version 2.x. Basically I just rebuilt the JAR provided by Vertica with Hadoop 2.x dependencies, and now it seems to work. It has not been extensively tested yet...

Download it here: http://dl.dropbox.com/u/122838/hadoop-vertica.jar

vijayrkadel
Newbie
Newbie
Posts: 3
Joined: Sat Oct 18, 2014 1:19 pm

Re: Pig Connector

Post by vijayrkadel » Mon Oct 27, 2014 6:05 am

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. tried to access method org.apache.hadoop.mapred.TaskReport.downgradeArray([Lorg/apache/hadoop/mapreduce/TaskReport;)[Lorg/apache/hadoop/mapred/TaskReport; from class org.apache.hadoop.mapred.DowngradeHelper

I am using hadoop 2.2.5 and pig 0.13 and i have used same "hadoop-vertica.jar" as give above for hadoop 2.x ? and in local mode this is working ..

Please help me out

Post Reply

Return to “Hadoop Connector”