Adding JARS to hive without using ADD JAR

By Matthew Rathbone on October 20 2011 Share Tweet Post

Hire me to supercharge your Hadoop and Spark projects

I help businesses improve their return on investment from big data projects. I do everything from software architecture to staff training. Learn More

Say you’ve built some library you want to use in Hive, or even in Hadoop. If this library is a UDF for use in hive queries you can load it like this:

ADD JAR ‘s3n://matthewsbucket/superudf.jar’;

CREATE TEMPORARY FUNCTION super as ‘com.matthewrathbone.SuperFunction’;

If you’re creating a bunch of these you don’t want to have to ‘ADD JAR’ _every_single_time_ you want the function, you want it to be in the library already.

To do that either put it in hive/lib, or hadoop/lib on all the nodes. If you’re using Elastic Mapreduce you can do this in a bootstrap script:

sudo apt-get install wget

wget -o /home/hadoop/lib/super.jar http://somewhere.com/superudf.jar

Now you can skip the ADD JAR step in function creation (which is much faster by the way):

CREATE TEMPORARY FUNCTION super as ‘com.matthewrathbone.SuperFunction’;

Adding JARS to hive without using ADD JAR

Hire me to supercharge your Hadoop and Spark projects

Matthew Rathbone

Hire me to supercharge your Hadoop and Spark projects

Join the discussion

Beekeeper Studio

Adding JARS to hive without using ADD JAR

Hire me to supercharge your Hadoop and Spark projects

Matthew Rathbone

Hire me to supercharge your Hadoop and Spark projects

Previous

Next

Related Hadoop Articles

Join the discussion

Join my newsletter

Beekeeper Studio

Related Articles