Sponsored Links

Ad by Google
In this tutorial, I will show you step by step guide to install Apache Hive in your Ubuntu machine.
After completion of this tutorial, your Hive environment is ready to use, and also show you the usage of few basic commands. Apache Hive is one of the popular Big Data technology.

What is Apache Hive

The Apache Hive is a data warehouse framework for querying and managing the large volume of data-sets resides in distributed storage(HDFS) using SQL like language called HiveQL.

Prerequisites to install Apache Hive 1.2.x
Java 7 or later version of java must be installed. If not installed, than follow JDK-Installation
Hadoop 2.x version must be installed, If not installed, than follow Hadoop Single Node Cluster Set-up guide.

Download & install Hive 1.2.x
Download the latest and stable version of Apache Hive from Apache Software foundation, alternatively you can download using wget command also.
Syntax -
subodh@subodh-Inspiron-3520:~/software$ wget http://a.mbbsindia.com/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz

The above will take some time to download, it depends on your internet connection speed, once you downloaded now extract it using the below command.
Syntax -
subodh@subodh-Inspiron-3520:~/software$ tar -xzf apache-hive-1.2.1-bin.tar.gz
Set HIVE_HOME environment variable -
Edit ~/.bashrc file and add the HIVE_HOME variable.
Syntax -
subodh@subodh-Inspiron-3520:~/software$ gedit ~/.bashrc 
The above will open an editor, define the Hive installation directory, copy paste the below code by changing your own installation directory.
# hive installation
export HIVE_HOME=/home/subodh/software/apache-hive-1.2.1-bin
export PATH=$PATH:$HIVE_HOME/bin
Reload the bashrc file by executing source ~/.bashrc command, and your HIVE_HOME environment variable is set.

Now set the HADOOP_HOME inside hive.env.sh
subodh@subodh-Inspiron-3520:~/software/apache-hive-1.2.1-bin/conf$ cp hive-env.sh.template hive-env.sh
subodh@subodh-Inspiron-3520:~/software/apache-hive-1.2.1-bin/conf$ gedit hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory
Now change the permission of tmp directory in HDFS
subodh@subodh-Inspiron-3520:~$ hadoop fs -chmod g+w /tmp
Done!!, Now you can verify hive installation by executing hive command on your terminal
subodh@subodh-Inspiron-3520:~/software$ hive

Logging initialized using configuration in jar:file:/home/subodh/software/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
You are here means, your hive installation is successful and you are ready to use Hive. It's SQL like interface, that means you can execute the SQL commands in Hive also, lets execute some basic commands.
hive> show databases;
Time taken: 0.86 seconds, Fetched: 1 row(s)
hive> create database test_db;
Time taken: 0.27 seconds
hive> show databases;
Time taken: 0.032 seconds, Fetched: 2 row(s)
hive> use test_db;
Time taken: 0.029 seconds
hive> show tables;
Time taken: 0.052 seconds
hive> create table t(id int);
Time taken: 0.282 seconds
hive> show tables;
Time taken: 0.03 seconds, Fetched: 1 row(s)
To exit from the hive terminal, type exit;

DONE :) Happy data analytics
Sponsored Links


  1. I am getting following error.

    MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3364)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
    ... 16 more

  2. Execute this "$HIVE_HOME/bin/schematool -initSchema -dbType derby" command from your terminal.