Apache Hive installation in Ubuntu Hive 1.2

What is Apache Hive

The Apache Hive is a data warehouse framework for querying and managing the large volume of data-sets resides in distributed storage(HDFS) using SQL like language called HiveQL.

Prerequisites to install Apache Hive 1.2.x
Java 7 or later version of java must be installed. If not installed, than follow JDK-Installation
Hadoop 2.x version must be installed, If not installed, than follow Hadoop Single Node Cluster Set-up guide.

Download & install Hive 1.2.x
Download the latest and stable version of Apache Hive from Apache Software foundation, alternatively you can download using wget command also.
Syntax -

subodh@subodh-Inspiron-3520:~/software$ wget http://a.mbbsindia.com/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz

The above will take some time to download, it depends on your internet connection speed, once you downloaded now extract it using the below command.
Syntax -

subodh@subodh-Inspiron-3520:~/software$ tar -xzf apache-hive-1.2.1-bin.tar.gz

Set HIVE_HOME environment variable -
Edit ~/.bashrc file and add the HIVE_HOME variable.
Syntax -

subodh@subodh-Inspiron-3520:~/software$ gedit ~/.bashrc

The above will open an editor, define the Hive installation directory, copy paste the below code by changing your own installation directory.

# hive installation
export HIVE_HOME=/home/subodh/software/apache-hive-1.2.1-bin
export PATH=$PATH:$HIVE_HOME/bin

Reload the bashrc file by executing source ~/.bashrc command, and your HIVE_HOME environment variable is set.

Now set the HADOOP_HOME inside hive.env.sh

subodh@subodh-Inspiron-3520:~/software/apache-hive-1.2.1-bin/conf$ cp hive-env.sh.template hive-env.sh
subodh@subodh-Inspiron-3520:~/software/apache-hive-1.2.1-bin/conf$ gedit hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/home/subodh/software/hadoop-2.7.1

Now change the permission of tmp directory in HDFS

subodh@subodh-Inspiron-3520:~$ hadoop fs -chmod g+w /tmp

Done!!, Now you can verify hive installation by executing hive command on your terminal

subodh@subodh-Inspiron-3520:~/software$ hive

Logging initialized using configuration in jar:file:/home/subodh/software/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
hive>

You are here means, your hive installation is successful and you are ready to use Hive. It's SQL like interface, that means you can execute the SQL commands in Hive also, lets execute some basic commands.

hive> show databases;
OK
default
Time taken: 0.86 seconds, Fetched: 1 row(s)
hive> create database test_db;
OK
Time taken: 0.27 seconds
hive> show databases;
OK
default
test_db
Time taken: 0.032 seconds, Fetched: 2 row(s)
hive> use test_db;
OK
Time taken: 0.029 seconds
hive> show tables;
OK
Time taken: 0.052 seconds
hive> create table t(id int);
OK
Time taken: 0.282 seconds
hive> show tables;
OK
t
Time taken: 0.03 seconds, Fetched: 1 row(s)
hive>

To exit from the hive terminal, type exit;

DONE :) Happy data analytics

2 comments:

UnknownSeptember 3, 2016 at 3:07 AM
I am getting following error.

MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3364)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
... 16 more
javamakeuseSeptember 5, 2016 at 9:14 AM
Execute this "$HIVE_HOME/bin/schematool -initSchema -dbType derby" command from your terminal.