STEP 1 — Download Hive
Go to home directory:
cd ~
Download Hive (example: 3.1.3 – stable for Hadoop 3.x):
wget https://archive.apache.org/dist/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
Extract:
tar -xvzf apache-hive-3.1.3-bin.tar.gz
sudo mv apache-hive-3.1.3-bin /usr/local/hive
sudo chown -R $USER:$USER /usr/local/hive
STEP 2 — Set Hive Environment Variables
Open:
nano ~/.bashrc
Add at bottom:
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
Apply:
source ~/.bashrc
Check:
hive --version
STEP 3 — Configure Hive
Go to Hive config directory:
cd /usr/local/hive/conf
Copy template:
cp hive-default.xml.template hive-site.xml
STEP 4 — Configure Hive Metastore (Derby – Simple Mode)
Since this is lab setup, we use embedded Derby database (no MySQL needed).
Edit:
nano hive-site.xml
Inside <configuration> add:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/home/ubuntu/metastore_db;create=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
⚠ Replace ubuntu if your username is different.
Save & Exit.
STEP 5 — Create Hive Warehouse Directory in HDFS
Start Hadoop if not running:
start-dfs.sh
start-yarn.sh
Now create warehouse folder:
hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -chmod -R 777 /user/hive
STEP 6 — Initialize Hive Metastore
Go to Hive home:
cd /usr/local/hive
Initialize schema:
schematool -dbType derby -initSchema
You should see:
Initialization completed successfully
STEP 7 — Start Hive
Simply run:
hive
You should see:
hive>
If yes → Hive installed successfully 🎉
STEP 8 — Create Database
Inside Hive:
CREATE DATABASE mydb;
Check:
SHOW DATABASES;
You should see:
default
mydb
STEP 9 — Use Database
USE mydb;
STEP 10 — Create Table
CREATE TABLE student (
id INT,
name STRING,
marks INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
Check tables:
SHOW TABLES;
STEP 11 — Insert Data
Create sample file in Ubuntu:
echo "1,John,85" > student.txt
echo "2,Alice,90" >> student.txt
echo "3,Bob,78" >> student.txt
Upload to HDFS:
hdfs dfs -mkdir /input
hdfs dfs -put student.txt /input/
Back in Hive:
LOAD DATA INPATH '/input/student.txt' INTO TABLE student;
STEP 12 — Query Table
SELECT * FROM student;
Output:
1 John 85
2 Alice 90
3 Bob 78
STEP 13 — Simple Query Example
SELECT name, marks FROM student WHERE marks > 80;
No comments:
Post a Comment