Sunday, 1 March 2026

map reduce weather data ubuntu hadoop using python

Weather Data Mining using Hadoop Streaming

Step 1: Create Input File

Open terminal:


nano weather_data.txt

Add sample data:


2023-10-01,25,60,0
2023-10-02,30,70,5
2023-10-03,15,80,10
2023-10-04,10,90,15
2023-10-05,35,50,0

Format:


Date,Temperature,Humidity,Precipitation

Step 2: Create Mapper Script


nano mapper.py

mapper.py


#!/usr/bin/env python3
import sys

for line in sys.stdin:
    line = line.strip()
    if not line:
        continue

    try:
        date, temp, humidity, precipitation = line.split(",")

        temp = float(temp)
        humidity = float(humidity)
        precipitation = float(precipitation)

        # Weather condition logic
        if precipitation > 0:
            message = "Rainy day"
        elif temp >= 35:
            message = "Very Hot day"
        elif temp >= 30:
            message = "Hot day"
        elif temp <= 10:
            message = "Very Cold day"
        elif temp <= 15:
            message = "Cold day"
        elif humidity > 85:
            message = "Humid day"
        else:
            message = "Pleasant day"

        print(f"{date}\t{message}")

    except:
        continue

Make executable:


chmod +x mapper.py

Step 3: Create Reducer Script


nano reducer.py

reducer.py


#!/usr/bin/env python3
import sys

for line in sys.stdin:
    line = line.strip()
    if line:
        print(line)

Make executable:


chmod +x reducer.py

👉 Note: Reducer is simple because classification is done in mapper.

Step 4: Test Locally (Without Hadoop)


cat weather_data.txt | ./mapper.py | sort | ./reducer.py

Expected Output


2023-10-01    Pleasant day
2023-10-02    Rainy day
2023-10-03    Rainy day
2023-10-04    Rainy day
2023-10-05    Very Hot day

Step 5: Run in Hadoop Streaming

Create HDFS directory


hdfs dfs -mkdir /weather

Upload file


hdfs dfs -put weather_data.txt /weather

Run Hadoop job


hadoop jar $HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming*.jar \
-input /weather/weather_data.txt \
-output /weather_output \
-mapper mapper.py \
-reducer reducer.py \
-file mapper.py \
-file reducer.py

Step 6: View Output


hdfs dfs -cat /weather_output/part-00000

Hadoop learning pot

Search This Blog

Sunday, 1 March 2026

map reduce weather data ubuntu hadoop using python

Weather Data Mining using Hadoop Streaming

Step 1: Create Input File

Step 2: Create Mapper Script

mapper.py

Step 3: Create Reducer Script

reducer.py

Step 4: Test Locally (Without Hadoop)

Expected Output

Step 5: Run in Hadoop Streaming

Create HDFS directory

Upload file

Run Hadoop job

Step 6: View Output

No comments:

Post a Comment

Hadoop Analytics

AI & DS HUE EXPERIMENT

Search This Blog