first create people_data.txt file
Alice,30,New York Bob,25,California Charlie,35,Texas David,30,California Eva,20,New York Frank,40,California Grace,30,Texas Hannah,45,New York Ivy,25,California Jack,20,Texas
--------------------------
copy data to hadoop directory
> hadoop fs -put people_data.txt people_data.txt
---------------------------------------
write pig script
pig1.pig
people= LOAD 'people_data.txt' USING PigStorage(',')as (name:chararray, age:int, city:chararray);
dump people;
pig3.pig
people= LOAD 'people_data.txt' USING PigStorage(',')as (name:chararray, age:int, city:chararray);
Group_data = GROUP people by city;
dump Group_data;
----------------------------------------------------
pig4.pig
people= LOAD 'people_data.txt' USING PigStorage(',')as (name:chararray, age:int, city:chararray);
sort_age = ORDER people BY age DESC;
dump sort_age;
--------------------------------------------------------
pig5.pig
people_data.txt:
city_data.txt:
Now, let's perform the join on the city
field.
Pig Script:
No comments:
Post a Comment