Search This Blog

Friday, 3 February 2017

File Formats part-1

Input File formats In HDFS


In general the input file format will play key role Hadoop MapR Programming

because of the output is generated in three stages


primary data---> IN HDFS---->  Mapper______> Map output-------------> Reducer Input/Output



HDFS data Represented  in Text,Sequential and Binary Format


By default It Will Represent Text Input Format

if u have submitted the input data to the HDFS

it will organise the data like KEY , VALUE pairs


key is the BYTEOFFSET    ie   address value of the Line


Value is the Individual String of line in your file system 

No comments:

Post a Comment

Hadoop Analytics

pigdemo-1

 1. first create data file emp111.txt in ur LFS 2. MOVE to HDFS 3. OPen vi editor type Pig Script 4. vi pig1.pig bag1= load 'emp.txt...