Search This Blog

Friday, 3 February 2017

File Formats part-1

Input File formats In HDFS


In general the input file format will play key role Hadoop MapR Programming

because of the output is generated in three stages


primary data---> IN HDFS---->  Mapper______> Map output-------------> Reducer Input/Output



HDFS data Represented  in Text,Sequential and Binary Format


By default It Will Represent Text Input Format

if u have submitted the input data to the HDFS

it will organise the data like KEY , VALUE pairs


key is the BYTEOFFSET    ie   address value of the Line


Value is the Individual String of line in your file system 

No comments:

Post a Comment

Hadoop Analytics

AI & DS HUE EXPERIMENT

   STEP 1 — Create Sample Dataset (On Linux) Create emp1.csv: vi emp1.csv Paste: employee_id,name,department,salary 1,John Doe,Engineer...