Search This Blog

Monday 20 March 2017

flume examples



Experimemnt1:  source:netcat     sink:logger

Practical  session by using Apche flume to get data creating your own local host 12345 with ip
0.0.0.0    and creates source , channel and sink and finally getting into flume

Open in your terminal
Flumeàapache-flume-bin->
[training@localhost apache-flume-1.6.0-bin]$ flume-ng agent -n agent -c conf -f conf/flume1.conf -Dflume.root.logger=INFO,console


flume1.conf  file:
agent.sources = s1

agent.channels = c1
agent.sinks = k1
agent.sources.s1.type = netcat
agent.sources.s1.channels = c1
agent.sources.s1.bind=0.0.0.0
agent.sources.s1.port=12345
agent.channels.c1.type=memory
agent.sinks.k1.type=logger
agent.sinks.k1.channel=c1
~                       


7)] Source starting
2017-03-16 16:32:04,036 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:161)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/0.0.0.0:12345]
2017-03-16 16:46:02,513 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 71 0D                                           q. }
2017-03-16 16:46:11,527 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 68 6F 77 20 72 20 75 0D                         how r u. }
2017-03-16 16:49:37,649 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 68 69 20 68 69 0D                               hi hi. }





Open new terminal window and connect to flume sink

[training@localhost apache-flume-1.6.0-bin]$ telnet localhost 12345
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
q
OK
how r u
OK
Hi hi


Experiment 2:   source:  seq   sink:hdfs
Configuration file:
agent.sources=seqsource
agent.channels=mem
agent.sinks=hdfssink
agent.sources.seqsource.type=seq
agent.sinks.hdfssink.type=hdfs
agent.sinks.hdfssink.hdfs.path=hdfs://localhost:8020/user/training/seqgendata/
agent.sinks.hdfssink.hdfs.filePrefix=log
agent.sinks.hdfssink.hdfs.rollcount=10000
agent.sinks.hdfssink.hdfs.fileType=DataStream
agent.channels.mem.type=memory
agent.channels.mem.capacity=1000
agent.channels.mem.transactionCapacity=100
agent.sources.seqsource.channels=mem
agent.sinks.hdfssink.channel=mem

$./bin/flume-ng agent --conf $FLUME_CONF --conf-file $FLUME_CONF/seq_gen.conf --name agent








EXPERIEMNT3:      source: netcat    sink:hdfs
Sample1.conf:::::::::::::
agent.sources=seqsource
agent.channels=mem
agent.sinks=hdfssink
agent.sources.seqsource.type=netcat
agent.sources.seqsource.bind=localhost
agent.sources.seqsource.port=22222
agent.sinks.hdfssink.type=hdfs
agent.sinks.hdfssink.hdfs.path=hdfs://localhost:8020/user/training/sampledata/
agent.sinks.hdfssink.hdfs.filePrefix=netcat
agent.sinks.hdfssink.hdfs.rollInterval=120
agent.sinks.hdfssink.hdfs.fileType=DataStream
agent.channels.mem.type=memory
agent.channels.mem.capacity=1000
agent.channels.mem.transactionCapacity=100
agent.sources.seqsource.channels=mem
agent.sinks.hdfssink.channel=mem


[training@localhost apache-flume-1.6.0-bin]$ flume-ng agent -n agent -c --conf -f conf/sample1.conf




Experiment 4:   Source  exec     sink:hdfs
agent-hdfs.sources = logger-source
agent-hdfs.sinks = hdfs-sink
agent-hdfs.channels = memoryChannel
agent-hdfs.sources.logger-source.type=exec
agent-hdfs.sources.logger-source.command=tail -f /home/training/employee
agent-hdfs.sources.logger-source.batchSize=2
agent-hdfs.sources.logger-source.channels=memoryChannel
agent-hdfs.sinks.hdfs-sink.type=hdfs
agent-hdfs.sinks.hdfs-sink.hdfs.path=/user/training/empsinkdata
agent-hdfs.sinks.hdfs-sink.hdfs.batchSize=10
agent-hdfs.sinks.hdfs-sink.channel=memoryChannel
agent-hdfs.channels.memoryChannel.type=memory
agent-hdfs.channels.memoryChannel.capacity=1000
#agent-hdfs.channels.memoryChannel.capacity=50

[training@localhost apache-flume-1.6.0-bin]$ flume-ng agent -n agent-hdfs -c --conf -f conf/hdsink.conf
Info: Including Hadoop libraries found via (/usr/lib/hadoop/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.4.3.jar from classpath
Info

No comments:

Post a Comment

Hadoop Analytics

NewolympicData

  Alison Bartosik 21 United States 2004 08-29-04 Synchronized Swimming 0 0 2 2 Anastasiya Davydova 21 Russia 2004 0...