Flume案例
鸡汤: 每一发奋努力的背后,必有加倍的赏赐回馈。
1. 监控端口数据官方案例¶
多用于测试。
1.1 需求¶
案例需求:首先,Flume监控本机8888端口,然后通过telnet工具向本机8888端口发送消息,最后Flume将监听的数据实时显示在控制台。
1.2 分析¶
- 通过telnet 工具向基本8888端口发数据
- Flume 监听本机的8888端口,通过Flume的source端读取数据
- Flume将获取的数据通过Sink端写到控制台

1.3 实现步骤¶
- 检查工具
yum install -y telnet nc
- 判断8888端口是否被占用
netstat -tunlp | grep 8888
描述
功能描述:netstat命令是一个监控TCP/IP网络的非常有用的工具,它可以显示路由表、实际的网络连接以及每一个网络接口设备的状态信息。 基本语法:netstat [选项] 选项参数: -t或--tcp:显示TCP传输协议的连线状况; -u或--udp:显示UDP传输协议的连线状况; -n或--numeric:直接使用ip地址,而不通过域名服务器; -l或--listening:显示监控中的服务器的Socket; -p或--programs:显示正在使用Socket的程序识别码和程序名称;
快速命令
mkdir -p /usr/local/flume/flume_job cd /usr/local/flume/flume_job/ cat >flume-telnet-logger.conf<<EOF # Name the components on this agent cmz.sources = r1 cmz.sinks = k1 cmz.channels = c1 # Describe/configure the source cmz.sources.r1.type = netcat cmz.sources.r1.bind = localhost cmz.sources.r1.port = 8888 # Describe the sink cmz.sinks.k1.type = logger # Use a channel which buffers events in memory cmz.channels.c1.type = memory cmz.channels.c1.capacity = 1000 cmz.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel cmz.sources.r1.channels = c1 cmz.sinks.k1.channel = c1 EOF 启动 /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name cmz \ --conf-file /usr/local/flume/flume_job/flume-telnet-logger.conf \ -Dflume.root.logger=INFO,console cat >/tmp/cmz_test.txt<<EOF hello caimengzhi. you look great today. you did a good job. EOF nc localhost 8888 </tmp/cmz_test.txt
注:配置文件来源于官方手册http://flume.apache.org/FlumeUserGuide.html

注意
1. 文件名字 flume-telnet-logger.conf | | |_________ 输出 | |________________ 输入 |_____________________ 标识 起的文件配置文件最好能知道输入和输出各指的是什么 2. 启动的name 也就是cmz一定要和配置文件中的一样,否则启动不起来 3. 参数说明 参数说明: --conf conf/ :表示配置文件存储在conf/目录 --name cmz :表示给agent起名为cmz --conf-file /usr/local/flume/flume_job/flume-telnet-logger.conf:flume本次启动读取的配置文件是在job文件夹下的flume-telnet.conf文件。 -Dflume.root.logger==INFO,console -D表示flume运行时动态修改flume.root.logger参数属性值,并将控制台日志打印级别设置为INFO级别。 日志级别包括:log、info、warn、error。

详细操作
[root@master ~]# mkdir -p /usr/local/flume/flume_job [root@master ~]# cd /usr/local/flume/flume_job/ [root@master flume_job]# cat >flume-telnet-logger.conf<<EOF > # Name the components on this agent > cmz.sources = r1 > cmz.sinks = k1 > cmz.channels = c1 > > # Describe/configure the source > cmz.sources.r1.type = netcat > cmz.sources.r1.bind = localhost > cmz.sources.r1.port = 8888 > > # Describe the sink > cmz.sinks.k1.type = logger > > # Use a channel which buffers events in memory > cmz.channels.c1.type = memory > cmz.channels.c1.capacity = 1000 > cmz.channels.c1.transactionCapacity = 100 > > # Bind the source and sink to the channel > cmz.sources.r1.channels = c1 > cmz.sinks.k1.channel = c1 > EOF 启动 [root@master flume_job]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name cmz \ > --conf-file /usr/local/flume/flume_job/flume-telnet-logger.conf \ > -Dflume.root.logger=INFO,console Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access ..... 19/08/20 19:02:38 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting 19/08/20 19:02:38 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/usr/local/flume/flume_job/flume-telnet-logger.conf 19/08/20 19:02:38 INFO conf.FlumeConfiguration: Processing:k1 19/08/20 19:02:38 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: cmz 19/08/20 19:02:38 INFO conf.FlumeConfiguration: Processing:k1 19/08/20 19:02:38 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [cmz] 19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Creating channels 19/08/20 19:02:38 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory 19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Created channel c1 19/08/20 19:02:38 INFO source.DefaultSourceFactory: Creating instance of source r1, type netcat 19/08/20 19:02:38 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger 19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1] 19/08/20 19:02:38 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSou rce{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1eea61d0 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }19/08/20 19:02:38 INFO node.Application: Starting Channel c1 19/08/20 19:02:38 INFO node.Application: Waiting for channel: c1 to start. Sleeping for 500 ms 19/08/20 19:02:38 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean. 19/08/20 19:02:38 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started 19/08/20 19:02:38 INFO node.Application: Starting Sink k1 19/08/20 19:02:38 INFO node.Application: Starting Source r1 19/08/20 19:02:38 INFO source.NetcatSource: Source starting 19/08/20 19:02:38 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:8888] 省略。。。。。 [root@master ~]# cat >/tmp/cmz_test.txt<<EOF > hello caimengzhi. > you look great today. > you did a good job. > EOF [root@master ~]# [root@master ~]# nc localhost 8888 </tmp/cmz_test.txt OK OK OK 19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 63 61 69 6D 65 6E 67 7A 68 69 hello caimengzhi } 19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 79 6F 75 20 6C 6F 6F 6B 20 67 72 65 61 74 20 74 you look great t } 19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 79 6F 75 20 64 69 64 20 61 20 67 6F 6F 64 20 6A you did a good j } [root@master ~]# telnet localhost 8888 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. summer OK 19/08/20 19:34:28 INFO sink.LoggerSink: Event: { headers:{} body: 73 75 6D 6D 65 72 0D summer. }
2. 实时读取本地文件到HDFS案例¶
2.1 需求¶
实时监控Nginx日志,并上传到HDFS中
2.2 分析¶

2.3 步骤¶
Flume要想将数据输出到HDFS,必须持有Hadoop相关jar包
commons-configuration-1.6.jar hadoop-auth-2.6.5.jar hadoop-common-2.6.5.jar hadoop-hdfs-2.6.5.jar commons-io-2.4.jar htrace-core-3.0.4.jar
/usr/local/flume/lib/
文件夹下
[root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/commons-configuration-1.6.jar /usr/local/flume/lib/ [root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/hadoop-auth-2.6.5.jar /usr/local/flume/lib/ [root@master share]# cp /usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar /usr/local/flume/lib/ [root@master share]# cp /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.6.5.jar /usr/local/flume/lib/ [root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/commons-io-2.4.jar /usr/local/flume/lib/ [root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/htrace-core-3.0.4.jar /usr/local/flume/lib/
- 创建文件
[root@master flume_job]# touch flume-file-hdfs.conf
快速命令
cd /usr/local/flume/flume_job/ cat >flume-file-hdfs.conf<<EOF # Name the components on this agent a2.sources = r2 a2.sinks = k2 a2.channels = c2 # Describe/configure the source a2.sources.r2.type = exec a2.sources.r2.command = tail -F /var/log/nginx/access.log a2.sources.r2.shell = /bin/bash -c # Describe the sink a2.sinks.k2.type = hdfs a2.sinks.k2.hdfs.path = hdfs://master:9000/flume/%Y%m%d/%H #上传文件的前缀 a2.sinks.k2.hdfs.filePrefix = logs- #是否按照时间滚动文件夹 a2.sinks.k2.hdfs.round = true #多少时间单位创建一个新的文件夹 a2.sinks.k2.hdfs.roundValue = 1 #重新定义时间单位 a2.sinks.k2.hdfs.roundUnit = hour #是否使用本地时间戳 a2.sinks.k2.hdfs.useLocalTimeStamp = true #积攒多少个Event才flush到HDFS一次 a2.sinks.k2.hdfs.batchSize = 1000 #设置文件类型,可支持压缩 a2.sinks.k2.hdfs.fileType = DataStream #多久生成一个新的文件 a2.sinks.k2.hdfs.rollInterval = 600 #设置每个文件的滚动大小 a2.sinks.k2.hdfs.rollSize = 134217700 #文件的滚动与Event数量无关 a2.sinks.k2.hdfs.rollCount = 0 #最小冗余数 a2.sinks.k2.hdfs.minBlockReplicas = 1 # Use a channel which buffers events in memory a2.channels.c2.type = memory a2.channels.c2.capacity = 1000 a2.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a2.sources.r2.channels = c2 a2.sinks.k2.channel = c2 EOF /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name a2 \ --conf-file /usr/local/flume/flume_job/flume-file-hdfs.conf hadoop fs -ls /flume/*
上传到hdfs中的目录,就是flume不需要事先创建,会自动创建

详细操作
[root@master ~]# cd /usr/local/flume/flume_job/ [root@master flume_job]# cat >flume-file-hdfs.conf<<EOF > # Name the components on this agent > a2.sources = r2 > a2.sinks = k2 > a2.channels = c2 > > # Describe/configure the source > a2.sources.r2.type = exec > a2.sources.r2.command = tail -F /var/log/nginx/access.log > a2.sources.r2.shell = /bin/bash -c > > # Describe the sink > a2.sinks.k2.type = hdfs > a2.sinks.k2.hdfs.path = hdfs://master:9000/flume/%Y%m%d/%H > #上传文件的前缀 > a2.sinks.k2.hdfs.filePrefix = logs- > #是否按照时间滚动文件夹 > a2.sinks.k2.hdfs.round = true > #多少时间单位创建一个新的文件夹 > a2.sinks.k2.hdfs.roundValue = 1 > #重新定义时间单位 > a2.sinks.k2.hdfs.roundUnit = hour > #是否使用本地时间戳 > a2.sinks.k2.hdfs.useLocalTimeStamp = true > #积攒多少个Event才flush到HDFS一次 > a2.sinks.k2.hdfs.batchSize = 1000 > #设置文件类型,可支持压缩 > a2.sinks.k2.hdfs.fileType = DataStream > #多久生成一个新的文件 > a2.sinks.k2.hdfs.rollInterval = 600 > #设置每个文件的滚动大小 > a2.sinks.k2.hdfs.rollSize = 134217700 > #文件的滚动与Event数量无关 > a2.sinks.k2.hdfs.rollCount = 0 > #最小冗余数 > a2.sinks.k2.hdfs.minBlockReplicas = 1 > > # Use a channel which buffers events in memory > a2.channels.c2.type = memory > a2.channels.c2.capacity = 1000 > a2.channels.c2.transactionCapacity = 100 > > # Bind the source and sink to the channel > a2.sources.r2.channels = c2 > a2.sinks.k2.channel = c2 > EOF [root@master flume_job]# ls flume-file-hdfs.conf flume-telnet-logger.conf [root@master ~]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name a2 \ > --conf-file /usr/local/flume/flume_job/flume-file-hdfs.conf Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access 查看 [root@master nginx]# hadoop fs -ls /flume Found 1 items drwxr-xr-x - root supergroup 0 2019-08-20 20:25 /flume/20190820 [root@master nginx]# hadoop fs -ls /flume/* Found 1 items drwxr-xr-x - root supergroup 0 2019-08-20 20:27 /flume/20190820/20 [root@master nginx]# hadoop fs -ls /flume/20190820/20/* -rw-r--r-- 3 root supergroup 118471 2019-08-20 20:27 /flume/20190820/20/logs-.1566303934487 -rw-r--r-- 3 root supergroup 2884 2019-08-20 20:27 /flume/20190820/20/logs-.1566304046120.tmp [root@master nginx]# hadoop fs -text /flume/20190820/20/logs-.1566303934487 |head 127.0.0.1 - - [20/Aug/2019:20:25:31 +0800] "GET / HTTP/1.1" 200 3700 "-" "curl/7.29.0" "-" 106.11.159.48 - - [20/Aug/2019:07:46:30 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr ome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"106.11.159.48 - - [20/Aug/2019:07:46:30 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr ome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"42.156.254.112 - - [20/Aug/2019:07:46:55 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch rome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"42.156.254.112 - - [20/Aug/2019:07:46:55 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch rome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"112.96.71.67 - - [20/Aug/2019:07:57:56 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:56 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /apple-touch-icon-120x120-precomposed.png HTTP/1.1" 401 188 "-" "MobileSafari/604.1 CFNetwork/978.0.7 Darwin/18.6.0" text: Unable to write to output stream.
3. 实时读取目录文件到HDFS案例¶
3.1 需求¶
使用Flume监听整个目录的文件.

3.2 实现步骤¶
快速命令
mkdir -p /opt/upload echo '1'>cmz1.txt echo '2'>cmz2.log echo '3'>cmz3.tmp cd /usr/local/flume/flume_job/ cat >flume-dir-hdfs.conf<<EOF a3.sources = r3 a3.sinks = k3 a3.channels = c3 # Describe/configure the source # 类型是目录 a3.sources.r3.type = spooldir # 监控的目录 a3.sources.r3.spoolDir = /opt/upload # 上传完成后,加后缀 a3.sources.r3.fileSuffix = .COMPLETED a3.sources.r3.fileHeader = true #忽略所有以.tmp结尾的文件,不上传 a3.sources.r3.ignorePattern = ([^ ]*\.tmp) # Describe the sink a3.sinks.k3.type = hdfs a3.sinks.k3.hdfs.path = hdfs://master:9000/flume/upload/%Y%m%d/%H #上传文件的前缀 a3.sinks.k3.hdfs.filePrefix = upload- #是否按照时间滚动文件夹 a3.sinks.k3.hdfs.round = true #多少时间单位创建一个新的文件夹 a3.sinks.k3.hdfs.roundValue = 1 #重新定义时间单位 a3.sinks.k3.hdfs.roundUnit = hour #是否使用本地时间戳 a3.sinks.k3.hdfs.useLocalTimeStamp = true #积攒多少个Event才flush到HDFS一次 a3.sinks.k3.hdfs.batchSize = 100 #设置文件类型,可支持压缩 a3.sinks.k3.hdfs.fileType = DataStream #多久生成一个新的文件 a3.sinks.k3.hdfs.rollInterval = 600 #设置每个文件的滚动大小大概是128M a3.sinks.k3.hdfs.rollSize = 134217700 #文件的滚动与Event数量无关 a3.sinks.k3.hdfs.rollCount = 0 #最小冗余数 a3.sinks.k3.hdfs.minBlockReplicas = 1 # Use a channel which buffers events in memory a3.channels.c3.type = memory a3.channels.c3.capacity = 1000 a3.channels.c3.transactionCapacity = 100 # Bind the source and sink to the channel a3.sources.r3.channels = c3 a3.sinks.k3.channel = c3 EOF /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name a3 \ --conf-file /usr/local/flume/flume_job/flume-dir-hdfs.conf hdfs dfs -text /flume/upload/* ll /opt/upload

说明
说明: 在使用Spooling Directory Source时 1)不要在监控目录中创建并持续修改文件 2)上传完成的文件会以.COMPLETED结尾,可以自定义 3)被监控文件夹每500毫秒扫描一次文件变动
详细操作
[root@master ~]# cd /opt/ [root@master opt]# ls [root@master opt]# mkdir -p /opt/upload [root@master opt]# cd /usr/local/flume/flume_job/ [root@master flume_job]# cat >flume-dir-hdfs.conf<<EOF > a3.sources = r3 > a3.sinks = k3 > a3.channels = c3 > > # Describe/configure the source > # 类型是目录 > a3.sources.r3.type = spooldir > # 监控的目录 > a3.sources.r3.spoolDir = /opt/upload > # 上传完成后,加后缀 > a3.sources.r3.fileSuffix = .COMPLETED > a3.sources.r3.fileHeader = true > #忽略所有以.tmp结尾的文件,不上传 > a3.sources.r3.ignorePattern = ([^ ]*\.tmp) > > # Describe the sink > a3.sinks.k3.type = hdfs > a3.sinks.k3.hdfs.path = hdfs://master:9000/flume/upload/%Y%m%d/%H > #上传文件的前缀 > a3.sinks.k3.hdfs.filePrefix = upload- > #是否按照时间滚动文件夹 > a3.sinks.k3.hdfs.round = true > #多少时间单位创建一个新的文件夹 > a3.sinks.k3.hdfs.roundValue = 1 > #重新定义时间单位 > a3.sinks.k3.hdfs.roundUnit = hour > #是否使用本地时间戳 > a3.sinks.k3.hdfs.useLocalTimeStamp = true > #积攒多少个Event才flush到HDFS一次 > a3.sinks.k3.hdfs.batchSize = 100 > #设置文件类型,可支持压缩 > a3.sinks.k3.hdfs.fileType = DataStream > #多久生成一个新的文件 > a3.sinks.k3.hdfs.rollInterval = 600 > #设置每个文件的滚动大小大概是128M > a3.sinks.k3.hdfs.rollSize = 134217700 > #文件的滚动与Event数量无关 > a3.sinks.k3.hdfs.rollCount = 0 > #最小冗余数 > a3.sinks.k3.hdfs.minBlockReplicas = 1 > > # Use a channel which buffers events in memory > a3.channels.c3.type = memory > a3.channels.c3.capacity = 1000 > a3.channels.c3.transactionCapacity = 100 > > # Bind the source and sink to the channel > a3.sources.r3.channels = c3 > a3.sinks.k3.channel = c3 > EOF [root@master flume_job]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name a3 \ > --conf-file /usr/local/flume/flume_job/flume-dir-hdfs.conf Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access 省略。。。。。 测试 [root@master ~]# cd /opt/upload/ [root@master upload]# echo '1'>cmz1.txt [root@master upload]# echo '2'>cmz2.log [root@master upload]# echo '3'>cmz3.tmp [root@master upload]# ls /opt/upload cmz1.txt cmz2.log cmz3.tmp 过一分钟在看下 [root@master upload]# ls /opt/upload cmz1.txt.COMPLETED cmz2.log.COMPLETED cmz3.tmp 再次查看hdfs [root@master upload]# hdfs dfs -ls /flume/ Found 2 items drwxr-xr-x - root supergroup 0 2019-08-20 20:25 /flume/20190820 drwxr-xr-x - root supergroup 0 2019-08-20 23:25 /flume/upload [root@master upload]# hdfs dfs -ls /flume/upload Found 1 items drwxr-xr-x - root supergroup 0 2019-08-20 23:25 /flume/upload/20190820 [root@master upload]# hdfs dfs -ls /flume/upload/20190820 Found 1 items drwxr-xr-x - root supergroup 0 2019-08-20 23:25 /flume/upload/20190820/23 [root@master upload]# hdfs dfs -ls /flume/upload/20190820/23 Found 1 items -rw-r--r-- 3 root supergroup 4 2019-08-20 23:25 /flume/upload/20190820/23/upload-.1566314712480.tmp [root@master upload]# hdfs dfs -text /flume/upload/20190820/23/* 1 2
4. 单数据源多出口案例(选择器)¶
4.1 需求¶
使用Flume-1监控文件变动,Flume-1将变动内容传递给Flume-2,Flume-2负责存储到HDFS。同时Flume-1将变动内容传递给Flume-3,Flume-3负责输出到Local FileSystem。

4.2 分析¶
单Source多Channel、Sink,使用Flume-1监控文件变动,Flume-1将变动内容传递给Flume-2,Flume-2负责存储到HDFS。同时Flume-1将变动内容传递给Flume-3,Flume-3负责输出到Local FileSystem。

4.3 实现¶
1. 配置1个接收日志文件的source和两个channel、两个sink,分别输送给flume-flume-hdfs和flume-flume-dir。 2. 配置上级Flume输出的Source,输出是到HDFS的Sink。 3. 配置上级Flume输出的Source,输出是到本地目录的Sink。
快速命令
mkdir -p /tmp/datas/flume cd /usr/local/flume/flume_job/ cat>flume-file-flume.conf<<EOF # Name the components on this agent a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 c2 # 将数据流复制给所有channel a1.sources.r1.selector.type = replicating # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -F /var/log/nginx/access.log a1.sources.r1.shell = /bin/bash -c # Describe the sink a1.sinks.k1.type = avro a1.sinks.k1.hostname = master a1.sinks.k1.port = 4141 a1.sinks.k2.type = avro a1.sinks.k2.hostname = master a1.sinks.k2.port = 4142 # Describe the channel a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.channels.c2.type = memory a1.channels.c2.capacity = 1000 a1.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 c2 a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c2 EOF cat>flume-flume-hdfs.conf<<EOF # Name the components on this agent a2.sources = r1 a2.sinks = k1 a2.channels = c1 # Describe/configure the source a2.sources.r1.type = avro a2.sources.r1.bind = master a2.sources.r1.port = 4141 # Describe the sink a2.sinks.k1.type = hdfs a2.sinks.k1.hdfs.path = hdfs://master:9000/flume2/%Y%m%d/%H #上传文件的前缀 a2.sinks.k1.hdfs.filePrefix = flume2- #是否按照时间滚动文件夹 a2.sinks.k1.hdfs.round = true #多少时间单位创建一个新的文件夹 a2.sinks.k1.hdfs.roundValue = 1 #重新定义时间单位 a2.sinks.k1.hdfs.roundUnit = hour #是否使用本地时间戳 a2.sinks.k1.hdfs.useLocalTimeStamp = true #积攒多少个Event才flush到HDFS一次 a2.sinks.k1.hdfs.batchSize = 100 #设置文件类型,可支持压缩 a2.sinks.k1.hdfs.fileType = DataStream #多久生成一个新的文件 a2.sinks.k1.hdfs.rollInterval = 600 #设置每个文件的滚动大小大概是128M a2.sinks.k1.hdfs.rollSize = 134217700 #文件的滚动与Event数量无关 a2.sinks.k1.hdfs.rollCount = 0 #最小冗余数 a2.sinks.k1.hdfs.minBlockReplicas = 1 # Describe the channel a2.channels.c1.type = memory a2.channels.c1.capacity = 1000 a2.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a2.sources.r1.channels = c1 a2.sinks.k1.channel = c1 EOF cat>flume-flume-dir.conf<<EOF # Name the components on this agent a3.sources = r1 a3.sinks = k1 a3.channels = c2 # Describe/configure the source a3.sources.r1.type = avro a3.sources.r1.bind = master a3.sources.r1.port = 4142 # Describe the sink a3.sinks.k1.type = file_roll # 输出的本地目录必须是已经存在的目录,如果该目录不存在,并不会创建新的目录。 a3.sinks.k1.sink.directory = /tmp/datas/flume # Describe the channel a3.channels.c2.type = memory a3.channels.c2.capacity = 1000 a3.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a3.sources.r1.channels = c2 a3.sinks.k1.channel = c2 EOF # 启动,注意顺序。开三个终端,分别启动 /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name a3 \ --conf-file /usr/local/flume/flume_job/flume-flume-dir.conf /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name a2 \ --conf-file /usr/local/flume/flume_job/flume-flume-hdfs.conf /usr/local/flume/bin/flume-ng agent \ --conf /usr/local/flume_job/conf/ \ --name a1 \ --conf-file /usr/local/flume/flume_job/flume-file-flume.conf # 检查 hadoop fs -ls /flume2
Avro解释
注:Avro是由Hadoop创始人Doug Cutting创建的一种语言无关的数据序列化和RPC框架。 注:RPC(Remote Procedure Call)—远程过程调用,它是一种通过网络从远程计算机程序上请求服务,而不需要了解底层网络技术的协议。
详细操作
[root@master flume_job]# cd /usr/local/flume/flume_job/ [root@master flume_job]# cat>flume-file-flume.conf<<EOF > # Name the components on this agent > a1.sources = r1 > a1.sinks = k1 k2 > a1.channels = c1 c2 > # 将数据流复制给所有channel > a1.sources.r1.selector.type = replicating > > # Describe/configure the source > a1.sources.r1.type = exec > a1.sources.r1.command = tail -F /var/log/nginx/access.log > a1.sources.r1.shell = /bin/bash -c > > # Describe the sink > a1.sinks.k1.type = avro > a1.sinks.k1.hostname = master > a1.sinks.k1.port = 4141 > > a1.sinks.k2.type = avro > a1.sinks.k2.hostname = master > a1.sinks.k2.port = 4142 > > # Describe the channel > a1.channels.c1.type = memory > a1.channels.c1.capacity = 1000 > a1.channels.c1.transactionCapacity = 100 > > a1.channels.c2.type = memory > a1.channels.c2.capacity = 1000 > a1.channels.c2.transactionCapacity = 100 > > # Bind the source and sink to the channel > a1.sources.r1.channels = c1 c2 > a1.sinks.k1.channel = c1 > a1.sinks.k2.channel = c2 > EOF [root@master flume_job]# [root@master flume_job]# cat>flume-flume-hdfs.conf<<EOF > # Name the components on this agent > a2.sources = r1 > a2.sinks = k1 > a2.channels = c1 > > # Describe/configure the source > a2.sources.r1.type = avro > a2.sources.r1.bind = master > a2.sources.r1.port = 4141 > > # Describe the sink > a2.sinks.k1.type = hdfs > a2.sinks.k1.hdfs.path = hdfs://master:9000/flume2/%Y%m%d/%H > #上传文件的前缀 > a2.sinks.k1.hdfs.filePrefix = flume2- > #是否按照时间滚动文件夹 > a2.sinks.k1.hdfs.round = true > #多少时间单位创建一个新的文件夹 > a2.sinks.k1.hdfs.roundValue = 1 > #重新定义时间单位 > a2.sinks.k1.hdfs.roundUnit = hour > #是否使用本地时间戳 > a2.sinks.k1.hdfs.useLocalTimeStamp = true > #积攒多少个Event才flush到HDFS一次 > a2.sinks.k1.hdfs.batchSize = 100 > #设置文件类型,可支持压缩 > a2.sinks.k1.hdfs.fileType = DataStream > #多久生成一个新的文件 > a2.sinks.k1.hdfs.rollInterval = 600 > #设置每个文件的滚动大小大概是128M > a2.sinks.k1.hdfs.rollSize = 134217700 > #文件的滚动与Event数量无关 > a2.sinks.k1.hdfs.rollCount = 0 > #最小冗余数 > a2.sinks.k1.hdfs.minBlockReplicas = 1 > > # Describe the channel > a2.channels.c1.type = memory > a2.channels.c1.capacity = 1000 > a2.channels.c1.transactionCapacity = 100 > > # Bind the source and sink to the channel > a2.sources.r1.channels = c1 > a2.sinks.k1.channel = c1 > EOF [root@master flume_job]# cat>flume-flume-dir.conf<<EOF > # Name the components on this agent > a3.sources = r1 > a3.sinks = k1 > a3.channels = c2 > > # Describe/configure the source > a3.sources.r1.type = avro > a3.sources.r1.bind = master > a3.sources.r1.port = 4142 > > # Describe the sink > a3.sinks.k1.type = file_roll > # 输出的本地目录必须是已经存在的目录,如果该目录不存在,并不会创建新的目录。 > a3.sinks.k1.sink.directory = /tmp/datas/flume > > # Describe the channel > a3.channels.c2.type = memory > a3.channels.c2.capacity = 1000 > a3.channels.c2.transactionCapacity = 100 > > # Bind the source and sink to the channel > a3.sources.r1.channels = c2 > a3.sinks.k1.channel = c2 > EOF [root@master flume_job]# ll total 24 -rw-r--r-- 1 root root 1496 Aug 20 23:16 flume-dir-hdfs.conf -rw-r--r-- 1 root root 856 Aug 21 15:00 flume-file-flume.conf -rw-r--r-- 1 root root 1336 Aug 20 20:21 flume-file-hdfs.conf -rw-r--r-- 1 root root 627 Aug 21 15:00 flume-flume-dir.conf -rw-r--r-- 1 root root 1288 Aug 21 15:00 flume-flume-hdfs.conf -rw-r--r-- 1 root root 507 Aug 20 19:09 flume-telnet-logger.conf # 启动 [root@master flume_job]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name a3 \ > --conf-file /usr/local/flume/flume_job/flume-flume-dir.conf Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access [root@master ~]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name a2 \ > --conf-file /usr/local/flume/flume_job/flume-flume-hdfs.conf Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access [root@master ~]# /usr/local/flume/bin/flume-ng agent \ > --conf /usr/local/flume_job/conf/ \ > --name a1 \ > --conf-file /usr/local/flume/flume_job/flume-file-flume.conf Warning: JAVA_HOME is not set! Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath Info: Including Hive libraries found via (/usr/local/hive) for Hive access # 检查hdfs [root@master flume]# hadoop fs -ls /flume2 Found 1 items drwxr-xr-x - root supergroup 0 2019-08-21 15:44 /flume2/20190821 [root@master flume]# hadoop fs -ls /flume2/* Found 1 items drwxr-xr-x - root supergroup 0 2019-08-21 15:44 /flume2/20190821/15 [root@master flume]# hadoop fs -ls /flume2/20190821/15 Found 1 items -rw-r--r-- 3 root supergroup 3504 2019-08-21 15:44 /flume2/20190821/15/flume2-.1566373444836.tmp [root@master flume]# hadoop fs -text /flume2/20190821/15/flume2-.1566373444836.tmp 157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build /HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build /HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" # 产看本地 [root@master flume]# cd /tmp/datas/flume/ [root@master flume]# ls 1566373411495-1 1566373411495-2 1566373411495-3 [root@master flume]# ll total 4 -rw-r--r-- 1 root root 0 Aug 21 15:43 1566373411495-1 -rw-r--r-- 1 root root 3504 Aug 21 15:44 1566373411495-2 -rw-r--r-- 1 root root 0 Aug 21 15:44 1566373411495-3 [root@master flume]# cat 1566373411495-2 157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302 9.110 Safari/537.36 SE 2.X MetaSr 1.0"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build /HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build /HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"