Flume案例

鸡汤: 每一发奋努力的背后,必有加倍的赏赐回馈。

1. 监控端口数据官方案例

  多用于测试。

1.1 需求

  案例需求:首先,Flume监控本机8888端口,然后通过telnet工具向本机8888端口发送消息,最后Flume将监听的数据实时显示在控制台。

1.2 分析

  • 通过telnet 工具向基本8888端口发数据
  • Flume 监听本机的8888端口,通过Flume的source端读取数据
  • Flume将获取的数据通过Sink端写到控制台

flume

1.3 实现步骤

  • 检查工具
yum install -y telnet nc
  • 判断8888端口是否被占用
netstat -tunlp | grep 8888
描述
功能描述:netstat命令是一个监控TCP/IP网络的非常有用的工具,它可以显示路由表、实际的网络连接以及每一个网络接口设备的状态信息。 
基本语法:netstat [选项]
选项参数:
    -t或--tcp:显示TCP传输协议的连线状况; 
-u或--udp:显示UDP传输协议的连线状况;
    -n或--numeric:直接使用ip地址,而不通过域名服务器; 
    -l或--listening:显示监控中的服务器的Socket; 
    -p或--programs:显示正在使用Socket的程序识别码和程序名称;

快速命令

mkdir -p /usr/local/flume/flume_job
cd /usr/local/flume/flume_job/

cat >flume-telnet-logger.conf<<EOF
# Name the components on this agent
cmz.sources = r1
cmz.sinks = k1
cmz.channels = c1

# Describe/configure the source
cmz.sources.r1.type = netcat
cmz.sources.r1.bind = localhost
cmz.sources.r1.port = 8888

# Describe the sink
cmz.sinks.k1.type = logger

# Use a channel which buffers events in memory
cmz.channels.c1.type = memory
cmz.channels.c1.capacity = 1000
cmz.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
cmz.sources.r1.channels = c1
cmz.sinks.k1.channel = c1
EOF

启动
/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name cmz \
--conf-file /usr/local/flume/flume_job/flume-telnet-logger.conf \
-Dflume.root.logger=INFO,console

cat >/tmp/cmz_test.txt<<EOF
hello caimengzhi.
you look great today.
you did a good job.
EOF

nc localhost 8888 </tmp/cmz_test.txt

注:配置文件来源于官方手册http://flume.apache.org/FlumeUserGuide.html

flume

注意
1. 文件名字
flume-telnet-logger.conf
  |    |      |_________ 输出
  |    |________________ 输入
  |_____________________ 标识  
起的文件配置文件最好能知道输入和输出各指的是什么

2. 启动的name 也就是cmz一定要和配置文件中的一样,否则启动不起来

3. 参数说明
参数说明:
--conf conf/  :表示配置文件存储在conf/目录
--name cmz  :表示给agent起名为cmz
--conf-file   /usr/local/flume/flume_job/flume-telnet-logger.conf:flume本次启动读取的配置文件是在job文件夹下的flume-telnet.conf文件。
-Dflume.root.logger==INFO,console 
 -D表示flume运行时动态修改flume.root.logger参数属性值,并将控制台日志打印级别设置为INFO级别。
日志级别包括:log、info、warn、error。

flume

详细操作
[root@master ~]# mkdir -p /usr/local/flume/flume_job
[root@master ~]# cd /usr/local/flume/flume_job/
[root@master flume_job]# cat >flume-telnet-logger.conf<<EOF
> # Name the components on this agent
> cmz.sources = r1
> cmz.sinks = k1
> cmz.channels = c1
> 
> # Describe/configure the source
> cmz.sources.r1.type = netcat
> cmz.sources.r1.bind = localhost
> cmz.sources.r1.port = 8888
> 
> # Describe the sink
> cmz.sinks.k1.type = logger
> 
> # Use a channel which buffers events in memory
> cmz.channels.c1.type = memory
> cmz.channels.c1.capacity = 1000
> cmz.channels.c1.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> cmz.sources.r1.channels = c1
> cmz.sinks.k1.channel = c1
> EOF


启动
[root@master flume_job]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name cmz \
> --conf-file /usr/local/flume/flume_job/flume-telnet-logger.conf \
> -Dflume.root.logger=INFO,console
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access

.....

19/08/20 19:02:38 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
19/08/20 19:02:38 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/usr/local/flume/flume_job/flume-telnet-logger.conf
19/08/20 19:02:38 INFO conf.FlumeConfiguration: Processing:k1
19/08/20 19:02:38 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: cmz
19/08/20 19:02:38 INFO conf.FlumeConfiguration: Processing:k1
19/08/20 19:02:38 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [cmz]
19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Creating channels
19/08/20 19:02:38 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Created channel c1
19/08/20 19:02:38 INFO source.DefaultSourceFactory: Creating instance of source r1, type netcat
19/08/20 19:02:38 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
19/08/20 19:02:38 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]
19/08/20 19:02:38 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSou
rce{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1eea61d0 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }19/08/20 19:02:38 INFO node.Application: Starting Channel c1
19/08/20 19:02:38 INFO node.Application: Waiting for channel: c1 to start. Sleeping for 500 ms
19/08/20 19:02:38 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
19/08/20 19:02:38 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started
19/08/20 19:02:38 INFO node.Application: Starting Sink k1
19/08/20 19:02:38 INFO node.Application: Starting Source r1
19/08/20 19:02:38 INFO source.NetcatSource: Source starting
19/08/20 19:02:38 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:8888]
省略。。。。。

[root@master ~]# cat >/tmp/cmz_test.txt<<EOF
> hello caimengzhi.
> you look great today.
> you did a good job.
> EOF
[root@master ~]# 
[root@master ~]# nc localhost 8888 </tmp/cmz_test.txt
OK
OK
OK

19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 63 61 69 6D 65 6E 67 7A 68 69 hello caimengzhi }
19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 79 6F 75 20 6C 6F 6F 6B 20 67 72 65 61 74 20 74 you look great t }
19/08/20 19:08:30 INFO sink.LoggerSink: Event: { headers:{} body: 79 6F 75 20 64 69 64 20 61 20 67 6F 6F 64 20 6A you did a good j }


[root@master ~]# telnet localhost 8888 
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
summer
OK

19/08/20 19:34:28 INFO sink.LoggerSink: Event: { headers:{} body: 73 75 6D 6D 65 72 0D      summer. }

2. 实时读取本地文件到HDFS案例

2.1 需求

实时监控Nginx日志,并上传到HDFS中

2.2 分析

flume

2.3 步骤

  Flume要想将数据输出到HDFS,必须持有Hadoop相关jar包

commons-configuration-1.6.jar
hadoop-auth-2.6.5.jar
hadoop-common-2.6.5.jar
hadoop-hdfs-2.6.5.jar
commons-io-2.4.jar
htrace-core-3.0.4.jar
拷贝到/usr/local/flume/lib/文件夹下
[root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/commons-configuration-1.6.jar /usr/local/flume/lib/
[root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/hadoop-auth-2.6.5.jar /usr/local/flume/lib/
[root@master share]# cp /usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar /usr/local/flume/lib/
[root@master share]# cp /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.6.5.jar /usr/local/flume/lib/
[root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/commons-io-2.4.jar /usr/local/flume/lib/
[root@master share]# cp /usr/local/hadoop/share/hadoop/tools/lib/htrace-core-3.0.4.jar /usr/local/flume/lib/

  • 创建文件

[root@master flume_job]# touch flume-file-hdfs.conf
  创建flume-file-hdfs.conf文件,要想读取Linux系统中的文件,就得按照Linux命令的规则执行命令。由于Hive日志在Linux系统中所以读取文件的类型选择:exec即execute执行的意思。表示执行Linux命令来读取文件。

快速命令

cd /usr/local/flume/flume_job/
cat >flume-file-hdfs.conf<<EOF
# Name the components on this agent
a2.sources = r2
a2.sinks = k2
a2.channels = c2

# Describe/configure the source
a2.sources.r2.type = exec
a2.sources.r2.command = tail -F /var/log/nginx/access.log
a2.sources.r2.shell = /bin/bash -c

# Describe the sink
a2.sinks.k2.type = hdfs
a2.sinks.k2.hdfs.path = hdfs://master:9000/flume/%Y%m%d/%H  #上传文件的前缀
a2.sinks.k2.hdfs.filePrefix = logs- #是否按照时间滚动文件夹
a2.sinks.k2.hdfs.round = true #多少时间单位创建一个新的文件夹
a2.sinks.k2.hdfs.roundValue = 1 #重新定义时间单位
a2.sinks.k2.hdfs.roundUnit = hour #是否使用本地时间戳
a2.sinks.k2.hdfs.useLocalTimeStamp = true #积攒多少个Event才flush到HDFS一次
a2.sinks.k2.hdfs.batchSize = 1000 #设置文件类型,可支持压缩 
a2.sinks.k2.hdfs.fileType = DataStream #多久生成一个新的文件
a2.sinks.k2.hdfs.rollInterval = 600 #设置每个文件的滚动大小
a2.sinks.k2.hdfs.rollSize = 134217700 #文件的滚动与Event数量无关
a2.sinks.k2.hdfs.rollCount = 0 #最小冗余数
a2.sinks.k2.hdfs.minBlockReplicas = 1

# Use a channel which buffers events in memory
a2.channels.c2.type = memory
a2.channels.c2.capacity = 1000
a2.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a2.sources.r2.channels = c2
a2.sinks.k2.channel = c2
EOF

/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name a2 \
--conf-file /usr/local/flume/flume_job/flume-file-hdfs.conf

hadoop fs -ls /flume/*

上传到hdfs中的目录,就是flume不需要事先创建,会自动创建

flume

详细操作
[root@master ~]# cd /usr/local/flume/flume_job/
[root@master flume_job]# cat >flume-file-hdfs.conf<<EOF
> # Name the components on this agent
> a2.sources = r2
> a2.sinks = k2
> a2.channels = c2
> 
> # Describe/configure the source
> a2.sources.r2.type = exec
> a2.sources.r2.command = tail -F /var/log/nginx/access.log
> a2.sources.r2.shell = /bin/bash -c
> 
> # Describe the sink
> a2.sinks.k2.type = hdfs
> a2.sinks.k2.hdfs.path = hdfs://master:9000/flume/%Y%m%d/%H
> #上传文件的前缀
> a2.sinks.k2.hdfs.filePrefix = logs-
> #是否按照时间滚动文件夹
> a2.sinks.k2.hdfs.round = true
> #多少时间单位创建一个新的文件夹
> a2.sinks.k2.hdfs.roundValue = 1
> #重新定义时间单位
> a2.sinks.k2.hdfs.roundUnit = hour
> #是否使用本地时间戳
> a2.sinks.k2.hdfs.useLocalTimeStamp = true
> #积攒多少个Event才flush到HDFS一次
> a2.sinks.k2.hdfs.batchSize = 1000
> #设置文件类型,可支持压缩
> a2.sinks.k2.hdfs.fileType = DataStream
> #多久生成一个新的文件
> a2.sinks.k2.hdfs.rollInterval = 600
> #设置每个文件的滚动大小
> a2.sinks.k2.hdfs.rollSize = 134217700
> #文件的滚动与Event数量无关
> a2.sinks.k2.hdfs.rollCount = 0
> #最小冗余数
> a2.sinks.k2.hdfs.minBlockReplicas = 1
> 
> # Use a channel which buffers events in memory
> a2.channels.c2.type = memory
> a2.channels.c2.capacity = 1000
> a2.channels.c2.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> a2.sources.r2.channels = c2
> a2.sinks.k2.channel = c2
> EOF
[root@master flume_job]# ls
flume-file-hdfs.conf  flume-telnet-logger.conf

[root@master ~]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name a2 \
> --conf-file /usr/local/flume/flume_job/flume-file-hdfs.conf 
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access

查看
[root@master nginx]# hadoop fs -ls /flume
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-20 20:25 /flume/20190820
[root@master nginx]# hadoop fs -ls /flume/*
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-20 20:27 /flume/20190820/20
[root@master nginx]# hadoop fs -ls /flume/20190820/20/*
-rw-r--r--   3 root supergroup     118471 2019-08-20 20:27 /flume/20190820/20/logs-.1566303934487
-rw-r--r--   3 root supergroup       2884 2019-08-20 20:27 /flume/20190820/20/logs-.1566304046120.tmp
[root@master nginx]# hadoop fs -text /flume/20190820/20/logs-.1566303934487 |head
127.0.0.1 - - [20/Aug/2019:20:25:31 +0800] "GET / HTTP/1.1" 200 3700 "-" "curl/7.29.0" "-"
106.11.159.48 - - [20/Aug/2019:07:46:30 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr
ome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"106.11.159.48 - - [20/Aug/2019:07:46:30 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr
ome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"42.156.254.112 - - [20/Aug/2019:07:46:55 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch
rome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"42.156.254.112 - - [20/Aug/2019:07:46:55 +0800] "GET /tools/xshell/xshell.sendfile/ HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch
rome/69.0.3497.81 YisouSpider/5.0 Safari/537.36"112.96.71.67 - - [20/Aug/2019:07:57:56 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l
ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:56 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l
ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l
ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /hadoop/CDH/cdh.hdfs_shell/ HTTP/1.1" 401 188 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, l
ike Gecko) Version/12.1.1 Mobile/15E148 Safari/604.1"112.96.71.67 - - [20/Aug/2019:07:57:57 +0800] "GET /apple-touch-icon-120x120-precomposed.png HTTP/1.1" 401 188 "-" "MobileSafari/604.1 CFNetwork/978.0.7 Darwin/18.6.0"
text: Unable to write to output stream.

3. 实时读取目录文件到HDFS案例

3.1 需求

  使用Flume监听整个目录的文件.

flume

3.2 实现步骤

快速命令

mkdir -p /opt/upload
echo '1'>cmz1.txt
echo '2'>cmz2.log
echo '3'>cmz3.tmp

cd /usr/local/flume/flume_job/
cat >flume-dir-hdfs.conf<<EOF
a3.sources = r3
a3.sinks = k3
a3.channels = c3

# Describe/configure the source
# 类型是目录
a3.sources.r3.type = spooldir
# 监控的目录
a3.sources.r3.spoolDir = /opt/upload
# 上传完成后,加后缀
a3.sources.r3.fileSuffix = .COMPLETED
a3.sources.r3.fileHeader = true
#忽略所有以.tmp结尾的文件,不上传
a3.sources.r3.ignorePattern = ([^ ]*\.tmp)

# Describe the sink
a3.sinks.k3.type = hdfs
a3.sinks.k3.hdfs.path = hdfs://master:9000/flume/upload/%Y%m%d/%H
#上传文件的前缀
a3.sinks.k3.hdfs.filePrefix = upload-
#是否按照时间滚动文件夹
a3.sinks.k3.hdfs.round = true
#多少时间单位创建一个新的文件夹
a3.sinks.k3.hdfs.roundValue = 1
#重新定义时间单位
a3.sinks.k3.hdfs.roundUnit = hour
#是否使用本地时间戳
a3.sinks.k3.hdfs.useLocalTimeStamp = true
#积攒多少个Event才flush到HDFS一次
a3.sinks.k3.hdfs.batchSize = 100
#设置文件类型,可支持压缩
a3.sinks.k3.hdfs.fileType = DataStream
#多久生成一个新的文件
a3.sinks.k3.hdfs.rollInterval = 600
#设置每个文件的滚动大小大概是128M
a3.sinks.k3.hdfs.rollSize = 134217700
#文件的滚动与Event数量无关
a3.sinks.k3.hdfs.rollCount = 0
#最小冗余数
a3.sinks.k3.hdfs.minBlockReplicas = 1

# Use a channel which buffers events in memory
a3.channels.c3.type = memory
a3.channels.c3.capacity = 1000
a3.channels.c3.transactionCapacity = 100

# Bind the source and sink to the channel
a3.sources.r3.channels = c3
a3.sinks.k3.channel = c3
EOF

/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name a3 \
--conf-file /usr/local/flume/flume_job/flume-dir-hdfs.conf 

hdfs dfs -text /flume/upload/*
ll /opt/upload

flume

说明

说明: 在使用Spooling Directory Source时
1)不要在监控目录中创建并持续修改文件
2)上传完成的文件会以.COMPLETED结尾,可以自定义
3)被监控文件夹每500毫秒扫描一次文件变动
详细操作
[root@master ~]# cd /opt/
[root@master opt]# ls
[root@master opt]# mkdir -p /opt/upload
[root@master opt]# cd /usr/local/flume/flume_job/
[root@master flume_job]# cat >flume-dir-hdfs.conf<<EOF
> a3.sources = r3
> a3.sinks = k3
> a3.channels = c3
> 
> # Describe/configure the source
> # 类型是目录
> a3.sources.r3.type = spooldir
> # 监控的目录
> a3.sources.r3.spoolDir = /opt/upload
> # 上传完成后,加后缀
> a3.sources.r3.fileSuffix = .COMPLETED
> a3.sources.r3.fileHeader = true
> #忽略所有以.tmp结尾的文件,不上传
> a3.sources.r3.ignorePattern = ([^ ]*\.tmp)
> 
> # Describe the sink
> a3.sinks.k3.type = hdfs
> a3.sinks.k3.hdfs.path = hdfs://master:9000/flume/upload/%Y%m%d/%H
> #上传文件的前缀
> a3.sinks.k3.hdfs.filePrefix = upload-
> #是否按照时间滚动文件夹
> a3.sinks.k3.hdfs.round = true
> #多少时间单位创建一个新的文件夹
> a3.sinks.k3.hdfs.roundValue = 1
> #重新定义时间单位
> a3.sinks.k3.hdfs.roundUnit = hour
> #是否使用本地时间戳
> a3.sinks.k3.hdfs.useLocalTimeStamp = true
> #积攒多少个Event才flush到HDFS一次
> a3.sinks.k3.hdfs.batchSize = 100
> #设置文件类型,可支持压缩
> a3.sinks.k3.hdfs.fileType = DataStream
> #多久生成一个新的文件
> a3.sinks.k3.hdfs.rollInterval = 600
> #设置每个文件的滚动大小大概是128M
> a3.sinks.k3.hdfs.rollSize = 134217700
> #文件的滚动与Event数量无关
> a3.sinks.k3.hdfs.rollCount = 0
> #最小冗余数
> a3.sinks.k3.hdfs.minBlockReplicas = 1
> 
> # Use a channel which buffers events in memory
> a3.channels.c3.type = memory
> a3.channels.c3.capacity = 1000
> a3.channels.c3.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> a3.sources.r3.channels = c3
> a3.sinks.k3.channel = c3
> EOF

[root@master flume_job]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name a3 \
> --conf-file /usr/local/flume/flume_job/flume-dir-hdfs.conf 
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access

省略。。。。。

测试
[root@master ~]# cd /opt/upload/
[root@master upload]# echo '1'>cmz1.txt
[root@master upload]# echo '2'>cmz2.log
[root@master upload]# echo '3'>cmz3.tmp
[root@master upload]# ls /opt/upload
cmz1.txt  cmz2.log  cmz3.tmp
过一分钟在看下
[root@master upload]# ls /opt/upload
cmz1.txt.COMPLETED  cmz2.log.COMPLETED  cmz3.tmp
再次查看hdfs
[root@master upload]# hdfs dfs -ls /flume/
Found 2 items
drwxr-xr-x   - root supergroup          0 2019-08-20 20:25 /flume/20190820
drwxr-xr-x   - root supergroup          0 2019-08-20 23:25 /flume/upload
[root@master upload]# hdfs dfs -ls /flume/upload
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-20 23:25 /flume/upload/20190820
[root@master upload]# hdfs dfs -ls /flume/upload/20190820
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-20 23:25 /flume/upload/20190820/23
[root@master upload]# hdfs dfs -ls /flume/upload/20190820/23
Found 1 items
-rw-r--r--   3 root supergroup          4 2019-08-20 23:25 /flume/upload/20190820/23/upload-.1566314712480.tmp
[root@master upload]# hdfs dfs -text /flume/upload/20190820/23/*
1
2

4. 单数据源多出口案例(选择器)

4.1 需求

  使用Flume-1监控文件变动,Flume-1将变动内容传递给Flume-2,Flume-2负责存储到HDFS。同时Flume-1将变动内容传递给Flume-3,Flume-3负责输出到Local FileSystem。

flume

4.2 分析

  单Source多Channel、Sink,使用Flume-1监控文件变动,Flume-1将变动内容传递给Flume-2,Flume-2负责存储到HDFS。同时Flume-1将变动内容传递给Flume-3,Flume-3负责输出到Local FileSystem。

flume

4.3 实现

1. 配置1个接收日志文件的source和两个channel、两个sink,分别输送给flume-flume-hdfs和flume-flume-dir。
2. 配置上级Flume输出的Source,输出是到HDFS的Sink。
3. 配置上级Flume输出的Source,输出是到本地目录的Sink。

快速命令

mkdir -p /tmp/datas/flume
cd /usr/local/flume/flume_job/
cat>flume-file-flume.conf<<EOF
# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2
# 将数据流复制给所有channel
a1.sources.r1.selector.type = replicating

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/nginx/access.log
a1.sources.r1.shell = /bin/bash -c

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = master 
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = master
a1.sinks.k2.port = 4142

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
EOF

cat>flume-flume-hdfs.conf<<EOF
# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1

# Describe/configure the source
a2.sources.r1.type = avro
a2.sources.r1.bind = master
a2.sources.r1.port = 4141

# Describe the sink
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path = hdfs://master:9000/flume2/%Y%m%d/%H
#上传文件的前缀
a2.sinks.k1.hdfs.filePrefix = flume2-
#是否按照时间滚动文件夹
a2.sinks.k1.hdfs.round = true
#多少时间单位创建一个新的文件夹
a2.sinks.k1.hdfs.roundValue = 1
#重新定义时间单位
a2.sinks.k1.hdfs.roundUnit = hour
#是否使用本地时间戳
a2.sinks.k1.hdfs.useLocalTimeStamp = true
#积攒多少个Event才flush到HDFS一次
a2.sinks.k1.hdfs.batchSize = 100
#设置文件类型,可支持压缩
a2.sinks.k1.hdfs.fileType = DataStream
#多久生成一个新的文件
a2.sinks.k1.hdfs.rollInterval = 600
#设置每个文件的滚动大小大概是128M
a2.sinks.k1.hdfs.rollSize = 134217700
#文件的滚动与Event数量无关
a2.sinks.k1.hdfs.rollCount = 0
#最小冗余数
a2.sinks.k1.hdfs.minBlockReplicas = 1

# Describe the channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1
EOF

cat>flume-flume-dir.conf<<EOF
# Name the components on this agent
a3.sources = r1
a3.sinks = k1
a3.channels = c2

# Describe/configure the source
a3.sources.r1.type = avro
a3.sources.r1.bind = master
a3.sources.r1.port = 4142

# Describe the sink
a3.sinks.k1.type = file_roll
# 输出的本地目录必须是已经存在的目录,如果该目录不存在,并不会创建新的目录。
a3.sinks.k1.sink.directory = /tmp/datas/flume

# Describe the channel
a3.channels.c2.type = memory
a3.channels.c2.capacity = 1000
a3.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a3.sources.r1.channels = c2
a3.sinks.k1.channel = c2
EOF

# 启动,注意顺序。开三个终端,分别启动
/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name a3 \
--conf-file /usr/local/flume/flume_job/flume-flume-dir.conf

/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name a2 \
--conf-file /usr/local/flume/flume_job/flume-flume-hdfs.conf

/usr/local/flume/bin/flume-ng agent \
--conf /usr/local/flume_job/conf/ \
--name a1 \
--conf-file /usr/local/flume/flume_job/flume-file-flume.conf

# 检查
hadoop fs -ls /flume2
Avro解释
注:Avro是由Hadoop创始人Doug Cutting创建的一种语言无关的数据序列化和RPC框架。
注:RPC(Remote Procedure Call)—远程过程调用,它是一种通过网络从远程计算机程序上请求服务,而不需要了解底层网络技术的协议。
详细操作
[root@master flume_job]# cd /usr/local/flume/flume_job/
[root@master flume_job]# cat>flume-file-flume.conf<<EOF
> # Name the components on this agent
> a1.sources = r1
> a1.sinks = k1 k2
> a1.channels = c1 c2
> # 将数据流复制给所有channel
> a1.sources.r1.selector.type = replicating
> 
> # Describe/configure the source
> a1.sources.r1.type = exec
> a1.sources.r1.command = tail -F /var/log/nginx/access.log
> a1.sources.r1.shell = /bin/bash -c
> 
> # Describe the sink
> a1.sinks.k1.type = avro
> a1.sinks.k1.hostname = master 
> a1.sinks.k1.port = 4141
> 
> a1.sinks.k2.type = avro
> a1.sinks.k2.hostname = master
> a1.sinks.k2.port = 4142
> 
> # Describe the channel
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000
> a1.channels.c1.transactionCapacity = 100
> 
> a1.channels.c2.type = memory
> a1.channels.c2.capacity = 1000
> a1.channels.c2.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> a1.sources.r1.channels = c1 c2
> a1.sinks.k1.channel = c1
> a1.sinks.k2.channel = c2
> EOF
[root@master flume_job]# 
[root@master flume_job]# cat>flume-flume-hdfs.conf<<EOF
> # Name the components on this agent
> a2.sources = r1
> a2.sinks = k1
> a2.channels = c1
> 
> # Describe/configure the source
> a2.sources.r1.type = avro
> a2.sources.r1.bind = master
> a2.sources.r1.port = 4141
> 
> # Describe the sink
> a2.sinks.k1.type = hdfs
> a2.sinks.k1.hdfs.path = hdfs://master:9000/flume2/%Y%m%d/%H
> #上传文件的前缀
> a2.sinks.k1.hdfs.filePrefix = flume2-
> #是否按照时间滚动文件夹
> a2.sinks.k1.hdfs.round = true
> #多少时间单位创建一个新的文件夹
> a2.sinks.k1.hdfs.roundValue = 1
> #重新定义时间单位
> a2.sinks.k1.hdfs.roundUnit = hour
> #是否使用本地时间戳
> a2.sinks.k1.hdfs.useLocalTimeStamp = true
> #积攒多少个Event才flush到HDFS一次
> a2.sinks.k1.hdfs.batchSize = 100
> #设置文件类型,可支持压缩
> a2.sinks.k1.hdfs.fileType = DataStream
> #多久生成一个新的文件
> a2.sinks.k1.hdfs.rollInterval = 600
> #设置每个文件的滚动大小大概是128M
> a2.sinks.k1.hdfs.rollSize = 134217700
> #文件的滚动与Event数量无关
> a2.sinks.k1.hdfs.rollCount = 0
> #最小冗余数
> a2.sinks.k1.hdfs.minBlockReplicas = 1
> 
> # Describe the channel
> a2.channels.c1.type = memory
> a2.channels.c1.capacity = 1000
> a2.channels.c1.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> a2.sources.r1.channels = c1
> a2.sinks.k1.channel = c1
> EOF

[root@master flume_job]# cat>flume-flume-dir.conf<<EOF
> # Name the components on this agent
> a3.sources = r1
> a3.sinks = k1
> a3.channels = c2
> 
> # Describe/configure the source
> a3.sources.r1.type = avro
> a3.sources.r1.bind = master
> a3.sources.r1.port = 4142
> 
> # Describe the sink
> a3.sinks.k1.type = file_roll
> # 输出的本地目录必须是已经存在的目录,如果该目录不存在,并不会创建新的目录。
> a3.sinks.k1.sink.directory = /tmp/datas/flume
> 
> # Describe the channel
> a3.channels.c2.type = memory
> a3.channels.c2.capacity = 1000
> a3.channels.c2.transactionCapacity = 100
> 
> # Bind the source and sink to the channel
> a3.sources.r1.channels = c2
> a3.sinks.k1.channel = c2
> EOF
[root@master flume_job]# ll
total 24
-rw-r--r-- 1 root root 1496 Aug 20 23:16 flume-dir-hdfs.conf
-rw-r--r-- 1 root root  856 Aug 21 15:00 flume-file-flume.conf
-rw-r--r-- 1 root root 1336 Aug 20 20:21 flume-file-hdfs.conf
-rw-r--r-- 1 root root  627 Aug 21 15:00 flume-flume-dir.conf
-rw-r--r-- 1 root root 1288 Aug 21 15:00 flume-flume-hdfs.conf
-rw-r--r-- 1 root root  507 Aug 20 19:09 flume-telnet-logger.conf

# 启动
[root@master flume_job]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name a3 \
> --conf-file /usr/local/flume/flume_job/flume-flume-dir.conf
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access

[root@master ~]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name a2 \
> --conf-file /usr/local/flume/flume_job/flume-flume-hdfs.conf
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access

[root@master ~]# /usr/local/flume/bin/flume-ng agent \
> --conf /usr/local/flume_job/conf/ \
> --name a1 \
> --conf-file /usr/local/flume/flume_job/flume-file-flume.conf
Warning: JAVA_HOME is not set!
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.6.5/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local//hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local//hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Including Hive libraries found via (/usr/local/hive) for Hive access


# 检查hdfs
[root@master flume]# hadoop fs -ls /flume2
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-21 15:44 /flume2/20190821
[root@master flume]# hadoop fs -ls /flume2/*
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-08-21 15:44 /flume2/20190821/15
[root@master flume]# hadoop fs -ls /flume2/20190821/15
Found 1 items
-rw-r--r--   3 root supergroup       3504 2019-08-21 15:44 /flume2/20190821/15/flume2-.1566373444836.tmp
[root@master flume]# hadoop fs -text /flume2/20190821/15/flume2-.1566373444836.tmp
157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap
pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap
pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil
d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil
d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build
/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build
/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"

# 产看本地
[root@master flume]# cd /tmp/datas/flume/
[root@master flume]# ls
1566373411495-1  1566373411495-2  1566373411495-3
[root@master flume]# ll
total 4
-rw-r--r-- 1 root root    0 Aug 21 15:43 1566373411495-1
-rw-r--r-- 1 root root 3504 Aug 21 15:44 1566373411495-2
-rw-r--r-- 1 root root    0 Aug 21 15:44 1566373411495-3
[root@master flume]# cat  1566373411495-2 
157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:26:51 +0800] "GET /pictures/index/p7.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"157.255.17.17 - - [21/Aug/2019:15:31:50 +0800] "GET /pictures/index/p8.png HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.302
9.110 Safari/537.36 SE 2.X MetaSr 1.0"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap
pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:10 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) Ap
pleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil
d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"180.163.220.68 - - [21/Aug/2019:15:35:12 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Buil
d/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build
/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"42.236.10.106 - - [21/Aug/2019:15:35:29 +0800] "GET /hadoop/hadoop/hadoop.hive_install HTTP/1.1" 401 590 "http://www.baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build
/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"