鸿 网 互 联 www.68idc.cn

Flume之Failover和Load balancing原理及实例

来源:互联网 作者:佚名 时间:2015-10-23 10:15
这里演示了加密文本,利用特定字符来定位位置sockdata=PHIDOHBBNIEFMNDIBJLENNEBBODILAEAPMIHHKMKBOKILJOJHGPDDBEOPMCEMLAPJNFDLOJCPHEAMEDNPINODFMBMLFPFHLLGBN

Failover Sink Processor

Failover Sink Processor维护了一个sink的优先级列表,具有故障转移的功能,具体的配置如下(加粗的必须配置):

属性名称 默认值 描述

processor.priority.<sinkName> 优先级值。<sinkName> 必须是sinks中有定义的。优先级值高Sink会更早被激活。值越大,优先级越高。
注:多个sinks的话,优先级的值不要相同,如果优先级相同的话,只会有一个生效。且failover时,同优先级的不会Failover,就算是同优先级的还存在也会报All sinks failed to process。

示例:

a1.sinkgroups = g1 a1.sinkgroups.g1.sinks = k1 k2 a1.sinkgroups.g1.processor.type = failover a1.sinkgroups.g1.processor.priority.k1 = 5 a1.sinkgroups.g1.processor.priority.k2 = 10 a1.sinkgroups.g1.processor.maxpenalty = 10000

Load balancing Sink Processor

Load balancing sink processor 提供了多个sinks负载均衡的能力。它维护了一个active sinks列表,该列表中的负载必须是分布式的。实现了round_robin(轮询调度) 或者 random(随机) 的选择机制,默认是:round_robin(轮询调度)。也可以通过继承AbstractSinkSelector类来实现自定义的选择机制。
当被调用时,选择器根据配置文件的选择机制挑选下一个sink,并且调用该sink。如果所选的Sink传递Event失败,则通过选择机制挑选下一个可用的Sink,以此类推。

属性名称 默认 描述

load_balance

random 或者自定义的类,该类继承了AbstractSinkSelector

示例:

a1.sinkgroups = g1 a1.sinkgroups.g1.sinks = k1 k2 a1.sinkgroups.g1.processor.type = load_balance a1.sinkgroups.g1.processor.backoff = true a1.sinkgroups.g1.processor.selector = random


Failover和Load balancing实例

测试环境:

10.0.1.76(Client)

10.0.1.68 (Failover和Load balancing)

10.0.1.70

10.0.1.77

10.0.1.85

10.0.1.86

10.0.1.87

以10.0.1.76作为客户端,通过exec获取nginx的日志信息,然后将数据传到10.0.1.68(配置了Failover和Load balancing)的节点,最后10.0.1.68将数据发送的10.0.1.70,77,85,86,87节点,这些节点最终将数据写到本地硬盘。

10.0.1.76的配置:

a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.sources.r1.channels = c1 a1.sources.r1.type = exec a1.sources.r1.command = tail -n 0 -F /home/nginx/logs/access.log a1.sinks.k1.type = avro a1.sinks.k1.channel = c1 a1.sinks.k1.hostname = 10.0.1.68 a1.sinks.k1.port = 41415 a1.channels = c1 a1.sources = r1 a1.sinks = k1获取nginx产生的日志,然后通过avro发送的10.0.1.68


10.0.1.68配置(配置A):

a1.channels = c1 a1.sources = r1 a1.sinks = k70 k77 k85 k86 k87 a1.sinkgroups = g1 g2 g3 a1.sinkgroups.g1.sinks = k70 k85 a1.sinkgroups.g1.processor.type = load_balance a1.sinkgroups.g1.processor.selector = round_robin a1.sinkgroups.g1.processor.backoff = true a1.sinkgroups.g2.sinks = k70 k86 a1.sinkgroups.g2.processor.type = failover   a1.sinkgroups.g2.processor.priority.k70 = 20 a1.sinkgroups.g2.processor.priority.k86 = 10 a1.sinkgroups.g2.processor.maxpenalty = 10000 a1.sinkgroups.g3.sinks = k85 k87 k77 a1.sinkgroups.g3.processor.type = failover a1.sinkgroups.g3.processor.priority.k85 = 20 a1.sinkgroups.g3.processor.priority.k87 = 10 a1.sinkgroups.g3.processor.priority.k77 = 5 a1.sinkgroups.g3.processor.maxpenalty = 10000 a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.sources.r1.channels = c1 a1.sources.r1.type = avro a1.sources.r1.bind = 0.0.0.0 a1.sources.r1.port = 41415 a1.sinks.k87.channel = c1 a1.sinks.k87.type = avro a1.sinks.k87.hostname = 10.0.1.87 a1.sinks.k87.port = 41414 a1.sinks.k86.channel = c1 a1.sinks.k86.type = AVRO a1.sinks.k86.hostname = 10.0.1.86 a1.sinks.k86.port = 41414 a1.sinks.k85.channel = c1 a1.sinks.k85.type = AVRO a1.sinks.k85.hostname = 10.0.1.85 a1.sinks.k85.port = 41414 a1.sinks.k77.channel = c1 a1.sinks.k77.type = AVRO a1.sinks.k77.hostname = 10.0.1.77 a1.sinks.k77.port = 41414 a1.sinks.k70.channel = c1 a1.sinks.k70.type = AVRO a1.sinks.k70.hostname = 10.0.1.70 a1.sinks.k70.port = 4141410.0.1.70和10.0.1.85Load balancing,均衡的方式为轮询调用。10.0.1.70和10.0.1.86为Failover,10.0.1.70和10.0.1.87为Failover


10.0.1.70,77,85,86,87配置:

a1.channels = c1 a1.sources = r1 a1.sinks = k1 a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.sources.r1.channels = c1 a1.sources.r1.type = AVRO a1.sources.r1.bind = 0.0.0.0 a1.sources.r1.port = 41414 a1.sinks.k1.channel = c1 a1.sinks.k1.type = file_roll a1.sinks.k1.sink.directory = /data/load/ a1.sinks.k1.sink.rollInterval = 0通过Avro获取到Event,存放到文件中。


每次往nginx发2w个请求,然后查看10.0.1.70,77,85,86,87四台服务器接受数据的情况。我们做几组测试:

注:表格中的 * 表示关闭关闭该服务器Flume进程。

测试一:

发送2w个请求到Nginx中,查看各个节点接受数据的行数:

服务器

10.0.1.70

10.0.1.77

10.0.1.85

10.0.1.86

10.0.1.87

总计

数据行数

3400

0

3459

6778

6363

20000

其实无论测试2w次请求,还是测试100w次请求,10.0.1.77都无法接受到数据。

测试二:

服务器

10.0.1.70

10.0.1.77

10.0.1.85

10.0.1.86

10.0.1.87(*)

总计

数据行数

6619

6300

6840

13878

6363

40000


问题1: 作为Failover的节点86,87为何可以接受数据,而77没有将接收数据呢?

网友评论
<