Fluent Bit:提取nginx日志关键字条目

发布于:2025-09-03 ⋅ 阅读:(15) ⋅ 点赞:(0)

#作者:程宏斌

前言

由于目前采集到的nginx日志信息,全部包含在log 这个key中,无法对其进行字段检索,增加了查看日志的困难度。期望fluent-bit采集的nginx日志,能根据nginx日志里面的关键字条目进行查询。类似于下面的关键字段,需要提取remote_addr、request、status、uri等字段.
log字段日志示例:

{"@timestamp":"2024-06-27T03:23:04+00:00","request_time":"0.008","upstream_response_time":"0.008","upstream_connect_time":"0.000","upstream_header_time":"0.007","remote_addr":"10.172.62.14","request":"GET /sso/v1/users/info HTTP/1.1","method":"GET","uri":"/sso/v1/users/info","request_uri":"/sso/v1/users/info","scheme":"https","protocol":"HTTP/1.1","host":"sso.tg.unicom.local:443","status":"200","args":"","body_bytes_sent":"289","http_referer":"","http_user_agent":"TiangongCCS/0.1.7","http_x_forwarded_for":"","http_cookie":"accessToken=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpc1Jvb3QiOiIwIiwiYWNjb3VudE5hbWUiOiJ3dWh0ODEiLCJyZWZyZXNoVGltZSI6MTgwLCJpc3MiOiJjdXNvZnR3YXJlIiwibW9iaWxlIjoiMTU1NjI5ODQzMjEiLCJ1c2VyTmFtZSI6Inl1ZWh5MTEiLCJhY2Nlc3NUb2tlbiI6ImJhNDdiYTc2MjFlOTRmZGE4ZjU5M2I4MDJlMjNhNzZkIiwidXNlcklEIjoiMTI1MjUzMTY0NjIwIiwidHRVc2VySWQiOiI1MGZjZGIwZmIwNDU0NTRhOTc3NjZiYWQ0NTRhYjkwNCIsImFjY291bnRJRCI6IjczMDg5NTQwNjM5MiIsImlzRW5hYmxlQ29uc29sZSI6IjEiLCJpZCI6IjUwZmNkYjBmYjA0NTQ1NGE5Nzc2NmJhZDQ1NGFiOTA0IiwiZXhwIjoxNzE5NDgwMTU1LCJlbWFpbCI6Inl1ZWh5MTFAY2hpbmF1bmljb20uY24iLCJpc0VuYWJsZVByb2dyYW0iOiIxIn0.P27DWWxl_9AuNGMBPypB754bMdxVwVPrxLXr_DYA0HRU5N3SyR6Ft7Im1jj3CMQ9AB0uWmL3DvG8ybn1tQkFvg;","userid":"","loginname":"","usertype":"","OU":"","deptname":"","dept_id":"","dept_name":""}

配置文件

添加新的nginx parser文件

[PARSER]
    Name nginx_reg
    Format regex
    Regex ^.*"remote_addr":"(?<remote_addr>[^"]+)".*"method":"(?<method>[^"]+)".*"uri":"(?<uri>[^"]+)".*"status":"(?<status>[^"]+)".*$

主fluent-bit配置文件添加filter

[FILTER]
    Name parser
    Match *
    Key_Name log
    Parser nginx_reg
    Reserve_Data true
    Preserve_Key true

数据生成脚本

#!/bin/bash  
​
nginxlog(){
  local json_str='{"request_time":"0.008","upstream_response_time":"0.008","upstream_connect_time":"0.000","upstream_header_time":"0.007","remote_addr":"192.168.1.1","request":"GET /sso/v1/users/info HTTP/1.1","method":"GET","uri":"/info","request_uri":"/info","scheme":"https","protocol":"HTTP/1.1","host":"localhost:443","status":"200","args":"","body_bytes_sent":"289","http_referer":"","http_user_agent":"TiangongCCS/0.1.7","http_x_forwarded_for":"","http_cookie":"accessToken=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpg;","userid":"","loginname":"","usertype":"","OU":"","deptname":"","dept_id":"","dept_name":""}' 
  target_size=31457280
  filename="./output.json"  
  > "$filename"
  while true; do  
    echo "$json_str" >> "$filename"  
    current_size=$(stat -c%s "$filename")  
    if [ $current_size -ge $target_size ]; then  
        break  
    fi  
  done
}
​
main(){
  while true;do
     cat ./output.json >> /var/log/pods/logtest/output_json.log
     sleep 1
  done
}
​
$1

验证测试

数据提取正确性验证

在这里插入图片描述

nginx日志字段提取性能测试

1、准备测试配置文件

[SERVICE]
    Flush 1
    Parsers_File parsers.conf
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    3302
[INPUT]
    Name         tail
    Tag          regex-fluent
    DB           ./db/regex-fluent.db
    Read_from_Head true
    Path  /var/log/pods/logtest/*.log
    Path_Key  pod_log_path
[FILTER]
    Name modify
    Match *
    Add paas_log_belong         user
    Add paas_log_type           middleware
    Add paas_collection_type    userfile
    Add paas_account_id         123456789
    Add paas_region_id          lftst
    Add paas_product_id         ccc
    Add paas_instance_name      test10
    Add paas_host_ip            127.0.0.1
    Add paas_manager_ip         127.0.0.1
    Add pod_namespace           default
    Add pod_name                test-0
    Add pod_container_name      test
[FILTER]
    Name parser
    Match *
    Key_Name log
    Parser nginx_reg
    Reserve_Data true
    Preserve_Key true
[OUTPUT]
    Name file
    Match *
    Path /vdata/logtest

2、记录启动命令

touch /var/log/pods/logtest/output_json.log
./bin/fluent-bit-3.0.2 -c etc/10.1fluent-3.0.2.conf &> ./logs/nginx_log &
timeout 2400 bash create_nginx_log.sh main &

不添加日志提取性能测试

1、准备测试配置

[SERVICE]
    Flush 1
    Parsers_File parsers.conf
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    3302
[INPUT]
    Name         tail
    Tag          regex-fluent
    DB           ./db/regex-fluent.db
    Read_from_Head true
    Path  /var/log/pods/logtest/*.log
    Path_Key  pod_log_path
[FILTER]
    Name modify
    Match *
    Add paas_log_belong         user
    Add paas_log_type           middleware
    Add paas_collection_type    userfile
    Add paas_account_id         123456789
    Add paas_region_id          lftst
    Add paas_product_id         ccc
    Add paas_instance_name      test10
    Add paas_host_ip            127.0.0.1
    Add paas_manager_ip         127.0.0.1
    Add pod_namespace           default
    Add pod_name                test-0
    Add pod_container_name      test
[OUTPUT]
    Name file
    Match *
    Path /vdata/logtest

2、启动命令

./bin/fluent-bit-3.0.2 -c etc/10.2fluent-3.0.2.conf &> ./logs/nginx_log

结果对比

Fluent Bit在Nginx日志提取场景下数据结果汇总

Flb版本 插件配置顺序 input(s/M) output(s/M) 备注
3.0.2 modify -> parser 14 20
3.0.2 modify 33 46
3.0.2 parser -> modify 13.2 19.5

结论

下降百分比计算方法:
((33 - 14) / 33) * 100% = 57%

本次计算结果得出,添加parser后input采集量为每秒14M 相比较与只有modify的33M对比起来,下降率达到57%


网站公告

今日签到

点亮在社区的每一天
去签到