Fluentd: No ke aha he mea nui e hoʻonohonoho i ka pahu hoʻopuka

Fluentd: No ke aha he mea nui e hoʻonohonoho i ka pahu hoʻopuka

I kēia mau lā, ʻaʻole hiki ke noʻonoʻo i kahi papahana Kubernetes me ka ʻole o ka ELK stack, kahi e mālama ai i nā lāʻau o nā noi ʻelua a me nā ʻōnaehana ʻōnaehana o ka pūpū. I kā mākou hana, hoʻohana mākou i ka EFK stack me Fluentd ma kahi o Logstash.

ʻO Fluentd kahi mea hōʻiliʻili lāʻau o ka honua hou e ulu a kaulana a ua hui pū me ka Cloud Native Computing Foundation, ʻo ia ke kumu i kālele ʻia ai kāna vector hoʻomohala i ka hoʻohana pū me Kubernetes.

ʻO ka ʻoiaʻiʻo o ka hoʻohana ʻana iā Fluentd ma kahi o Logstash ʻaʻole ia e hoʻololi i ke ʻano maʻamau o ka pūʻulu polokalamu, akā naʻe, ʻike ʻia ʻo Fluentd e kāna mau nuances kikoʻī e pili ana i kāna versatility.

No ka laʻana, i ka wā i hoʻomaka ai mākou e hoʻohana i ka EFK i kahi papahana paʻa me ka ikaika nui o ka hoʻopaʻa ʻana, ua ʻike mākou i ka ʻoiaʻiʻo ma Kibana ua hōʻike pinepine ʻia kekahi mau memo. Ma kēia ʻatikala e haʻi mākou iā ʻoe i ke kumu o kēia hanana a pehea e hoʻoponopono ai i ka pilikia.

ʻO ka pilikia o ka hana kope kope

Ma kā mākou papahana, hoʻonoho ʻia ʻo Fluentd ma ke ʻano he DaemonSet (hoʻomaka ʻokoʻa i hoʻokahi manawa ma kēlā me kēia node o ka pūʻulu Kubernetes) a nānā i nā pahu pahu stdout i /var/log/containers. Ma hope o ka hōʻiliʻili ʻana a me ka hana ʻana, hoʻouna ʻia nā lāʻau ma ke ʻano o nā palapala JSON i ElasticSearch, hoʻāla ʻia ma ka cluster a i ʻole ke ʻano kūʻokoʻa, e pili ana i ka nui o ka papahana a me nā koi no ka hana a me ka hoʻomanawanui hewa. Hoʻohana ʻia ʻo Kibana ma ke ʻano he kiʻi kiʻi.

I ka hoʻohana ʻana iā Fluentd me kahi plugin buffering output, ua loaʻa iā mākou kahi kūlana i loaʻa ai i kekahi mau palapala ma ElasticSearch ka ʻike like a ʻokoʻa wale nō i ka mea ʻike. Hiki iā ʻoe ke hōʻoia he ʻōlelo hou kēia me ka hoʻohana ʻana i ka log Nginx ma ke ʻano he laʻana. Ma ka waihona log, aia kēia memo i hoʻokahi kope:

127.0.0.1 192.168.0.1 - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0" -

Eia naʻe, aia kekahi mau palapala ma ElasticSearch i loaʻa kēia memo:

{
  "_index": "test-custom-prod-example-2020.01.02",
  "_type": "_doc",
  "_id": "HgGl_nIBR8C-2_33RlQV",
  "_version": 1,
  "_score": 0,
  "_source": {
    "service": "test-custom-prod-example",
    "container_name": "nginx",
    "namespace": "test-prod",
    "@timestamp": "2020-01-14T05:29:47.599052886 00:00",
    "log": "127.0.0.1 192.168.0.1 - [28/Feb/2013:12:00:00  0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0" -",
    "tag": "custom-log"
  }
}

{
  "_index": "test-custom-prod-example-2020.01.02",
  "_type": "_doc",
  "_id": "IgGm_nIBR8C-2_33e2ST",
  "_version": 1,
  "_score": 0,
  "_source": {
    "service": "test-custom-prod-example",
    "container_name": "nginx",
    "namespace": "test-prod",
    "@timestamp": "2020-01-14T05:29:47.599052886 00:00",
    "log": "127.0.0.1 192.168.0.1 - [28/Feb/2013:12:00:00  0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0" -",
    "tag": "custom-log"
  }
}

Eia kekahi, hiki ke hoʻonui hou ʻia ʻelua.

ʻOiai e hoʻoponopono ana i kēia pilikia ma nā log Fluentd, hiki iā ʻoe ke ʻike i ka nui o nā ʻōlelo aʻo me kēia ʻike:

2020-01-16 01:46:46 +0000 [warn]: [test-prod] failed to flush the buffer. retry_time=4 next_retry_seconds=2020-01-16 01:46:53 +0000 chunk="59c37fc3fb320608692c352802b973ce" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>"elasticsearch", :port=>9200, :scheme=>"http", :user=>"elastic", :password=>"obfuscated"}): read timeout reached"

Hoʻomaka kēia mau ʻōlelo aʻo inā ʻaʻole hiki iā ElasticSearch ke hoʻihoʻi i kahi pane i kahi noi i loko o ka manawa i ʻōlelo ʻia e ka request_timeout parameter, ʻo ia ke kumu ʻaʻole hiki ke holoi ʻia ka ʻāpana buffer i hoʻouna ʻia. Ma hope o kēia, hoʻāʻo ʻo Fluentd e hoʻouna hou i ka ʻāpana buffer iā ElasticSearch a ma hope o ka nui o nā hoʻāʻo ʻana, hoʻopau maikaʻi ka hana:

2020-01-16 01:47:05 +0000 [warn]: [test-prod] retry succeeded. chunk_id="59c37fc3fb320608692c352802b973ce" 
2020-01-16 01:47:05 +0000 [warn]: [test-prod] retry succeeded. chunk_id="59c37fad241ab300518b936e27200747" 
2020-01-16 01:47:05 +0000 [warn]: [test-dev] retry succeeded. chunk_id="59c37fc11f7ab707ca5de72a88321cc2" 
2020-01-16 01:47:05 +0000 [warn]: [test-dev] retry succeeded. chunk_id="59c37fb5adb70c06e649d8c108318c9b" 
2020-01-16 01:47:15 +0000 [warn]: [kube-system] retry succeeded. chunk_id="59c37f63a9046e6dff7e9987729be66f"

Eia nō naʻe, mālama ʻo ElasticSearch i kēlā me kēia o nā ʻāpana buffer i hoʻoili ʻia ma ke ʻano he kū hoʻokahi a hāʻawi iā lākou i nā koina kahua _id kūikawā i ka wā kuhikuhi. ʻO kēia ke ʻano o nā kope o nā memo.

Ma Kibana, penei kona ano:

Fluentd: No ke aha he mea nui e hoʻonohonoho i ka pahu hoʻopuka

ʻO ka pilikia

Nui nā koho e hoʻoponopono i kēia pilikia. ʻO kekahi o lākou ka mīkini i kūkulu ʻia i loko o ka fluent-plugin-elasticsearch plugin no ka hana ʻana i kahi hash kūʻokoʻa no kēlā me kēia palapala. Inā ʻoe e hoʻohana i kēia ʻano hana, e ʻike ʻo ElasticSearch i ka hana hou ʻana ma ka pae hoʻouna a pale i nā palapala kope. Akā, pono mākou e noʻonoʻo i kēia ʻano o ka hoʻoponopono ʻana i ka pilikia e hakakā nei me ka hoʻokolokolo a ʻaʻole hoʻopau i ka hewa me ka nele o ka manawa, no laila ua haʻalele mākou i ka hoʻohana ʻana.

Hoʻohana mākou i kahi plugin buffering ma ka Fluentd output e pale i ka nalowale o ka lāʻau i ka wā o nā pilikia pūnaewele no ka wā pōkole a i ʻole ka hoʻonui ʻana i ka ikaika logging. Inā no kekahi kumu ʻaʻole hiki iā ElasticSearch ke kākau koke i kahi palapala i ka index, ua hoʻopaʻa ʻia ka palapala a mālama ʻia ma ka disk. No laila, i kā mākou hihia, i mea e hoʻopau ai i ke kumu o ka pilikia e alakaʻi ai i ka hewa i hōʻike ʻia ma luna nei, pono e hoʻonohonoho i nā waiwai kūpono no nā ʻāpana buffering, kahi e lawa ai ka nui o ka Fluentd output buffer a i ka manawa like e hoʻomaʻemaʻe ʻia i ka manawa i hāʻawi ʻia.

He mea pono e hoʻomaopopo i nā waiwai o nā ʻāpana i kūkākūkā ʻia ma lalo nei i kēlā me kēia hihia kikoʻī o ka hoʻohana ʻana i ka buffering i nā plugins output, no ka mea e hilinaʻi lākou i nā kumu he nui: ka ikaika o ke kākau ʻana i nā leka i ka log e nā lawelawe, ka hana ʻōnaehana disk, ka pūnaewele. ka ukana a me kona bandwidth. No laila, no ka loaʻa ʻana o nā hoʻonohonoho paʻa i kūpono i kēlā me kēia hihia, akā ʻaʻole i hoʻonui ʻia, e pale ana i nā huli lōʻihi me ka makapō, hiki iā ʻoe ke hoʻohana i ka ʻike debugging a Fluentd i kākau ai i kāna log i ka wā o ka hana a loaʻa koke i nā waiwai kūpono.

I ka manawa i hoʻopaʻa ʻia ai ka pilikia, ua like ka hoʻonohonoho ʻana penei:

 <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.test.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 8M
        queue_limit_length 8
        overflow_action block
      </buffer>

I ka hoʻoholo ʻana i ka pilikia, ua koho lima ʻia nā waiwai o nā ʻāpana aʻe:
chunk_limit_size — ka nui o nā puʻupuʻu i hoʻokaʻawale ʻia nā memo i loko o ka pale.

  • flush_interval - ka manawa manawa ma hope o ka hoʻomaʻemaʻe ʻana i ka buffer.
  • queue_limit_length — ka helu kiʻekiʻe o nā puʻupuʻu ma ka pila.
  • request_timeout ka manawa i hoʻokumu ʻia ai ka pilina ma waena o Fluentd a me ElasticSearch.

Hiki ke helu ʻia ka nui buffer ma ka hoʻonui ʻana i nā ʻāpana queue_limit_length a me chunk_limit_size, hiki ke unuhi ʻia ʻo "ka helu kiʻekiʻe loa o nā puʻupuʻu i loko o ka queue, kēlā me kēia mea i hāʻawi ʻia ka nui." Inā ʻaʻole lawa ka nui buffer, e ʻike ʻia kēia ʻōlelo aʻo i loko o nā lāʻau:

2020-01-21 10:22:57 +0000 [warn]: [test-prod] failed to write data into buffer by buffer overflow action=:block

ʻO ia hoʻi, ʻaʻohe manawa o ka buffer e hoʻomaʻemaʻe ʻia i ka manawa i hāʻawi ʻia a hoʻopaʻa ʻia ka ʻikepili i komo i ka buffer piha, kahi e alakaʻi ai i ka nalowale o kahi hapa o nā lāʻau.

Hiki iā ʻoe ke hoʻonui i ka pale ma nā ʻano ʻelua: ma ka hoʻonui ʻana i ka nui o kēlā me kēia puʻupuʻu i ka pila, a i ʻole ka helu o nā puʻupuʻu i hiki i ka pila.

Inā hoʻonoho ʻoe i ka nui chunk chunk_limit_size i ʻoi aku ma mua o 32 megabytes, a laila ʻaʻole e ʻae ʻo ElasticSeacrh, no ka mea, ʻoi aku ka nui o ka ʻeke e komo mai ana. No laila, inā pono ʻoe e hoʻonui i ka pale, ʻoi aku ka maikaʻi o ka hoʻonui ʻana i ka lōʻihi queue queue_limit_length.

I ka pau ʻana o ka hoʻoulu ʻana o ka buffer a ʻaʻole i lawa ka memo o ka manawa, hiki iā ʻoe ke hoʻomaka e hoʻonui i ka ʻāpana request_timeout. Eia nō naʻe, inā hoʻonoho ʻoe i ka waiwai i ʻoi aku ma mua o 20 kekona, e hoʻomaka ana nā ʻōlelo aʻo e ʻike ʻia ma nā log Fluentd:

2020-01-21 09:55:33 +0000 [warn]: [test-dev] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=20.85753920301795 slow_flush_log_threshold=20.0 plugin_id="postgresql-dev" 

ʻAʻole pili kēia memo i ka hana ʻana o ka ʻōnaehana ma kekahi ʻano a ʻo ia hoʻi, ua lōʻihi ka manawa o ka buffer flush ma mua o ka mea i hoʻonohonoho ʻia e ka palena slow_flush_log_threshold. ʻO kēia ka hoʻopau ʻana i ka ʻike a hoʻohana mākou iā ia i ke koho ʻana i ka waiwai o ka ʻāpana request_timeout.

ʻO ka algorithm koho maʻamau e like me kēia:

  1. E hoʻonoho i ka noi_timeout i kahi waiwai i ʻoi aku ka nui ma mua o ka pono (haneri kekona). I ka wā o ka hoʻonohonoho ʻana, ʻo ke kumu nui no ka hoʻonohonoho kūpono o kēia ʻāpana, ʻo ia ka nalo ʻana o nā ʻōlelo luhi no ka nele o ka manawa.
  2. E kali no nā memo e pili ana i ka ʻoi aku o ka paepae slow_flush_log_threshold. E hōʻike ana ka ʻōlelo aʻoaʻo ma ke kahua elapsed_time i ka manawa maoli i holoi ʻia ka buffer.
  3. E hoʻonoho i ke noi_timeout i ka waiwai i ʻoi aku ka nui ma mua o ke kumu waiwai elapsed_time i loaʻa i ka wā nānā. E helu mākou i ka waiwai noi_timeout e like me elapsed_time + 50%.
  4. No ka wehe ʻana i nā ʻōlelo aʻoaʻo e pili ana i ka hoʻoheheʻe ʻana o ka buffer lōʻihi mai ka log, hiki iā ʻoe ke hoʻonui i ka waiwai o slow_flush_log_threshold. E helu mākou i kēia waiwai e like me elapsed_time + 25%.

ʻO nā waiwai hope o kēia mau ʻāpana, e like me ka mea i hōʻike mua ʻia, loaʻa i kēlā me kēia hihia. Ma ka hahai ʻana i ka algorithm ma luna, ua hōʻoia ʻia mākou e hoʻopau i ka hewa e alakaʻi ai i nā memo hou.

Hōʻike ka papa ma lalo nei i ka helu o nā hewa i kēlā me kēia lā, e alakaʻi ana i ke kope ʻana o nā memo, nā loli i ke kaʻina o ke koho ʻana i nā waiwai o nā ʻāpana i hōʻike ʻia ma luna.

node-1
node-2
node-3
node-4

Ma mua ma hope
Ma mua ma hope
Ma mua ma hope
Ma mua ma hope

ʻaʻole i holoi i ka pale
1749/2
694/2
47/0
1121/2

ua lanakila ka ho'āʻo hou
410/2
205/1
24/0
241/2

He mea pono e hoʻomaopopo ʻia e nalowale ana ka pili o nā hoʻonohonoho hopena i ka ulu ʻana o ka papahana a, no laila, piʻi ka helu o nā lāʻau. ʻO ka hōʻailona mua o ka lawa ʻole o ka manawa ʻo ka hoʻihoʻi ʻana i nā memo e pili ana i kahi pale pale lōʻihi i ka log Fluentd, ʻo ia hoʻi, ʻoi aku ma mua o ka paepae slow_flush_log_threshold. Mai kēia manawa, aia kahi palena liʻiliʻi ma mua o ka hoʻonui ʻia ʻana o ka palena noi_timeout, no laila pono e pane i kēia mau memo i ka manawa kūpono a hana hou i ke kaʻina o ke koho ʻana i nā hoʻonohonoho kūpono i hōʻike ʻia ma luna.

hopena

ʻO ka hoʻoponopono maikaʻi ʻana i ka Fluentd output buffer kekahi o nā pae nui o ka hoʻonohonoho ʻana i ka waihona EFK, e hoʻoholo ana i ke kūpaʻa o kāna hana a me ka hoʻokomo pono ʻana o nā palapala i nā kuhikuhi. Ma muli o ka algorithm hoʻonohonoho i wehewehe ʻia, hiki iā ʻoe ke hōʻoia e kākau ʻia nā lāʻau āpau i ka index ElasticSearch ma ke kaʻina pololei, me ka ʻole o ka hana hou ʻana a me nā poho.

E heluhelu pū i nā ʻatikala ʻē aʻe ma kā mākou blog:

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka