Adana bayanai na dogon lokaci a cikin Elasticsearch

Adana bayanai na dogon lokaci a cikin Elasticsearch

Sunana Igor Sidorenko, Ni jagora ne na fasaha a cikin Ζ™ungiyar admins waΙ—anda ke kula da duk kayan aikin Domclick.

Ina so in raba gwaninta na kafa ma'ajiyar bayanai a cikin Elasticsearch. Za mu dubi abin da saituna a kan nodes ke da alhakin rarraba shards, yadda ILM ke aiki da aiki.

Wadanda ke aiki tare da katako, wata hanya ko wata, suna fuskantar matsalar ajiyar dogon lokaci don bincike na gaba. A cikin Elasticsearch, wannan gaskiya ne musamman, saboda komai ya kasance abin takaici tare da aikin curator. Shafin 6.6 ya gabatar da ayyukan ILM. Ya Ζ™unshi matakai 4:

  • Hot - Ana sabunta fihirisar da kuma tambaya.
  • Dumi - Ba a sabunta fihirisar ba, amma har yanzu ana tambaya.
  • Cold - Ba a sabunta fihirisar ba kuma ba kasafai ake tambaya ba. Dole ne har yanzu bayanin ya kasance abin nema, amma tambayoyin na iya zama a hankali.
  • Share - Ba a buΖ™atar fihirisar kuma ana iya share shi cikin aminci.

An ba

  • Elasticsearch Data Hot: 24 processor, 128 GB memory, 1,8 TB SSD RAID 10 (8 nodes).
  • Elasticsearch Data Warm: 24 processor, 64 GB memory, 8 TB NetApp SSD Policy (4 nodes).
  • Elasticsearch Data Cold: 8 sarrafawa, 32 GB Ζ™waΖ™walwar ajiya, 128 TB HDD RAID 10 (4 nodes).

Manufar

Wadannan saituna na mutum ne, duk ya dogara ne akan wurin da ke kan nodes, adadin fihirisa, rajistan ayyukan, da dai sauransu. Muna da 2-3 TB na bayanai kowace rana.

  • Kwanaki 5 - Lokaci mai zafi (8 main / 1 replica).
  • Kwanaki 20 - Lokacin dumi (raguwa-index 4 main / 1 kwafi).
  • Kwanaki 90 - Lokacin sanyi (daskare-index 4 main / 1 kwafi).
  • Kwanaki 120 - Share lokaci.

Saita Elasticsearch

Don rarraba shards a fadin nodes, kuna buΖ™atar siga guda Ι—aya kawai:

  • hot- nodes:
    ~]# cat /etc/elasticsearch/elasticsearch.yml | grep attr
    # Add custom attributes to the node:
    node.attr.box_type: hot
  • dumi- nodes:
    ~]# cat /etc/elasticsearch/elasticsearch.yml | grep attr
    # Add custom attributes to the node:
    node.attr.box_type: warm
  • Cold- nodes:
    ~]# cat /etc/elasticsearch/elasticsearch.yml | grep attr
    # Add custom attributes to the node:
    node.attr.box_type: cold

Saita Logstash

Ta yaya duka yake aiki kuma ta yaya muka aiwatar da wannan fasalin? Bari mu fara da shigar da rajistan ayyukan Elasticsearch. Akwai hanyoyi guda biyu:

  1. Logstash yana debo rajistan ayyukan daga Kafka. Zai iya Ι—auka mai tsabta ko juyawa a gefen ku.
  2. Wani abu da kanta ke rubutawa zuwa Elasticsearch, misali, sabar APM.

Yi la'akari da misali na sarrafa fihirisar ta hanyar Logstash. Yana haifar da fihirisa kuma ya shafi shi tsarin index kuma masu dacewa ILM.

k8s-ingress.conf

input {
    kafka {
        bootstrap_servers => "node01, node02, node03"
        topics => ["ingress-k8s"]
        decorate_events => false
        codec => "json"
    }
}

filter {
    ruby {
        path => "/etc/logstash/conf.d/k8s-normalize.rb"
    }
    if [log] =~ "[warn]" or [log] =~ "[error]" or [log] =~ "[notice]" or [log] =~ "[alert]" {
        grok {
            match => { "log" => "%{DATA:[nginx][error][time]} [%{DATA:[nginx][error][level]}] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: *%{NUMBER:[nginx][error][connection_id]} %{DATA:[nginx][error][message]}, client: %{IPORHOST:[nginx][error][remote_ip]}, server: %{DATA:[nginx][error][server]}, request: "%{WORD:[nginx][error][method]} %{DATA:[nginx][error][url]} HTTP/%{NUMBER:[nginx][error][http_version]}", (?:upstream: "%{DATA:[nginx][error][upstream][proto]}://%{DATA:[nginx][error][upstream][host]}:%{DATA:[nginx][error][upstream][port]}/%{DATA:[nginx][error][upstream][url]}", )?host: "%{DATA:[nginx][error][host]}"(?:, referrer: "%{DATA:[nginx][error][referrer]}")?" }
            remove_field => "log"
        }
    }
    else {
        grok {
            match => { "log" => "%{IPORHOST:[nginx][access][host]} - [%{IPORHOST:[nginx][access][remote_ip]}] - %{DATA:[nginx][access][remote_user]} [%{HTTPDATE:[nginx][access][time]}] "%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][bytes_sent]} "%{DATA:[nginx][access][referrer]}" "%{DATA:[nginx][access][agent]}" %{NUMBER:[nginx][access][request_lenght]} %{NUMBER:[nginx][access][request_time]} [%{DATA:[nginx][access][upstream][name]}] (?:-|%{IPORHOST:[nginx][access][upstream][addr]}:%{NUMBER:[nginx][access][upstream][port]}) (?:-|%{NUMBER:[nginx][access][upstream][response_lenght]}) %{DATA:[nginx][access][upstream][response_time]} %{DATA:[nginx][access][upstream][status]} %{DATA:[nginx][access][request_id]}" }
            remove_field => "log"
        }
    }
}
output {
    elasticsearch {
        id => "k8s-ingress"
        hosts => ["node01", "node02", "node03", "node04", "node05", "node06", "node07", "node08"]
        manage_template => true # Π²ΠΊΠ»ΡŽΡ‡Π°Π΅ΠΌ ΡƒΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ шаблонами
        template_name => "k8s-ingress" # имя примСняСмого шаблона
        ilm_enabled => true # Π²ΠΊΠ»ΡŽΡ‡Π°Π΅ΠΌ ΡƒΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ ILM
        ilm_rollover_alias => "k8s-ingress" # alias для записи Π² индСксы, Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±Ρ‹Ρ‚ΡŒ ΡƒΠ½ΠΈΠΊΠ°Π»ΡŒΠ½Ρ‹ΠΌ
        ilm_pattern => "{now/d}-000001" # шаблон для создания индСксов, ΠΌΠΎΠΆΠ΅Ρ‚ Π±Ρ‹Ρ‚ΡŒ ΠΊΠ°ΠΊ "{now/d}-000001" Ρ‚Π°ΠΊ ΠΈ "000001"
        ilm_policy => "k8s-ingress" # ΠΏΠΎΠ»ΠΈΡ‚ΠΈΠΊΠ° прикрСпляСмая ΠΊ индСксу
        index => "k8s-ingress-%{+YYYY.MM.dd}" # Π½Π°Π·Π²Π°Π½ΠΈΠ΅ создаваСмого индСкса, ΠΌΠΎΠΆΠ΅Ρ‚ ΡΠΎΠ΄Π΅Ρ€ΠΆΠ°Ρ‚ΡŒ %{+YYYY.MM.dd}, зависит ΠΎΡ‚ ilm_pattern
    }
}

Saitin Kibana

Akwai tsarin tushe wanda ya shafi duk sabbin fihirisa. Yana saita rarraba fihirisar zafi, adadin shards, kwafi, da sauransu. An Ζ™addara nauyin samfurin ta zaΙ“i order. Samfura masu nauyi mafi girma suna Ζ™etare sigogin samfuri na yanzu ko Ζ™ara sababbi.

Adana bayanai na dogon lokaci a cikin Elasticsearch
Adana bayanai na dogon lokaci a cikin Elasticsearch

SAMU _template/default

{
  "default" : {
    "order" : -1, # вСс шаблона
    "version" : 1,
    "index_patterns" : [
      "*" # примСняСм ΠΊΠΎ всСм индСксам
    ],
    "settings" : {
      "index" : {
        "codec" : "best_compression", # ΡƒΡ€ΠΎΠ²Π΅Π½ΡŒ сТатия
        "routing" : {
          "allocation" : {
            "require" : {
              "box_type" : "hot" # распрСдСляСм Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΏΠΎ горячим Π½ΠΎΠ΄Π°ΠΌ
            },
            "total_shards_per_node" : "8" # максимальноС количСство ΡˆΠ°Ρ€Π΄ΠΎΠ² Π½Π° Π½ΠΎΠ΄Ρƒ ΠΎΡ‚ ΠΎΠ΄Π½ΠΎΠ³ΠΎ индСкса
          }
        },
        "refresh_interval" : "5s", # ΠΈΠ½Ρ‚Π΅Ρ€Π²Π°Π» обновлСния индСкса
        "number_of_shards" : "8", # количСство ΡˆΠ°Ρ€Π΄ΠΎΠ²
        "auto_expand_replicas" : "0-1", # количСство Ρ€Π΅ΠΏΠ»ΠΈΠΊ Π½Π° Π½ΠΎΠ΄Ρƒ ΠΎΡ‚ ΠΎΠ΄Π½ΠΎΠ³ΠΎ индСкса
        "number_of_replicas" : "1" # количСство Ρ€Π΅ΠΏΠ»ΠΈΠΊ
      }
    },
    "mappings" : {
      "_meta" : { },
      "_source" : { },
      "properties" : { }
    },
    "aliases" : { }
  }
}

Sa'an nan kuma yi amfani da taswirar zuwa fihirisa k8s-ingress-* ta amfani da samfuri tare da nauyi mafi girma.

Adana bayanai na dogon lokaci a cikin Elasticsearch
Adana bayanai na dogon lokaci a cikin Elasticsearch

GET _template/k8s-ingress

{
  "k8s-ingress" : {
    "order" : 100,
    "index_patterns" : [
      "k8s-ingress-*"
    ],
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "k8s-ingress",
          "rollover_alias" : "k8s-ingress"
        },
        "codec" : "best_compression",
        "routing" : {
          "allocation" : {
            "require" : {
              "box_type" : "hot"
            }
          }
        },
        "number_of_shards" : "8",
        "number_of_replicas" : "1"
      }
    },
    "mappings" : {
      "numeric_detection" : false,
      "_meta" : { },
      "_source" : { },
      "dynamic_templates" : [
        {
          "all_fields" : {
            "mapping" : {
              "index" : false,
              "type" : "text"
            },
            "match" : "*"
          }
        }
      ],
      "date_detection" : false,
      "properties" : {
        "kubernetes" : {
          "type" : "object",
          "properties" : {
            "container_name" : {
              "type" : "keyword"
            },
            "container_hash" : {
              "index" : false,
              "type" : "keyword"
            },
            "host" : {
              "type" : "keyword"
            },
            "annotations" : {
              "type" : "object",
              "properties" : {
                "value" : {
                  "index" : false,
                  "type" : "text"
                },
                "key" : {
                  "index" : false,
                  "type" : "keyword"
                }
              }
            },
            "docker_id" : {
              "index" : false,
              "type" : "keyword"
            },
            "pod_id" : {
              "type" : "keyword"
            },
            "labels" : {
              "type" : "object",
              "properties" : {
                "value" : {
                  "type" : "keyword"
                },
                "key" : {
                  "type" : "keyword"
                }
              }
            },
            "namespace_name" : {
              "type" : "keyword"
            },
            "pod_name" : {
              "type" : "keyword"
            }
          }
        },
        "@timestamp" : {
          "type" : "date"
        },
        "nginx" : {
          "type" : "object",
          "properties" : {
            "access" : {
              "type" : "object",
              "properties" : {
                "agent" : {
                  "type" : "text"
                },
                "response_code" : {
                  "type" : "integer"
                },
                "upstream" : {
                  "type" : "object",
                  "properties" : {
                    "port" : {
                      "type" : "keyword"
                    },
                    "name" : {
                      "type" : "keyword"
                    },
                    "response_lenght" : {
                      "type" : "integer"
                    },
                    "response_time" : {
                      "index" : false,
                      "type" : "text"
                    },
                    "addr" : {
                      "type" : "keyword"
                    },
                    "status" : {
                      "index" : false,
                      "type" : "text"
                    }
                  }
                },
                "method" : {
                  "type" : "keyword"
                },
                "http_version" : {
                  "type" : "keyword"
                },
                "bytes_sent" : {
                  "type" : "integer"
                },
                "request_lenght" : {
                  "type" : "integer"
                },
                "url" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword"
                    }
                  }
                },
                "remote_user" : {
                  "type" : "text"
                },
                "referrer" : {
                  "type" : "text"
                },
                "remote_ip" : {
                  "type" : "ip"
                },
                "request_time" : {
                  "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis||dd/MMM/YYYY:H:m:s Z",
                  "type" : "date"
                },
                "host" : {
                  "type" : "keyword"
                },
                "time" : {
                  "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis||dd/MMM/YYYY:H:m:s Z",
                  "type" : "date"
                }
              }
            },
            "error" : {
              "type" : "object",
              "properties" : {
                "server" : {
                  "type" : "keyword"
                },
                "upstream" : {
                  "type" : "object",
                  "properties" : {
                    "port" : {
                      "type" : "keyword"
                    },
                    "proto" : {
                      "type" : "keyword"
                    },
                    "host" : {
                      "type" : "keyword"
                    },
                    "url" : {
                      "type" : "text",
                      "fields" : {
                        "keyword" : {
                          "type" : "keyword"
                        }
                      }
                    }
                  }
                },
                "method" : {
                  "type" : "keyword"
                },
                "level" : {
                  "type" : "keyword"
                },
                "http_version" : {
                  "type" : "keyword"
                },
                "pid" : {
                  "index" : false,
                  "type" : "integer"
                },
                "message" : {
                  "type" : "text"
                },
                "tid" : {
                  "index" : false,
                  "type" : "keyword"
                },
                "url" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword"
                    }
                  }
                },
                "referrer" : {
                  "type" : "text"
                },
                "remote_ip" : {
                  "type" : "ip"
                },
                "connection_id" : {
                  "index" : false,
                  "type" : "keyword"
                },
                "host" : {
                  "type" : "keyword"
                },
                "time" : {
                  "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis||dd/MMM/YYYY:H:m:s Z",
                  "type" : "date"
                }
              }
            }
          }
        },
        "log" : {
          "type" : "text"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "ignore_above" : 256,
              "type" : "keyword"
            }
          }
        },
        "eventtime" : {
          "type" : "float"
        }
      }
    },
    "aliases" : { }
  }
}

Bayan yin amfani da duk samfuran, muna amfani da manufar ILM kuma mu fara sa ido kan rayuwar fihirisar.

Adana bayanai na dogon lokaci a cikin Elasticsearch

Adana bayanai na dogon lokaci a cikin Elasticsearch

Adana bayanai na dogon lokaci a cikin Elasticsearch

SAMU _ilm/policy/k8s-ingress

{
  "k8s-ingress" : {
    "version" : 14,
    "modified_date" : "2020-06-11T10:27:01.448Z",
    "policy" : {
      "phases" : {
        "warm" : { # тСплая Ρ„Π°Π·Π°
          "min_age" : "5d", # срок ΠΆΠΈΠ·Π½ΠΈ индСкса послС Ρ€ΠΎΡ‚Π°Ρ†ΠΈΠΈ Π΄ΠΎ наступлСния Ρ‚Π΅ΠΏΠ»ΠΎΠΉ Ρ„Π°Π·Ρ‹
          "actions" : {
            "allocate" : {
              "include" : { },
              "exclude" : { },
              "require" : {
                "box_type" : "warm" # ΠΊΡƒΠ΄Π° ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Ρ‰Π°Π΅ΠΌ индСкс
              }
            },
            "shrink" : {
              "number_of_shards" : 4 # ΠΎΠ±Ρ€Π΅Π·Π°Π½ΠΈΠ΅ индСксов, Ρ‚.ΠΊ. Ρƒ нас 4 Π½ΠΎΠ΄Ρ‹
            }
          }
        },
        "cold" : { # холодная Ρ„Π°Π·Π°
          "min_age" : "25d", # срок ΠΆΠΈΠ·Π½ΠΈ индСкса послС Ρ€ΠΎΡ‚Π°Ρ†ΠΈΠΈ Π΄ΠΎ наступлСния Ρ…ΠΎΠ»ΠΎΠ΄Π½ΠΎΠΉ Ρ„Π°Π·Ρ‹
          "actions" : {
            "allocate" : {
              "include" : { },
              "exclude" : { },
              "require" : {
                "box_type" : "cold" # ΠΊΡƒΠ΄Π° ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Ρ‰Π°Π΅ΠΌ индСкс
              }
            },
            "freeze" : { } # Π·Π°ΠΌΠΎΡ€Π°ΠΆΠΈΠ²Π°Π΅ΠΌ для ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·Π°Ρ†ΠΈΠΈ
          }
        },
        "hot" : { # горячая Ρ„Π°Π·Π°
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_size" : "50gb", # ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹ΠΉ Ρ€Π°Π·ΠΌΠ΅Ρ€ индСкса Π΄ΠΎ Ρ€ΠΎΡ‚Π°Ρ†ΠΈΠΈ (Π±ΡƒΠ΄Π΅Ρ‚ Ρ…2, Ρ‚.ΠΊ. Π΅ΡΡ‚ΡŒ 1 Ρ€Π΅ΠΏΠ»ΠΈΠΊΠ°)
              "max_age" : "1d" # ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹ΠΉ срок ΠΆΠΈΠ·Π½ΠΈ индСкса Π΄ΠΎ Ρ€ΠΎΡ‚Π°Ρ†ΠΈΠΈ
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "delete" : { # Ρ„Π°Π·Π° удалСния
          "min_age" : "120d", # ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡŒΠ½Ρ‹ΠΉ срок ΠΆΠΈΠ·Π½ΠΈ послС Ρ€ΠΎΡ‚Π°Ρ†ΠΈΠΈ ΠΏΠ΅Ρ€Π΅Π΄ ΡƒΠ΄Π°Π»Π΅Π½ΠΈΠ΅ΠΌ
          "actions" : {
            "delete" : { }
          }
        }
      }
    }
  }
}

Matsalolin

An sami matsaloli a matakin saiti da gyara matsala.

Lokaci mai zafi

Don daidaitaccen juyawa na fihirisa, kasancewar a Ζ™arshen yana da mahimmanci index_name-date-000026 lambobi masu tsari 000001. Akwai layi a cikin lambar da ke bincika fihirisa ta amfani da magana ta yau da kullun don kasancewar lambobi a Ζ™arshe. In ba haka ba, za a sami kuskure, ba za a yi amfani da manufofi a cikin index ba, kuma zai kasance a cikin lokaci mai zafi.

Lokaci mai dumi

Ji Ζ™yama (cutoff) - rage yawan shards, saboda muna da nodes 4 a cikin yanayin dumi da sanyi. Takaddun ya Ζ™unshi layi masu zuwa:

  • Dole ne a karanta fihirisar kawai.
  • Kwafin kowane shard a cikin fihirisar dole ne ya zauna akan kulli Ι—aya.
  • Matsayin lafiyar tari dole ne ya zama kore.

Don datse fihirisa, Elasticsearch yana matsar da duk shards na farko zuwa kumburi Ι—aya, yana kwafin fihirisar da aka yanke tare da ma'auni masu mahimmanci, sannan yana share tsohuwar. Siga total_shards_per_node dole ne ya zama daidai ko girma fiye da adadin manyan shards don dacewa da kulli Ι—aya. In ba haka ba, za a sami sanarwa kuma shards ba za su matsa zuwa daidaitattun nodes ba.

Adana bayanai na dogon lokaci a cikin Elasticsearch
Adana bayanai na dogon lokaci a cikin Elasticsearch

SAMU / raguwa-k8s-ingress-2020.06.06-000025/_saituna

{
  "shrink-k8s-ingress-2020.06.06-000025" : {
    "settings" : {
      "index" : {
        "refresh_interval" : "5s",
        "auto_expand_replicas" : "0-1",
        "blocks" : {
          "write" : "true"
        },
        "provided_name" : "shrink-k8s-ingress-2020.06.06-000025",
        "creation_date" : "1592225525569",
        "priority" : "100",
        "number_of_replicas" : "1",
        "uuid" : "psF4MiFGQRmi8EstYUQS4w",
        "version" : {
          "created" : "7060299",
          "upgraded" : "7060299"
        },
        "lifecycle" : {
          "name" : "k8s-ingress",
          "rollover_alias" : "k8s-ingress",
          "indexing_complete" : "true"
        },
        "codec" : "best_compression",
        "routing" : {
          "allocation" : {
            "initial_recovery" : {
              "_id" : "_Le0Ww96RZ-o76bEPAWWag"
            },
            "require" : {
              "_id" : null,
              "box_type" : "cold"
            },
            "total_shards_per_node" : "8"
          }
        },
        "number_of_shards" : "4",
        "routing_partition_size" : "1",
        "resize" : {
          "source" : {
            "name" : "k8s-ingress-2020.06.06-000025",
            "uuid" : "gNhYixO6Skqi54lBjg5bpQ"
          }
        }
      }
    }
  }
}

Lokacin sanyi

daskare (daskare) - Muna daskare fihirisar don inganta tambayoyin kan bayanan tarihi.

Binciken da aka yi akan daskararrun fihirisa suna amfani da Ζ™arami, sadaukarwa, search_throttled threadpool don sarrafa adadin bincike na lokaci-lokaci wanda ya buge daskararru akan kowane kumburi. Wannan yana iyakance adadin Ζ™arin Ζ™waΖ™walwar ajiya da ake buΖ™ata don tsarin bayanan wucin gadi wanda ya dace da daskararrun shards, wanda saboda haka yana kare nodes daga yawan amfani da Ζ™waΖ™walwar ajiya.
Fihirisar daskararre ana karantawa-kawai: ba za ku iya fidda su ba.
Ana sa ran gudanar da bincike kan daskararrun fihirisa a hankali. Ba a yi nufin daskararrun fihirisa don babban nauyin bincike ba. Mai yiyuwa ne binciken fihirisar daskararre na iya Ι—aukar daΖ™iΖ™a ko mintuna kafin a kammala, koda kuwa binciken iri Ι—aya da aka kammala a cikin millise seconds lokacin da fihirisar ba a daskare ba.

Sakamakon

Mun koyi yadda ake shirya nodes don yin aiki tare da ILM, saita samfuri don rarraba shards a tsakanin nodes masu zafi, da kuma saita ILM don ma'auni tare da duk matakan rayuwa.

hanyoyi masu amfani

source: www.habr.com