Elasticsearch Basics

Elasticsearch yog lub tshuab tshawb nrhiav nrog json so api, siv Lucene thiab sau hauv Java. Cov lus piav qhia ntawm txhua qhov zoo ntawm lub cav no muaj nyob ntawm official website. Hauv dab tsi hauv qab no peb yuav xa mus rau Elasticsearch li ES.

Cov tshuab zoo sib xws yog siv rau kev tshawb nrhiav nyuaj hauv cov ntaub ntawv database. Piv txwv li, kev tshawb fawb coj mus rau hauv tus account lub morphology ntawm cov lus los yog tshawb fawb los ntawm geo coordinates.

Hauv tsab xov xwm no kuv yuav tham txog cov hauv paus ntawm ES siv cov piv txwv ntawm indexing blog posts. Kuv mam li qhia koj yuav ua li cas lim, txheeb xyuas thiab tshawb cov ntaub ntawv.

Txhawm rau kom tsis txhob nyob ntawm qhov kev ua haujlwm, kuv yuav ua txhua qhov kev thov rau ES siv CURL. Kuj tseem muaj lub plugin rau google chrome hu ua kev txiav txim zoo.

Cov ntawv muaj txuas mus rau cov ntaub ntawv thiab lwm qhov chaw. Thaum kawg muaj cov kev sib txuas rau kev nkag mus ceev rau cov ntaub ntawv. Cov ntsiab lus ntawm cov lus tsis paub yuav pom nyob rau hauv glossaries.

Txhim kho ES

Txhawm rau ua qhov no, peb thawj zaug xav tau Java. Cov neeg tsim tawm pom zoo nruab Java versions tshiab dua Java 8 hloov tshiab 20 lossis Java 7 hloov tshiab 55.

Qhov kev faib tawm ES muaj nyob ntawm tus tsim tawm site. Tom qab unpacking lub archive koj yuav tsum tau khiav bin/elasticsearch. Kuj muaj pob khoom rau apt thiab yumCov. muaj official duab rau docker. Xav paub ntau ntxiv txog kev teeb tsa.

Tom qab kev teeb tsa thiab tso tawm, cia peb tshawb xyuas cov haujlwm ua haujlwm:

# для удобства Π·Π°ΠΏΠΎΠΌΠ½ΠΈΠΌ адрСс Π² ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Π½Π½ΡƒΡŽ
#export ES_URL=$(docker-machine ip dev):9200
export ES_URL=localhost:9200

curl -X GET $ES_URL

Peb yuav tau txais qee yam zoo li no:

{
  "name" : "Heimdall",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.2.1",
    "build_hash" : "d045fc29d1932bce18b2e65ab8b297fbf6cd41a1",
    "build_timestamp" : "2016-03-09T09:38:54Z",
    "build_snapshot" : false,
    "lucene_version" : "5.4.1"
  },
  "tagline" : "You Know, for Search"
}

Indexing

Cia peb ntxiv ib tsab ntawv rau ES:

# Π”ΠΎΠ±Π°Π²ΠΈΠΌ Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ c id 1 Ρ‚ΠΈΠΏΠ° post Π² индСкс blog.
# ?pretty ΡƒΠΊΠ°Π·Ρ‹Π²Π°Π΅Ρ‚, Ρ‡Ρ‚ΠΎ Π²Ρ‹Π²ΠΎΠ΄ Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±Ρ‹Ρ‚ΡŒ Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠΎ-Ρ‡ΠΈΡ‚Π°Π΅ΠΌΡ‹ΠΌ.

curl -XPUT "$ES_URL/blog/post/1?pretty" -d'
{
  "title": "ВСсСлыС котята",
  "content": "<p>БмСшная история ΠΏΡ€ΠΎ котят<p>",
  "tags": [
    "котята",
    "смСшная история"
  ],
  "published_at": "2014-09-12T20:44:42+00:00"
}'

server teb:

{
  "_index" : "blog",
  "_type" : "post",
  "_id" : "1",
  "_version" : 1,
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "created" : false
}

ES tau tsim Performance index blog thiab hom ncej. Peb tuaj yeem kos ib qho kev sib piv: qhov ntsuas yog qhov chaw khaws ntaub ntawv, thiab ib hom yog lub rooj hauv cov ntaub ntawv no. Txhua hom muaj nws tus kheej lub tswv yim - kuas, ib yam li lub rooj sib tham. Mapping yog generated txiav thaum cov ntaub ntawv yog indexed:

# ΠŸΠΎΠ»ΡƒΡ‡ΠΈΠΌ mapping всСх Ρ‚ΠΈΠΏΠΎΠ² индСкса blog
curl -XGET "$ES_URL/blog/_mapping?pretty"

Hauv cov lus teb rau tus neeg rau zaub mov, kuv ntxiv qhov tseem ceeb ntawm thaj chaw ntawm cov ntaub ntawv indexed hauv cov lus:

{
  "blog" : {
    "mappings" : {
      "post" : {
        "properties" : {
          /* "content": "<p>БмСшная история ΠΏΡ€ΠΎ котят<p>", */ 
          "content" : {
            "type" : "string"
          },
          /* "published_at": "2014-09-12T20:44:42+00:00" */
          "published_at" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
          },
          /* "tags": ["котята", "смСшная история"] */
          "tags" : {
            "type" : "string"
          },
          /*  "title": "ВСсСлыС котята" */
          "title" : {
            "type" : "string"
          }
        }
      }
    }
  }
}

Nws tsim nyog sau cia tias ES tsis sib txawv ntawm ib tus nqi thiab ib qho ntawm cov nqi. Piv txwv li, lub npe teb tsuas muaj ib lub npe, thiab cov ntawv cim npe muaj ib qho array ntawm cov hlua, txawm hais tias lawv tau sawv cev tib yam hauv kev kos duab.
Peb mam li tham ntxiv txog daim ntawv qhia tom qab.

Thov

Retrieving ib daim ntawv los ntawm nws tus ID:

# ΠΈΠ·Π²Π»Π΅Ρ‡Π΅ΠΌ Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ с id 1 Ρ‚ΠΈΠΏΠ° post ΠΈΠ· индСкса blog
curl -XGET "$ES_URL/blog/post/1?pretty"
{
  "_index" : "blog",
  "_type" : "post",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "title" : "ВСсСлыС котята",
    "content" : "<p>БмСшная история ΠΏΡ€ΠΎ котят<p>",
    "tags" : [ "котята", "смСшная история" ],
    "published_at" : "2014-09-12T20:44:42+00:00"
  }
}

Cov yuam sij tshiab tau tshwm sim hauv cov lus teb: _version ΠΈ _source. Feem ntau, txhua tus yuam sij pib nrog _ raug cais ua nom.

Ntsiab _version qhia cov ntaub ntawv version. Nws yog tsim nyog rau lub optimistic xauv mechanism ua hauj lwm. Piv txwv li, peb xav hloov cov ntaub ntawv uas muaj version 1. Peb xa cov ntaub ntawv hloov pauv thiab qhia tias qhov no yog qhov hloov kho ntawm cov ntaub ntawv nrog version 1. Yog tias lwm tus neeg kuj tau kho cov ntaub ntawv nrog version 1 thiab xa cov kev hloov pauv ua ntej peb, ces ES yuav tsis lees txais peb cov kev hloov pauv, vim nws khaws cov ntaub ntawv nrog version 2.

Ntsiab _source muaj cov ntaub ntawv uas peb indexed. ES tsis siv tus nqi no rau kev nrhiav haujlwm vim Index yog siv rau kev tshawb nrhiav. Txhawm rau txuag chaw, ES khaws cov ntaub ntawv compressed. Yog tias peb tsuas xav tau tus id, thiab tsis yog tag nrho cov ntaub ntawv, ces peb tuaj yeem kaw qhov chaw cia.

Yog tias peb tsis xav tau cov ntaub ntawv ntxiv, peb tuaj yeem tau txais cov ntsiab lus ntawm _source:

curl -XGET "$ES_URL/blog/post/1/_source?pretty"
{
  "title" : "ВСсСлыС котята",
  "content" : "<p>БмСшная история ΠΏΡ€ΠΎ котят<p>",
  "tags" : [ "котята", "смСшная история" ],
  "published_at" : "2014-09-12T20:44:42+00:00"
}

Koj tseem tuaj yeem xaiv tsuas yog qee thaj chaw:

# ΠΈΠ·Π²Π»Π΅Ρ‡Π΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΏΠΎΠ»Π΅ title
curl -XGET "$ES_URL/blog/post/1?_source=title&pretty"
{
  "_index" : "blog",
  "_type" : "post",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "title" : "ВСсСлыС котята"
  }
}

Cia peb ntsuas ob peb nqe lus ntxiv thiab khiav cov lus nug nyuaj.

curl -XPUT "$ES_URL/blog/post/2" -d'
{
  "title": "ВСсСлыС Ρ‰Π΅Π½ΠΊΠΈ",
  "content": "<p>БмСшная история ΠΏΡ€ΠΎ Ρ‰Π΅Π½ΠΊΠΎΠ²<p>",
  "tags": [
    "Ρ‰Π΅Π½ΠΊΠΈ",
    "смСшная история"
  ],
  "published_at": "2014-08-12T20:44:42+00:00"
}'
curl -XPUT "$ES_URL/blog/post/3" -d'
{
  "title": "Как Ρƒ мСня появился ΠΊΠΎΡ‚Π΅Π½ΠΎΠΊ",
  "content": "<p>Π”ΡƒΡˆΠ΅Ρ€Π°Π·Π΄ΠΈΡ€Π°ΡŽΡ‰Π°Ρ история ΠΏΡ€ΠΎ Π±Π΅Π΄Π½ΠΎΠ³ΠΎ ΠΊΠΎΡ‚Π΅Π½ΠΊΠ° с ΡƒΠ»ΠΈΡ†Ρ‹<p>",
  "tags": [
    "котята"
  ],
  "published_at": "2014-07-21T20:44:42+00:00"
}'

Kev cais tawm

# Π½Π°ΠΉΠ΄Π΅ΠΌ послСдний пост ΠΏΠΎ Π΄Π°Ρ‚Π΅ ΠΏΡƒΠ±Π»ΠΈΠΊΠ°Ρ†ΠΈΠΈ ΠΈ ΠΈΠ·Π²Π»Π΅Ρ‡Π΅ΠΌ поля title ΠΈ published_at
curl -XGET "$ES_URL/blog/post/_search?pretty" -d'
{
  "size": 1,
  "_source": ["title", "published_at"],
  "sort": [{"published_at": "desc"}]
}'
{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : null,
    "hits" : [ {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "1",
      "_score" : null,
      "_source" : {
        "title" : "ВСсСлыС котята",
        "published_at" : "2014-09-12T20:44:42+00:00"
      },
      "sort" : [ 1410554682000 ]
    } ]
  }
}

Peb xaiv tus ncej kawg. size txwv tus naj npawb ntawm cov ntaub ntawv tawm. total qhia tag nrho cov ntaub ntawv uas phim qhov kev thov. sort nyob rau hauv cov zis muaj ib tug array ntawm integers uas sorting yog ua. Cov. hnub tau hloov mus rau ib tug integer. Xav paub ntau ntxiv txog kev txheeb xyuas tuaj yeem pom hauv cov ntaub ntawv.

Lim thiab queries

ES txij li version 2 tsis paub qhov txawv ntawm cov ntxaij lim dej thiab cov lus nug, hloov lub tswv yim ntawm cov ntsiab lus yog qhia.
Cov ntsiab lus nug sib txawv ntawm cov ntsiab lus lim hauv qhov lus nug tsim _score thiab tsis cached. Kuv yuav qhia koj seb _score yog dab tsi tom qab.

Lim los ntawm hnub

Peb siv qhov kev thov ntau yam nyob rau hauv cov ntsiab lus ntawm lim:

# ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠΌ посты, ΠΎΠΏΡƒΠ±Π»ΠΈΠΊΠΎΠ²Π°Π½Π½Ρ‹Π΅ 1ΠΎΠ³ΠΎ сСнтября ΠΈΠ»ΠΈ ΠΏΠΎΠ·ΠΆΠ΅
curl -XGET "$ES_URL/blog/post/_search?pretty" -d'
{
  "filter": {
    "range": {
      "published_at": { "gte": "2014-09-01" }
    }
  }
}'

Lim los ntawm cov cim npe

Peb siv lus nug mus nrhiav cov ntaub ntawv ids uas muaj ib lo lus muab:

# Π½Π°ΠΉΠ΄Π΅ΠΌ всС Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Ρ‹, Π² ΠΏΠΎΠ»Π΅ tags ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Ρ… Π΅ΡΡ‚ΡŒ элСмСнт 'котята'
curl -XGET "$ES_URL/blog/post/_search?pretty" -d'
{
  "_source": [
    "title",
    "tags"
  ],
  "filter": {
    "term": {
      "tags": "котята"
    }
  }
}'
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "title" : "ВСсСлыС котята",
        "tags" : [ "котята", "смСшная история" ]
      }
    }, {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "3",
      "_score" : 1.0,
      "_source" : {
        "title" : "Как Ρƒ мСня появился ΠΊΠΎΡ‚Π΅Π½ΠΎΠΊ",
        "tags" : [ "котята" ]
      }
    } ]
  }
}

Nrhiav cov ntawv nyeem

Peb ntawm peb cov ntaub ntawv muaj cov hauv qab no hauv cov ntsiab lus teb:

  • <p>БмСшная история ΠΏΡ€ΠΎ котят<p>
  • <p>БмСшная история ΠΏΡ€ΠΎ Ρ‰Π΅Π½ΠΊΠΎΠ²<p>
  • <p>Π”ΡƒΡˆΠ΅Ρ€Π°Π·Π΄ΠΈΡ€Π°ΡŽΡ‰Π°Ρ история ΠΏΡ€ΠΎ Π±Π΅Π΄Π½ΠΎΠ³ΠΎ ΠΊΠΎΡ‚Π΅Π½ΠΊΠ° с ΡƒΠ»ΠΈΡ†Ρ‹<p>

Peb siv match lus nug mus nrhiav cov ntaub ntawv ids uas muaj ib lo lus muab:

# source: false ΠΎΠ·Π½Π°Ρ‡Π°Π΅Ρ‚, Ρ‡Ρ‚ΠΎ Π½Π΅ Π½ΡƒΠΆΠ½ΠΎ ΠΈΠ·Π²Π»Π΅ΠΊΠ°Ρ‚ΡŒ _source Π½Π°ΠΉΠ΄Π΅Π½Π½Ρ‹Ρ… Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ²
curl -XGET "$ES_URL/blog/post/_search?pretty" -d'
{
  "_source": false,
  "query": {
    "match": {
      "content": "история"
    }
  }
}'
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.11506981,
    "hits" : [ {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "2",
      "_score" : 0.11506981
    }, {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "1",
      "_score" : 0.11506981
    }, {
      "_index" : "blog",
      "_type" : "post",
      "_id" : "3",
      "_score" : 0.095891505
    } ]
  }
}

Txawm li cas los xij, yog tias peb tshawb nrhiav "cov dab neeg" hauv cov ntsiab lus teb, peb yuav tsis pom dab tsi, vim Qhov ntsuas tsuas muaj cov lus qub xwb, tsis yog lawv cov stems. Txhawm rau ua qhov kev tshawb fawb zoo, koj yuav tsum teeb tsa lub ntsuas ntsuas.

teb _score qhia tau hais tias qhov tseeb. Yog tias qhov kev thov raug ua tiav hauv cov ntsiab lus lim, ces tus nqi _score yuav ib txwm sib npaug rau 1, uas txhais tau tias ua tiav qhov sib tw rau lub lim.

Cov kws tshuaj ntsuam xyuas

Cov kws tshuaj ntsuam xyuas yog xav tau los hloov cov ntawv sau rau hauv ib pawg tokens.
Analyzers muaj ib tug Tokenizer thiab ob peb xaiv tau TokenFilters. Tokenizer tuaj yeem ua ntej los ntawm ntau yam CharFilters. Tokenizers rhuav tshem cov kab hauv paus rau hauv tokens, xws li qhov chaw thiab cov cim cim. TokenFilter tuaj yeem hloov cov tokens, rho tawm lossis ntxiv cov tshiab, piv txwv li, tsuas yog tso lub qia ntawm lo lus, tshem tawm cov prepositions, ntxiv cov ntsiab lus. CharFilter - hloov tag nrho cov hlua, piv txwv li, txiav tawm html cim npe.

ES muaj ntau yam txheem analyzers. Piv txwv li, ib tug analyzer Lavxias teb sab.

Cia peb ua kom zoo dua api thiab cia saib yuav ua li cas tus qauv thiab Lavxias teb sab analyzers hloov txoj hlua "Cov dab neeg funny txog kittens":

# ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌ Π°Π½Π°Π»ΠΈΠ·Π°Ρ‚ΠΎΡ€ standard       
# ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ Π½ΡƒΠΆΠ½ΠΎ ΠΏΠ΅Ρ€Π΅ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π½Π΅ ASCII символы
curl -XGET "$ES_URL/_analyze?pretty&analyzer=standard&text=%D0%92%D0%B5%D1%81%D0%B5%D0%BB%D1%8B%D0%B5%20%D0%B8%D1%81%D1%82%D0%BE%D1%80%D0%B8%D0%B8%20%D0%BF%D1%80%D0%BE%20%D0%BA%D0%BE%D1%82%D1%8F%D1%82"
{
  "tokens" : [ {
    "token" : "вСсСлыС",
    "start_offset" : 0,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 0
  }, {
    "token" : "истории",
    "start_offset" : 8,
    "end_offset" : 15,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "ΠΏΡ€ΠΎ",
    "start_offset" : 16,
    "end_offset" : 19,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "котят",
    "start_offset" : 20,
    "end_offset" : 25,
    "type" : "<ALPHANUM>",
    "position" : 3
  } ]
}
# ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌ Π°Π½Π°Π»ΠΈΠ·Π°Ρ‚ΠΎΡ€ russian
curl -XGET "$ES_URL/_analyze?pretty&analyzer=russian&text=%D0%92%D0%B5%D1%81%D0%B5%D0%BB%D1%8B%D0%B5%20%D0%B8%D1%81%D1%82%D0%BE%D1%80%D0%B8%D0%B8%20%D0%BF%D1%80%D0%BE%20%D0%BA%D0%BE%D1%82%D1%8F%D1%82"
{
  "tokens" : [ {
    "token" : "вСсСл",
    "start_offset" : 0,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 0
  }, {
    "token" : "истор",
    "start_offset" : 8,
    "end_offset" : 15,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "ΠΊΠΎΡ‚",
    "start_offset" : 20,
    "end_offset" : 25,
    "type" : "<ALPHANUM>",
    "position" : 3
  } ]
}

Tus txheej txheem analyzer cais txoj hlua los ntawm qhov chaw thiab hloov txhua yam mus rau cov ntaub ntawv qis, tus kws tshuaj ntsuam xyuas Lavxias tau tshem tawm cov lus tsis tseem ceeb, hloov nws mus rau cov ntaub ntawv qis thiab sab laug lub qia ntawm cov lus.

Wb pom qhov twg Tokenizer, TokenFilters, CharFilters tus neeg soj ntsuam Lavxias siv:

{
  "filter": {
    "russian_stop": {
      "type":       "stop",
      "stopwords":  "_russian_"
    },
    "russian_keywords": {
      "type":       "keyword_marker",
      "keywords":   []
    },
    "russian_stemmer": {
      "type":       "stemmer",
      "language":   "russian"
    }
  },
  "analyzer": {
    "russian": {
      "tokenizer":  "standard",
      /* TokenFilters */
      "filter": [
        "lowercase",
        "russian_stop",
        "russian_keywords",
        "russian_stemmer"
      ]
      /* CharFilters ΠΎΡ‚ΡΡƒΡ‚ΡΡ‚Π²ΡƒΡŽΡ‚ */
    }
  }
}

Cia peb piav qhia peb cov ntsuas ntsuas raws li Lavxias, uas yuav txiav tawm html cim npe. Wb hu ua default, vim tus ntsuas nrog lub npe no yuav raug siv los ntawm lub neej ntawd.

{
  "filter": {
    "ru_stop": {
      "type":       "stop",
      "stopwords":  "_russian_"
    },
    "ru_stemmer": {
      "type":       "stemmer",
      "language":   "russian"
    }
  },
  "analyzer": {
    "default": {
      /* добавляСм ΡƒΠ΄Π°Π»Π΅Π½ΠΈΠ΅ html Ρ‚Π΅Π³ΠΎΠ² */
      "char_filter": ["html_strip"],
      "tokenizer":  "standard",
      "filter": [
        "lowercase",
        "ru_stop",
        "ru_stemmer"
      ]
    }
  }
}

Ua ntej, tag nrho HTML cim npe yuav raug tshem tawm los ntawm cov kab hauv paus, tom qab ntawd tus qauv tokenizer yuav faib nws mus rau hauv tokens, cov txiaj ntsig tokens yuav txav mus rau cov ntaub ntawv qis, cov lus tsis tseem ceeb yuav raug tshem tawm, thiab cov tokens ntxiv yuav nyob twj ywm ntawm lo lus.

Tsim ib qho index

Saum toj no peb tau piav qhia lub neej ntawd analyzer. Nws yuav siv tau rau txhua txoj hlua. Peb cov ntawv tshaj tawm muaj cov ntawv cim npe, yog li cov cim tseem yuav ua tiav los ntawm tus ntsuas. Vim Peb tab tom nrhiav rau cov posts los ntawm qhov tseeb match rau ib daim ntawv, ces peb yuav tsum tau lov tes taw kev tsom xam rau cov cim npe.

Wb tsim ib qho index blog2 nrog ib qho kev soj ntsuam thiab daim ntawv qhia, nyob rau hauv qhov kev tshuaj ntsuam ntawm cov cim npe yog neeg xiam:

curl -XPOST "$ES_URL/blog2" -d'
{
  "settings": {
    "analysis": {
      "filter": {
        "ru_stop": {
          "type": "stop",
          "stopwords": "_russian_"
        },
        "ru_stemmer": {
          "type": "stemmer",
          "language": "russian"
        }
      },
      "analyzer": {
        "default": {
          "char_filter": [
            "html_strip"
          ],
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "ru_stop",
            "ru_stemmer"
          ]
        }
      }
    }
  },
  "mappings": {
    "post": {
      "properties": {
        "content": {
          "type": "string"
        },
        "published_at": {
          "type": "date"
        },
        "tags": {
          "type": "string",
          "index": "not_analyzed"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}'

Cia peb ntxiv tib 3 cov lus rau qhov ntsuas no (blog2). Kuv yuav tshem tawm qhov txheej txheem no vim ... nws zoo ib yam li ntxiv cov ntaub ntawv rau blog index.

Cov ntawv nyeem puv nrog kev txhawb nqa

Cia peb saib lwm hom kev thov:

# Π½Π°ΠΉΠ΄Π΅ΠΌ Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Ρ‹, Π² ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Ρ… встрСчаСтся слово 'истории'
# query -> simple_query_string -> query содСрТит поисковый запрос
# ΠΏΠΎΠ»Π΅ title ΠΈΠΌΠ΅Π΅Ρ‚ ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ 3
# ΠΏΠΎΠ»Π΅ tags ΠΈΠΌΠ΅Π΅Ρ‚ ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ 2
# ΠΏΠΎΠ»Π΅ content ΠΈΠΌΠ΅Π΅Ρ‚ ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ 1
# ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ ΠΏΡ€ΠΈ Ρ€Π°Π½ΠΆΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠΈ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ΠΎΠ²
curl -XPOST "$ES_URL/blog2/post/_search?pretty" -d'
{
  "query": {
    "simple_query_string": {
      "query": "истории",
      "fields": [
        "title^3",
        "tags^2",
        "content"
      ]
    }
  }
}'

Vim Peb tab tom siv lub ntsuas ntsuas nrog Lavxias stemming, tom qab ntawd qhov kev thov no yuav rov qab tag nrho cov ntaub ntawv, txawm hais tias lawv tsuas muaj lo lus 'keeb kwm'.

Qhov kev thov yuav muaj cov cim tshwj xeeb, piv txwv li:

""fried eggs" +(eggplant | potato) -frittata"

Thov syntax:

+ signifies AND operation
| signifies OR operation
- negates a single token
" wraps a number of tokens to signify a phrase for searching
* at the end of a term signifies a prefix query
( and ) signify precedence
~N after a word signifies edit distance (fuzziness)
~N after a phrase signifies slop amount
# Π½Π°ΠΉΠ΄Π΅ΠΌ Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Ρ‹ Π±Π΅Π· слова 'Ρ‰Π΅Π½ΠΊΠΈ'
curl -XPOST "$ES_URL/blog2/post/_search?pretty" -d'
{
  "query": {
    "simple_query_string": {
      "query": "-Ρ‰Π΅Π½ΠΊΠΈ",
      "fields": [
        "title^3",
        "tags^2",
        "content"
      ]
    }
  }
}'

# ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠΌ 2 поста ΠΏΡ€ΠΎ ΠΊΠΎΡ‚ΠΈΠΊΠΎΠ²

ua tim khawv

PS

Yog tias koj txaus siab rau cov ntawv no-cov lus qhia, muaj tswv yim rau cov ntawv tshiab lossis muaj cov lus pom zoo rau kev koom tes, kuv yuav zoo siab tau txais cov lus hauv cov lus ntawm tus kheej lossis xa ntawv m.kuzmin+habr@darkleaf.ru.

Tau qhov twg los: www.hab.com