Bool Query

must - 查詢必須匹配的字，並計算 _score (與 AND 等價)
filter - 查詢必須匹配的字，不計算 _score (代表對評分沒有任何貢獻，只是用來過濾)
should - 滿足任一匹配的字，將增加 _score ，否則，無任何影響，如果一個 query 中沒有 must 和 filter 則必須匹配一個或以上的 should (與 OR 等價)
must_not - 查詢排除的字 (與 NOT 等價)
boost - 權重
minimum_should_match - 設定 should 至少要匹配幾個句子

Example 1:

user 必須是 kimchy，並且過濾出 tag 是 “tech” (匹配多寡並不影響 _score)，age 範圍排除 10 ~ 20，如果 tag 有 wow 或是 elasticsearch 則 _score 比較高，兩個都有則更高

{
    "bool" : {
        "must" : {
            "term" : { "user" : "kimchy" }
        },
        "filter": {
            "term" : { "tag" : "tech" }
        },
        "must_not" : {
            "range" : {
                "age" : { "from" : 10, "to" : 20 }
            }
        },
        "should" : [
            {
                "term" : { "tag" : "wow" }
            },
            {
                "term" : { "tag" : "elasticsearch" }
            }
        ],
        "minimum_should_match" : 1,
        "boost" : 1.0
    }
}

Example 2:

將 bool 帶入 filter 一樣可以不計算分數

查找 title 字段匹配 how to make millions 並且不被 tag 為 spam 的文件。那些被 tag 為 starred 或在 2014 之後的文件，將比另外那些文件擁有更高的排名。如果兩者都滿足，那麼它排名將更高，並過濾出 price 必須小於等於 29.99，且 category 不能是 ebooks 這兩個條件則不影響排名

{
    "bool": {
        "must":     { "match": { "title": "how to make millions" }},
        "must_not": { "match": { "tag":   "spam" }},
        "should": [
            { "match": { "tag": "starred" }},
            { "range": { "date": { "gte": "2014-01-01" }}}
        ],
        "filter": {
          "bool": {
              "must": [
                  { "range": { "price": { "lte": 29.99 }}}
              ],
              "must_not": [
                  { "term": { "category": "ebooks" }}
              ]
          }
        }
    }
}

Example3: constant_score

它將一個不變的常量評分應用於所有匹配的文件，比較簡潔用來取代只有一個 filter 的 bool

{
    "constant_score":   {
        "filter": {
            "term": { "category": "ebooks" }
        }
    }
}

Example4: boost 權重

預設為 1

{
    "query": {
        "bool": {
            "must": {
                "match": {
                    "content": {
                        "query":    "full text search",
                        "operator": "and"
                    }
                }
            },
            "should": [
                { "match": {
                    "content": {
                        "query": "Elasticsearch",
                        "boost": 3
                    }
                }},
                { "match": {
                    "content": {
                        "query": "Lucene",
                        "boost": 2
                    }
                }}
            ]
        }
    }
}

Example5: equle to match

OR

下面兩個相等

{
    "match": { "title": "brown fox"}
}

{
  "bool": {
    "should": [
      { "term": { "title": "brown" }},
      { "term": { "title": "fox"   }}
    ]
  }
}

AND

下面兩個相等

{
    "match": {
        "title": {
            "query":    "brown fox",
            "operator": "and"
        }
    }
}

{
  "bool": {
    "must": [
      { "term": { "title": "brown" }},
      { "term": { "title": "fox"   }}
    ]
  }
}

minimum_should_match

下面兩個相等

{
    "match": {
        "title": {
            "query":                "quick brown fox",
            "minimum_should_match": "75%"
        }
    }
}

{
  "bool": {
    "should": [
      { "term": { "title": "brown" }},
      { "term": { "title": "fox"   }},
      { "term": { "title": "quick" }}
    ],
    "minimum_should_match": 2
  }
}

Exact Values Search

數字查詢

SELECT document
FROM   products
WHERE  price = 20

通常精準的字查詢，就不需要計算分數，因此加上 constant_score

{
    "query" : {
        "constant_score" : {
            "filter" : {
                "term" : {
                    "price" : 20
                }
            }
        }
    }
}

text 查詢

SELECT product
FROM   products
WHERE  productID = "XHDK-A-1293-#fJ3"

這裡會有個問題，分析器會解析 XHDK-A-1293-#fJ3 -> XHDK A 1293 #fJ3，因此查詢時會有問題

GET /my_store/products/_search
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "term" : {
                    "productID" : "XHDK-A-1293-#fJ3"
                }
            }
        }
    }
}

必須重新針對 productID 設定不要分析，重新設定前記得先刪除原本的 index

{
    "mappings" : {
        "products" : {
            "properties" : {
                "productID" : {
                    "type" : "string",
                    "index" : "not_analyzed"
                }
            }
        }
    }

}

Combining Filters

這邊要注意一下，如果 should，和其他的一起合用，就必須加上 minimum_should_match 或是要包在同一個 array 裡面，再多一個 bool 不然就不會是我們想要的 a AND (B OR C)

{
  "filter": [
      { "term": { "date_on": "2022-05-04" } },
      { "term": { "name.keyword": "測試" } }
    ],
  "should": [
    { "term": { "company": "公司" } },
    { "bool": {
        "filter": [
          { "range": { "age": { "lt": 0 } } },
          { "range": { "age": { "lt": 0 } } }
        ]
      }
    }
  ],
  "minimum_should_match": 1
}

{
  "filter": [
    { "term": { "date_on": "2022-05-04" } },
    { "term": { "name.keyword": "測試" } }
    {
      "bool": {
        "should": [
          { "term": { "company": "公司" } },
          { "bool": {
              "filter": [
                  { "range": { "age": { "gte": 50 } } },
                  { "range": { "age": { "lte": 20 } } }
              ]
            }
          }
        ]
      }
    }
  ]
}

Example1

SELECT product
FROM   products
WHERE  (price = 20 OR productID = "XHDK-A-1293-#fJ3")
  AND  (price != 30)

{
   "query" : {
      "bool" : {
         "filter" : {
            "bool" : {
              "should" : [
                 { "term" : {"price" : 20}},
                 { "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
              ],
              "must_not" : {
                 "term" : {"price" : 30}
              }
           }
         }
      }
   }
}

Example2

SELECT document
FROM   products
WHERE  productID      = "KDKE-B-9947-#kL5"
  OR (     productID = "JODL-X-1937-#pV7"
       AND price     = 30 )

{
   "query" : {
      "bool" : {
         "filter" : {
            "bool" : {
              "should" : [
                { "term" : {"productID" : "KDKE-B-9947-#kL5"}},
                { "bool" : {
                  "must" : [
                    { "term" : {"productID" : "JODL-X-1937-#pV7"}},
                    { "term" : {"price" : 30}}
                  ]
                }}
              ]
           }
         }
      }
   }
}

Example3

在收件箱中，且沒有被讀過的
不在收件箱中，但被標註重要的

{
  "query": {
      "constant_score": {
          "filter": {
              "bool": {
                 "should": [
                    { "bool": {
                          "must": [
                             { "term": { "folder": "inbox" }},
                             { "term": { "read": false }}
                          ]
                    }},
                    { "bool": {
                          "must_not": {
                             "term": { "folder": "inbox" }
                          },
                          "must": {
                             "term": { "important": true }
                          }
                    }}
                 ]
              }
            }
        }
    }
}

Disjunction Max Query 最佳字段

給予兩個字段

PUT /my_index/my_type/1
{
    "title": "Quick brown rabbits",
    "body":  "Brown rabbits are commonly seen."
}

PUT /my_index/my_type/2
{
    "title": "Keeping pets healthy",
    "body":  "My quick brown fox eats rabbits on a regular basis."
}

bool

使用一般的 bool，會發現 1 的分數會比較高，主要在於 1 的兩個句字都有包含到 Brown，但我們希望的是比較準確的 2，因為 body 就包含了 Brown fox

{
    "query": {
        "bool": {
            "should": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}

dis_max

將任何與任一查詢匹配的文件作為結果返回，但只將最佳匹配的評分作為查詢的評分結果返回

{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Brown fox" }},
                { "match": { "body":  "Brown fox" }}
            ]
        }
    }
}

如果用 dis_max 查出的兩個最佳匹配分數一樣，可以加上 tie_breaker 調優，將其他匹配的語句一起做計算並乘個比例，範圍在 0~1

tie_breaker 可以是 0 到 1 之間的浮點數，其中 0 代表使用 dis_max 最佳匹配語句的普通邏輯， 1 表示所有匹配語句同等重要。最佳的精確值需要根據數據與查詢調試得出，但是合理值應該與零接近（處於 0.1 - 0.4 之間），這樣就不會顛覆 dis_max 最佳匹配性質的根本。

{
    "query": {
        "dis_max": {
            "queries": [
                { "match": { "title": "Quick pets" }},
                { "match": { "body":  "Quick pets" }}
            ],
            "tie_breaker": 0.3
        }
    }
}

post_filter

後過濾器，可以針對 query 完後的結果，做最後的 filter，並且不影響 aggregation

使用場景，像是用 agg 列出 category list，當點選某一個 category 時，並不希望影響到 category list，而只針對結果進行 filter

{
    "size" : 0,
    "query": {
        "match": {
            "make": "ford"
        }
    },
    "post_filter": {
        "term" : {
            "color" : "green"
        }
    },
    "aggs" : {
        "all_colors": {
            "terms" : { "field" : "color" }
        }
    }
}

Leon's Blogging

Coding blogging for hackers.

Advance ElasticSearch

Bool Query

Example 1:

Example 2:

Example3: constant_score

Example4: boost 權重

Example5: equle to match

OR

AND

minimum_should_match

Exact Values Search

數字查詢

text 查詢

Combining Filters

Disjunction Max Query 最佳字段

bool

dis_max

post_filter

Function Score Query

Reference

Comments