知识内容输出不易,请尊重他人劳动成果。严禁随意传播、复制和盗用他人成果或文章内容用以商业或盈利目的!
1、查询和检索
查询:有明确的搜索条件边界。比如,年龄 15~25 岁,颜色 = 红色,价格 < 3000,这里的 15、25、红色、3000 都是条件边界。即有明确的范围界定。
检索:即全文检索,无搜索条件边界,召回结果取决于相关性
,其相关性计算无明确边界性条件,如同义词、谐音、别名、错别字、混淆词、网络热梗等均可成为其相关性判断依据。
2、上下文对象
使用 query 关键字进行检索,倾向于相关度搜索,故需要计算评分。搜索是 Elasticsearch 最关键和重要的部分。
3、相关度评分:_score
使用 query 关键字进行检索,倾向于相关度搜索,故需要计算评分。搜索是 Elasticsearch 最关键和重要的部分。
相关度评分用于对搜索结果排序,评分越高则认为其结果和搜索的预期值相关度越高,即越符合搜索预期值。在 5.x 之前相关度评分默认使用 TF/IDF 算法计算而来,5.x 之后默认为 BM25。在核心知识篇不必关心相关评分的具体原理,只需知晓其概念即可。
相关度评分为搜索结果的排序依据,默认情况下评分越高,则结果越靠前。
如果没有指定排序字段,则默认按照评分高低排序,相关度评分为搜索结果的排序依据,默认情况下评分越高,则结果越靠前。
4、源数据:_source
4.1 数据源过滤器:
- Including:结果中返回哪些 field
- Excluding:结果中不要返回哪些 field,不返回的field不代表不能通过该字段进行检索,因为元数据不存在不代表索引不存在
4.2 在 mapping 中定义过滤
支持通配符,但是这种方式不推荐,因为 mapping 不可变
'PUT product
{
"mappings": {
"_source": {
"includes": [
"name",
"price"
],
"excludes": [
"desc",
"tags"
]
}
}
}
4.3 在查询中过滤
不查看源数据,仅查看元字段
{
"_source": false,
"query": {
...
}
}
只看以obj.
开头的字段
{
"_source": "obj.*",
"query": {
...
}
}
查看以obj.
或者obj2.
开头的字段,
{
"_source": [ "obj1.*", "obj2.*" ],
"query": {
...
}
}
查看以obj.
或者obj2.
开头,并且不以.desc
为后缀的字段
{
"_source": {
"includes": [
"obj1.*",
"obj2.*"
],
"excludes": [
"*.desc"
]
},
"query": {
...
}
}
4.4 禁用 _source
优点:节省存储开销
缺点:
- 不支持 update、update_by_query 和 reindex API。
- 不支持高亮。
- 不支持 reindex、更改 mapping 分析器和版本升级。
- 通过查看索引时使用的原始文档来调试查询或聚合的功能。
- 将来有可能自动修复索引损坏。
总结:如果只是为了节省磁盘,可以压缩索引比禁用 _source 更好。
5、分页和排序
默认情况下,搜索返回前 10 个匹配命中。
5.1 分页
分页查找可以使用以下两个关键字
- from:从低几个文档开始返回,需要为非负数,默认为 :0
- size:定义要返回的命中数,默认值为:10
基本语法
GET <index>/_search
{
"from": 0,
"size": 20
}
注意
-
from + size 必须小于等于 10000,其原因涉及深度分页问题,会在进阶篇讲解其原理,如需探索阅读,参考以下文章
-
max_result_window:可以解除 from + size 必须小于 10000 的限制,但是如果不清楚其原理,而盲目修改阈值,可能会造成严重后果。具体原因可参考以下文章:
-
track_total_hits:允许 hits.total 返回实际数值,但是会牺牲性能,参考:
5.2 排序
基本语法
GET <index>/_search
{
"sort": [
{
"<sort_field>": {
"order": "desc" // or asc
}
}
]
}
- sort_field:可以是 _source field,也可以是 meta data field
6、Url Query
6.1 优缺点
- 不依赖于任何客户端
- 对于复杂查询不友好
- 没有任何智能提示
6.2 基本用法
查询所有文档
GET /goods/_search
带参数查询
GET /goods/_search?q=name:xiaomi
分页
GET /goods/_search?from=0&size=2&sort=price:asc
精准匹配 exact value
GET /goods/_search?q=date:2040-07-27
如果不指定参数名称,即 _all 搜索 相当于在索引的所************
GET /goods/_search?q=2040-07-27
DELETE goods
# ****_all****
PUT goods
{
"mappings": {
"properties": {
"desc": {
"type": "text",
"index": false
}
}
}
}
# ************
POST /goods/_update/5
{
"doc": {
"desc": "erji zhong de kendeji 2040-07-27"
}
}
7、********:Full Text Query ★
7.1 ****************
- ********:**********************
- ********:**************,******************
- ********:** FullText Query
- ********:****************************
- ********:TF-IDF、Okapi BM25
7.2 ************
7.2.1 ****
**********************,********************************************************。**************************,**********、********、********、******、************、******************。
********************************(Inverted Index),****************************************。******************,**************************************************,************************************。
************,**************************************。**********,****************(Tokenization),**********************(****),************************。**********************,******************************。******************************************************。
************************************,**********************,************************,**********************************。************************************************************,****TF-IDF(****-**********)****、BM25(Okapi Best Matching 25)******。
7.2.2 ******
******************,****:
- **************************;
- **********************;
- ******************,************、********、**********;
- ****************************,******************。
****,****************************,****:
- **********************,****“********”**************“****”**“****”,**************“******”**“**”;
- ********************************;
- ************************************************;
- ******************************,********、****、******。
7.2.3 ************
7.3 Match
7.3.1 ****
GET <index>/_search
{
"query": {
"match": {
"<field_name>": "<field_value>"
}
}
}
7.3.2 ****
**********************。**********************,********************************,**********。
7.4 Match All
match_all:******************
GET <index>/_search
{
"query": {
"match_all": {}
}
}
7.5 Match Phrase
7.5.1 ************
Match_phrase ******************,****************************。** Match ********,Match_phrase ************************,******************。
7.5.3 ********
- ******:match_phrase ******************************。****,************** “elastic org cn”,********************:。
- **************************:****************** match_phrase ****************,********************。**************
elastic org cn
,****************************"elastic"、"org"**"cn"**********,**************。 - Slop****:************************** match_phrase **************************,** slop ****** 0,********** slop **************,****************** slop **********
7.5.3 ********
GET <index>/_search
{
"query": {
"match_phrase": {
"<field_name>": "<field_value>"
}
}
}
7.5.4 Slop ****
** Elasticsearch ** Match_phrase ******,slop **************************************************。******************************************,********************************。
slop **************************,******** 0,******************************************************。** slop ******** 0 **,**************************************************。****,**** slop ********** 1,**********************************************。
**** slop **********************:
- slop ************ Match_phrase ****,******** Match_phrase_prefix ****
- slop ************,******************,**********************,******************。
8、********:Term-Level Query
8.1 ****
****************************,******************************
,********************,********,**********************************,************,******************************。**************************。**:********、**********。
Term **************:Id、****、**********************。
8.2 Term Query
8.2.1 ********
** match query ****,term query ****************,******************************,********、**********。
8.2.2 ********
********,term query ** keyword ************,******************。
term query | match query | match_phrase | |
---|---|---|---|
******** | ******(**:keyword) | ****(**:text) | ****(**:text) |
******** | ******** | ******** | ******** |
8.2.3 ****
GET <index>/_search
{
"query": {
"term": {
"<keyword_field_name>": "<field_value>"
}
}
}
8.2.4 term ** keyword
term | keyword | |
---|---|---|
******** | ******** | ******** |
******** | ** | ** |
******** | ****** | ****** |
8.2.5 ********
** match query ****************************************,** match ************
******。
**:****** match **********************,****************
************,name ******** text,********************:
GET goods_en/_search
{
"query": {
"match": {
"name.keyword": "xiaomi 9"
}
}
}
****************** keyword ****,****************xiaomi 9
****************,************xiaomi 9
******** match query,****************,************:
************** xiaomi ** 9 ********,************** xiaomi 9
******,************************,****************************,************************************************。
8.3 Terms Query
8.3.1 ********
Terms ** Term ************ Terms ******************。
****:
GET /_search
{
"query": {
"terms": {
"<field_name>": [ "<value1>", "<value2>" ]
}
}
}
8.3.2 ****
**** goods ******************
、******
**********
GET goods/_search
{
"query": {
"terms": {
"tags.keyword": [
"****",
"******"
]
}
}
}
******************** SQL **********
SELECT * FROM goods WHERE tags in(‘****’,‘******’)
8.3.3 Ids ****
****************************** _id ****,******** Ids ****,**:
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "4", "100"]
}
}
}
**
GET goods/_search
{
"query": {
"ids": {
"values": [1,4,7]
}
}
}
8.4 ********:Range Query
GET /_search
{
"query": {
"range": {
"age": {
"gte": 10,
"lte": 20,
"boost": 2.0
}
}
}
}
********
9、********:Boolean Query
******************
ES **
PUT /goods_en/_doc/1
{
"name" : "HuaRen 2060 Super Phone ShaNiu",
"desc" : "From the future of technology,zhenren moshi zongxiang kuaile",
"brand": "HuaRen",
"price" : 99999,
"lv":"ceiling",
"type":"phone",
"createtime":"2050-10-01T08:00:00Z",
"tags": [ "future", "chuanyue","zhenrenmoshi","shoujimoshi" ]
}
PUT /goods_en/_doc/2
{
"name" : "HuaWei Mate 9000 Phone",
"desc" : "zhichi weixing tongxun,xinhao qiang, gaoduan daqi shangdangci",
"brand": "HuaWei",
"price" : 29999,
"lv":"qijianji",
"type":"phone",
"createtime":"2050-05-21T08:00:00Z",
"tags": [ "xinhao", "shangwu","timian","xuhang"]
}
PUT /goods_en/_doc/3
{
"name" : "iphone 120 Pro Max Ultra Phone",
"desc" : "sihua liuchuang bukadun,nianqingren de diyitai shouji hebi shi iphone",
"brand": "Apple",
"price" : 12999,
"lv":"gaoduan",
"type":"phone",
"createtime":"2050-06-20",
"tags": [ "IOS", "sihua", "liuchang" ]
}
PUT /goods_en/_doc/4
{
"name" : "XiaoMi 110 Pro Ultra",
"desc" : "erji zhong de huangmenji",
"brand": "Xiaomi",
"price" : 6999,
"lv":"gaoduan",
"type":"erji",
"createtime":"2050-06-23",
"tags": [ "fashao", "nfc","miui" ]
}
PUT /goods_en/_doc/5
{
"name" : "hongmi k100",
"desc" : "xingjiabi qijian, dijia gaopei",
"brand": "Xiaomi",
"price" : 1999,
"type":"phone",
"lv":"qianyuanji",
"createtime":"2050-07-20",
"tags": [ "xingjiabi","xuhang", "shishang","miui" ]
}
SQL**:
/*
Navicat Premium Data Transfer
Source Server : local
Source Server Type : MySQL
Source Server Version : 50731
Source Host : localhost:3306
Source Schema : my_db
Target Server Type : MySQL
Target Server Version : 50731
File Encoding : 65001
Date: 16/04/2023 11:35:54
*/
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
-- ----------------------------
-- Table structure for goods_en
-- ----------------------------
DROP TABLE IF EXISTS `goods_en`;
CREATE TABLE `goods_en` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`desc` varchar(300) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`brand` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`price` decimal(10,0) DEFAULT NULL,
`lv` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`type` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`createtime` datetime DEFAULT NULL,
`tags` varchar(200) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
-- ----------------------------
-- Records of goods_en
-- ----------------------------
BEGIN;
INSERT INTO `goods_en` VALUES (1, 'HuaRen 2060', 'From the future of technology,zhenren moshi zongxiang kuaile', 'HuaRen', 99999, 'ceiling', 'phone', '2050-10-01 16:00:00', '\"future\",\"chuanyue\",\"zhenrenmoshi\",\"shoujimoshi\"');
INSERT INTO `goods_en` VALUES (2, 'HuaWei Mate 9000', 'zhichi weixing tongxun,xinhao qiang, gaoduan daqi shangdangci', 'HuaWei', 29999, 'qijianji', 'phone', '2050-05-21 16:00:00', '\"xinhao\",\"shangwu\",\"timian\",\"xuhang\"');
INSERT INTO `goods_en` VALUES (3, 'iphone 120 Pro Max Ultra', 'sihua liuchuang bukadun,nianqingren de diyitai shouji hebi shi iphone', 'Apple', 12999, 'gaoduan', 'phone', '2050-06-20 00:00:00', '\"IOS\",\"sihua\",\"liuchang\"');
INSERT INTO `goods_en` VALUES (4, 'XiaoMi 110 Pro Ultra', 'erji zhong de huangmenji', 'Xiaomi', 6999, 'gaoduan', 'erji', '2050-06-23 00:00:00', '\"fashao\",\"nfc\",\"miui\"');
INSERT INTO `goods_en` VALUES (5, 'hongmi k100', 'xingjiabi qijian, dijia gaopei', 'Xiaomi', 1999, 'qianyuanji', 'phone', '2050-07-20 00:00:00', '\"xingjiabi\",\"xuhang\",\"shishang\",\"miui\"');
COMMIT;
SET FOREIGN_KEY_CHECKS = 1;
9.1 ****
9.1.1 ****
bool:********************,bool ************ more_matches_is_better ******,******** must ** should ******************************
bool query ********************
9.1.2 ********
GET _search
{
"query": {
"bool": {
"filter": [ // **************
{******},
{******}
],
"must": [ // **************
{******},
{******}
],
"must_not": [ // ******************
{******},
{******}
],
"should": [ // ********************************
{******},
{******}
]
}
}
}
9.2 ********
9.2.1 Must ****
- **************
- ********************
# Must => WHERE ****** and ******
# ****1: brand:xiaomi
# And
# ****2: price:> 5000
GET goods_en/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"brand": "Xiaomi"
}
},
{
"range": {
"price": {
"gt": 5000
}
}
}
]
}
}
}
9.2.2 Filter ****
****** cache☆ ****(****)**********************。******** must **********, **** filter ******************,****************。
**:
- ****************
- ********
# ****1: type:phone
# And
# ****2: price:< 20000
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"filter": [
{
"term": {
"type.keyword": "phone"
}
},
{
"range": {
"price": {
"lt": 20000
}
}
}
]
}
}
}
9.2.3 Should ****
******** or ****(****)********************。
should => Where ****** or ****** // ** ****************
# ****1: name: Huawei Xiaomi
# OR
# ****2: band: xiaomi huawei
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"should": [
{
"match": {
"name": "xiaomi huawei"
}
},
{
"terms": {
"brand.keyword": [
"Xiaomi",
"HuaWei"
]
}
}
]
}
}
}
9.2.4 Must_not ****
******************************,********************
Must not !(****** and ******)
#****1: ************ 10000 **:!(price < 10000)
#****2: ************: brand != Apple
GET goods_en/_search
{
"query": {
"bool": {
"must_not": [
{
"range": {
"price": {
"lte": 10000
}
}
},
{
"term": {
"brand.keyword": {
"value": "Apple"
}
}
}
]
}
}
}
9.3 ********
9.3.1 ********
********************,************************ AND,**************。**,********** must [case1, case2] ** must_not [case3, case4] **,********:************ case1 ** case2,**************** case3、case4
9.3.2 ****
** filter ** must **********,filter ************************,******************,** must ******************。****** filter **********,************************,******************。
******,**** filter ******** filter。
# filter ** must ****
# ******:******** 5000
# ******:**********
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"filter": [
{
"range": {
"price": {
"gte": "5000"
}
}
}
],
"must": [
{
"match": {
"name": "phone"
}
}
]
}
}
}
9.4 minimum_should_match ****
9.4.1 ********
minimum_should_match ************ should **************************************,**** bool **************** should ****,****** must ** filter ****,********** 1。
# ****1: name****** "phone"
# ******** minimum_should_match ******,**************** OR
# ****2: type **** "phone"
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"should": [
{
"term": {
"type.keyword": "phone"
}
},
{
"match": {
"name": "phone"
}
}
],
"minimum_should_match": 2 // ****** 2 ****************** 2 ******
}
}
}
9.4.2 ********
******** bool ********************** must **** filter ****,** minimum_should_match ************** 0。
# ( must **** filter )** should ****
# ****1: ********20000
# ****2: name******"phone"**** type ****"phone"
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"filter": [
{
"range": {
"price": {
"lte": "20000"
}
}
}
],
"should": [
{
"term": {
"type.keyword": "phone"
}
},
{
"match": {
"name": "phone"
}
}
],
"minimum_should_match": 1
}
}
}
9.5 **********
# ****1:name ******** iphone
# ****2:
# name******"phone" ** ******** **** 20000
# **** type ****"phone" ** (Brand = HuaWei ** Apple)
GET goods_en/_search
{
"_source": false,
"query": {
"bool": {
"must_not": [
{
"match": {
"name": "iphone"
}
}
],
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "phone"
}
},
{
"range": {
"price": {
"gte": 20000
}
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"type": "phone"
}
},
{
"terms": {
"brand.keyword": [
"HuaWei",
"Apple"
]
}
}
]
}
}
]
}
}
}