模拟题记录

分片分配

案例1：将索引 A 所有 shards 分配到 node1, 索引 B 的所有 shards 分配到 node2 、 node3

具体步骤：

为三个节点分别设置不同的属性，假设 node1 为 hot ，node2 和 node3 均为 warm
在 node1 的配置文件中增加 node.attr.hot_waram: hot 在 node2 和 node3 的配置文件中增加 node.attr.hot_warm: warm
依次重启集群中的节点

PUT index_a/_settings
{
    "index.routing.allocation.require.hot_warm": "hot"
}

PUT index_b/_settings
{
    "index.routing.allocation.require.hot_warm": "warm"
}

查看分片的位置，以及重新分配的状态：

1
2
3

GET _cat/shards/index_a?v

GET _cat/shards/index_b?v

重点！！！想要取消设置，可以设置对应的项为空，例如： index.routing.allocation.include.hot_warm:null

exclude,include,require 的属性值均采用逗号分隔的字符串。不是数组！

案例2：将索引 hamlet 设置为不允许分配在节点 node3 上

PUT hamlet/_settings
{
    "index.routing.allocation.exclude._name":"node3"
}

注： _name 是内置的节点属性，除_name之外还包括_ip、_host两个属性

案例3：按照要求给节点设置属性 AZ

要求将 node1 和 node2 的属性设置为 earth、node3 设置为 mars。分片进行 force awareness

#在node1和node2 中配置
node.attr.AZ: earth

#在node3中配置
node.attr.AZ: mars

新增如上配置，然后重启集群。添加如下设置即可：

PUT _cluster/settings
{
    "persistent":{
        "cluster.routing.allocation.awareness.attributes": "AZ",
        "cluster.routing.allocation.awareness.force.zone.values": "earth,mars"
    }
}

单节点 RBAC 全流程配置

在节点的 elasticsearch.yml 中开启 xpack，增加如下配置项：

1 2	xpack.security.enabled: true xpack.security.transport.ssl.enabled: true

重启 elasticsearch ，然后执行密码初始化的 shell :

1	./elasticsearch-setup-passwords interactive

这个 shell 只能执行一次，如果已经执行过，就只能用 dsl 或者在 kibana 界面中进行密码的修改了。

修改 kibana 中的以下配置项，并重启

1 2	elasticsearch.username: "elastic" elasticsearch.password: "123456"

在设置界面中按照要求配置 user 和 role 即可。

date_histogram + pipeline 聚合

先求每个月的平均值，然后找出平均值最大的那个月,需要注意是 buckets_path ，有个负数 s，练习时这里我犯了好几回错

GET earthquake/_search
{
    "size":0,
    "aggs":{
        "month_aggs":{
            "date_histogram":{
                "field": "magnitude",
                "calendar_interval": "1M"
            },
            "aggs":{
                "avg_aggs":{
                    "avg":{
                        "field": "magnitude"
                    }
                }
            }
        },
        "max_avg_magnitude":{
            "max_bucket":{
                "buckets_path":"month_aggs.avg_aggs"
            }
        }
    }
}

ingest 数据处理

案例1：对数组进行处理,数据样例大致如下

 POST test_005/_bulk
{"index":{"_id":1}}
{"tags":["ping pang", "basket ball", " foot bool "]}
{"index":{"_id":2}}
{"tags":[" ping pang ", "gof bal"]}

要求：去除字符串中的空格，增加一个新字段 array_length，值为数组 tags 的长度

PUT _ingest/pipeline/my_pipeline
{
    "description":"for practice",
    "processors":[
      {
          "foreach":{
              "field": "tags",
              "processor":{
                  "trim":{
                      "field":"_ingest._value"
                  }
              }
          }
      },
      {
          "script":{
              "source":"""
              ArrayList list = ctx.tags;
              ctx.array_length = list.size();
              """
          }
      }
    ]
}

POST test_005/_update_by_query?pipeline=my_pipeline
#验证结果
GET test_005/_search

案例2：根据条件判断的结果，处理数据

设置 pipeline 名称为 earthquakes_pipeline
将 magnitude_type 字段的值转为大写
如果字段中包含 batch_number 则将该值 +1 ，否则将 batch_number 置为 1;

POST earthquakes/_doc/1
{
  "batch_number": 999,
  "magnitude_type":"abc"
}
POST earthquakes/_doc/2
{
  "magnitude_type":"cde"
}

PUT _ingest/pipeline/earthquakes_pipeline
{
    "description":"earth_quakes test",
    "processors":[
    {
        "uppercase":{
            "field":"magnitude_type"
        }
    },
    {
        "script":{
            "source":"""
            if(ctx.batch_number == null){
                ctx.batch_number = 1;
            }else {
                ctx.batch_number +=1;
            }
            """
        }
    }
    ]
}


POST earthquakes/_update_by_query?pipeline=earthquakes_pipeline

GET earthquakes/_search

案例3：计算字符串长度并分割字符串为数组，最后 reindex 到新索引,

虽然 reindex 可以直接写 script ，但采用 ingest 实现更为稳妥一些

PUT _ingest/pipeline/reindex_pipeline
{
    "description":"for reindex test",
    "processors":[
    {
        "script":{
            "source": """
            ctx.content_length = ctx.title.length();
            """
        }
    },
    {
        "set":{
            "field":"split_title",
            "value":"{{title}}"
        }
    },
    {
        "split":{
            "field":"split_title",
            "separator":" "
        }
    }
    ]
}

DELETE index_a

POST index_a/_doc/1
{
    "title":"foo bar"
}

POST index_a/_update_by_query?pipeline=reindex_pipeline

GET index_a/_search

案例4：使用 . 分割字符串，并将分割后的结果存到三个不同的字段中

POST _ingest/pipeline/_simulate
{
  "pipeline" : {
   "description": "split pipeline",
  "processors": [
    {
      "split": {
        "field": "line_number",
        "separator": "\\."
      }
    },
    {
      "set": {
        "field": "number_act",
        "value": "{{line_number.0}}"
      }
    },
     {
      "set": {
        "field": "number_scene",
        "value": "{{line_number.1}}"
      }
    },
     {
      "set": {
        "field": "number_line",
        "value": "{{line_number.2}}"
      }
    },
     {
        "script": {
          "source": """
            ctx.a1 = ctx.line_number.0;
            ctx.a2 = ctx.line_number.1;
            ctx.a3 = ctx.line_number.2;
          """
        }
      }
  ]
  },
  "docs" : [
    { "_source": { "line_number": "1.2.3"} }
  ]
}

別名操作

别名的快速操作:

PUT {index_name}/_alias/{alias_name}

GET {index_name}/_alias/{alias_name}

DELETE {index_name}/_alias/{alias_name}

具体指令操作，以及针对数据过滤：

POST _aliases
{
    "actions":{
        "add":{
            "index":"{index_name}",
            "alias":"{alias_name}",
            "filter":{
                "range":{
                    "{field}":{
                        "gt":"{value}"
                    }
                }
            }
        }
    }
}

单节点备份恢复全过程

根据要求，切换到可以启动 elastic 的用户，在服务器的指定目录下创建文件夹，例如：

su elastic

cd /opt/elasticsearch-7.2.1

mkdir repo

在节点的 elasticsearch.yml 文件中新增如下配置项并重启,路径即为刚才所创建的文件夹的位置：

1	path.repo: /opt/elasticsearch-7.2.1/repo

进行如下测试：

PUT index_a/_doc/1
{
  "name":"zhangsan"
}
PUT index_b/_doc/1
{
  "name":"zhangsan"
}
# 新建仓库 repo ,此时会在 /opt/elasticsearch-7.2.1/repo 下建立相应的文件夹
PUT _snapshot/snapshot_repo
{
  "type": "fs",
  "settings": {
    "location": "snapshot_repo"
  }
}
#创建备份
PUT _snapshot/snapshot_repo/index_snapshot?wait_for_completion=true
{
  "indices": "index_a,index_b",
  "ignore_unavailable": true,
  "include_global_state": true
}
#删除
DELETE index_a,index_b
#恢复
POST _snapshot/snapshot_repo/index_snapshot/_restore
{
  "indices": "index_a",
  "index_settings":{
    "number_of_replica": 0
  }
}
#成功
GET index_a/_search
#失败
GET index_b/_search

跨集群检索配置全流程

带权限认证的集群无法之间做 cross_cluster_search，除非签发时用到的证书是一样的。我是将xpack关闭后做的测试

在集群 A 创建 index_a 并插入一条数据，然后修改集群 A 的 _cluster/settings,（需要注意集群的 seeds 地址为 transport 的端口，默认为 9300 ，初次尝试时我没有注意到这点）

DELETE index_a
POST index_a/_doc
{
    "name":"zhangsan"
}

PUT _cluster/settings
{
    "persistent":{
        "cluster":{
            "remote":{
                "cluster_B":{
                    "seeds":["172.25.17.61:9300"]
                }
            }
        }
    }
}

在集群 B 创建 index_b 并插入一条数据,然后修改集群 B 的 _cluster/settings

DELETE index_b
POST index_b/_doc
{
    "name":"lisi"
}

PUT _cluster/settings
{
    "persistent":{
        "cluster":{
            "remote":{
                "cluster_A":{
                    "seeds":["172.25.17.58:9300","172.25.17.87:9300","172.25.17.142:9300"]
                }
            }
        }
    }
}

进行测试：

#在集群A 查询 B的数据
GET cluster_B:index_b/_search

在集群B 查询 A 的数据
GET cluster_A:index_a/_search

贝德维尔.jpg