删除 elasticsearch 中的旧索引

Removing old indices in elasticsearch

我的许多日志都以 logstash-年-周格式编入索引。也就是说,如果我想删除超过几周的索引,我该如何在 elasticsearch 中实现它。有没有一种简单、无缝的方法来做到这一点?

看看Curator,专门为这种用例开发的工具。

用于文档的示例命令:

curator --host 10.0.0.2 delete indices --older-than 30 --time-unit days \
   --timestring '%Y.%m.%d'

Curator 是最合适的人选。 您可以在此处找到 link - https://github.com/elastic/curator

像下面这样的命令应该可以正常工作 -

curator --host <IP> delete indices --older-than 30 --prefix "twitter-" --time-unit days  --timestring '%Y-%m-%d'

您可以将其保留在 CRON 中,以便偶尔删除索引。

您可以在此处找到一些示例和文档 - https://www.elastic.co/guide/en/elasticsearch/client/curator/current/examples.html

我使用 bash 脚本,只需将 30 更改为您要保留的天数

#!/bin/bash

# Zero padded days using %d instead of %e
DAYSAGO=`date --date="30 days ago" +%Y%m%d`
ALLLINES=`/usr/bin/curl -s -XGET http://127.0.0.1:9200/_cat/indices?v | egrep logstash`

echo
echo "THIS IS WHAT SHOULD BE DELETED FOR ELK:"
echo

echo "$ALLLINES" | while read LINE
do
  FORMATEDLINE=`echo $LINE | awk '{ print  }' | awk -F'-' '{ print  }' | sed 's/\.//g' ` 
  if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
  then
    TODELETE=`echo $LINE | awk '{ print  }'`
    echo "http://127.0.0.1:9200/$TODELETE"
  fi
done

echo
echo -n "if this make sence, Y to continue N to exit [Y/N]:"
read INPUT
if [ "$INPUT" == "Y" ] || [ "$INPUT" == "y" ] || [ "$INPUT" == "yes" ] || [ "$INPUT" == "YES" ]
then
  echo "$ALLLINES" | while read LINE
  do
    FORMATEDLINE=`echo $LINE | awk '{ print  }' | awk -F'-' '{ print  }' | sed 's/\.//g' `
    if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
    then
      TODELETE=`echo $LINE | awk '{ print  }'`
      /usr/bin/curl -XDELETE http://127.0.0.1:9200/$TODELETE
      sleep 1
      fi
  done
else 
  echo SCRIPT CLOSED BY USER, BYE ...
  echo
  exit
fi

yanb(又一个bash)

#!/bin/bash
searchIndex=logstash-monitor
elastic_url=localhost
elastic_port=9200

date2stamp () {
    date --utc --date "" +%s
}

dateDiff (){
    case  in
        -s)   sec=1;      shift;;
        -m)   sec=60;     shift;;
        -h)   sec=3600;   shift;;
        -d)   sec=86400;  shift;;
        *)    sec=86400;;
    esac
    dte1=$(date2stamp )
    dte2=$(date2stamp )
    diffSec=$((dte2-dte1))
    if ((diffSec < 0)); then abs=-1; else abs=1; fi
    echo $((diffSec/sec*abs))
}

for index in $(curl -s "${elastic_url}:${elastic_port}/_cat/indices?v" |     grep -E " ${searchIndex}-20[0-9][0-9]\.[0-1][0-9]\.[0-3][0-9]" | awk '{     print  }'); do
  date=$(echo ${index: -10} | sed 's/\./-/g')
  cond=$(date +%Y-%m-%d)
  diff=$(dateDiff -d $date $cond)
  echo -n "${index} (${diff})"
  if [ $diff -gt 1 ]; then
    echo " / DELETE"
    # curl -XDELETE "${elastic_url}:${elastic_port}/${index}?pretty"
  else
    echo ""
  fi
done    

如果您使用的是 elasticsearch 版本 5.x,那么您需要安装 curator 版本 4.x。 您可以从 documentation

查看版本兼容性和安装步骤

一旦安装。然后 运行 命令

curator --config path/config_file.yml [--dry-run] path/action_file.yml

Curator 提供了一个 dry-run 标志来仅输出 Curator 将执行的内容。输出将在您在 config.yml 文件中定义的日志文件中。如果没有在 config_file.yml 中定义的记录键,那么馆长将输出到控制台。要删除索引 运行 上面的命令没有 --dry-run flag

配置文件config_file.yml是

---
client:
  hosts:
   - 127.0.0.1
  port: 9200
logging:
  loglevel: INFO
  logfile: "/root/curator/logs/actions.log"
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

动作文件 action_file.yml 是

---
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 7 days (based on index name), for logstash-
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: logstash-
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 7
      exclude:

如果要自动删除周指数、月指数等。然后只需编写 bash 脚本,如

#!/bin/bash
# Script to delete the log event indices of the elasticsearch weekly

#This will delete the indices of the last 7 days
curator --config /path/config_file.yml /path/action_file.yml

将 shell 脚本放入以下文件夹之一:/etc/cron.daily, /etc/cron.hourly, /etc/cron.monthly or /etc/cron.weekly 您的工作就完成了。

注意:确保在配置和操作文件中使用正确的缩进。不然不行。

curator_cli delete_indices --filter_list '{"filtertype":"none"}' 

将删除全部或过滤:

 --filter_list '[{"filtertype":"age","source":"creation_date","direction":"older","unit":"days","unit_count":13},{"filtertype":"pattern","kind":"prefix","value":"logstash"}]'

你可以使用 curl

 curl -X DELETE http://localhost:9200/filebeat-$(date +"%Y.%m.%d" -d "last Month")

这个必须把这个命令加到xxx.sh,就可以创建crontab了。 crontab -e

00 00 * * * /etc/elasticsearch/xxx.sh

此 cron 将在每天中午 12 点 运行 删除旧日志。

在我的例子中,删除旧索引是强制性的,因为我已经从 5.X、

升级到 7.5 版本

所以我按照简单的步骤清除了索引。

rm -rf /var/lib/elasticsearch/nodes/0/indices/*

从 elasticsearch 6.6 开始,Index Lifecycle Management 包含在基本(免费)版本 elasticsearch 中,并以更优雅的方式完成了 Curator 过去的工作。

以下步骤未经许可转载自 Martin Ehrnhöfer 的精彩简洁 blog post.

假设(注意复制粘贴):

  • 您的 elasticsearch 服务器可在 http://elasticsearch:9200
  • 访问
  • 您希望在三十天后清除索引 (30d)
  • 您的保单名称将创建为 cleanup_policy
  • 您的 filebeat 索引名称以 filebeat-
  • 开头
  • 您的 logstash 索引名称以 logstash-
  • 开头

1。创建一个在一个月后删除索引的策略

curl -X PUT "http://elasticsearch:9200/_ilm/policy/cleanup_policy?pretty" \
     -H 'Content-Type: application/json' \
     -d '{
      "policy": {                       
        "phases": {
          "hot": {                      
            "actions": {}
          },
          "delete": {
            "min_age": "30d",           
            "actions": { "delete": {} }
          }
        }
      }
    }'

2。将此策略应用于所有现有的 filebeat 和 logstash 索引

curl -X PUT "http://elasticsearch:9200/logstash-*/_settings?pretty" \
     -H 'Content-Type: application/json' \
     -d '{ "lifecycle.name": "cleanup_policy" }'
curl -X PUT "http://elasticsearch:9200/filebeat-*/_settings?pretty" \
     -H 'Content-Type: application/json' \
     -d '{ "lifecycle.name": "cleanup_policy" }'

3。创建一个模板以将此策略应用于新的 filebeat 和 logstash 索引

curl -X PUT "http://elasticsearch:9200/_template/logging_policy_template?pretty" \
     -H 'Content-Type: application/json' \
     -d '{
      "index_patterns": ["filebeat-*", "logstash-*"],                 
      "settings": { "index.lifecycle.name": "cleanup_policy" }
    }'

策展人没有帮助我

现在 Curator 在 运行 使用以下命令时给我一个错误:

curator --config config_file.yml action_file.yml

错误:

Error: Elasticsearch version 7.9.1 incompatible with this version of Curator (5.2.0)

找不到与Elasticsearch 7.9.1 兼容的curator 版本,我不能只升级或降级elasticsearch 版本。因此,我改为使用@Alejandro 的答案并使用下面的脚本来完成。我稍微修改了脚本

脚本解决方案

#!/bin/bash

# Zero padded days using %d instead of %e
DAYSAGO=`date --date="30 days ago" +%Y%m%d`
ALLLINES=`/usr/bin/curl -s -XGET http://127.0.0.1:9200/_cat/indices?v`
# Just add -u <username>:<password> in curl statement if your elastic search is behind the credentials. Also, you can give an additional grep statement to filter out specific indexes

echo
echo "THIS IS WHAT SHOULD BE DELETED FOR ELK:"
echo

echo "$ALLLINES" | while read LINE
do
  FORMATEDLINE=`echo $LINE | awk '{ print  }' | grep -Eo "[0-9]{4}.[0-9]{2}.[0-9]{2}" | sed 's/\.//g'`
  if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
  then
    TODELETE=`echo $LINE | awk '{ print  }'`
    echo "http://127.0.0.1:9200/$TODELETE"
  fi
done

echo
echo -n "Y to continue N to exit [Y/N]:"
read INPUT
if [ "$INPUT" == "Y" ] || [ "$INPUT" == "y" ] || [ "$INPUT" == "yes" ] || [ "$INPUT" == "YES" ]
then
  echo "$ALLLINES" | while read LINE
    do
    FORMATEDLINE=`echo $LINE | awk '{ print  }' | grep -Eo "[0-9]{4}.[0-9]{2}.[0-9]{2}" | sed 's/\.//g'`
    if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
    then
      TODELETE=`echo -n $LINE | awk '{ print  }'`
      /usr/bin/curl -XDELETE http://127.0.0.1:9200/$TODELETE
      sleep 1
      fi
  done
else
  echo SCRIPT CLOSED BY USER, BYE ...
  echo
  exit
fi