如何将旧的 clickhouse 分区移动到 S3 磁盘

How to move older clickhouse partitions to S3 disk

我目前正开始为我们的内部分析系统使用 clickhouse,但似乎没有自动的方法来配置数据保留策略。我唯一看到的是 ALTER ... MOVE PARTITION (https://clickhouse.tech/docs/en/sql-reference/statements/alter/partition/#alter_move-partition),但看起来这个过程必须在我们的应用程序层中手动/实现。

我的 objective 出于存档和价格原因,将超过 3 个月的数据直接移动到 S3 集群,同时仍然能够查询它。

是否有直接在 clickhouse 中使用存储策略执行此操作的本地方法?

提前致谢。

This answer was based out of @Denny Crane's comment: https://altinity.com/blog/clickhouse-and-s3-compatible-object-storage, where I did put comments where there were not enough explanations, and keeping it in the event that the link dies.

    1. 将您的 S3 磁盘添加到新的配置文件(假设 /etc/clickhouse-server/config.d/storage.xml:
<yandex>
  <storage_configuration>
    <disks>
      <!-- This tag is the name of your S3-emulated disk, used for the rest of this tutorial -->
      <your_s3>
        <type>s3</type>
        <!-- Set this to the endpoint of your S3-compatible provider -->
        <endpoint>https://nyc3.digitaloceanspaces.com</endpoint>
        <!-- Set this to your access key ID provided by your provider -->
        <access_key_id>*****</access_key_id>
        <!-- Set this to your access key Secret provided by your provider -->
        <secret_access_key>*****</secret_access_key>
      </your_s3>
    </disks>
  <!-- Don't leave this file yet! We still have things to do there -->
  ...
  </storage_configuration>
</yandex>
    1. 为您的数据存储添加存储策略:
<!-- Put this after the three dots in the snippet above -->
<policies>
  <shared>
    <volumes>
      <default>
        <!-- Default is the disk that is present in the default question -->
        <disk>default</disk>
      </default>
      <your_s3>
        <disk>your_s3</disk>
      </your_s3>
    </volumes>
  </shared>
</policies>

完成后,您可以使用以下插入语句创建表:

CREATE TABLE visits (...)
ENGINE = MergeTree
TTL toStartOfYear(time) + interval 3 year to volume 'your_s3'
SETTINGS storage_policy = 'shared';

其中 shared 是您的策略名称,your_s3 是您在该策略中的磁盘名称。