DB 文件夹利用大量 space 创建 space 问题

Question

我有一个 grafana windows server.Where 我们已经集成了与 HyperV 快照相关的信息以及 CPU、HV 的内存使用等。我可以在我们的 grana 中看到以下文件夹 windows 服务器

C:\InfluxDB\data\telegraf\autogen

在这个 autogen 文件夹下，我可以看到多个包含 .tsm 个文件的子文件夹。每个文件每 7 天创建一次，文件夹大小约为 4 到 5GB。从 2017 年 2 月 2 日到 2018 年 3 月 14 日，此 autogen 文件夹中有许多文件，使用了大约 225GB space。

Answer 1

你看到的： autogen 是 RP 的默认 Retention Policy (RP) auto-created by InfluxDB and has an infinite data retention duration. All datapoints in Influx are logically stored in shards. Physically shards data is compressed and stored in .tsm files. Shards are unified into shards groups. Each shard group covers a specific time range defined by so-called shard duration and stores datapoints belonging to this time interval. By default，retention duration > 6 month 分片组持续时间设置为 7 days。

有关详细信息，请参阅 storage engine 上的文档。

关于您的问题：

"Is there anyway we can shrink the size of autogen file?"
可能没有。你唯一能做的就是依靠 InfluxDB 内部压缩。 Here 他们说如果你增加 shard duration 可能会有所改善。
*虽然，因为 InfluxDB 删除整个分片而不是单独的数据点，shard duration 的增加将使您的数据被存储，直到整个分片超出当前保留期限的范围，然后才会被删除。不过，如果您有无限的保留期限，那也没关系。这就引出了第二个问题。
"Is it possible to delete the old file under autogen folder?"
如果你能负担得起丢失旧数据或负担不起太多存储 space InfluxDB 允许指定数据保留策略 (RP)，上面已经提到过。基本上，您的所有测量都与特定的 RP 相关联，一旦保留期限结束，数据将被删除。因此，如果您指定 RP 为 1 年，InfluxDB 将自动删除所有早于 now() - 1 year 的数据点。 RP 是处理存储问题的标准（并且非常明显）方法。 RP 想法的逻辑延续是 分组和聚合 您的数据在更长的离散时间间隔 （下采样）。在 Influx 中，它可以通过连续查询（CQ）来实现。您可以阅读更多 data retention and downsamping here。

总而言之，存储限制是不可避免的，正确配置保留策略是可行的方法。

DB 文件夹利用大量 space 创建 space 问题

DB folder utilising lot of space creating space issue

influxdb

grafana