我们可以在 S3 存储桶上配置 Marklogic 数据库备份吗

Can we configure Marklogic database backup on S3 bucket

我需要在 S3 存储桶中配置 Marklogic Full/Incremental 备份可以吗?任何人都可以分享 documents/steps 来配置吗?

谢谢!

是的,您可以备份到 S3。

您需要配置 S3 凭据,以便 MarkLogic 能够使用 S3 和 read/write 对象到您的 S3 存储桶。

MarkLogic 不能将 S3 用于日志存档路径,因为 S3 不支持文件追加操作。因此,如果您想启用日志存档,则需要在创建备份时为其指定一个自定义路径。

Backing Up a Database

The directory you specified can be an operating system mounted directory path, it can be an HDFS path, or it can be an S3 path. For details on using HDFS and S3 storage in MarkLogic, see Disk Storage Considerations in the Query Performance and Tuning Guide.

S3 Storage

S3 requires authentication with the following S3 credentials:

  • AWS Access Key
  • AWS Secret Key

The S3 credentials for a MarkLogic cluster are stored in the security database for the cluster. You can only have one set of S3 credentials per cluster. You can set up security access in S3, you can access any paths that are allowed access by those credentials. Because of the flexibility of how you can set up access in S3, you can set up any S3 account to allow access to any other account, so if you want to allow the credentials you have set up in MarkLogic to access S3 paths owned by other S3 users, those users need to grant access to those paths to the AWS Access Key set up in your MarkLogic Cluster.

To set up the AW credentials for a cluster, enter the keys in the Admin Interface under Security > Credentials. You can also set up the keys programmatically using the following Security API functions:

  • sec:credentials-get-aws
  • sec:credentials-set-aws

The credentials are stored in the Security database. Therefore, you cannot use S3 as the forest storage for a security database.

如果您想启用日记功能,您需要将它们写入不同的位置。 S3 不支持日志归档。

日志的默认位置在备份中,但是当 creating programmatically 您可以指定一个不同的 $journal-archive-path

S3 and MarkLogic

Storage on S3 has an 'eventual consistency' property, meaning that write operations might not be available immediately for reading, but they will be available at some point. Because of this, S3 data directories in MarkLogic have a restriction that MarkLogic does not create Journals on S3. Therefore, MarkLogic recommends that you use S3 only for backups and for read-only forests, otherwise you risk the possibility of data loss. If your forests are read-only, then there is no need to have journals.