如何在 cloudformation 中设置 "create a single schema for each s3 path"?
How can I set "create a single schema for each s3 path" in cloudformation?
我想从 CFN (Cloudformation) 创建爬虫资源。
这是我的代码:
Type: AWS::Glue::Crawler
Properties:
Name: !Ref GlueCrawlerName
Role: !GetAtt crawlerRole.Arn
Description: AWS Glue crawler to crawl DLG data
DatabaseName: !Ref GlueDatabaseName
Targets:
S3Targets:
- Path:
!Join
- ''
- - 's3://'
- !Ref s3bucket
- '/'
- !Ref GlueTableName
SchemaChangePolicy:
UpdateBehavior: UPDATE_IN_DATABASE
DeleteBehavior: DEPRECATE_IN_DATABASE
Schedule:
ScheduleExpression: cron(0 1 * * ? 2019)
一切正常,只有'Create a single schema for each S3 path'是错误的。哪个 属性 是为了将此设置为 true?
每个子文件夹需要一个 table 还是在 s3 路径的根级别只需要一个 table?
对于单根级别 table,请在您的 CFN 中附加以下内容:
Configuration: "{\"Version\":1.0,\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"
也许会有帮助。根据 AWS 文档:
Set the Configuration field with a string representation of the
following JSON object in the crawler API:
{
"Version": 1.0,
"Grouping": {
"TableGroupingPolicy": "CombineCompatibleSchemas" }
}
我想从 CFN (Cloudformation) 创建爬虫资源。
这是我的代码:
Type: AWS::Glue::Crawler
Properties:
Name: !Ref GlueCrawlerName
Role: !GetAtt crawlerRole.Arn
Description: AWS Glue crawler to crawl DLG data
DatabaseName: !Ref GlueDatabaseName
Targets:
S3Targets:
- Path:
!Join
- ''
- - 's3://'
- !Ref s3bucket
- '/'
- !Ref GlueTableName
SchemaChangePolicy:
UpdateBehavior: UPDATE_IN_DATABASE
DeleteBehavior: DEPRECATE_IN_DATABASE
Schedule:
ScheduleExpression: cron(0 1 * * ? 2019)
一切正常,只有'Create a single schema for each S3 path'是错误的。哪个 属性 是为了将此设置为 true?
每个子文件夹需要一个 table 还是在 s3 路径的根级别只需要一个 table?
对于单根级别 table,请在您的 CFN 中附加以下内容:
Configuration: "{\"Version\":1.0,\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"
也许会有帮助。根据 AWS 文档:
Set the Configuration field with a string representation of the following JSON object in the crawler API:
{
"Version": 1.0,
"Grouping": {
"TableGroupingPolicy": "CombineCompatibleSchemas" }
}