将文本输出导出为 csv 格式,以便使用 Powershell 插入数据库

Export text ouput into csv format ready for insert into databases using Powershell

我希望将出现在我屏幕上的 aws cli 输出作为 powershell 会话的文本输出通过管道传输到 csv 格式的文本文件中。

我从以下文章中研究了 Export-CSV cmdlet:

https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/export-csv?view=powershell-7.1

我看不出如何使用它来帮助我实现目标。根据我的测试,它似乎只适用于特定的 windows 程序,不适用于一般的文本输出。

此站点上的一篇文章展示了如何使用 unix 命令通过用逗号替换空格来实现我的目标。

Unix 的答案是在命令末尾使用 sed,如下所示:

aws rds describe-db-instance-automated-backups --query 'DBInstanceAutomatedBackups[*].{ARN:DBInstanceArn,EarliestTime:RestoreWindow.EarliestTime,LatestTime:RestoreWindow.LatestTime}' --output text | sed -E 's/\s+/,/g'

Export-csv` 似乎无法执行此操作。

有谁知道我如何使用 powershell 复制 sed 在此处所做的事情?

这是我想要的 csv 格式的输出示例:

arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod     2019-03-03T09:54:29.402Z        2019-03-05T01:25:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:xyz-prod-rds-golf    2019-03-01T09:04:31.477Z        2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-stardb   2019-02-01T09:07:30.648Z        2019-03-05T01:27
:20Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-domaindb    2019-02-02T09:04:30.771Z        2019-03-05T01:28
:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-ctz-prod-rds-datavault   2019-02-26T14:14:30.254Z        2019-03-05T01:29
:13Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-gcp-prod-rds-datavault   2019-02-01T14:05:40.456Z        2019-03-05T01:31
:05Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-conformed-datavault-prod    2019-02-02T14:06:26.050Z        2019-03-
05T01:27:02Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-dqm-datavault-prod  2019-02-01T14:12:05.286Z        2019-03-05T01:26
:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-prod-dgc-cde-lineage 2019-03-02T09:54:29.053Z        2019-03-05T01:29
:11Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-rec-prod     2019-02-02T22:09:00.673Z        2019-03-05T01:29:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-serve-prod       2019-03-02T09:54:20.729Z        2019-03-05T01:30:21Z

让我们假设返回的数据看起来像这个模型(在问题中它的格式很奇怪):

$awsReturn = @"
arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod     2019-03-03T09:54:29.402Z        2019-03-05T01:25:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:xyz-prod-rds-golf    2019-03-01T09:04:31.477Z        2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-stardb   2019-02-01T09:07:30.648Z        2019-03-05T01:27:20Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-domaindb    2019-02-02T09:04:30.771Z        2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-ctz-prod-rds-datavault   2019-02-26T14:14:30.254Z        2019-03-05T01:29:13Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-gcp-prod-rds-datavault   2019-02-01T14:05:40.456Z        2019-03-05T01:31:05Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-conformed-datavault-prod    2019-02-02T14:06:26.050Z        2019-03-05T01:27:02Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-dqm-datavault-prod  2019-02-01T14:12:05.286Z        2019-03-05T01:26:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-prod-dgc-cde-lineage 2019-03-02T09:54:29.053Z        2019-03-05T01:29:11Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-rec-prod     2019-02-02T22:09:00.673Z        2019-03-05T01:29:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-serve-prod       2019-03-02T09:54:20.729Z        2019-03-05T01:30:21Z
"@

那么,你可以这样做:

# Since I don't know if that is one single string or a string array:
if ($awsReturn -isnot [array]) { $awsReturn = $awsReturn -split '\r?\n' }

# write it to csv file
$awsReturn -replace '\s+', ',' | Set-Content -Path 'WhereEver.csv' -PassThru  # PassThru also displays on screen

获取可用作 CSV 的文件(尽管它没有 headers 或带引号的字段)


如果你想用Export-CSV得到一个csv文件包含headers和引号字段,你需要拆分行并输出objects.

像这样:

# Since I don't know if that is one single string or a string array:
if ($awsReturn -isnot [array]) { $awsReturn = $awsReturn -split '\r?\n' }

# write it to csv file (without headers or quotes values)
$awsReturn | ForEach-Object {
    $data = $_ -split '\s+'  # in this case we know we have 3 fields
    [PsCustomObject]@{
        Prod      = $data[0]
        DateStart = $data[1]
        DateEnd   = $data[2]
    }
} | Export-Csv -Path 'WhereEver.csv' -NoTypeInformation

WhereEver.csv 文件将如下所示:

"Prod","DateStart","DateEnd"
"arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod","2019-03-03T09:54:29.402Z","2019-03-05T01:25:53Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:xyz-prod-rds-golf","2019-03-01T09:04:31.477Z","2019-03-05T01:28:40Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-stardb","2019-02-01T09:07:30.648Z","2019-03-05T01:27:20Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-domaindb","2019-02-02T09:04:30.771Z","2019-03-05T01:28:40Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:-ctz-prod-rds-datavault","2019-02-26T14:14:30.254Z","2019-03-05T01:29:13Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:-gcp-prod-rds-datavault","2019-02-01T14:05:40.456Z","2019-03-05T01:31:05Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:prod-conformed-datavault-prod","2019-02-02T14:06:26.050Z","2019-03-05T01:27:02Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:prod-dqm-datavault-prod","2019-02-01T14:12:05.286Z","2019-03-05T01:26:53Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:prod-prod-dgc-cde-lineage","2019-03-02T09:54:29.053Z","2019-03-05T01:29:11Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:prod-rec-prod","2019-02-02T22:09:00.673Z","2019-03-05T01:29:40Z"
"arn:aws:rds:ap-southwest-2:9711387875370:db:-serve-prod","2019-03-02T09:54:20.729Z","2019-03-05T01:30:21Z"

您可能正在使用制表符分隔的文本文件,没有 headers。 制表符分隔符在屏幕上显示时可能看起来像多个空格。

如果是这种情况,如果是这样,您实际上可以使用 import-csv 读取此文件,但您必须使用 -header 参数来提供您自己的字段名称,并且 -分隔符使用制表符作为分隔符。必须使用反引号转义机制指定制表符。

有关详细信息,请参阅 this question 的已接受答案。

如果您可以控制您的数据馈送,还有一个选择。 aws cli 界面有一个选项可以将输出格式化为 JSON 格式。该格式将更容易以您可以使用的形式导入 Powershell。

编辑:

以下脚本使用了 Theo 提供的模型,只是多个空格已被制表符替换。它使用 ConvertFrom-Csv 而不是 Import-Csv,但它是相同的想法:

$awsReturn = @"
arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod   2019-03-03T09:54:29.402Z    2019-03-05T01:25:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:xyz-prod-rds-golf   2019-03-01T09:04:31.477Z    2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-stardb    2019-02-01T09:07:30.648Z    2019-03-05T01:27:20Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-domaindb  2019-02-02T09:04:30.771Z    2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-ctz-prod-rds-datavault 2019-02-26T14:14:30.254Z    2019-03-05T01:29:13Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-gcp-prod-rds-datavault 2019-02-01T14:05:40.456Z    2019-03-05T01:31:05Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-conformed-datavault-prod   2019-02-02T14:06:26.050Z    2019-03-05T01:27:02Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-dqm-datavault-prod     2019-02-01T14:12:05.286Z    2019-03-05T01:26:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-prod-dgc-cde-lineage   2019-03-02T09:54:29.053Z    2019-03-05T01:29:11Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-rec-prod   2019-02-02T22:09:00.673Z    2019-03-05T01:29:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-serve-prod 2019-03-02T09:54:20.729Z    2019-03-05T01:30:21Z
"@


$myarray = $awsreturn | ConvertFrom-Csv -header "Prod","DateStart","DateEnd" -delimiter "`t"

$myarray | Format-Table

$myarray | gm

当我在我的环境中 运行 它时,它产生了以下内容:

Prod                                                                      DateStart                DateEnd             
----                                                                      ---------                -------             
arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod                 2019-03-03T09:54:29.402Z 2019-03-05T01:25:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:xyz-prod-rds-golf             2019-03-01T09:04:31.477Z 2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-stardb          2019-02-01T09:07:30.648Z 2019-03-05T01:27:20Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-asm-prod-rds-domaindb        2019-02-02T09:04:30.771Z 2019-03-05T01:28:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-ctz-prod-rds-datavault       2019-02-26T14:14:30.254Z 2019-03-05T01:29:13Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-gcp-prod-rds-datavault       2019-02-01T14:05:40.456Z 2019-03-05T01:31:05Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-conformed-datavault-prod 2019-02-02T14:06:26.050Z 2019-03-05T01:27:02Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-dqm-datavault-prod       2019-02-01T14:12:05.286Z 2019-03-05T01:26:53Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-prod-dgc-cde-lineage     2019-03-02T09:54:29.053Z 2019-03-05T01:29:11Z
arn:aws:rds:ap-southwest-2:9711387875370:db:prod-rec-prod                 2019-02-02T22:09:00.673Z 2019-03-05T01:29:40Z
arn:aws:rds:ap-southwest-2:9711387875370:db:-serve-prod                   2019-03-02T09:54:20.729Z 2019-03-05T01:30:21Z




   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                           
----        ----------   ----------                                                           
Equals      Method       bool Equals(System.Object obj)                                       
GetHashCode Method       int GetHashCode()                                                    
GetType     Method       type GetType()                                                       
ToString    Method       string ToString()                                                    
DateEnd     NoteProperty string DateEnd=2019-03-05T01:25:53Z                                  
DateStart   NoteProperty string DateStart=2019-03-03T09:54:29.402Z                            
Prod        NoteProperty string Prod=arn:aws:rds:ap-southwest-2:9711387875370:db:catflow--prod