将结构化文本文件解析为 CSV
Parse Structured Text File with into CSV
我是 PowerShell 的新手,搜索了又搜索,但找不到我正在尝试做的事情的解决方案。我想监控一组文件夹,如果任何文本文件发生更改,那么我想将该数据插入到 csv 文件中。文本文件的结构始终如下所示,但值(冒号后)可能会有所不同...
Type: 1 Red/1 Blue
SecondaryType:
Keywords:
Area: 150
Length: 28
Width: 22
System: 55.5cm
DateTime: 5/5/2017 10:06:38 PM
UserName: bgates
Platform: Major Platform 2017
CustomIdentifier: 1.11.0645.1330
Version: 14.116.65557.111
在 : 之后有一个选项卡,然后可能有也可能没有值。
我拼凑了一些代码,它正在导出 csv,但解析不正确,我想是因为选项卡和经常丢失数据。这是我的代码:
$watcher = New-Object System.IO.FileSystemWatcher
$watcher.Path = "C:\Users\username\Desktop_InProgress"
$watcher.Filter = "*.txt"
$watcher.IncludeSubdirectories = $true
$watcher.EnableRaisingEvents = $true
$action = { $path = $Event.SourceEventArgs.FullPath
$changeType = $Event.SourceEventArgs.ChangeType
$logline = "$(Get-Date), $changeType, $path"
(Get-Content $path) -join "`r`n" -Split "(?m)^(?=\S)" |
Where{$_} |
ForEach{
Clear-Variable PrimaryType,SecondaryType,Keywords,Area,Length,Width,System,DateTime,Username,Platform,CustomIdentifier,Version
Switch -regex ($_ -split "`r`n"){
"PrimaryType:" {$PrimaryType = ($_ -split ':',2)[-1].trim();Continue}
"SecondaryType:" {$SecondaryType = ($_ -split ':',2)[-1].trim();Continue}
"Keywords:" {$Keywords = ($_ -split ':',2)[-1].trim();Continue}
"Area:" {$Area = ($_ -split ':',2)[-1].trim();Continue}
"Length:" {$Length = ($_ -split ':',2)[-1].trim();Continue}
"Width:" {$Width = ($_ -split ':',2)[-1].trim();Continue}
"System:" {$System = ($_ -split ':',2)[-1].trim();Continue}
"DateTime:" {$DateTime = ($_ -split ':',2)[-1].trim();Continue}
"Username:" {$Username = ($_ -split ':',2)[-1].trim();Continue}
"Platform:" {$Platform = ($_ -split ':',2)[-1].trim();Continue}
"CustomIdentifier:" {$CustomIdentifier = ($_ -split ':',2)[-1].trim();Continue}
"Version:" {$Version = ($_ -split ':',2)[-1].trim();Continue}
}
[PSCustomObject]@{
'PrimaryType' = $PrimaryType
'SecondaryType' = $SecondaryType
'Keywords' = $Keywords
'Area' = $Area
'Length' = $Length
'Width' = $Width
'System' = $System
'DateTime' = $DateTime
'Username' = $Username
'Platform' = $Platform
'CustomIdentifier' = $CustomIdentifier
'Version' = $Version }
$Files | ForEach{ [PSCustomObject]@{'PrimaryType' = $PrimaryType; 'SecondaryType' = $SecondaryType; 'Keywords' = $Keywords; 'Area' = $Area; 'Length' = $Length; 'Width' = $Width; 'System' = $System; 'DateTime' = $DateTime; 'Username' = $Username; 'Platform' = $Platform; 'CustomIdentifier' = $CustomIdentifier; 'Version' = $Version}}
} | Export-Csv -path "C:\Users\username\Desktop\Smart Scrape\test.csv" -NoTypeInformation
###Add-content "C:\Users\username\Desktop\Smart Scrape\log.txt" -value $logline
}
Register-ObjectEvent $watcher "Created" -Action $action
Register-ObjectEvent $watcher "Changed" -Action $action
Register-ObjectEvent $watcher "Deleted" -Action $action
Register-ObjectEvent $watcher "Renamed" -Action $action
while ($true) {sleep 5}
其中一个文件的简单解析器可以是:
$Entries = [ordered]@{}
Get-Content $Path | ForEach-Object {
$Key, $Value = $_ -split ':', 2
$Entries[$Key.Trim()] = $Value.Trim()
}
[PSCustomObject]$Entries | Export-Csv -Append -Path "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -NoTypeInformation
它只是用冒号分割每一行,只分割成 2 以避免破坏其中有冒号的日期时间,然后取左和右并将它们存储在哈希表(字典)中,然后将其转换到 PSCustomObject 进行输出。
它不知道字段名称是什么,看起来也不需要。
试试这个
get-content $Path | ConvertFrom-Csv -Delimiter ":" -Header Name, Value | export-csv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -notype -append
带有别名的简短版本
gc $Path | ConvertFrom-Csv -D ":" -h Name, Value | epcsv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -not -a
我是 PowerShell 的新手,搜索了又搜索,但找不到我正在尝试做的事情的解决方案。我想监控一组文件夹,如果任何文本文件发生更改,那么我想将该数据插入到 csv 文件中。文本文件的结构始终如下所示,但值(冒号后)可能会有所不同...
Type: 1 Red/1 Blue
SecondaryType:
Keywords:
Area: 150
Length: 28
Width: 22
System: 55.5cm
DateTime: 5/5/2017 10:06:38 PM
UserName: bgates
Platform: Major Platform 2017
CustomIdentifier: 1.11.0645.1330
Version: 14.116.65557.111
在 : 之后有一个选项卡,然后可能有也可能没有值。
我拼凑了一些代码,它正在导出 csv,但解析不正确,我想是因为选项卡和经常丢失数据。这是我的代码:
$watcher = New-Object System.IO.FileSystemWatcher
$watcher.Path = "C:\Users\username\Desktop_InProgress"
$watcher.Filter = "*.txt"
$watcher.IncludeSubdirectories = $true
$watcher.EnableRaisingEvents = $true
$action = { $path = $Event.SourceEventArgs.FullPath
$changeType = $Event.SourceEventArgs.ChangeType
$logline = "$(Get-Date), $changeType, $path"
(Get-Content $path) -join "`r`n" -Split "(?m)^(?=\S)" |
Where{$_} |
ForEach{
Clear-Variable PrimaryType,SecondaryType,Keywords,Area,Length,Width,System,DateTime,Username,Platform,CustomIdentifier,Version
Switch -regex ($_ -split "`r`n"){
"PrimaryType:" {$PrimaryType = ($_ -split ':',2)[-1].trim();Continue}
"SecondaryType:" {$SecondaryType = ($_ -split ':',2)[-1].trim();Continue}
"Keywords:" {$Keywords = ($_ -split ':',2)[-1].trim();Continue}
"Area:" {$Area = ($_ -split ':',2)[-1].trim();Continue}
"Length:" {$Length = ($_ -split ':',2)[-1].trim();Continue}
"Width:" {$Width = ($_ -split ':',2)[-1].trim();Continue}
"System:" {$System = ($_ -split ':',2)[-1].trim();Continue}
"DateTime:" {$DateTime = ($_ -split ':',2)[-1].trim();Continue}
"Username:" {$Username = ($_ -split ':',2)[-1].trim();Continue}
"Platform:" {$Platform = ($_ -split ':',2)[-1].trim();Continue}
"CustomIdentifier:" {$CustomIdentifier = ($_ -split ':',2)[-1].trim();Continue}
"Version:" {$Version = ($_ -split ':',2)[-1].trim();Continue}
}
[PSCustomObject]@{
'PrimaryType' = $PrimaryType
'SecondaryType' = $SecondaryType
'Keywords' = $Keywords
'Area' = $Area
'Length' = $Length
'Width' = $Width
'System' = $System
'DateTime' = $DateTime
'Username' = $Username
'Platform' = $Platform
'CustomIdentifier' = $CustomIdentifier
'Version' = $Version }
$Files | ForEach{ [PSCustomObject]@{'PrimaryType' = $PrimaryType; 'SecondaryType' = $SecondaryType; 'Keywords' = $Keywords; 'Area' = $Area; 'Length' = $Length; 'Width' = $Width; 'System' = $System; 'DateTime' = $DateTime; 'Username' = $Username; 'Platform' = $Platform; 'CustomIdentifier' = $CustomIdentifier; 'Version' = $Version}}
} | Export-Csv -path "C:\Users\username\Desktop\Smart Scrape\test.csv" -NoTypeInformation
###Add-content "C:\Users\username\Desktop\Smart Scrape\log.txt" -value $logline
}
Register-ObjectEvent $watcher "Created" -Action $action
Register-ObjectEvent $watcher "Changed" -Action $action
Register-ObjectEvent $watcher "Deleted" -Action $action
Register-ObjectEvent $watcher "Renamed" -Action $action
while ($true) {sleep 5}
其中一个文件的简单解析器可以是:
$Entries = [ordered]@{}
Get-Content $Path | ForEach-Object {
$Key, $Value = $_ -split ':', 2
$Entries[$Key.Trim()] = $Value.Trim()
}
[PSCustomObject]$Entries | Export-Csv -Append -Path "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -NoTypeInformation
它只是用冒号分割每一行,只分割成 2 以避免破坏其中有冒号的日期时间,然后取左和右并将它们存储在哈希表(字典)中,然后将其转换到 PSCustomObject 进行输出。
它不知道字段名称是什么,看起来也不需要。
试试这个
get-content $Path | ConvertFrom-Csv -Delimiter ":" -Header Name, Value | export-csv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -notype -append
带有别名的简短版本
gc $Path | ConvertFrom-Csv -D ":" -h Name, Value | epcsv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -not -a