将结构化文本文件解析为 CSV

Parse Structured Text File with into CSV

我是 PowerShell 的新手,搜索了又搜索,但找不到我正在尝试做的事情的解决方案。我想监控一组文件夹,如果任何文本文件发生更改,那么我想将该数据插入到 csv 文件中。文本文件的结构始终如下所示,但值(冒号后)可能会有所不同...

Type:   1 Red/1 Blue
SecondaryType:  
Keywords:   
Area:   150
Length: 28
Width:  22
System: 55.5cm
DateTime:   5/5/2017 10:06:38 PM
UserName:   bgates
Platform:   Major Platform 2017
CustomIdentifier:   1.11.0645.1330
Version:    14.116.65557.111

在 : 之后有一个选项卡,然后可能有也可能没有值。

我拼凑了一些代码,它正在导出 csv,但解析不正确,我想是因为选项卡和经常丢失数据。这是我的代码:

$watcher = New-Object System.IO.FileSystemWatcher
$watcher.Path = "C:\Users\username\Desktop_InProgress"
$watcher.Filter = "*.txt"
$watcher.IncludeSubdirectories = $true
$watcher.EnableRaisingEvents = $true  


$action = { $path = $Event.SourceEventArgs.FullPath
            $changeType = $Event.SourceEventArgs.ChangeType
            $logline = "$(Get-Date), $changeType, $path"
            (Get-Content $path) -join "`r`n" -Split "(?m)^(?=\S)" |
                Where{$_} | 
                ForEach{
                    Clear-Variable PrimaryType,SecondaryType,Keywords,Area,Length,Width,System,DateTime,Username,Platform,CustomIdentifier,Version
                    Switch -regex ($_ -split "`r`n"){
                        "PrimaryType:" {$PrimaryType = ($_ -split ':',2)[-1].trim();Continue}
                        "SecondaryType:" {$SecondaryType = ($_ -split ':',2)[-1].trim();Continue}
                        "Keywords:" {$Keywords = ($_ -split ':',2)[-1].trim();Continue}
                        "Area:" {$Area = ($_ -split ':',2)[-1].trim();Continue}
                        "Length:" {$Length = ($_ -split ':',2)[-1].trim();Continue}
                        "Width:" {$Width = ($_ -split ':',2)[-1].trim();Continue}
                        "System:" {$System = ($_ -split ':',2)[-1].trim();Continue}
                        "DateTime:" {$DateTime = ($_ -split ':',2)[-1].trim();Continue}
                        "Username:" {$Username = ($_ -split ':',2)[-1].trim();Continue}
                        "Platform:" {$Platform = ($_ -split ':',2)[-1].trim();Continue}
                        "CustomIdentifier:" {$CustomIdentifier = ($_ -split ':',2)[-1].trim();Continue}
                        "Version:" {$Version = ($_ -split ':',2)[-1].trim();Continue}
                    }
                    [PSCustomObject]@{
                        'PrimaryType' = $PrimaryType
                        'SecondaryType' = $SecondaryType
                        'Keywords' = $Keywords
                        'Area' = $Area
                        'Length' = $Length
                        'Width' = $Width
                        'System' = $System
                        'DateTime' = $DateTime
                        'Username' = $Username
                        'Platform' = $Platform
                        'CustomIdentifier' = $CustomIdentifier
                        'Version' = $Version }

                    $Files | ForEach{ [PSCustomObject]@{'PrimaryType' = $PrimaryType; 'SecondaryType' = $SecondaryType; 'Keywords' = $Keywords; 'Area' = $Area; 'Length' = $Length; 'Width' = $Width; 'System' = $System; 'DateTime' = $DateTime; 'Username' = $Username; 'Platform' = $Platform; 'CustomIdentifier' = $CustomIdentifier; 'Version' = $Version}}
                } | Export-Csv -path "C:\Users\username\Desktop\Smart Scrape\test.csv" -NoTypeInformation
            ###Add-content "C:\Users\username\Desktop\Smart Scrape\log.txt" -value $logline
          }    
Register-ObjectEvent $watcher "Created" -Action $action
Register-ObjectEvent $watcher "Changed" -Action $action
Register-ObjectEvent $watcher "Deleted" -Action $action
Register-ObjectEvent $watcher "Renamed" -Action $action
while ($true) {sleep 5}

其中一个文件的简单解析器可以是:

$Entries = [ordered]@{}

Get-Content $Path | ForEach-Object {

    $Key, $Value = $_ -split ':', 2
    $Entries[$Key.Trim()] = $Value.Trim()

}

[PSCustomObject]$Entries | Export-Csv -Append -Path "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -NoTypeInformation

它只是用冒号分割每一行,只分割成 2 以避免破坏其中有冒号的日期时间,然后取左和右并将它们存储在哈希表(字典)中,然后将其转换到 PSCustomObject 进行输出。

它不知道字段名称是什么,看起来也不需要。

试试这个

get-content $Path | ConvertFrom-Csv -Delimiter ":" -Header Name, Value | export-csv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv"  -notype -append

带有别名的简短版本

gc $Path | ConvertFrom-Csv -D ":" -h Name, Value | epcsv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -not -a