逐行解析多个文本文件,但仅解析特定匹配项之前的行
Parse multiple text files line by line but only the lines before a specific match
我正在处理大量日志文件,我想解析每个文件中特定匹配项之前的所有行。
文件看起来像这样:
04/14/2022 02:07:19 SLK 1234 12345 86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
04/14/2022 02:07:58 SLK 1234 12345 86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
04/14/2022 02:08:01 SLK 1234 4321 86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 1 Cluster 1 ID 00EQ3G038651
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 2 CLuster 2 ID 00EQ3G026434
00000000-0000-0000-0000-000000000664 Error 2022-04-13T02:09:07+02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) sscmd(5812:1) KEYINPUT.CPP(209) Command: MANUAL ROLLOVER
00000000-0000-0000-0000-000000000649 Error 2022-04-13T02:09:07+02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) Closed the current active Error Log due to: Manual rollover.
我想要模式“Command: MANUAL ROLLOVER”之前的所有 n 行(来自所有文件)不包括该行,然后解析 Excel 电子表格中的数据,如下所示:
Client ID || Date || Time || Error code|| Detail
SLK-1234 04/14/2022 02:07:19 12345 86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
SLK-1234 04/14/2022 02:07:19 12345 86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
SLK-1234 04/14/2022 02:07:19 4321 86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10
我“认为”我用这段代码正确地理解了第一部分:
$move = "X:\New\test\Output"
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter *.*
$Results = foreach( $File in $Files ){
$location = $root+"\"+$file
$s = Select-String -Path "$location" -Pattern "Command: ROLLOVER ERRORLOG" -Context ?,0 |
Foreach-Object { $_.Line,$_.Context.PreContext[0].Trim()}
但我很少提取到 Excel 所以我不知道如何实现其余的 :/.
我会使用 switch
来处理文件 line-by-line 并使用(相当长的)正则表达式将数据解析为对象。
通过日志文件并将数据收集为对象后,您可以使用 Export-Csv
将其全部写入结构化 CSV 文件中,您只需在 Excel.
中打开
$move = "X:\New\test\Output"
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File
$Results = foreach($file in $files) {
$data = switch -Regex -File $file.FullName {
# exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
'Command: MANUAL ROLLOVER' { break }
# parse the string into named matches
'^(?<date>\d{2}/\d{2}/\d{4})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<idleft>[^\s]+)\s+(?<idright>\d+)\s+(?<error>\d+)\s+(?<details>(.+))$' {
# output an object with the wanted properties
[PsCustomObject]@{
'Client ID' = '{0}-{1}' -f $matches['idleft'], $matches['idright']
'Date' = $matches['date']
'Time' = $matches['time']
'Error code' = $matches['error']
'Detail' = $matches['details']
}
}
}
$target = Join-Path -Path $move -ChildPath $file.Name
# export the gathered data to a csv you can double-click to open in Excel
$data | Export-Csv -Path $target -UseCulture -NoTypeInformation
}
使用您的示例日志,这将在 Excel 中打开为:
根据您的评论,我了解到您想解析所有文件中的数据并将其保存在同一文件夹中的一个大型 csv 文件中。
为此:
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File
$Results = foreach($file in $files) {
switch -Regex -File $file.FullName {
# exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
'Command: MANUAL ROLLOVER' { break }
# parse the string into named matches
'^(?<date>\d{2}/\d{2}/\d{4})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<idleft>[^\s]+)\s+(?<idright>\d+)\s+(?<error>\d+)\s+(?<details>(.+))$' {
# output an object with the wanted properties
[PsCustomObject]@{
'Client ID' = '{0}-{1}' -f $matches['idleft'], $matches['idright']
'Date' = $matches['date']
'Time' = $matches['time']
'Error code' = $matches['error']
'Detail' = $matches['details']
}
}
}
}
$target = Join-Path -Path $root -ChildPath 'LogResults.csv'
# export the gathered data to a csv you can double-click to open in Excel
$Results | Export-Csv -Path $target -UseCulture -NoTypeInformation
如果您需要例如每个客户 ID 成为其自己的 csv 文件,以便您可以将它们作为单独的 Excel 工作表加载,您可以执行以下操作:
$Results | Group-Object 'Client ID' | ForEach-Object {
$target = Join-Path -Path $root -ChildPath ('{0}.csv' -f $_.Name)
$_.Group | Export-Csv -Path $target -UseCulture -NoTypeInformation
}
我不太熟悉 ImportExcel 模块,但我相信这可能有效:
$target = Join-Path -Path $root -ChildPath 'LogResults.xlsx'
$Results | Group-Object 'Client ID' | ForEach-Object {
$_.Group | Export-Excel -Path $target -Autosize -WorksheetName $_.Name
}
我正在处理大量日志文件,我想解析每个文件中特定匹配项之前的所有行。
文件看起来像这样:
04/14/2022 02:07:19 SLK 1234 12345 86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
04/14/2022 02:07:58 SLK 1234 12345 86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
04/14/2022 02:08:01 SLK 1234 4321 86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 1 Cluster 1 ID 00EQ3G038651
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 2 CLuster 2 ID 00EQ3G026434
00000000-0000-0000-0000-000000000664 Error 2022-04-13T02:09:07+02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) sscmd(5812:1) KEYINPUT.CPP(209) Command: MANUAL ROLLOVER
00000000-0000-0000-0000-000000000649 Error 2022-04-13T02:09:07+02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) Closed the current active Error Log due to: Manual rollover.
我想要模式“Command: MANUAL ROLLOVER”之前的所有 n 行(来自所有文件)不包括该行,然后解析 Excel 电子表格中的数据,如下所示:
Client ID || Date || Time || Error code|| Detail
SLK-1234 04/14/2022 02:07:19 12345 86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
SLK-1234 04/14/2022 02:07:19 12345 86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
SLK-1234 04/14/2022 02:07:19 4321 86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10
我“认为”我用这段代码正确地理解了第一部分:
$move = "X:\New\test\Output"
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter *.*
$Results = foreach( $File in $Files ){
$location = $root+"\"+$file
$s = Select-String -Path "$location" -Pattern "Command: ROLLOVER ERRORLOG" -Context ?,0 |
Foreach-Object { $_.Line,$_.Context.PreContext[0].Trim()}
但我很少提取到 Excel 所以我不知道如何实现其余的 :/.
我会使用 switch
来处理文件 line-by-line 并使用(相当长的)正则表达式将数据解析为对象。
通过日志文件并将数据收集为对象后,您可以使用 Export-Csv
将其全部写入结构化 CSV 文件中,您只需在 Excel.
$move = "X:\New\test\Output"
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File
$Results = foreach($file in $files) {
$data = switch -Regex -File $file.FullName {
# exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
'Command: MANUAL ROLLOVER' { break }
# parse the string into named matches
'^(?<date>\d{2}/\d{2}/\d{4})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<idleft>[^\s]+)\s+(?<idright>\d+)\s+(?<error>\d+)\s+(?<details>(.+))$' {
# output an object with the wanted properties
[PsCustomObject]@{
'Client ID' = '{0}-{1}' -f $matches['idleft'], $matches['idright']
'Date' = $matches['date']
'Time' = $matches['time']
'Error code' = $matches['error']
'Detail' = $matches['details']
}
}
}
$target = Join-Path -Path $move -ChildPath $file.Name
# export the gathered data to a csv you can double-click to open in Excel
$data | Export-Csv -Path $target -UseCulture -NoTypeInformation
}
使用您的示例日志,这将在 Excel 中打开为:
根据您的评论,我了解到您想解析所有文件中的数据并将其保存在同一文件夹中的一个大型 csv 文件中。 为此:
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File
$Results = foreach($file in $files) {
switch -Regex -File $file.FullName {
# exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
'Command: MANUAL ROLLOVER' { break }
# parse the string into named matches
'^(?<date>\d{2}/\d{2}/\d{4})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<idleft>[^\s]+)\s+(?<idright>\d+)\s+(?<error>\d+)\s+(?<details>(.+))$' {
# output an object with the wanted properties
[PsCustomObject]@{
'Client ID' = '{0}-{1}' -f $matches['idleft'], $matches['idright']
'Date' = $matches['date']
'Time' = $matches['time']
'Error code' = $matches['error']
'Detail' = $matches['details']
}
}
}
}
$target = Join-Path -Path $root -ChildPath 'LogResults.csv'
# export the gathered data to a csv you can double-click to open in Excel
$Results | Export-Csv -Path $target -UseCulture -NoTypeInformation
如果您需要例如每个客户 ID 成为其自己的 csv 文件,以便您可以将它们作为单独的 Excel 工作表加载,您可以执行以下操作:
$Results | Group-Object 'Client ID' | ForEach-Object {
$target = Join-Path -Path $root -ChildPath ('{0}.csv' -f $_.Name)
$_.Group | Export-Csv -Path $target -UseCulture -NoTypeInformation
}
我不太熟悉 ImportExcel 模块,但我相信这可能有效:
$target = Join-Path -Path $root -ChildPath 'LogResults.xlsx'
$Results | Group-Object 'Client ID' | ForEach-Object {
$_.Group | Export-Excel -Path $target -Autosize -WorksheetName $_.Name
}