提取所有 space 个空闲字符串

Question

下面显示的试用脚本旨在从文本文件 F 中提取 space 自由字符串，并按照找到的顺序将它们传递到设置它们的结果文件 Fr每行一个。总的来说它工作正常，除了结果文件中可能有跳过行的行尾，除了我不知道它是否是一个正确停止的脚本，除此之外，最糟糕的是，它需要永远做好它的工作。

$fic= "<F>"
$ficr="<Fr>"
$fics="<Fs>"
$cF=(gc $fic -encoding utf8)
clc $fics;clc $ficr
$Lfic=(gc $fic).length;
$MPfic=$Lfic-1;$Pfic=0..$MPfic
foreach($x in $Pfic){$llge=((gc $fic)[$x]).length;$mplge=$llge-1;$plge=0..$mplge;foreach($y in $plge)
                       {if($cF[$x][$y] -ne " "){$cF[$x][$y] >> $fics} 
                                               else {if($cF[$x][$y+1] -ne " ")
                                                       {(-join (gc $fics)) >> $ficr;clc $fics}else{while($cF[$x][$y+1] -eq " "){$y=$y+1}}
                                                    }
                       }
                     }

1/ 有人知道如何完善该脚本吗？（我想保留它作为可能遇到的不切实际的编码可能性的说明。）

2/ 有人可以建议更有效的代码来完成这项工作吗？

输入和输出的示例

例如，如果文件 F 中的文本如下（spaces 已被竖线替换，但在普通文本中这是没有意义的，竖线与任何其他字符一样; 此处它们旨在准确显示 space 出现的位置和方式；行指示（x 行）不属于文本，文本始终从行首开始（无起始白色 spaces)."Line 2" input是一个空行.),

line 1   %|10|prog|axil,|(les|prog|activés)  
line 2     
line 3   %|début%||||||||||||||||Ce|qu'il|faut:|<<~ZZZ_if_livre_op_prog_PX.txt~>>
line 4   %|à|partir|du|mot|<<~index~>>

文件 Fr 中应该是这样的：

line 1  %
line 2  10
line 3  prog
line 4  axil,
line 5  (les
line 6  prog
line 7  activés)
line 8  %
line 9  début%
line 10 Ce
line 11 qu'il
line 12 faut:
line 13 <<~ZZZ_if_livre_op_prog_LPX.txt~>>
line 14 %
line 15 à
line 16 partir
line 17 du
line 18 mot
line 19 <<~index~>>

Answer 1

您当前的脚本似乎过于复杂。您可以将其简化为（伪代码）：

foreach $line in F {
    if $line has a space {
        Write $line to Fs
    }
    else {
        Write $line to Fr
    }
}

在 PowerShell 中可能如下所示：

# read all lines from file
$lines = Get-Content $fic -Encoding utf8

# split into two groups - those that contain whitespace and those that don't
$withSpace,$withoutSpace = $lines.Where({$_ -match '\s'}, 'Split')

# write the lines with whitespace to $fics
$withSpace |Set-Content $fics

# write the lines without whitespace to $ficr
$withoutSpace |Set-Content $ficr

Answer 2

您只想在所有空白处拆分？

-split (get-content spaces.txt)

为了去除空行，-raw 将整个文件作为一个字符串读取，而不是每行读取一个字符串。

-split (get-content -raw spaces.txt)

提取所有 space 个空闲字符串

Extraction of all space free strings

string

powershell

extract