使用批处理文件从 csv 文件中删除特殊字符

Use batch file to remove special characters from csv file

我有一个包含 18 个字段的 csv 文件。我写了一个批处理文件来处理数据。除了从 "DEVILS DUE /1FIRST COMICS, LLC" 发布者中删除逗号外,一切正常。该字段未正确解析。我曾尝试查看其他批处理文件以获取示例,但我对 snytax 不熟悉。

@echo off & Setlocal EnableDelayedExpansion
( FOR /f "tokens=1-18 delims=," %%A in ('More +4 datatest.csv') do (
rem H is the department code
rem S is the sales tax code
rem Q is the publisher code
    Set "H=%%H"
    Set "S=T"
    Set "Q=%%Q"
    if "%%Q"=="BOOM! STUDIOS" Set "Q=BOOM STUDIOS"
    if "%%Q"=="DEVILS DUE /1FIRST COMICS, LLC" Set "Q=DEVILS DUE"
    if "%%H"=="1" Set "H=1005" 
    if "%%H"=="1" Set "S=N"
    if "%%H"=="2" Set "H=1009" 
    if "%%H"=="2" Set "S=N"
    if "%%H"=="3" Set "H=1008"
    if "%%H"=="4" Set "H=1002"
    if "%%H"=="5" Set "H=1006"
    if "%%H"=="6" Set "H=1003"
    if "%%H"=="7" Set "H=1011"
    if "%%H"=="8" Set "H=1011"
    if "%%H"=="9" Set "H=1004"
    if "%%H"=="10" Set "H=1016"
    if "%%H"=="11" Set "H=1015"
    if "%%H"=="12" Set "H=1015"
    if "%%H"=="13" Set "H=1011"
    if "%%H"=="14" Set "H=1009" 
    if "%%H"=="14" Set "S=N"
    if "%%H"=="15" Set "H=1013"  
    if "%%H"=="16" Set "H=1017"
    echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,%%Q,"","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
  )
)>paygoinvoice.csv
@echo on

您遇到的问题是 FOR 循环中的 delims=, 导致更改 DEVILS DUE /1FIRST COMICS, LLCANY逗号在您的代码中作为 space.

将此与 Tokens= 结合,然后 %%H = DEVILS DUE /1FIRST COMICS ---和--- %%I = LLC.

一个快速而肮脏的修复(据我所知) 是简单地将所有 ", " 更改为不同的东西,然后 运行 它进入主函数.对于我的示例,我使用了 1Comma1。这会将您的 IF 搜索更改为 DEVILS DUE /1FIRST COMICS1Comma1 LLC

Fixed.Bat:

@echo off & Setlocal EnableDelayedExpansion

Rem | Replace all ", " with "1Comma1"
for /f "tokens=1,* delims=¶" %%A in ('"type datatest.csv"') do (
    SET string=%%A
    setlocal EnableDelayedExpansion
    SET modified=!string:, =1Comma1 !

    >> datatest.csv.TEMP echo(!modified!
    endlocal
)

Rem | Main .CSV Edit Function
( FOR /f "tokens=1-8* delims=," %%A in ('More +4 datatest.csv.TEMP') do (
    Set "ItemData=%%H"
    if "%%H"=="1" Set "ItemData=1005"
    if "%%H"=="3" Set "ItemData=1008"
    if "%%H"=="BOOM STUDIOS" Set "ItemData=NEW STUDIOS"
    if "%%H"=="DEVILS DUE /1FIRST COMICS1Comma1 LLC" Set "ItemData=DEVILS DUE"

    echo %%A,%%B,%%C,%%D,%%E,%%F,%%G,!ItemData!,%%I
  )
)>paygoinvoice.txt
del datatest.csv.TEMP

@echo on

PS: 我在上面的示例中使用的代码取自您上次 post 关于该主题的内容。只需将您的新代码添加到它所属的位置即可。

还要记住 EnableDelayedExpansion 自动从 FOR 循环或 IF 语句的输出中删除 !

由于您从未展示过真实世界的示例输入文件,因此很难提供帮助。

for /f 解析 csv 文件的问题是:

  1. 遵守双引号字段并且还将标记包含的逗号,
  2. 只将相邻的分隔符视为一个分隔符,并忽略前导分隔符。

所以第一期适用,第二期未知。

一个解决方法是解析有问题的字段并将它们作为参数传递给遵守引号,并在那里处理它们。

为了简化数组值的处理,存在一种将列表扩展为数组的技术,请参阅在 DepCodeSTaxCode 的下一批中实现的技术(正如我在 to your ):

@echo off & Setlocal EnableDelayedExpansion

:: Build array DepCode[1..16]
Set i=0&Set "DepCode=,1005,1009,1008,1002,1006,1003,1011,1011,1004,1016,1015,1015,1011,1009,1013,1017"
Set "DepCode=%DepCode:,="&Set /a i+=1&Set "DepCode[!i!]=%"
:: Set DepCode

:: Build array STaxCode[1..16]
Set i=0&Set "STaxCode=,N,N,S,S,S,S,S,S,S,S,S,S,S,N,S,S"
Set "STaxCode=%STaxCode:,="&Set /a i+=1&Set "STaxCode[!i!]=%"
:: Set STaxCode

( FOR /f "tokens=1-16* delims=," %%A in ('More +4 SO_53917950.csv') do (
    rem H is the department code
    Set "H=!DepCode[%%~H]!"
    rem S is the sales tax code
    Set "S=!STaxCode[%%~H]!"
    rem Q is the publisher code 17th field and 18th field 
    Call :RemoveComma %%Q 

rem echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,"!PubCode!","","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
    echo "%%~A","%%~B","%%~C","%%~D","%%~E","%%~F","%%~G","!H!","%%~I","%%~J","%%~K","%%~L","%%~M","%%~N","%%~O","%%~P","!PubCode!","!R!","!S!"

  )
)>paygoinvoice.csv
Goto :Eof

:RemoveComma
Set "R=%~2"
:: remove comma from field
::Set "PubCode=%PubCode:,= %"

:: split field at first comma or slash/backslash
for /f "delims=,/\" %%a in (%1) do Set "PubCode=%%a" 

这个构建的输入文件SO_.csv:

first  line to remove
second line to remove
third  line to remove
fourth line to remove
"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","HeadH","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","2","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","3","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","4","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","5","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","6","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","7","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","8","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","9","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","10","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","11","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","12","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","13","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","14","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","15","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","16","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"

将有此输出(由于额外的调用,处理速度明显变慢):

"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR",""
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1005","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1008","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1002","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1006","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1003","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1004","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1016","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1013","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1017","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"