EDI 文件的正则表达式

Regular expression over EDI File

我有以下 EDI 文件,需要过滤元素 LOC+11 而不是 LOC+7,我需要它们之间的所有段,LOC 段重复但它们之间的段不重复。

目前我的正则表达式看起来像 LOC[^L]*(?:L(?!OC)[^L]*)* 但我得到了 4 个结果,因为它也过滤了 loc+7 元素。

我只需要 2 个结果。你能帮帮我吗?

> NAD+ST+14::92++Test' LOC+11+KOD23277::92' LOC+7+D77::92:Test' LIN+1++
> test AP:IN'IMD+F++12::272:K
> RIPPsadasdRIEM'RFF+ON:EN10514492'RFF+AAN:501'
> DTM+171:20220309:102'RFF+AIF:500'DTM+171:20220305:102'CTA+SC+12414:test,
> test'COM+melweasdsaanigge.frasdasdasdaicke1@test.de:EM'
> COM+?+49-561-490-4173:TE'COM+?+49-561-490-84173:FX' QTY+83:1000:PCE'
> QTY+70:66850:PCE'DTM+51:20080101:102'
> QTY+72:0:PCE'DTM+52:20080101:102'
> QTY+194:1000:PCE'DTM+50:20220224:102'
> RFF+AAU:2143276'DTM+171:20220218:102'
> QTY+194:1000:PCE'DTM+50:20220202:102'
> RFF+AAU:2138944'DTM+171:20220131:102'
> QTY+194:1000:PCE'DTM+50:20220105:102'
> RFF+AAU:2138943'DTM+171:20220103:102' SCC+24'
> QTY+113:1000:PCE'DTM+2:20220412:102'
> QTY+113:1000:PCE'DTM+2:20220503:102'
> QTY+113:1000:PCE'DTM+64:20220530:102'DTM+63:20220605:102'
> QTY+113:1000:PCE'DTM+64:20220620:102'DTM+63:20220626:102'
> QTY+113:1000:PCE'DTM+64:20220711:102'DTM+63:20220717:102'
> QTY+113:1000:PCE'DTM+64:20220801:102'DTM+63:20220807:102' GEI+3+37'
> 
> NAD+ST+14::92++test' LOC+11+KOD823226::92' LOC+7+D86::92:Test' LIN+2++
> test H:IN'IMD+F++12::272:K
> RIPPRIEM'RFF+ON:EN10662318'RFF+AAN:266'DTM+171:20220309:102'
> RFF+AIF:265'DTM+171:20220305:102'CTA+SC+12414:test,
> test'COM+test.test@test.de:EM'
> COM+?+49-561-490-4173:TE'COM+?+49-561-490-84173:FX' QTY+83:200:PCE'
> QTY+70:14319:PCE'DTM+51:20100101:102'
> QTY+72:0:PCE'DTM+52:20100101:102' QTY+194:200:PCE'DTM+50:20220126:102'
> RFF+AAU:2146871'DTM+171:20220121:102'
> QTY+194:200:PCE'DTM+50:20211210:102'RFF+AAU:2146914'DTM+171:20211209:102' QTY+194:200:PCE'DTM+50:20211129:102'RFF+AAU:2139927'DTM+171:20211124:102'SCC+24'
> QTY+113:200:PCE'DTM+2:20220503:102'
> QTY+113:200:PCE'DTM+64:20220606:102'DTM+63:20220612:102'
> QTY+113:200:PCE'DTM+64:20220718:102'DTM+63:20220724:102'
> QTY+113:200:PCE'DTM+64:20220829:102'DTM+63:20220904:102'
> QTY+113:200:PCE'DTM+64:20221010:102'DTM+63:20221016:102'
> 
> UNT+142+1'UNZ+1+2756'

您可以使用

LOC\+11[^L]*(?:L(?!OC\+11)[^L]*)*
LOC\+11[\w\W]*?(?=LOC\+11|$)

参见regex demo

详情:

  • LOC\+11 - LOC+11 字符串
  • [^L]*(?:L(?!OC\+11)[^L]*)* - 直到第一次出现 LOC+11 子字符串的任何文本(使用 )。

尽管使用上述两种模式得到的结果是相同的,但第一种模式要快得多,前提是没有太多 L 后面没有跟随 +11