python window.*} 中的正则表达式,其中 * 表示通配符

Regex in python window.*} where * implies wildcard character

我有一个巨大的字符串,需要从中删除所有出现的 window.*},其中 * 是选择所有内容的通配符。

我正在使用以下正则表达式

data=re.sub("window.*?\}","",data)

data 包含提供的示例数据

然而,问题是它替换了第一个 window 和最后一个 } 之间的整个块。如您所见,我使用了 *?对于非贪婪匹配,但仍然无效。

我需要从字符串

中删除 window.onload 等所有函数

示例数据如下:

</td><td>Care Instructions</td><td></td><td>Tested</td><td></td><td>Maintenance</td><td></td><td>Lightweight</td><td></td><td>Water resistance</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr><tr><td>3358</td><td>2012-12-18 11:09:21</td><td>5589043</td><td>UNPublished</td><td>SNK-300-SNORKEL</td><td>0</td><td>Tribord</td><td>542</td><td>433</td><td>14.5</td><td>Adult</td><td>any</td><td>window.onload = function(){window.parent.CKEDITOR._["contentDomReadyattribute85"]( window );}</td><td>
    snorkelers and scuba divers.
</td><td>window.onload = function(){window.parent.CKEDITOR._["contentDomReadyattribute111"]( window );}</td><td>2</td><td></td><td>FALSE</td><td>5000</td><td>128</td><td>MASKS & SNORKELS</td><td>11</td><td>Diving</td><td>519713</td><td>White</td><td></td><td>http://test.com/image/products/p_3358/zoom_asset_7454285.jpg</td><td>Freedom of movement</td><td>
    Soft hypollergenic black silicone mouthpiece.
</td><td>Maintenance</td><td></td><td>Reduced chafing</td><td>
    100% phthalate-free PVC tube, 100% silicone mouthpiece.
</td><td>Technical</td><td></td><td>Easy storage</td><td>
    Flexible tube that can positioned between mask strap and forehead.


</td><td>Composition</td><td>
    salt water resistant. Only rinse with fresh water before prolonged storage.
</td><td>Comfortable</td><td>
    Soft hypollergenic black silicone mouthpiece.
</td><td>Easy breathing</td><td></td><td>Anatomic design</td><td>
    Flexible tube that can positioned between mask strap and forehead.


</td><td>Care Instructions</td><td>
    salt water resistant. Only rinse with fresh water before prolonged storage.
</td><td>Mouth piece</td><td></td><td>Restriction of use</td><td></td><td>http://test.com/image/products/p_3358/zoom_asset_33225717.jpg</td><td>http://test.com/image/products/p_3358/zoom_asset_30448342.jpg</td><td>http://test.com/image/products/p_3358/zoom_asset_19099347.jpg</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr><tr><td>3359</td><td>2014-01-09 18:32:31</td><td>8174997</td><td>Published</td><td>DYNAMO-LAMP</td><td>90</td><td>Tribord</td><td>999</td><td>649</td><td>14.5</td><td>Any</td><td>any</td><td></td><td>
    Snorkelling.Watertight to 5m, eco-friendly and economical.
</td><td></td><td>2</td><td></td><td>FALSE</td><td>333.7</td><td>129</td><td>SNORKELING KITS</td><td>11</td><td>Diving</td><td>1349256</td><td>Black</td><td></td><td>http://test.com/image/products/p_3359/big_800PX_29672977a.jpg</td><td>Battery</td><td>
    3 modes: Blinking, eco-friendly, and 100% (100% = 10minutes after 1min charge)
</td><td>Features</td><td></td><td>Swimming Depth</td><td>
import re
p = re.compile(r'window[^}]*}')
subst = ""
result = re.sub(p, subst, test_str)

尝试 this.See 演示。

https://regex101.com/r/wZ0iA3/2