解析位于开头的 Python 个 ASCII 扩展字符

Question

我需要从 Python 中的调试语句中删除前 3 或 4 个 ASCII 扩展字符，但我现在不能。这是一个例子：

ª!è[002:58:535]REGMICRO:Load: 36.6

ëª7è[001:40:971]HTTP_CLI:Http Client Mng not initialized.

我试过： ^.*[A-Za-z]+$

和

[\x80-\xFF]+HTTP_CLI:0 - Line written in.*

但是所有内容都被忽略并给我这个错误：

"20160922 15:16:28.549 : FAIL : UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 1: ordinal not in range(128) 20160922 15:16:28.551 : INFO : ${resulters} = ('FAIL', u"UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 1: ordinal not in range(128)") 20160922 15:16:28.553 : INFO : ('FAIL', u"UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 1: ordinal not in range(128)")"

有人在 RIDE 和 Python 上工作吗？

谢谢！

Answer 1

回答如何用 RF 删除方括号前的字符（如果我正确理解问题，坦率地说 - 我不确定） - 您尝试使用的正则表达式不正确；假设您想获取第一个方括号后的所有内容：

${line}=    Set Variable    ëª7è[001:40:971]HTTP_CLI:Http Client Mng not initialized.
${regx}=    Set Variable    ^.*(\[.*$)
${result}=  Get Regexp Matches      ${line}      ${regx}      1

您要使用的正则表达式（第 2 ^ 行）是 "from start of the line, skip everything up to the 1st square bracket - and the sequence from the square bracket to the end is group 1"。然后使用 kw "Get Regexp Matches" 你得到匹配的组 1.

在python中：

import re
line = "ëª7è[001:40:971]HTTP_CLI:Http Client Mng not initialized."
regx = "^.*(\[.*$)"
result = re.search(regx, line).group(1)  # the value of result is "[001:40:971]HTTP_CLI:Http Client Mng not initialized."

解析位于开头的 Python 个 ASCII 扩展字符

Parse in Python ASCII extended Characters located at the beginning

regex

extended-ascii

python-2.7

robotframework