如何创建包含多个正则表达式的元组列表

How to create a list of tuples containing multiple Regular Expression

所以我目前正在进行一项作业,要求我们从文本文档中提取 phone 号码、电子邮件和网站。讲师要求我们把它输出成一个元组列表,每个元组包含初始索引、长度和匹配。这里有些例子: [(1,10,'0909900008'), (35,16,'contact@viva.com')], ... 由于要实现三个不同的要求。我如何将它们全部放入元组列表中?我想到了三个正则表达式,但我不能真正将它们全部放在 1 个列表中。我应该创建一个新的表达式来描述这三个吗?感谢您的帮助。

result = []

# Match with RE
email_pattern = r'[\w\.-]+@[\w\.-]+(?:\.[\w]+)+'
email = re.findall(email_pattern, string)
for match in re.finditer(email_pattern, string):
    print(match.start(), match.end() - match.start(), match.group())

phone_pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
phone = re.findall(phone_pattern, string)
for match in re.finditer(phone_pattern, string):
    print(match.start(), match.end() - match.start(), match.group())

website_pattern = '(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})'
web = re.findall(website_pattern, string)
for match in re.finditer(website_pattern, string):
    print(match.start(), match.end() - match.start(), match.group())

我的输出:

# Text document
should we use regex more often? let me know at 012345678@student.eng or bbx@gmail.com. To further notice, contact Khoi at 0957507468 or accessing
https://web.de or maybe www.google.com, or Mr.Q at 0912299922.

# Output
47 21 012345678@student.eng
72 13 bbx@gmail.com
122 10 0957507468
197 10 0912299922
146 14 https://web.de
170 15 www.google.com,

而不是 printappendresult list 然后 print 它,即改变

print(match.start(), match.end() - match.start(), match.group())

result.append((match.start(), match.end() - match.start(), match.group()))

其他人也一样,最后

print(result)