从列表中紧跟在“@”符号之后的字符串中提取数值
extract numerical values from strings in list that come directly after the "@ " symbol
我有一个如下所示的列表:*为了便于阅读,我已将每个元素放在一个新的编号行上*
-
NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%,
-
NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%,
-
NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%,
-
UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
-
ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
我正在尝试编写一个脚本来提取紧跟在第一个“@”符号之后的数字。
我将如何在 Python 中执行此操作?
尝试:
my_list = ['NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%']
result = [re.findall(r'\@ (\d+\.\d+)',s)[0] for s in my_list]
result
['5.15000000', '3.00000000']
阅读 python 中的正则表达式模块。
import re
l = '''NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%,
NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%,
NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%,
UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,'''
print(re.findall(r'(?<=@)\s*\d*[.]\d*',l))
编辑:我没有意识到列表中有子字符串,所以这里有一个。
import re
mylist = ['NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%',
'UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%',
'ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%']
results = [re.search(r'(?<=@)\s*\d*[.]\d*',string).group() for string in mylist]
for res in results:
print(res)
我有一个如下所示的列表:*为了便于阅读,我已将每个元素放在一个新的编号行上*
-
NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%,
-
NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%,
-
NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%,
-
UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
-
ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
我正在尝试编写一个脚本来提取紧跟在第一个“@”符号之后的数字。 我将如何在 Python 中执行此操作?
尝试:
my_list = ['NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%']
result = [re.findall(r'\@ (\d+\.\d+)',s)[0] for s in my_list]
result
['5.15000000', '3.00000000']
阅读 python 中的正则表达式模块。
import re
l = '''NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%,
NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%,
NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%,
UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,
ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%,'''
print(re.findall(r'(?<=@)\s*\d*[.]\d*',l))
编辑:我没有意识到列表中有子字符串,所以这里有一个。
import re
mylist = ['NOVO NORDISK A/S 216757.000 SHS @ 5.15000000 EXDTE-22MAR19 PAYDTE-26MAR19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S 205395.000 SHS @ 3.00000000 EXDTE-16AUG19 PAYDTE-20AUG19 TAXABLE RECLAIMABLE TAXED AT .270000%',
'NOVO NORDISK A/S TAX RECLAIM PAID 79000.000 EX-DT:20MAR15 PY-DT:24MAR15 CURRENCY TIP # 1150790014348SIEMENS AG 89789.000 SHS @ 3.80000000 EXDATE-31JAN19 PAYDATE-04FEB19 TAXABLE RECLAIMABLE TAXED AT .263750%',
'UNILEVER NV 171208.000 SHS @ 0.38720000 EXDATE-14FEB19 PAYDATE-20MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%',
'ROYAL DUTCH SHELL PLC A SHS 568644.000 SHS @ 0.41810000 EXDTE-14FEB19 PAYDTE-25MAR19 TAXABLE W/H @ SOURCE TAXED AT .150000%']
results = [re.search(r'(?<=@)\s*\d*[.]\d*',string).group() for string in mylist]
for res in results:
print(res)