从 Python 中的字符串中提取想要的数值
Extracting wanted numerical values from strings in Python
我有两个字符串:
['Renewables\n', '17.9% (3,951 MW)\n']
['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
但我想从每个数据中得到的唯一数据是每个字符串末尾的百分比数值,例如 17.9 和 47.61,我想获取这些数字并将它们用于后续程序,结果将是取决于哪个数字较大,因为它们是网络抓取工具的结果。
如何仅将这些值提取为浮点数以便我以后可以使用它们?
编辑
为了清楚起见,我只想要每个字符串的最后一个百分比值,不需要任何 MW 值或之前的百分比值。
您可以使用正则表达式:
import re
s = ['Renewables\n', '17.9% (3,951 MW)\n']
s1 = ['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
final_results = [float(re.findall('\d+\.\d+(?=\%)|\d+\.\d+(?=\s\%)', i[-1])[0]) for i in [s, s1]]
输出:
[17.9, 47.61]
这是一个没有正则表达式的解决方案,但它非常适合您的情况。
代码检查字符串中的 %,如果找到则拆分并提取之前的数字。
examples = ['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
output = []
for each_string in examples:
if "%" in each_string:
number = each_string.split("%")[0].strip(" ")
output.append(number)
#output = ['27.4', '54', '47.61']
我有两个字符串:
['Renewables\n', '17.9% (3,951 MW)\n']
['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
但我想从每个数据中得到的唯一数据是每个字符串末尾的百分比数值,例如 17.9 和 47.61,我想获取这些数字并将它们用于后续程序,结果将是取决于哪个数字较大,因为它们是网络抓取工具的结果。
如何仅将这些值提取为浮点数以便我以后可以使用它们?
编辑
为了清楚起见,我只想要每个字符串的最后一个百分比值,不需要任何 MW 值或之前的百分比值。
您可以使用正则表达式:
import re
s = ['Renewables\n', '17.9% (3,951 MW)\n']
s1 = ['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
final_results = [float(re.findall('\d+\.\d+(?=\%)|\d+\.\d+(?=\s\%)', i[-1])[0]) for i in [s, s1]]
输出:
[17.9, 47.61]
这是一个没有正则表达式的解决方案,但它非常适合您的情况。
代码检查字符串中的 %,如果找到则拆分并提取之前的数字。
examples = ['Solar\n', '27.4% (1,081 MW)\n', 'LATEST SYSTEM\n', 'GENERATION\n', '4,738 MW\n', 'THERMAL GENERATION\n', '(COAL, GAS, OTHER)\n', '54 %\n', 'RENEWABLE\n', 'GENERATION\n', '47.61 %\n']
output = []
for each_string in examples:
if "%" in each_string:
number = each_string.split("%")[0].strip(" ")
output.append(number)
#output = ['27.4', '54', '47.61']