将列表中的所有字符串转换为浮点数。适用于单个列表,但不适用于数据框
Convert all strings in a list to float. Works on single list but not when applied to dataframe
我有一个带地理位置的数据框 df_tweets
。地理位置作为列表的字符串表示形式存储在变量 geo_loc
中。它看起来像这样:
# Geocode values are stored as objects/strings
df_tweets.geo_code[0]
#Output:
'[-4.241751 55.858303]'
我测试了将 geo_code
的一行转换为浮点数的经纬度列表:
# Converting string representation of list to list using strip and split
# Can't use json.loads() or ast.literal_eval() because there's no comma delimiter
#--- Test with one tweet ----#
ini_list = df_tweets.geo_code[0]
# Converting string to list, but it will convert
# the lon and lat values to strings
# i.e. ['-4.241751', '55.858303']
results = ini_list.strip('][').split(' ')
# So, we must convert string lon and lat to floats
results = list(map(float, results))
# printing final result and its type
print ("final list", results)
print (type(result))
这给了我:
# Output:
final list [-4.241751, 55.858303]
<class 'list'>
成功!除了没有。我把它写成一个辅助函数:
def str_to_float_list(list_as_str):
'''
Function to convert a string representation
of a list into a list of floats
using strip and split, when you can't use json.loads()
or ast.literal_eval() because there's no comma delimiter
Parameter:
str_ = string representation of a list.
'''
# Convert string to list
str_list = list_as_str.strip('][').split(' ')
# Convert strings inside list to float
float_list = list(map(float, str_list[0]))
return float_list
而当我 运行:
df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
当它遇到减号-
时,它会给我一个ValueError
。我不明白为什么?我错过了什么?
这是完整的错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-94-c1035312dc12> in <module>()
20
21
---> 22 df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
1 frames
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-94-c1035312dc12> in str_to_float_list(list_as_str)
15
16 # Convert strings inside list to float
---> 17 float_list = list(map(float, str_list[0]))
18
19 return float_list
ValueError: could not convert string to float: '-'
在你的第 17 行,
float_list = list(map(float, str_list[0]))
您不需要引用索引。将整个列表传递给列表,就像这样。
float_list = list(map(float, str_list))
原因是str_list[0]是一个字符串对象,所以它试图把它当作一个列表,并迭代地转换每个值,从将“-”转换为浮点数开始,然后它会转换“4”等
我有一个带地理位置的数据框 df_tweets
。地理位置作为列表的字符串表示形式存储在变量 geo_loc
中。它看起来像这样:
# Geocode values are stored as objects/strings
df_tweets.geo_code[0]
#Output:
'[-4.241751 55.858303]'
我测试了将 geo_code
的一行转换为浮点数的经纬度列表:
# Converting string representation of list to list using strip and split
# Can't use json.loads() or ast.literal_eval() because there's no comma delimiter
#--- Test with one tweet ----#
ini_list = df_tweets.geo_code[0]
# Converting string to list, but it will convert
# the lon and lat values to strings
# i.e. ['-4.241751', '55.858303']
results = ini_list.strip('][').split(' ')
# So, we must convert string lon and lat to floats
results = list(map(float, results))
# printing final result and its type
print ("final list", results)
print (type(result))
这给了我:
# Output:
final list [-4.241751, 55.858303]
<class 'list'>
成功!除了没有。我把它写成一个辅助函数:
def str_to_float_list(list_as_str):
'''
Function to convert a string representation
of a list into a list of floats
using strip and split, when you can't use json.loads()
or ast.literal_eval() because there's no comma delimiter
Parameter:
str_ = string representation of a list.
'''
# Convert string to list
str_list = list_as_str.strip('][').split(' ')
# Convert strings inside list to float
float_list = list(map(float, str_list[0]))
return float_list
而当我 运行:
df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
当它遇到减号-
时,它会给我一个ValueError
。我不明白为什么?我错过了什么?
这是完整的错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-94-c1035312dc12> in <module>()
20
21
---> 22 df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
1 frames
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-94-c1035312dc12> in str_to_float_list(list_as_str)
15
16 # Convert strings inside list to float
---> 17 float_list = list(map(float, str_list[0]))
18
19 return float_list
ValueError: could not convert string to float: '-'
在你的第 17 行,
float_list = list(map(float, str_list[0]))
您不需要引用索引。将整个列表传递给列表,就像这样。
float_list = list(map(float, str_list))
原因是str_list[0]是一个字符串对象,所以它试图把它当作一个列表,并迭代地转换每个值,从将“-”转换为浮点数开始,然后它会转换“4”等