TypeError: 'float' object is not iterable on a list in built in max function
TypeError: 'float' object is not iterable on a list in built in max function
我正在尝试使用 max 函数及其关键参数找到与给定实际电影标题的近似电影标题最接近的匹配项。
如果我定义一个示例列表并测试它的功能...
from difflib import SequenceMatcher as SM
movies = ['fake movie title', 'faker movie title', 'shaun died']
approx_title = 'Shaun of the Dead.'
max(movies, key = lambda title: SM(None, approx_title, title).ratio())
'shaun died'
但我试图匹配单独数据框中的整个列,所以我尝试将 Pandas 系列转换为列表和 运行 相同的函数,但我得到了 type_error,即使我已经检查了两部电影的数据类型并且 movie_lst 都是列表。
Old id New id Title Year Critics Score Audience Score Rating
NaN 21736.0 Peter Pan 1999.0 NaN 70.0 PG nothing objectionable
NaN 771471359.0 Dragonheart Battle for the Heartfire 2017.0 NaN 50.0 PG13
NaN 770725090.0 The Nude Vampire Vampire nue, La 1974.0 NaN 24.0 NR
2281.0 19887.0 Beyond the Clouds 1995.0 65.0 67.0 NR
10913.0 11286.0 Wild America 1997.0 27.0 59.0 PG violence
movie_lst = rt_info['Title'].tolist()
['Peter Pan',
'Dragonheart Battle for the Heartfire',
'The Nude Vampire Vampire nue, La',
'Beyond the Clouds',
'Wild America',
'Sexual Dependency',
'Body Slam',
'Hatchet II',
'Lion of the Desert Omar Mukhtar',
'Imagine That',
'Harold',
'A United Kingdom',
'Violent City The FamilyCitt violenta',
'Ratchet Clank',
'Wes Craven Presents Carnival of Souls',
'The Adventures of Ociee Nash',
'Blackfish',
'For Petes Sake',
'Daybreakers',
'The Big One',
'Godzilla vs Megaguirus',
'In a Lonely Place',
'Case 39', ...
]
max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
TypeError Traceback (most recent call last)
<ipython-input-88-0022a3c1bdb9> in <module>()
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
<ipython-input-88-0022a3c1bdb9> in <lambda>(title)
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
/usr/lib/python3.4/difflib.py in __init__(self, isjunk, a, b, autojunk)
211 self.a = self.b = None
212 self.autojunk = autojunk
--> 213 self.set_seqs(a, b)
214
215 def set_seqs(self, a, b):
/usr/lib/python3.4/difflib.py in set_seqs(self, a, b)
223
224 self.set_seq1(a)
--> 225 self.set_seq2(b)
226
227 def set_seq1(self, a):
/usr/lib/python3.4/difflib.py in set_seq2(self, b)
277 self.matching_blocks = self.opcodes = None
278 self.fullbcount = None
--> 279 self.__chain_b()
280
281 # For each element x in b, set b2j[x] to a list of the indices in
/usr/lib/python3.4/difflib.py in __chain_b(self)
309 self.b2j = b2j = {}
310
--> 311 for i, elt in enumerate(b):
312 indices = b2j.setdefault(elt, [])
313 indices.append(i)
TypeError: 'float' object is not iterable
我很困惑为什么 - 任何帮助将不胜感激!
不是 pandas 专家,无法重现,但取决于文件的读取方式,因为有些标题(例如法国电影 11.6
)与浮点数匹配,因此有可能有些数据是 float
而不是字符串(你的问题证明它 是 可能的:))
一个好的解决方法是像这样强制数据为字符串:
movie_lst = [str(x) for x in movie_lst]
如果字符串已经是字符串,它不会创建字符串的副本 (Should I avoid converting to a string if a value is already a string?),因此它很高效,而且您肯定只会得到字符串。
请注意,您可以通过打印找到违规者:
[x for x in movie_lst if not isinstance(x,str)]
我正在尝试使用 max 函数及其关键参数找到与给定实际电影标题的近似电影标题最接近的匹配项。 如果我定义一个示例列表并测试它的功能...
from difflib import SequenceMatcher as SM
movies = ['fake movie title', 'faker movie title', 'shaun died']
approx_title = 'Shaun of the Dead.'
max(movies, key = lambda title: SM(None, approx_title, title).ratio())
'shaun died'
但我试图匹配单独数据框中的整个列,所以我尝试将 Pandas 系列转换为列表和 运行 相同的函数,但我得到了 type_error,即使我已经检查了两部电影的数据类型并且 movie_lst 都是列表。
Old id New id Title Year Critics Score Audience Score Rating
NaN 21736.0 Peter Pan 1999.0 NaN 70.0 PG nothing objectionable
NaN 771471359.0 Dragonheart Battle for the Heartfire 2017.0 NaN 50.0 PG13
NaN 770725090.0 The Nude Vampire Vampire nue, La 1974.0 NaN 24.0 NR
2281.0 19887.0 Beyond the Clouds 1995.0 65.0 67.0 NR
10913.0 11286.0 Wild America 1997.0 27.0 59.0 PG violence
movie_lst = rt_info['Title'].tolist()
['Peter Pan',
'Dragonheart Battle for the Heartfire',
'The Nude Vampire Vampire nue, La',
'Beyond the Clouds',
'Wild America',
'Sexual Dependency',
'Body Slam',
'Hatchet II',
'Lion of the Desert Omar Mukhtar',
'Imagine That',
'Harold',
'A United Kingdom',
'Violent City The FamilyCitt violenta',
'Ratchet Clank',
'Wes Craven Presents Carnival of Souls',
'The Adventures of Ociee Nash',
'Blackfish',
'For Petes Sake',
'Daybreakers',
'The Big One',
'Godzilla vs Megaguirus',
'In a Lonely Place',
'Case 39', ...
]
max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
TypeError Traceback (most recent call last)
<ipython-input-88-0022a3c1bdb9> in <module>()
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
<ipython-input-88-0022a3c1bdb9> in <lambda>(title)
----> 1 max(movie_lst, key = lambda title: SM(None, approx_title, title).ratio())
/usr/lib/python3.4/difflib.py in __init__(self, isjunk, a, b, autojunk)
211 self.a = self.b = None
212 self.autojunk = autojunk
--> 213 self.set_seqs(a, b)
214
215 def set_seqs(self, a, b):
/usr/lib/python3.4/difflib.py in set_seqs(self, a, b)
223
224 self.set_seq1(a)
--> 225 self.set_seq2(b)
226
227 def set_seq1(self, a):
/usr/lib/python3.4/difflib.py in set_seq2(self, b)
277 self.matching_blocks = self.opcodes = None
278 self.fullbcount = None
--> 279 self.__chain_b()
280
281 # For each element x in b, set b2j[x] to a list of the indices in
/usr/lib/python3.4/difflib.py in __chain_b(self)
309 self.b2j = b2j = {}
310
--> 311 for i, elt in enumerate(b):
312 indices = b2j.setdefault(elt, [])
313 indices.append(i)
TypeError: 'float' object is not iterable
我很困惑为什么 - 任何帮助将不胜感激!
不是 pandas 专家,无法重现,但取决于文件的读取方式,因为有些标题(例如法国电影 11.6
)与浮点数匹配,因此有可能有些数据是 float
而不是字符串(你的问题证明它 是 可能的:))
一个好的解决方法是像这样强制数据为字符串:
movie_lst = [str(x) for x in movie_lst]
如果字符串已经是字符串,它不会创建字符串的副本 (Should I avoid converting to a string if a value is already a string?),因此它很高效,而且您肯定只会得到字符串。
请注意,您可以通过打印找到违规者:
[x for x in movie_lst if not isinstance(x,str)]