元组迭代向量化
tuple iteration vectorization
我有一个 pandas 代码,它正在迭代元组,我正在尝试对其进行矢量化。
我正在迭代的元组列表,如果是这样的话:
[('Morden', 35672, 'Morden Hall Park, Surrey'),
('Morden', 73995, 'Morden Hall Park, Surrey'),
('Newbridge', 120968, 'Newbridge, Midlothian'),
('Stroud', 127611, 'Stroud, Gloucestershire')]
工作元组迭代代码是:
for tuple_ in result_tuples:
listing_looking_ins1.loc[:,'looking_in']\
[(listing_looking_ins1.listing_id ==tuple_[1]) &
(listing_looking_ins1.looking_in ==tuple_[0])] = tuple_[2]
我尝试编写一个 func 与 apply 方法一起使用,但它不起作用:
result_tuples_df = pd.DataFrame(result_tuples)
def replace_ (row):
row.loc[:,'looking_in'][(listing_looking_ins1.listing_id\
\==result_tuples_df[1]) &
(listing_looking_ins1.looking_in\==result_tuples_df[0])] \
= result_tuples_df[2]
listing_looking_ins1.apply(replace_, axis=1)
谢谢!
您可以将您的元组列表转换为 DataFrame 并将其与原始合并:
result_tuples_df = pd.DataFrame(result_tuples,
columns=['listing_id', 'looking_in', 'result'])
df = listing_looking_ins1.merge(result_tuples_df)
print(df)
输出:
listing_id looking_in result
0 Morden 35672 Morden Hall Park, Surrey
1 Morden 73995 Morden Hall Park, Surrey
2 Newbridge 120968 Newbridge, Midlothian
3 Stroud 127611 Stroud, Gloucestershire
然后,如果您想在 looking_in
列中获得结果:
df.drop('looking_in', 1).rename(columns={'result': 'looking_in'})
输出:
listing_id looking_in
0 Morden Morden Hall Park, Surrey
1 Morden Morden Hall Park, Surrey
2 Newbridge Newbridge, Midlothian
3 Stroud Stroud, Gloucestershire
P.S。在您的代码中,您设置的值是:
listing_looking_ins1.loc[:,'looking_in'][...] = ...
这是 DataFrame 副本上的设置值。请参阅 How to deal with SettingWithCopyWarning in Pandas? 了解为什么以及如何避免这样做
P.P.S。既然你询问了向量化和使用应用,你可能还想看看这个答案 on performance of different operations
我有一个 pandas 代码,它正在迭代元组,我正在尝试对其进行矢量化。
我正在迭代的元组列表,如果是这样的话:
[('Morden', 35672, 'Morden Hall Park, Surrey'),
('Morden', 73995, 'Morden Hall Park, Surrey'),
('Newbridge', 120968, 'Newbridge, Midlothian'),
('Stroud', 127611, 'Stroud, Gloucestershire')]
工作元组迭代代码是:
for tuple_ in result_tuples:
listing_looking_ins1.loc[:,'looking_in']\
[(listing_looking_ins1.listing_id ==tuple_[1]) &
(listing_looking_ins1.looking_in ==tuple_[0])] = tuple_[2]
我尝试编写一个 func 与 apply 方法一起使用,但它不起作用:
result_tuples_df = pd.DataFrame(result_tuples)
def replace_ (row):
row.loc[:,'looking_in'][(listing_looking_ins1.listing_id\
\==result_tuples_df[1]) &
(listing_looking_ins1.looking_in\==result_tuples_df[0])] \
= result_tuples_df[2]
listing_looking_ins1.apply(replace_, axis=1)
谢谢!
您可以将您的元组列表转换为 DataFrame 并将其与原始合并:
result_tuples_df = pd.DataFrame(result_tuples,
columns=['listing_id', 'looking_in', 'result'])
df = listing_looking_ins1.merge(result_tuples_df)
print(df)
输出:
listing_id looking_in result
0 Morden 35672 Morden Hall Park, Surrey
1 Morden 73995 Morden Hall Park, Surrey
2 Newbridge 120968 Newbridge, Midlothian
3 Stroud 127611 Stroud, Gloucestershire
然后,如果您想在 looking_in
列中获得结果:
df.drop('looking_in', 1).rename(columns={'result': 'looking_in'})
输出:
listing_id looking_in
0 Morden Morden Hall Park, Surrey
1 Morden Morden Hall Park, Surrey
2 Newbridge Newbridge, Midlothian
3 Stroud Stroud, Gloucestershire
P.S。在您的代码中,您设置的值是:
listing_looking_ins1.loc[:,'looking_in'][...] = ...
这是 DataFrame 副本上的设置值。请参阅 How to deal with SettingWithCopyWarning in Pandas? 了解为什么以及如何避免这样做
P.P.S。既然你询问了向量化和使用应用,你可能还想看看这个答案 on performance of different operations