如何使用 Pandas (.map) 进行标注?
How to labeling with Pandas (.map)?
在下面table
中用pandas(.map)标注
m2m_similarity.columns = ['MoviId 1','MoviId 2','similarity_score']
m2m_similarity.head(3)
我试过把label弄的有点像,很像,exacly
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].map({
0.1: 'slightly-similar', 0.2: 'slightly-similar', 0.3: 'slightly-similar', 0.4: 'slightly-similar',
0.5: 'similar', 0.6: 'similar', 0.7: 'similar', 0.8: 'similar',0.9: 'similar',
1.0: 'Exacly'
})
m2m_similarity.head(3)
结果是Nan
尝试使用:
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].replace({
0.1: 'slightly-similar', 0.2: 'slightly-similar', 0.3: 'slightly-similar', 0.4: 'slightly-similar',
0.5: 'similar', 0.6: 'similar', 0.7: 'similar', 0.8: 'similar',0.9: 'similar',
1.0: 'Exacly'
})
更好的方法是:
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].map(lambda s: 'Exacly' if round(s, 2) == 1 else ('similar' if round(s, 2) >= 0.5 else 'slightly-similar'))
因为它将涵盖两者之间的所有选项。
并且无论如何确保在 similarity_score
中你有数字而不是 strings
,如果它们实际上不是高精度浮点数,你只显示第一个数字。
在下面table
中用pandas(.map)标注m2m_similarity.columns = ['MoviId 1','MoviId 2','similarity_score']
m2m_similarity.head(3)
我试过把label弄的有点像,很像,exacly
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].map({
0.1: 'slightly-similar', 0.2: 'slightly-similar', 0.3: 'slightly-similar', 0.4: 'slightly-similar',
0.5: 'similar', 0.6: 'similar', 0.7: 'similar', 0.8: 'similar',0.9: 'similar',
1.0: 'Exacly'
})
m2m_similarity.head(3)
结果是Nan
尝试使用:
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].replace({
0.1: 'slightly-similar', 0.2: 'slightly-similar', 0.3: 'slightly-similar', 0.4: 'slightly-similar',
0.5: 'similar', 0.6: 'similar', 0.7: 'similar', 0.8: 'similar',0.9: 'similar',
1.0: 'Exacly'
})
更好的方法是:
m2m_similarity['analysis'] = m2m_similarity['similarity_score'].map(lambda s: 'Exacly' if round(s, 2) == 1 else ('similar' if round(s, 2) >= 0.5 else 'slightly-similar'))
因为它将涵盖两者之间的所有选项。
并且无论如何确保在 similarity_score
中你有数字而不是 strings
,如果它们实际上不是高精度浮点数,你只显示第一个数字。