Python, pandas 从 csv 打印最频繁的 1-1000
Python, pandas print most frequent 1-1000 from csv
我有以下代码:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import csv
data1=pd.read_csv('11-01 412-605.csv', low_memory=False)
d412=pd.DataFrame(data1, columns=['size', 'price', 'date'])
new_df = pd.value_counts(d412['size']).reset_index()
new_df.columns = ['size', 'frequency']
print (new_df)
export_csv = new_df.to_csv ('empty.csv', index = None, header=True)
输出:
output
但是,我想打印出计数仅为 1-1000 的值。我该怎么做,因为现在它会打印出所有值。
我试过了:
new_df = pd.value_counts(d412['size']<1000).reset_index()
但这不起作用,因为它会为所有小于 1000 的值打印出 true 或 false
尝试
print(new_df.loc[df_new['frequency']<1000,:])
如果我误解了计数的列,请将 'frequency' 替换为 'size'
欢迎来到 Stack Overflow!
根据后面步骤中Series.value_counts, it's clear value_counts() doesn't allow filtering the values. You can filter the data using DataFrame.loc的参考,其他人也提到过。因此,以下代码将起作用:
new_df = pd.value_counts(d412['size']).reset_index()
new_df.columns = ['size', 'frequency']
print(new_df.loc[new_df['frequency'] <= 1000])
我有以下代码:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import csv
data1=pd.read_csv('11-01 412-605.csv', low_memory=False)
d412=pd.DataFrame(data1, columns=['size', 'price', 'date'])
new_df = pd.value_counts(d412['size']).reset_index()
new_df.columns = ['size', 'frequency']
print (new_df)
export_csv = new_df.to_csv ('empty.csv', index = None, header=True)
输出: output 但是,我想打印出计数仅为 1-1000 的值。我该怎么做,因为现在它会打印出所有值。 我试过了:
new_df = pd.value_counts(d412['size']<1000).reset_index()
但这不起作用,因为它会为所有小于 1000 的值打印出 true 或 false
尝试
print(new_df.loc[df_new['frequency']<1000,:])
如果我误解了计数的列,请将 'frequency' 替换为 'size'
欢迎来到 Stack Overflow!
根据后面步骤中Series.value_counts, it's clear value_counts() doesn't allow filtering the values. You can filter the data using DataFrame.loc的参考,其他人也提到过。因此,以下代码将起作用:
new_df = pd.value_counts(d412['size']).reset_index()
new_df.columns = ['size', 'frequency']
print(new_df.loc[new_df['frequency'] <= 1000])