使用元组元素从列表制作频率直方图
Make Frequency histogram from list with tuple elements
我想做一个词频分布,x轴是词数,y轴是频数。
我有以下列表:
example_list = [('dhr', 17838), ('mw', 13675), ('wel', 5499), ('goed', 5080),
('contact', 4506), ('medicatie', 3797), ('uur', 3792),
('gaan', 3473), ('kwam', 3463), ('kamer', 3447),
('mee', 3278), ('gesprek', 2978)]
我尝试先将其转换为 pandas DataFrame,然后使用 pd.hist()
如下例所示,但我无法理解并认为它实际上很简单但可能我遗漏了什么。
import numpy as np
import matplotlib.pyplot as plt
word = []
frequency = []
for i in range(len(example_list)):
word.append(example_list[i][0])
frequency.append(example_list[i][1])
plt.bar(word, frequency, color='r')
plt.show()
您不能将 word
传递到 matplotlib.pyplot.bar
directly. However you could create an indices array for bar
and then replace these indices with the words
using matplotlib.pyplot.xticks
:
import numpy as np
import matplotlib.pyplot as plt
indices = np.arange(len(example_list))
plt.bar(indices, frequency, color='r')
plt.xticks(indices, word, rotation='vertical')
plt.tight_layout()
plt.show()
创建 word
和 frequency
的 for
循环也可以替换为简单的 zip
和列表解包:
word, frequency = zip(*example_list)
使用pandas:
import pandas as pd
import matplotlib.pyplot as plt
example_list = [('dhr', 17838), ('mw', 13675), ('wel', 5499), ('goed', 5080), ('contact', 4506), ('medicatie', 3797), ('uur', 3792), ('gaan', 3473), ('kwam', 3463), ('kamer', 3447), ('mee', 3278), ('gesprek', 2978)]
df = pd.DataFrame(example_list, columns=['word', 'frequency'])
df.plot(kind='bar', x='word')
我想做一个词频分布,x轴是词数,y轴是频数。
我有以下列表:
example_list = [('dhr', 17838), ('mw', 13675), ('wel', 5499), ('goed', 5080),
('contact', 4506), ('medicatie', 3797), ('uur', 3792),
('gaan', 3473), ('kwam', 3463), ('kamer', 3447),
('mee', 3278), ('gesprek', 2978)]
我尝试先将其转换为 pandas DataFrame,然后使用 pd.hist()
如下例所示,但我无法理解并认为它实际上很简单但可能我遗漏了什么。
import numpy as np
import matplotlib.pyplot as plt
word = []
frequency = []
for i in range(len(example_list)):
word.append(example_list[i][0])
frequency.append(example_list[i][1])
plt.bar(word, frequency, color='r')
plt.show()
您不能将 word
传递到 matplotlib.pyplot.bar
directly. However you could create an indices array for bar
and then replace these indices with the words
using matplotlib.pyplot.xticks
:
import numpy as np
import matplotlib.pyplot as plt
indices = np.arange(len(example_list))
plt.bar(indices, frequency, color='r')
plt.xticks(indices, word, rotation='vertical')
plt.tight_layout()
plt.show()
创建 word
和 frequency
的 for
循环也可以替换为简单的 zip
和列表解包:
word, frequency = zip(*example_list)
使用pandas:
import pandas as pd
import matplotlib.pyplot as plt
example_list = [('dhr', 17838), ('mw', 13675), ('wel', 5499), ('goed', 5080), ('contact', 4506), ('medicatie', 3797), ('uur', 3792), ('gaan', 3473), ('kwam', 3463), ('kamer', 3447), ('mee', 3278), ('gesprek', 2978)]
df = pd.DataFrame(example_list, columns=['word', 'frequency'])
df.plot(kind='bar', x='word')