将词典列表更改为 scikit 中的特征向量和目标

Question

我有一个包含特征和分类标签的字典列表。我从 CSV 中读取。如何根据 scikit 的分类任务要求将其拆分为一个 numpy 数组。
到目前为止的代码

from sklearn.feature_extraction import DictVectorizer
          rowdicts =[{'feature1': 4, 'feature2':2,'target':"yes","feature3":0},{'feature1': 3, 'feature2': 2,'target':"no","feature3":1}]


    vec1 = DictVectorizer(sparse=False)
    X = vec1.fit_transform(rowdicts)

对于分类任务，从上述矢量化器中删除目标标签的好方法是什么？

Answer 1

您可以使用 get_feature_names:

找出哪些列的含义

print(vec1.get_feature_names())

输出：

['feature1', 'feature2', 'feature3', 'target=no', 'target=yes']

现在我们知道我们可以删除 target=no 列：

X = numpy.delete(X, 3, axis=1)

将词典列表更改为 scikit 中的特征向量和目标

change a list of dictionaries to feature vector and target in scikit

python

numpy

scikit-learn