如何在 Scikit-learn 中获取 OrdinalEncoder 的笛卡尔积?

How to get cartesian product of OrdinalEncoder in Scikit-learn?

代码:

import itertools
import pandas as pd
from sklearn import preprocessing


data = pd.read_csv("data.csv")
feature_cols = ['country', 'city', 'pay_type']
X = data[feature_cols]
y = data.label
oe = preprocessing.OrdinalEncoder()
X = oe.fit_transform(X)
print(oe.categories_)
for element in itertools.product(oe.categories_):
    print(element)

和输出:

[array(['Saudi Arabia'], dtype=object), array(['Dammam', 'Jeddah', 'Madinah', 'Makkah', 'Riyadh', 'Taif'],
      dtype=object), array(['COD', 'PREPAID'], dtype=object)]
(array(['Saudi Arabia'], dtype=object),)
(array(['Dammam', 'Jeddah', 'Madinah', 'Makkah', 'Riyadh', 'Taif'],
      dtype=object),)
(array(['COD', 'PREPAID'], dtype=object),)

从输出中,我们有 3 个特征,取值范围是:

Country: 'Saudi Arabia'
City: 'Dammam', 'Jeddah', 'Madinah', 'Makkah', 'Riyadh', 'Taif'
Pay type: 'COD', 'PREPAID'

我想得到所有特征值的笛卡尔积,即:

('Saudi Arabia', 'Dammam', 'COD')
('Saudi Arabia', 'Dammam', 'PREPAID')
('Saudi Arabia', 'Jeddah', 'COD')
...

我试过itertools.product,但只输出 3 个元素而不是笛卡尔积。

谁能给我一些关于获取这些特征值的笛卡尔积的提示?

你很接近。您需要执行以下操作:

...
for element in itertools.product(*oe.categories_):
    print(element)

基本上你必须解压 oe.categories_ 以便它可以被视为 itertools.product.

的单独迭代