How avoid error "TypeError: invalid data type for einsum" in Python
How avoid error "TypeError: invalid data type for einsum" in Python
我尝试将 CSV 文件加载到 numpy-array 并在 LogisticRegression 等中使用该数组。现在,我正在努力解决错误,如下所示:
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn.linear_model import LogisticRegression
dataset = pd.read_csv('../Bookie_test.csv').values
X = dataset[1:, 32:34]
y = dataset[1:, 14]
# normalize the data attributes
normalized_X = preprocessing.normalize(X)
# standardize the data attributes
standardized_X = preprocessing.scale(X)
model = LogisticRegression()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
我收到一个错误:
> C:\Anaconda32\lib\site-packages\sklearn\utils\validation.py:332:
> UserWarning: The normalize function assumes floating point values as
> input, got object "got %s" % (estimator, X.dtype)) Traceback (most
> recent call last): File
> "X:/test3.py", line 23, in
> <module>
> normalized_X = preprocessing.normalize(X) File "C:\Anaconda32\lib\site-packages\sklearn\preprocessing\data.py", line
> 553, in normalize
> norms = row_norms(X) File "C:\Anaconda32\lib\site-packages\sklearn\utils\extmath.py", line 65,
> in row_norms
> norms = np.einsum('ij,ij->i', X, X) TypeError: invalid data type for einsum
I am new in Python and don't like transformation:
- 将 CSV 加载到 Pandas
- 将 Pandas 转换为 NumPy
- 在 LogisticRegression 中使用 NumPy
有没有简单的方法,比如:
- 加载到 Pandas
- 在 ML 方法中使用 Pandas 数据帧?
关于主要问题,感谢Evert
的建议,我会检查一下。
关于 #2:我找到了很好的教程 http://www.markhneedham.com/blog/2013/11/09/python-making-scikit-learn-and-pandas-play-nice/
并通过 pandas
+ sklearn
取得了预期的结果
我尝试将 CSV 文件加载到 numpy-array 并在 LogisticRegression 等中使用该数组。现在,我正在努力解决错误,如下所示:
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn.linear_model import LogisticRegression
dataset = pd.read_csv('../Bookie_test.csv').values
X = dataset[1:, 32:34]
y = dataset[1:, 14]
# normalize the data attributes
normalized_X = preprocessing.normalize(X)
# standardize the data attributes
standardized_X = preprocessing.scale(X)
model = LogisticRegression()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
我收到一个错误:
> C:\Anaconda32\lib\site-packages\sklearn\utils\validation.py:332:
> UserWarning: The normalize function assumes floating point values as
> input, got object "got %s" % (estimator, X.dtype)) Traceback (most
> recent call last): File
> "X:/test3.py", line 23, in
> <module>
> normalized_X = preprocessing.normalize(X) File "C:\Anaconda32\lib\site-packages\sklearn\preprocessing\data.py", line
> 553, in normalize
> norms = row_norms(X) File "C:\Anaconda32\lib\site-packages\sklearn\utils\extmath.py", line 65,
> in row_norms
> norms = np.einsum('ij,ij->i', X, X) TypeError: invalid data type for einsum
I am new in Python and don't like transformation:
- 将 CSV 加载到 Pandas
- 将 Pandas 转换为 NumPy
- 在 LogisticRegression 中使用 NumPy
有没有简单的方法,比如:
- 加载到 Pandas
- 在 ML 方法中使用 Pandas 数据帧?
关于主要问题,感谢Evert
的建议,我会检查一下。
关于 #2:我找到了很好的教程 http://www.markhneedham.com/blog/2013/11/09/python-making-scikit-learn-and-pandas-play-nice/
并通过 pandas
+ sklearn