预测百分比出错(逻辑错误)
Prediction percentage come wrong(Logical error)
在下面的代码中,我试图预测患糖尿病的概率。在我想计算数据框中真假百分比的部分,此职责的代码似乎是正确的,但它给出了错误的输出。
输入:
真实人数:268
虚假人数:500
预期输出:
真:34.90%------假:65.10%
34.90 + 65.10 =100.00
我得到的:
真:34.90%------假:50.00%
34.90 + 50.00 != 100
这很奇怪!因为我们只有 True 和 False(50%,50%)
这是我的代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('pima-data-Copy1.csv')
df.isnull().values.any()
dibetes_map = {True:1, False:0}
df['diabetes'] = df['diabetes'].map(dibetes_map)
num_true = len(df.loc[df['diabetes'] == True])
num_false = len(df.loc[df['diabetes'] == False])
print("number of true: {0} ({1:2.2f}%)".format(num_true, (num_true/(num_true + num_false))*100))
print("number of false: {0} ({1:2.2f}%)".format(num_false, (num_false/(num_false + num_false))*100))
你在第二行的分母中有 num_false + num_false
。以下是一些建议的简化:
df = pd.read_csv('pima-data-Copy1.csv')
# df.isnull().values.any() this isn't used anywhere
# dibetes_map = {True:1, False:0}
# df['diabetes'] = df['diabetes'].map(dibetes_map) # this is redundant as you are comparing with True/False
num_true = df['diabetes'].sum()
total = df['diabetes'].count()
num_false = total - num_true
print("number of true: {0} ({1:2.2f}%)".format(num_true, (num_true / total)*100))
print("number of false: {0} ({1:2.2f}%)".format(num_false, (num_false / total)*100)))
在下面的代码中,我试图预测患糖尿病的概率。在我想计算数据框中真假百分比的部分,此职责的代码似乎是正确的,但它给出了错误的输出。
输入:
真实人数:268
虚假人数:500
预期输出:
真:34.90%------假:65.10%
34.90 + 65.10 =100.00
我得到的:
真:34.90%------假:50.00%
34.90 + 50.00 != 100
这很奇怪!因为我们只有 True 和 False(50%,50%)
这是我的代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('pima-data-Copy1.csv')
df.isnull().values.any()
dibetes_map = {True:1, False:0}
df['diabetes'] = df['diabetes'].map(dibetes_map)
num_true = len(df.loc[df['diabetes'] == True])
num_false = len(df.loc[df['diabetes'] == False])
print("number of true: {0} ({1:2.2f}%)".format(num_true, (num_true/(num_true + num_false))*100))
print("number of false: {0} ({1:2.2f}%)".format(num_false, (num_false/(num_false + num_false))*100))
你在第二行的分母中有 num_false + num_false
。以下是一些建议的简化:
df = pd.read_csv('pima-data-Copy1.csv')
# df.isnull().values.any() this isn't used anywhere
# dibetes_map = {True:1, False:0}
# df['diabetes'] = df['diabetes'].map(dibetes_map) # this is redundant as you are comparing with True/False
num_true = df['diabetes'].sum()
total = df['diabetes'].count()
num_false = total - num_true
print("number of true: {0} ({1:2.2f}%)".format(num_true, (num_true / total)*100))
print("number of false: {0} ({1:2.2f}%)".format(num_false, (num_false / total)*100)))