用 Python 中两个属性的百分比替换一个属性
Replace an attribute with percentage of two attributes in Python
正如标题所说,我想计算两个属性的百分比并将数据集的最后一个属性替换为新的百分比。
数据集如下:
dataset = pd.read_csv('2016-17-to-2018-19-immunizations.csv',encoding = "ISO-8859-1")
我目前的尝试是:
enrollment = newDataset.iloc[:,7:8]
count = newDataset.iloc[:,9:10]
newDataset.iloc[:,10:11] = count / enrollment
newDataset.to_csv(r'newDF.csv', index='None')
newDataset.head()
但是最后一个百分比属性没有被新的百分比所取代。
任何线索我在这里做错了什么?
编辑-1
Dataset.csv(显示前 5 条记录):
SCHOOL_YEAR,SCHOOL_CODE,COUNTY,PUBLIC_PRIVATE,CITY,SCHOOL_NAME,REPORTED,ENROLLMENT,CATEGORY,COUNT,PERCENT
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,PBE,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,HEPB,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,DTP,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,POLIO,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,Up-To-Date,,
使用样本的修剪版本 Dataset.csv
:
SCHOOL_YEAR
SCHOOL_CODE
ENROLLMENT
COUNT
PERCENT
0
2016-2017
52749
10
0.0
NaN
1
2016-2017
52749
10
0.0
NaN
2
2016-2017
52749
10
3.0
NaN
3
2016-2017
52749
10
NaN
NaN
4
2016-2017
52749
10
8.0
NaN
您可以像这样设置 PERCENT
列:
df.PERCENT = df.COUNT / df.ENROLLMENT
SCHOOL_YEAR
SCHOOL_CODE
ENROLLMENT
COUNT
PERCENT
0
2016-2017
52749
10
0.0
0.0
1
2016-2017
52749
10
0.0
0.0
2
2016-2017
52749
10
3.0
0.3
3
2016-2017
52749
10
NaN
NaN
4
2016-2017
52749
10
8.0
0.8
请注意,在这种情况下 PERCENT
列已经存在。如果目标列不存在(例如,RATIO
),那么我们需要对目标变量使用括号表示法(例如,df['RATIO'] = df.COUNT / df.ENROLLMENT
)。
正如标题所说,我想计算两个属性的百分比并将数据集的最后一个属性替换为新的百分比。
数据集如下:
dataset = pd.read_csv('2016-17-to-2018-19-immunizations.csv',encoding = "ISO-8859-1")
我目前的尝试是:
enrollment = newDataset.iloc[:,7:8]
count = newDataset.iloc[:,9:10]
newDataset.iloc[:,10:11] = count / enrollment
newDataset.to_csv(r'newDF.csv', index='None')
newDataset.head()
但是最后一个百分比属性没有被新的百分比所取代。
任何线索我在这里做错了什么?
编辑-1 Dataset.csv(显示前 5 条记录):
SCHOOL_YEAR,SCHOOL_CODE,COUNTY,PUBLIC_PRIVATE,CITY,SCHOOL_NAME,REPORTED,ENROLLMENT,CATEGORY,COUNT,PERCENT
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,PBE,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,HEPB,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,DTP,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,POLIO,,
2016-2017,52749,MONTEREY,PRIVATE,PRUNEDALE,PRUNEDALE CHRISTIAN ACADEMY,Y,10,Up-To-Date,,
使用样本的修剪版本 Dataset.csv
:
SCHOOL_YEAR | SCHOOL_CODE | ENROLLMENT | COUNT | PERCENT | |
---|---|---|---|---|---|
0 | 2016-2017 | 52749 | 10 | 0.0 | NaN |
1 | 2016-2017 | 52749 | 10 | 0.0 | NaN |
2 | 2016-2017 | 52749 | 10 | 3.0 | NaN |
3 | 2016-2017 | 52749 | 10 | NaN | NaN |
4 | 2016-2017 | 52749 | 10 | 8.0 | NaN |
您可以像这样设置 PERCENT
列:
df.PERCENT = df.COUNT / df.ENROLLMENT
SCHOOL_YEAR | SCHOOL_CODE | ENROLLMENT | COUNT | PERCENT | |
---|---|---|---|---|---|
0 | 2016-2017 | 52749 | 10 | 0.0 | 0.0 |
1 | 2016-2017 | 52749 | 10 | 0.0 | 0.0 |
2 | 2016-2017 | 52749 | 10 | 3.0 | 0.3 |
3 | 2016-2017 | 52749 | 10 | NaN | NaN |
4 | 2016-2017 | 52749 | 10 | 8.0 | 0.8 |
请注意,在这种情况下 PERCENT
列已经存在。如果目标列不存在(例如,RATIO
),那么我们需要对目标变量使用括号表示法(例如,df['RATIO'] = df.COUNT / df.ENROLLMENT
)。