用 python pandas 操纵 excel
Manipulate excel with python pandas
我遇到 excel 格式奇怪 excel format, I was looking to put them in a suitable format with python pandas since right now they are separated by days, and it should be all followed like this suitable format 的问题。
当我用 pandas 使用 read_excel 阅读它时,我想统一它们并删除第一个 title-date,从此:
Unnamed: 1
NaN NaN
04Oct2020 (Sunday) NaN
date & time cars
04/10/2020 00:00:00 1
04/10/2020 00:01:00 2
到合适的形式是这样的:
date & time cars
04/10/2020 00:00:00 1
04/10/2020 00:01:00 2
.
.
05/10/2020 00:00:00 1
(点是为了表示天是统一的)。我该怎么做?我没有成功,感谢任何帮助!
一种非常 hacky 的方法,应该适用于您的数据集。
import pandas as pd
exclude = ["Mean", "STDEV", "Median", "Min", "Max", "date & time"]
df = pd.read_excel("test.xls", names = ["date_time", "cars"])
df = df[~df.date_time.isin(exclude)].dropna()
df.to_excel("testoutput.xls", index=False)
这会将此输出写入新的 Excel 文件,删除行索引。
date_time cars
1 2020-10-04 00:00:00 1
2 2020-10-04 00:01:00 2
3 2020-10-04 00:02:00 3
4 2020-10-04 00:03:00 4
5 2020-10-04 00:04:00 5
6 2020-10-04 00:05:00 6
7 2020-10-04 00:06:00 7
17 2020-10-05 00:00:00 1
18 2020-10-05 00:01:00 2
19 2020-10-05 00:02:00 3
20 2020-10-05 00:03:00 4
21 2020-10-05 00:04:00 5
22 2020-10-05 00:05:00 6
23 2020-10-05 00:06:00 7
24 2020-10-05 00:07:00 8
25 2020-10-05 00:08:00 9
26 2020-10-05 00:09:00 10
我遇到 excel 格式奇怪 excel format, I was looking to put them in a suitable format with python pandas since right now they are separated by days, and it should be all followed like this suitable format 的问题。 当我用 pandas 使用 read_excel 阅读它时,我想统一它们并删除第一个 title-date,从此:
Unnamed: 1
NaN NaN
04Oct2020 (Sunday) NaN
date & time cars
04/10/2020 00:00:00 1
04/10/2020 00:01:00 2
到合适的形式是这样的:
date & time cars
04/10/2020 00:00:00 1
04/10/2020 00:01:00 2
.
.
05/10/2020 00:00:00 1
(点是为了表示天是统一的)。我该怎么做?我没有成功,感谢任何帮助!
一种非常 hacky 的方法,应该适用于您的数据集。
import pandas as pd
exclude = ["Mean", "STDEV", "Median", "Min", "Max", "date & time"]
df = pd.read_excel("test.xls", names = ["date_time", "cars"])
df = df[~df.date_time.isin(exclude)].dropna()
df.to_excel("testoutput.xls", index=False)
这会将此输出写入新的 Excel 文件,删除行索引。
date_time cars
1 2020-10-04 00:00:00 1
2 2020-10-04 00:01:00 2
3 2020-10-04 00:02:00 3
4 2020-10-04 00:03:00 4
5 2020-10-04 00:04:00 5
6 2020-10-04 00:05:00 6
7 2020-10-04 00:06:00 7
17 2020-10-05 00:00:00 1
18 2020-10-05 00:01:00 2
19 2020-10-05 00:02:00 3
20 2020-10-05 00:03:00 4
21 2020-10-05 00:04:00 5
22 2020-10-05 00:05:00 6
23 2020-10-05 00:06:00 7
24 2020-10-05 00:07:00 8
25 2020-10-05 00:08:00 9
26 2020-10-05 00:09:00 10