将我自己的日期分配给格式为 HH:MM:SSS.000 in Pandas 的日期列
Assigning my own date to Date Column of format HH:MM:SSS.000 in Pandas
我将数据记录到一个 csv 文件中,该文件的时间图格式为 HH:MM:SSS.000。
Raw Data Image。当我通过这行代码将此数据读入 pandas 时,它会自动将今天的日期添加到 parse_dates 函数的列中;
import pandas as pd
df = pd.read_csv('20-9-2019-ETH.csv', names=['Volume', 'Price', 'Time'],
index_col=2, parse_dates=True)
df.head()
Volume Price
Time
2020-03-01 00:00:11.904 0.091683 217.60
2020-03-01 00:00:12.730 0.916826 217.60
2020-03-01 00:00:12.430 0.331441 217.60
2020-03-01 00:00:15.161 1.420000 217.59
2020-03-01 00:00:15.354 0.174274 217.57
我需要做什么才能指定应该使用什么日期而不是创建文件的日期?或者,也许我可以完全删除日期,只在其中添加时间戳?两种解决方案都可以,或者两者都可以,这样我就可以了解更多!谢谢!
一个相当直接的解决方案是将数据读取为字符串,然后使用 pandas.to_timedelta
.
进行解析
示例代码:
from io import StringIO
import pandas as pd
raw_data = \
"""
col_1,col_2
a val,00:00:11.904
another val,00:00:12.730
a third val,00:00:12.430
fourth val,00:00:15.161
fifth val,00:00:15.354
"""
df = pd.read_csv(StringIO(raw_data), header=0, dtype={"col_1": "string", "col_2": "string"})
print(f"{df}\n\n{df.dtypes}\n\n")
df["col_2"] = pd.to_timedelta(df["col_2"])
print(f"{df}\n\n{df.dtypes}")
输出:
col_1 col_2
0 a val 00:00:11.904
1 another val 00:00:12.730
2 a third val 00:00:12.430
3 fourth val 00:00:15.161
4 fifth val 00:00:15.354
col_1 object
col_2 string
dtype: object
col_1 col_2
0 a val 00:00:11.904000
1 another val 00:00:12.730000
2 a third val 00:00:12.430000
3 fourth val 00:00:15.161000
4 fifth val 00:00:15.354000
col_1 object
col_2 timedelta64[ns]
dtype: object
我将数据记录到一个 csv 文件中,该文件的时间图格式为 HH:MM:SSS.000。 Raw Data Image。当我通过这行代码将此数据读入 pandas 时,它会自动将今天的日期添加到 parse_dates 函数的列中;
import pandas as pd
df = pd.read_csv('20-9-2019-ETH.csv', names=['Volume', 'Price', 'Time'],
index_col=2, parse_dates=True)
df.head()
Volume Price
Time
2020-03-01 00:00:11.904 0.091683 217.60
2020-03-01 00:00:12.730 0.916826 217.60
2020-03-01 00:00:12.430 0.331441 217.60
2020-03-01 00:00:15.161 1.420000 217.59
2020-03-01 00:00:15.354 0.174274 217.57
我需要做什么才能指定应该使用什么日期而不是创建文件的日期?或者,也许我可以完全删除日期,只在其中添加时间戳?两种解决方案都可以,或者两者都可以,这样我就可以了解更多!谢谢!
一个相当直接的解决方案是将数据读取为字符串,然后使用 pandas.to_timedelta
.
示例代码:
from io import StringIO
import pandas as pd
raw_data = \
"""
col_1,col_2
a val,00:00:11.904
another val,00:00:12.730
a third val,00:00:12.430
fourth val,00:00:15.161
fifth val,00:00:15.354
"""
df = pd.read_csv(StringIO(raw_data), header=0, dtype={"col_1": "string", "col_2": "string"})
print(f"{df}\n\n{df.dtypes}\n\n")
df["col_2"] = pd.to_timedelta(df["col_2"])
print(f"{df}\n\n{df.dtypes}")
输出:
col_1 col_2
0 a val 00:00:11.904
1 another val 00:00:12.730
2 a third val 00:00:12.430
3 fourth val 00:00:15.161
4 fifth val 00:00:15.354
col_1 object
col_2 string
dtype: object
col_1 col_2
0 a val 00:00:11.904000
1 another val 00:00:12.730000
2 a third val 00:00:12.430000
3 fourth val 00:00:15.161000
4 fifth val 00:00:15.354000
col_1 object
col_2 timedelta64[ns]
dtype: object