如何更改 pandas 数据框中列的日期时间格式
How to change date time format of column in pandas data frame
我有数据框 (df2)。它有列(日期),其中包含一些格式为 "Mon Aug 10 11:06:25 UTC 2015" 的日期和时间,我必须将其更改为格式“Aug 10 11:06:25 2015”。
我试过下面的代码,但出现错误
df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
df2
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2896 try:
-> 2897 return self._engine.get_loc(key)
2898 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'date'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-403-66f0c1caed0e> in <module>
1 df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'})
2
----> 3 df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
4 df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
5 df2
~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2978 if self.columns.nlevels > 1:
2979 return self._getitem_multilevel(key)
-> 2980 indexer = self.columns.get_loc(key)
2981 if is_integer(indexer):
2982 indexer = [indexer]
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2897 return self._engine.get_loc(key)
2898 except KeyError:
-> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key))
2900 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2901 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'date'
最简单的方法是:
import pandas as pd
df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
我很确定这会解决您的主要问题。 documentation.
从那时起,您可以更轻松地操作 pd.Timestamp 对象以显示您想要的任何格式。
祝你好运。请让我知道这是否适合您,或者您是否需要进一步的帮助。
编辑: @AsraKhalid,我怀疑你的错误来源实际上在第一行:df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'}).您可能认为您正在更改列名,但实际上有一个拼写错误,但没有被报告,因为 df.rename 默认情况下会抑制错误。尝试将其更改为 df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'}, errors="raise")。这样你就会看到 'Mon Aug 10 07:56:39 UTC 2015' 是否真的在 df 中,或者你是否拼错了
您可以使用 pandas apply() 方法。请检查 date formats 我不明白为什么你的时间戳中有 UTC 字符串。但是根据您的问题,请尝试以下代码:
from datetime import datetime
def change_date_string(date_string):
date_string = str(date_string).replace('UTC', '')
date_object = datetime.strptime(date_string, "%a %b %d %H:%M:%S %Y").strftime('%b %d %H:%M:%S %Y')
return date_object
df2['date'] = df2['date'].apply(change_date_string)
示例:
from datetime import datetime
date_string = 'Mon Aug 10 11:06:25 UTC 2015'
date_string = str(date_string).replace('UTC', '')
date_object = datetime.strptime(date_string, "%a %b %d %H:%M:%S %Y").strftime('%b %d %H:%M:%S %Y')
print(date_object)
输出:
Aug 10 11:06:25 2015
请注意,输出将采用字符串格式
我有数据框 (df2)。它有列(日期),其中包含一些格式为 "Mon Aug 10 11:06:25 UTC 2015" 的日期和时间,我必须将其更改为格式“Aug 10 11:06:25 2015”。
我试过下面的代码,但出现错误
df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
df2
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2896 try:
-> 2897 return self._engine.get_loc(key)
2898 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'date'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-403-66f0c1caed0e> in <module>
1 df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'})
2
----> 3 df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
4 df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
5 df2
~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2978 if self.columns.nlevels > 1:
2979 return self._getitem_multilevel(key)
-> 2980 indexer = self.columns.get_loc(key)
2981 if is_integer(indexer):
2982 indexer = [indexer]
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2897 return self._engine.get_loc(key)
2898 except KeyError:
-> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key))
2900 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2901 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'date'
最简单的方法是:
import pandas as pd
df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
df2['date'] = df2['date'].dt.strftime('%b %d %H:%M:%S %Y')
我很确定这会解决您的主要问题。 documentation.
从那时起,您可以更轻松地操作 pd.Timestamp 对象以显示您想要的任何格式。
祝你好运。请让我知道这是否适合您,或者您是否需要进一步的帮助。
编辑: @AsraKhalid,我怀疑你的错误来源实际上在第一行:df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'}).您可能认为您正在更改列名,但实际上有一个拼写错误,但没有被报告,因为 df.rename 默认情况下会抑制错误。尝试将其更改为 df2 = df2.rename(columns = {'Mon Aug 10 07:56:39 UTC 2015': 'date'}, errors="raise")。这样你就会看到 'Mon Aug 10 07:56:39 UTC 2015' 是否真的在 df 中,或者你是否拼错了
您可以使用 pandas apply() 方法。请检查 date formats 我不明白为什么你的时间戳中有 UTC 字符串。但是根据您的问题,请尝试以下代码:
from datetime import datetime
def change_date_string(date_string):
date_string = str(date_string).replace('UTC', '')
date_object = datetime.strptime(date_string, "%a %b %d %H:%M:%S %Y").strftime('%b %d %H:%M:%S %Y')
return date_object
df2['date'] = df2['date'].apply(change_date_string)
示例:
from datetime import datetime
date_string = 'Mon Aug 10 11:06:25 UTC 2015'
date_string = str(date_string).replace('UTC', '')
date_object = datetime.strptime(date_string, "%a %b %d %H:%M:%S %Y").strftime('%b %d %H:%M:%S %Y')
print(date_object)
输出:
Aug 10 11:06:25 2015
请注意,输出将采用字符串格式