我的数据框标题未在 Python Google Colab 中正确显示
My dataframe titles are not appearing properly in Python Google Colab
我正在尝试以这种方式在 Google Colab 中创建数据框,因为我计划稍后对大约 1000 个文件进行分析。我的变量名 header 没有正确注册时遇到问题。我已经链接了我的代码输出以及预期的标题。
我的输出:-
我的代码如下:
import numpy as np
import pandas as pd
from pandas import DataFrame
import pandas_datareader as pdr
from pathlib import Path
import glob
import csv
import sys
import os
import io
# Load the Drive helper and mount
from google.colab import drive
# This will prompt for authorization.
drive.mount('/content/drive')
iter_changes = "Prediction"
PATH_TO_DRIVE_ML_DATA = "/content/drive/My Drive/Root_Work_Sample/inputs"
INPUT_PATH = PATH_TO_DRIVE_ML_DATA+"/work_sample"
OUTPUT_PATH = PATH_TO_DRIVE_ML_DATA+"/outputs/"+iter_changes
# check if directory already exists
if not os.path.exists(OUTPUT_PATH):
os.makedirs(OUTPUT_PATH)
print("Directory created", OUTPUT_PATH)
else:
pass
#raise Exception("Directory already exists. Don't override.")
df = pd.read_csv(os.path.join(INPUT_PATH, 'Root_Work_Sample_Stadardized_Test.csv'), engine='python')
#df = pd.read_csv(io.BytesIO(uploaded['Root_Work_Sample_Stadardized_Test.csv']))
print(df.shape)
print(df.columns)
display(df.head(5))
print(df.dtypes)
我的输出与正确的标题:-
我不确定为什么会发生这种情况,因为我无权访问您的数据,但解决此问题的一种方法是对列名称进行硬编码:
header = ["your", "column", "names"]
df = pd.read_csv(os.path.join(INPUT_PATH, 'Root_Work_Sample_Stadardized_Test.csv'),
engine='python',
names = header)
我正在尝试以这种方式在 Google Colab 中创建数据框,因为我计划稍后对大约 1000 个文件进行分析。我的变量名 header 没有正确注册时遇到问题。我已经链接了我的代码输出以及预期的标题。
我的输出:-
我的代码如下:
import numpy as np
import pandas as pd
from pandas import DataFrame
import pandas_datareader as pdr
from pathlib import Path
import glob
import csv
import sys
import os
import io
# Load the Drive helper and mount
from google.colab import drive
# This will prompt for authorization.
drive.mount('/content/drive')
iter_changes = "Prediction"
PATH_TO_DRIVE_ML_DATA = "/content/drive/My Drive/Root_Work_Sample/inputs"
INPUT_PATH = PATH_TO_DRIVE_ML_DATA+"/work_sample"
OUTPUT_PATH = PATH_TO_DRIVE_ML_DATA+"/outputs/"+iter_changes
# check if directory already exists
if not os.path.exists(OUTPUT_PATH):
os.makedirs(OUTPUT_PATH)
print("Directory created", OUTPUT_PATH)
else:
pass
#raise Exception("Directory already exists. Don't override.")
df = pd.read_csv(os.path.join(INPUT_PATH, 'Root_Work_Sample_Stadardized_Test.csv'), engine='python')
#df = pd.read_csv(io.BytesIO(uploaded['Root_Work_Sample_Stadardized_Test.csv']))
print(df.shape)
print(df.columns)
display(df.head(5))
print(df.dtypes)
我的输出与正确的标题:-
我不确定为什么会发生这种情况,因为我无权访问您的数据,但解决此问题的一种方法是对列名称进行硬编码:
header = ["your", "column", "names"]
df = pd.read_csv(os.path.join(INPUT_PATH, 'Root_Work_Sample_Stadardized_Test.csv'),
engine='python',
names = header)