如何将文件从我的会话上传到 Azure 数据存储中?
How to upload a file from my session into azure datastorage?
我从我的 azure datastorage 下载了一个数据文件,对其进行了预处理,然后再次想将处理后的文件作为最终 csv 上传到数据存储,我该怎么做?我尝试了下面的方法,但它给了我一个目录错误:
datastore = ws.get_default_datastore()
datastore_paths_train = [(datastore, 'X.csv')]
traindata = Dataset.Tabular.from_delimited_files(path=datastore_paths_train)
train = traindata.to_pandas_dataframe()
#preprocessing the data
X, y = preprocess_data(train)
#splitting the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
#uploading data to datastore
print('Uploading data to datastore')
outputs_folder = './Scaling_data'
os.makedirs(outputs_folder, exist_ok=True)
datastore.upload(X_train, outputs_folder)
如何获得我的 'X_train' 目录,我尝试将其设为 Path 对象,但也没有用。我在这里可能是错的,如果有任何其他方法可以将 csv 上传到数据存储区,我很乐意学习。
下面的datastore.upload
方法显示您需要指定要上传的源文件目录。有关详细信息,请参阅 here。
upload(src_dir, target_path=None, overwrite=False, show_progress=True)
所以你需要先将数据帧X_train
保存到本地文件。请参阅以下示例:
outputs_folder = "./Scaling_data"
# create local directory if not exist
if not os.path.exists(outputs_folder):
os.mkdir(outputs_folder)
local_path = './Scaling_data/prepared.csv'
# save dataframe X_train to local file './Scaling_data/prepared.csv'
X_train.to_csv(local_path)
# upload the local file from src_dir to the target_path in datastore
datastore.upload(src_dir=outputs_folder, target_path=outputs_folder)
您也可以查看 this example。
我从我的 azure datastorage 下载了一个数据文件,对其进行了预处理,然后再次想将处理后的文件作为最终 csv 上传到数据存储,我该怎么做?我尝试了下面的方法,但它给了我一个目录错误:
datastore = ws.get_default_datastore()
datastore_paths_train = [(datastore, 'X.csv')]
traindata = Dataset.Tabular.from_delimited_files(path=datastore_paths_train)
train = traindata.to_pandas_dataframe()
#preprocessing the data
X, y = preprocess_data(train)
#splitting the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
#uploading data to datastore
print('Uploading data to datastore')
outputs_folder = './Scaling_data'
os.makedirs(outputs_folder, exist_ok=True)
datastore.upload(X_train, outputs_folder)
如何获得我的 'X_train' 目录,我尝试将其设为 Path 对象,但也没有用。我在这里可能是错的,如果有任何其他方法可以将 csv 上传到数据存储区,我很乐意学习。
下面的datastore.upload
方法显示您需要指定要上传的源文件目录。有关详细信息,请参阅 here。
upload(src_dir, target_path=None, overwrite=False, show_progress=True)
所以你需要先将数据帧X_train
保存到本地文件。请参阅以下示例:
outputs_folder = "./Scaling_data"
# create local directory if not exist
if not os.path.exists(outputs_folder):
os.mkdir(outputs_folder)
local_path = './Scaling_data/prepared.csv'
# save dataframe X_train to local file './Scaling_data/prepared.csv'
X_train.to_csv(local_path)
# upload the local file from src_dir to the target_path in datastore
datastore.upload(src_dir=outputs_folder, target_path=outputs_folder)
您也可以查看 this example。