如何处理驱动器 api 的最大导出限制大小文件
How to handle the maximun export limit size file for drive api
我正在尝试下载一些 google doc 文件,但之后我需要使用导出方法转换为 Microsoft word mimetype,在找到一个超过 10 mb 大小的文件之前它工作正常,api 文档说这是导出文档的限制大小,但我确实需要下载这些文件,我的脚本中的所有内容都工作正常,除了抛出的错误是
“此文件太大,无法导出。”。详细信息:“此文件太大,无法导出。”
那么,是否有办法避免此限制或将文档导出到 content
文件夹中
编辑: 我要下载的文档不是 public 所以我想我需要授权请求才能获取内容
编辑 2:脚本:
SCOPES = ['https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/spreadsheets']
def main():
#----------------------Google drive auth-----------------------------
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
# Call the Drive v3 API
service = build('drive', 'v3', credentials=creds)
sheets_service = build('sheets', 'v4', credentials=creds)
# Call the Sheets API
sheet = sheets_service.spreadsheets()
# ID of folder that contain the wanted files
query = "'[ID OF THE FOLDER]' in parents"
response = service.files().list(q=query,
spaces='drive',
fields='files(id, name, parents, webViewLink,exportLinks)').execute()
baseURL="https://docs.google.com/document/d/"
for document in response['files']:
downloadURL=baseURL+document["id"]+"/export?format=doc"
r = requests.get(downloadURL)
with open('pathtosabe, 'wb') as f:
f.write(r.content)
main()
来自您的关注者 、
well, that is the problem i don´t know how to use the acces token in the request the file is downloaded but the content is shown as corrupted i tryed with a public document and the content was visible
我认为当您的 Google 文档未公开共享时,当访问令牌用于您的 r = requests.get(downloadURL)
脚本时,它可能会起作用。因此,在这个答案中,我想使用从您脚本的授权脚本中检索到的访问令牌来建议修改后的脚本。
修改后的脚本:
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
# Call the Drive v3 API
service = build('drive', 'v3', credentials=creds)
sheets_service = build('sheets', 'v4', credentials=creds)
# Call the Sheets API
sheet = sheets_service.spreadsheets()
# ID of folder that contain the wanted files
query = "'[ID OF THE FOLDER]' in parents"
response = service.files().list(q=query,
spaces='drive',
fields='files(id, name, parents, webViewLink,exportLinks)').execute()
access_token = creds.token # Added
baseURL="https://docs.google.com/document/d/"
for document in response['files']:
downloadURL=baseURL+document["id"]+"/export?format=doc"
r = requests.get(downloadURL, headers={'Authorization': 'Bearer ' + access_token}) # Modified
with open('pathtosabe', 'wb') as f: # Modified
f.write(r.content)
- 在您的脚本中,
with open('pathtosabe, 'wb') as f:
的 'pathtosabe,
没有用单引号引起来。请注意这一点。如果要使用pathtosabe
作为变量,请声明并修改为with open(pathtosabe, 'wb') as f:
.
我正在尝试下载一些 google doc 文件,但之后我需要使用导出方法转换为 Microsoft word mimetype,在找到一个超过 10 mb 大小的文件之前它工作正常,api 文档说这是导出文档的限制大小,但我确实需要下载这些文件,我的脚本中的所有内容都工作正常,除了抛出的错误是
“此文件太大,无法导出。”。详细信息:“此文件太大,无法导出。” 那么,是否有办法避免此限制或将文档导出到 content
文件夹中编辑: 我要下载的文档不是 public 所以我想我需要授权请求才能获取内容
编辑 2:脚本:
SCOPES = ['https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/spreadsheets']
def main():
#----------------------Google drive auth-----------------------------
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
# Call the Drive v3 API
service = build('drive', 'v3', credentials=creds)
sheets_service = build('sheets', 'v4', credentials=creds)
# Call the Sheets API
sheet = sheets_service.spreadsheets()
# ID of folder that contain the wanted files
query = "'[ID OF THE FOLDER]' in parents"
response = service.files().list(q=query,
spaces='drive',
fields='files(id, name, parents, webViewLink,exportLinks)').execute()
baseURL="https://docs.google.com/document/d/"
for document in response['files']:
downloadURL=baseURL+document["id"]+"/export?format=doc"
r = requests.get(downloadURL)
with open('pathtosabe, 'wb') as f:
f.write(r.content)
main()
来自您的关注者
well, that is the problem i don´t know how to use the acces token in the request the file is downloaded but the content is shown as corrupted i tryed with a public document and the content was visible
我认为当您的 Google 文档未公开共享时,当访问令牌用于您的 r = requests.get(downloadURL)
脚本时,它可能会起作用。因此,在这个答案中,我想使用从您脚本的授权脚本中检索到的访问令牌来建议修改后的脚本。
修改后的脚本:
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
# Call the Drive v3 API
service = build('drive', 'v3', credentials=creds)
sheets_service = build('sheets', 'v4', credentials=creds)
# Call the Sheets API
sheet = sheets_service.spreadsheets()
# ID of folder that contain the wanted files
query = "'[ID OF THE FOLDER]' in parents"
response = service.files().list(q=query,
spaces='drive',
fields='files(id, name, parents, webViewLink,exportLinks)').execute()
access_token = creds.token # Added
baseURL="https://docs.google.com/document/d/"
for document in response['files']:
downloadURL=baseURL+document["id"]+"/export?format=doc"
r = requests.get(downloadURL, headers={'Authorization': 'Bearer ' + access_token}) # Modified
with open('pathtosabe', 'wb') as f: # Modified
f.write(r.content)
- 在您的脚本中,
with open('pathtosabe, 'wb') as f:
的'pathtosabe,
没有用单引号引起来。请注意这一点。如果要使用pathtosabe
作为变量,请声明并修改为with open(pathtosabe, 'wb') as f:
.