如何使用 Python 为 Azure 文件存储使用获取文件属性 REST API
How to use Get File Properties REST API for Azure Files Storage using Python
我正在尝试创建一个 Python 脚本,它将利用 Python Azure SDK 和 REST API 来提取我的 Azure 文件中文件的信息存储帐户。
我正在使用 SDK 访问存储中的文件并获取名称。然后使用我希望能够进行 REST API 调用的名称来获取文件属性,特别是 Last-Modified 属性。我尝试使用 SDK 访问最后修改的 属性,但由于某种原因它总是 returns None。
我想使用最后修改日期来确定它是否已经超过 24 小时,如果已经超过那么我想删除文件。当我第一次创建文件并将其上传到 Azure 时,我不确定是否可以在一段时间 属性 后对文件设置某种自动删除。如果有那么这无论如何都会解决我的问题。
我已经在下面发布了我正在使用的代码。当我尝试发出 HTTP 请求时,出现错误 "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."
import datetime
import requests
import json
import base64
import hmac
import hashlib
import urllib
from azure.storage.file import *
StorageAccountConnectionString = ""
fileshareName = "testFileShare"
storage_account_name = "testStorage"
storage_account_key = ""
api_version = "2018-03-28"
file_service = FileService(connection_string=StorageAccountConnectionString)
listOfStateDirectories = file_service.list_directories_and_files(fileshareName)
for state_directory in listOfStateDirectories:
print("Cleaning up State Directory: " + state_directory.name)
if(isinstance(state_directory, Directory)):
listOfBridgeDirectories = file_service.list_directories_and_files(fileshareName, state_directory.name)
for bridge_directory in listOfBridgeDirectories:
if(isinstance(bridge_directory, Directory)):
print("Cleaning up Bridge Directory: " + bridge_directory.name)
path_to_bridge_directory = state_directory.name + "/" + bridge_directory.name
listOfFilesAndFolders = file_service.list_directories_and_files(fileshareName, path_to_bridge_directory)
for file_or_folder in listOfFilesAndFolders:
if isinstance(file_or_folder, File):
name_of_file = file_or_folder.name
# Get the time of the current request
request_time = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
string_to_append_to_url = fileshareName + '/' + path_to_bridge_directory + '/' + name_of_file
# Parse the url to make sure everything is good
# string_to_append_to_url = urllib.parse.quote(string_to_append_to_url)
string_params = {
'verb': 'HEAD',
'Content-Encoding': '',
'Content-Language': '',
'Content-Length': '',
'Content-MD5': '',
'Content-Type': '',
'Date': '',
'If-Modified-Since': '',
'If-Match': '',
'If-None-Match': '',
'If-Unmodified-Since': '',
'Range': '',
'CanonicalizedHeaders': 'x-ms-date:' + request_time + '\nx-ms-version:' + api_version + '\n',
'CanonicalizedResource': '/' + storage_account_name + '/' + string_to_append_to_url
}
string_to_sign = (string_params['verb'] + '\n'
+ string_params['Content-Encoding'] + '\n'
+ string_params['Content-Language'] + '\n'
+ string_params['Content-Length'] + '\n'
+ string_params['Content-MD5'] + '\n'
+ string_params['Content-Type'] + '\n'
+ string_params['Date'] + '\n'
+ string_params['If-Modified-Since'] + '\n'
+ string_params['If-Match'] + '\n'
+ string_params['If-None-Match'] + '\n'
+ string_params['If-Unmodified-Since'] + '\n'
+ string_params['Range'] + '\n'
+ string_params['CanonicalizedHeaders']
+ string_params['CanonicalizedResource'])
signed_string = base64.b64encode(hmac.new(base64.b64decode(storage_account_key), msg=string_to_sign.encode('utf-8'), digestmod=hashlib.sha256).digest()).decode()
headers = {
'x-ms-date': request_time,
'x-ms-version': api_version,
'Authorization': ('SharedKey ' + storage_account_name + ':' + signed_string)
}
url = ('https://' + storage_account_name + '.file.core.windows.net/' + string_to_append_to_url)
print(url)
r = requests.get(url, headers=headers)
print(r.content)
注意:一些目录会有空格,所以我不确定这是否会影响 REST API 调用,因为 URL 也会有空格。如果它确实影响了它,那么我将如何访问那些 URL 将包含空格
的文件
I try to access the last modified property using the SDK but it always returns None for some reason.
并不是所有的SDKAPI和RESTAPI都会return响应headers中的Last-Modified
属性,这包括 REST API List Directories and Files
and Python SDK API list_directories_and_files
.
我尝试使用 SDK 重现您的问题,代码如下。
generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
if isinstance(file_or_dir, File):
print(file_or_dir.name, file_or_dir.properties.last_modified)
由于list_directories_and_files
方法不会returnFile
中的任何属性object,所以上面代码的file_or_dir.properties.last_modified
值为None
.
REST APIs Get File
, Get File Properties
, Get File Metadata
and the Python SDK APIs get_file_properties
, get_file_metadata
会在returnLast-Modified
属性的响应中headers,所以要改代码如下获取 last_modified
属性 使其工作。
generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
if isinstance(file_or_dir, File):
file_name = file_or_dir.name
file = file_service.get_file_properties(share_name, directory_name, file_name, timeout=None, snapshot=None)
print(file_or_dir.name, file.properties.last_modified)
当然,调用 REST API 与使用 SDK API 是一样的。但是,构建SAS签名字符串容易出错,代码阅读也不友好
我正在尝试创建一个 Python 脚本,它将利用 Python Azure SDK 和 REST API 来提取我的 Azure 文件中文件的信息存储帐户。
我正在使用 SDK 访问存储中的文件并获取名称。然后使用我希望能够进行 REST API 调用的名称来获取文件属性,特别是 Last-Modified 属性。我尝试使用 SDK 访问最后修改的 属性,但由于某种原因它总是 returns None。
我想使用最后修改日期来确定它是否已经超过 24 小时,如果已经超过那么我想删除文件。当我第一次创建文件并将其上传到 Azure 时,我不确定是否可以在一段时间 属性 后对文件设置某种自动删除。如果有那么这无论如何都会解决我的问题。
我已经在下面发布了我正在使用的代码。当我尝试发出 HTTP 请求时,出现错误 "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."
import datetime
import requests
import json
import base64
import hmac
import hashlib
import urllib
from azure.storage.file import *
StorageAccountConnectionString = ""
fileshareName = "testFileShare"
storage_account_name = "testStorage"
storage_account_key = ""
api_version = "2018-03-28"
file_service = FileService(connection_string=StorageAccountConnectionString)
listOfStateDirectories = file_service.list_directories_and_files(fileshareName)
for state_directory in listOfStateDirectories:
print("Cleaning up State Directory: " + state_directory.name)
if(isinstance(state_directory, Directory)):
listOfBridgeDirectories = file_service.list_directories_and_files(fileshareName, state_directory.name)
for bridge_directory in listOfBridgeDirectories:
if(isinstance(bridge_directory, Directory)):
print("Cleaning up Bridge Directory: " + bridge_directory.name)
path_to_bridge_directory = state_directory.name + "/" + bridge_directory.name
listOfFilesAndFolders = file_service.list_directories_and_files(fileshareName, path_to_bridge_directory)
for file_or_folder in listOfFilesAndFolders:
if isinstance(file_or_folder, File):
name_of_file = file_or_folder.name
# Get the time of the current request
request_time = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
string_to_append_to_url = fileshareName + '/' + path_to_bridge_directory + '/' + name_of_file
# Parse the url to make sure everything is good
# string_to_append_to_url = urllib.parse.quote(string_to_append_to_url)
string_params = {
'verb': 'HEAD',
'Content-Encoding': '',
'Content-Language': '',
'Content-Length': '',
'Content-MD5': '',
'Content-Type': '',
'Date': '',
'If-Modified-Since': '',
'If-Match': '',
'If-None-Match': '',
'If-Unmodified-Since': '',
'Range': '',
'CanonicalizedHeaders': 'x-ms-date:' + request_time + '\nx-ms-version:' + api_version + '\n',
'CanonicalizedResource': '/' + storage_account_name + '/' + string_to_append_to_url
}
string_to_sign = (string_params['verb'] + '\n'
+ string_params['Content-Encoding'] + '\n'
+ string_params['Content-Language'] + '\n'
+ string_params['Content-Length'] + '\n'
+ string_params['Content-MD5'] + '\n'
+ string_params['Content-Type'] + '\n'
+ string_params['Date'] + '\n'
+ string_params['If-Modified-Since'] + '\n'
+ string_params['If-Match'] + '\n'
+ string_params['If-None-Match'] + '\n'
+ string_params['If-Unmodified-Since'] + '\n'
+ string_params['Range'] + '\n'
+ string_params['CanonicalizedHeaders']
+ string_params['CanonicalizedResource'])
signed_string = base64.b64encode(hmac.new(base64.b64decode(storage_account_key), msg=string_to_sign.encode('utf-8'), digestmod=hashlib.sha256).digest()).decode()
headers = {
'x-ms-date': request_time,
'x-ms-version': api_version,
'Authorization': ('SharedKey ' + storage_account_name + ':' + signed_string)
}
url = ('https://' + storage_account_name + '.file.core.windows.net/' + string_to_append_to_url)
print(url)
r = requests.get(url, headers=headers)
print(r.content)
注意:一些目录会有空格,所以我不确定这是否会影响 REST API 调用,因为 URL 也会有空格。如果它确实影响了它,那么我将如何访问那些 URL 将包含空格
的文件I try to access the last modified property using the SDK but it always returns None for some reason.
并不是所有的SDKAPI和RESTAPI都会return响应headers中的Last-Modified
属性,这包括 REST API List Directories and Files
and Python SDK API list_directories_and_files
.
我尝试使用 SDK 重现您的问题,代码如下。
generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
if isinstance(file_or_dir, File):
print(file_or_dir.name, file_or_dir.properties.last_modified)
由于list_directories_and_files
方法不会returnFile
中的任何属性object,所以上面代码的file_or_dir.properties.last_modified
值为None
.
REST APIs Get File
, Get File Properties
, Get File Metadata
and the Python SDK APIs get_file_properties
, get_file_metadata
会在returnLast-Modified
属性的响应中headers,所以要改代码如下获取 last_modified
属性 使其工作。
generator = file_service.list_directories_and_files(share_name, directory_name)
for file_or_dir in generator:
if isinstance(file_or_dir, File):
file_name = file_or_dir.name
file = file_service.get_file_properties(share_name, directory_name, file_name, timeout=None, snapshot=None)
print(file_or_dir.name, file.properties.last_modified)
当然,调用 REST API 与使用 SDK API 是一样的。但是,构建SAS签名字符串容易出错,代码阅读也不友好