在数据工厂中使用 Azure Function APP 到运行 Python 脚本

Question

我正在从 blob 存储合并到 CSV 文件并将其上传到 Data Lake 存储（第 2 代）。该代码适用于 PyCharm 和 VS Code，但我想运行使用函数应用程序在 Azure 数据工厂管道中使用它。如果我尝试在管道中运行它，我会收到此错误：“目标 Azure Function1 上的操作失败：调用提供的 Azure 函数 'name' 失败，状态 - 'Unauthorized' 和消息 - 'Invoking Azure function failed with HttpStatusCode - Unauthorized.'."

import azure.functions as func
import pandas as pd
import logging
from azure.storage.blob import BlobServiceClient
from azure.storage.filedatalake import DataLakeServiceClient

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    STORAGEACCOUNTURL= 'https://storage.blob.core.windows.net/'
    STORAGEACCOUNTKEY= '****'
    LOCALFILENAME= ['file1.csv', 'file2.csv']
    CONTAINERNAME= 'inputblob'

    file1 = pd.DataFrame()
    file2 = pd.DataFrame(])
    #download from blob

    blob_service_client_instance = BlobServiceClient(account_url=STORAGEACCOUNTURL, credential=STORAGEACCOUNTKEY)

    for i in LOCALFILENAME:
        with open(i, "wb") as my_blobs:
            blob_client_instance = blob_service_client_instance.get_blob_client(container=CONTAINERNAME, blob=i, snapshot=None)
            blob_data = blob_client_instance.download_blob()
            blob_data.readinto(my_blobs)
            if i == 'file1.csv':
                file1 = pd.read_csv(i)
            if i == 'file2.csv':
                file2 = pd.read_csv(i)

    # load

    # join the 2 dataframes into the final dataframe
    summary = pd.merge(left=file1, right=file2, on='key', how='inner')
        
    summary.to_csv(path_or_buf=r'path\summary.csv', index=False, encoding='utf-8')

    global service_client
            
    service_client = DataLakeServiceClient(account_url="https://storage.dfs.core.windows.net/", credential='****')
        
    file_system_client = service_client.get_file_system_client(file_system="outputdatalake")

    directory_client = file_system_client.get_directory_client("functionapp") 

    file_client = directory_client.create_file("merged.csv")
            
    local_file = open(r"path\summary.csv",'rb') 

    file_contents = local_file.read()

    file_client.upload_data(file_contents, overwrite=True) 

    return func.HttpResponse("This HTTP triggered function executed successfully.")

Answer 1

我尝试使用基于 python 的 http 触发器进行重现，但在首次部署后遇到以下错误

Call to provided Azure function 'HttpTriggerT' failed with status-'Unauthorized' and message - 'Invoking Azure function failed with HttpStatusCode - Unauthorized.'.

注意：尝试在部署或更改后刷新并重新启动 Function App 服务。众所周知，这可以解决一些暂时性问题。

理想情况下，您可以使用功能键（针对特定的单一功能）或 master/host 键（针对功能应用服务中的所有功能）来允许访问。托管身份提供对整个功能应用程序的安全访问

使用功能键，

导航到您的 功能应用 > 功能 > your_function > 功能键

复制密钥并在功能链接服务中添加授权

使用托管身份

此外，我还进行了以下更改以使其正常工作。

导航到已部署的函数应用，设置 > 标识 > 打开系统分配的托管标识。

添加身份提供者。设置 > 身份验证 > Microsoft Identity

为 ADF 创建托管身份：

向 ADF 添加凭据

最后编辑 Azure 函数链接服务

从注册为身份提供者的 AAD 应用中获取资源 ID

在数据工厂中使用 Azure Function APP 到运行 Python 脚本

Using Azure Function APP in Data Factory to run Python script

python

azure

azure-data-factory

azure-functions

此外，我还进行了以下更改以使其正常工作。

管道中的测试函数调用

在数据工厂中使用 Azure Function APP 到 运行 Python 脚本

Using Azure Function APP in Data Factory to run Python script

python

azure

azure-data-factory

azure-functions

此外，我还进行了以下更改以使其正常工作。

管道中的测试函数调用

在数据工厂中使用 Azure Function APP 到运行 Python 脚本