使用 Apache Airflow 工具,为批处理管道实现 DAG,以从远程系统获取目录

Using Apache Airflow Tool, Implement a DAG for a batch processing pipeline to get a directory from a remote system

如何使用 Apache airflow 工具为以下 Python 代码实现 DAG。代码中完成的任务是从 GPU 服务器获取一个目录到本地系统。代码在 Jupyter Notebook 中运行良好。请帮助在 Airflow 中实施......我对此很陌生。谢谢

import pysftp
import os
myHostname = "hostname"
myUsername = "username"
myPassword = "pwd"

with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword) as sftp:
    print("Connection successfully stablished ... ")
    src = '/path/src/'
    dst = '/home/path/path/destination'
    os.mkdir(dst)
    sftp.get_d(src, dst, preserve_mtime=True)
    print("Fetched source images from GPU server to local directory")
# connection closed automatically at the end of the with-block```

  • 我承认示例不多,但可能会有帮助
  • 对于 SSH-connection,参见