将 Python 日志消息路由到 AWS Glue 中的 Cloudwatch
Routing Python logging messages to Cloudwatch in AWS Glue
我已经编写了一个 Python (pyspark) 库,我正在我的 AWS Glue 脚本中使用它。 Python 库使用 import logger; log = logging.getLogger(__name__); log.info(message)
的常用方法记录日志。
我希望这些日志在我的粘合作业运行时出现在 Cloudwatch 中。如何将这些 Python 日志路由到 Cloudwatch?
您需要在向 Cloudwatch 发送日志的 python 记录器上设置 Cloudwatch 处理程序。
一种方法是使用 watchtower
, which you import into Glue as a zip file in the usual manner 提供的 Cloudwatch 处理程序。
import logging
from watchtower import CloudWatchLogHandler
from awsglue.utils import getResolvedOptions
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
# This will be used to make sure the log stream is prefixed with the job ID
# meaning it appears when you click 'see logs' in the Glue web UI.
job_run_id = args['JOB_RUN_ID']
lsn = f"{job_run_id}_custom"
# Need to set your default region so that boto3 create a log client in the right region
os.environ["AWS_DEFAULT_REGION"] = "eu-west-1"
cw = CloudWatchLogHandler(log_group="/aws-glue/jobs/logs-v2", stream_name=lsn)
# Demo of how to route logs made by requests (via urllib3) to cloudwatch
import requests
rlog = logging.getLogger("urllib3")
rlog.setLevel(logging.DEBUG)
rlog.handlers = []
rlog.addHandler(cw)
r = requests.get('https://www.bbc.co.uk/news')
# The logs generated by urllib3 will now appear in Cloudwatch
我已经编写了一个 Python (pyspark) 库,我正在我的 AWS Glue 脚本中使用它。 Python 库使用 import logger; log = logging.getLogger(__name__); log.info(message)
的常用方法记录日志。
我希望这些日志在我的粘合作业运行时出现在 Cloudwatch 中。如何将这些 Python 日志路由到 Cloudwatch?
您需要在向 Cloudwatch 发送日志的 python 记录器上设置 Cloudwatch 处理程序。
一种方法是使用 watchtower
, which you import into Glue as a zip file in the usual manner 提供的 Cloudwatch 处理程序。
import logging
from watchtower import CloudWatchLogHandler
from awsglue.utils import getResolvedOptions
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
# This will be used to make sure the log stream is prefixed with the job ID
# meaning it appears when you click 'see logs' in the Glue web UI.
job_run_id = args['JOB_RUN_ID']
lsn = f"{job_run_id}_custom"
# Need to set your default region so that boto3 create a log client in the right region
os.environ["AWS_DEFAULT_REGION"] = "eu-west-1"
cw = CloudWatchLogHandler(log_group="/aws-glue/jobs/logs-v2", stream_name=lsn)
# Demo of how to route logs made by requests (via urllib3) to cloudwatch
import requests
rlog = logging.getLogger("urllib3")
rlog.setLevel(logging.DEBUG)
rlog.handlers = []
rlog.addHandler(cw)
r = requests.get('https://www.bbc.co.uk/news')
# The logs generated by urllib3 will now appear in Cloudwatch