如何使用python脚本替换要解析的yaml文件中的环境变量值

How to replace environment variable value in yaml file to be parsed using python script

我需要在需要用脚本解析的yaml文件中使用环境变量"PATH"。

这是我在终端上设置的环境变量:

$ echo $PATH
/Users/abc/Downloads/tbwork

这是我的 sample.yml:

---
Top: ${PATH}/my.txt
Vars:
- a
- b

当我用我的脚本解析这个 yaml 文件时,我没有看到 PATH 变量的实际值。

这是我的脚本:

import yaml
import os
import sys

stream = open("sample.yml", "r")
docs = yaml.load_all(stream)
for doc in docs:
    for k,v in doc.items():
        print k, "->", v
    print "\n",

输出:

Top -> ${PATH}/my.txt
Vars -> ['a', 'b']

预期输出为:

Top -> /Users/abc/Downloads/tbwork/my.txt
Vars -> ['a', 'b']

如果我做错了,有人能帮我找出正确的方法吗?

PY-yaml 库默认不解析环境变量。您需要定义一个隐式解析器,它将找到定义环境变量的正则表达式并执行一个函数来解析它。

您可以通过 yaml.add_implicit_resolveryaml.add_constructor 来完成。在下面的代码中,您定义了一个解析器,它将匹配 YAML 值中的 ${ env variable } 并调用函数 path_constructor 来查找环境变量。

import yaml
import re
import os

path_matcher = re.compile(r'$\{([^}^{]+)\}')
def path_constructor(loader, node):
  ''' Extract the matched value, expand env variable, and replace the match '''
  value = node.value
  match = path_matcher.match(value)
  env_var = match.group()[2:-1]
  return os.environ.get(env_var) + value[match.end():]

yaml.add_implicit_resolver('!path', path_matcher)
yaml.add_constructor('!path', path_constructor)

data = """
env: ${VAR}/file.txt
other: file.txt
"""

if __name__ == '__main__':
  p = yaml.load(data, Loader=yaml.FullLoader)
  print(os.environ.get('VAR')) ## /home/abc
  print(p['env']) ## /home/abc/file.txt

警告:如果您不是指定环境变量(或任何其他不受信任的输入)的人,请不要运行这样做,因为存在远程代码执行漏洞截至 2020 年 7 月的 FullLoader。

如果您不想修改 global/default yaml 加载器,这里是一个使用新加载器 class 的替代版本。

更重要的是,它正确地替换了不仅仅是环境变量的内插字符串,例如 path/to/${SOME_VAR}/and/${NEXT_VAR}/foo/bar

        path_matcher = re.compile(r'.*$\{([^}^{]+)\}.*')
        def path_constructor(loader, node):
            return os.path.expandvars(node.value)

        class EnvVarLoader(yaml.SafeLoader):
            pass

        EnvVarLoader.add_implicit_resolver('!path', path_matcher, None)
        EnvVarLoader.add_constructor('!path', path_constructor)

        with open(configPath) as f:
            c = yaml.load(f, Loader=EnvVarLoader)

有一个很好的库 envyaml。 有了它就很简单了:

from envyaml import EnvYAML

# read file env.yaml and parse config
env = EnvYAML('env.yaml')

使用 yamls add_implicit_resolver 和 add_constructor 对我有用,但像上面的例子一样:

import yaml
import re
import os
os.environ['VAR']="you better work"
path_matcher = re.compile(r'$\{([^}^{]+)\}')
def path_constructor(loader, node):

  ''' Extract the matched value, expand env variable, and replace the match '''
  print("i'm here")
  value = node.value
  match = path_matcher.match(value)
  env_var = match.group()[2:-1]
  return os.environ.get(env_var) + value[match.end():]

yaml.add_implicit_resolver('!path', path_matcher, None, yaml.SafeLoader)
yaml.add_constructor('!path', path_constructor, yaml.SafeLoader)

data = """
env: ${VAR}/file.txt
other: file.txt
"""

if __name__ == '__main__':
  p = yaml.safe_load(data)
  print(os.environ.get('VAR')) ## you better work
  print(p['env']) ## you better work/file.txt

您可以查看如何here, which lead to the very small library pyaml-env 以方便使用,这样我们就不会在每个项目中重复内容。

因此,使用该库,您的示例 yaml 变为:

---
Top: !ENV ${PATH}/my.txt
Vars:
- a
- b

parse_config

from pyaml_env import parse_config
config = parse_config('path/to/config.yaml')

print(config)
# outputs the following, with the environment variables resolved
{
    'Top': '/Users/abc/Downloads/tbwork/my.txt'
    'Vars': ['a', 'b']
}

如果您愿意,也可以选择使用默认值,如下所示:

---
Top: !ENV ${PATH:'~/data/'}/my.txt
Vars:
- a
- b

关于实施,简而言之: 为了让 PyYAML 能够解析环境变量,我们需要三个主要的东西:

  1. 用于环境变量识别的正则表达式模式,例如pattern = re.compile(‘.?${(\w+)}.?’)

  2. 一个标记,表示有一个(或更多)环境变量需要解析,例如!ENV.

  3. 加载程序将用来解析环境变量的函数

完整示例:

import os
import re
import yaml


def parse_config(path=None, data=None, tag='!ENV'):
    """
    Load a yaml configuration file and resolve any environment variables
    The environment variables must have !ENV before them and be in this format
    to be parsed: ${VAR_NAME}.
    E.g.:
    database:
        host: !ENV ${HOST}
        port: !ENV ${PORT}
    app:
        log_path: !ENV '/var/${LOG_PATH}'
        something_else: !ENV '${AWESOME_ENV_VAR}/var/${A_SECOND_AWESOME_VAR}'
    :param str path: the path to the yaml file
    :param str data: the yaml data itself as a stream
    :param str tag: the tag to look for
    :return: the dict configuration
    :rtype: dict[str, T]
    """
    # pattern for global vars: look for ${word}
    pattern = re.compile('.*?${(\w+)}.*?')
    loader = yaml.SafeLoader

    # the tag will be used to mark where to start searching for the pattern
    # e.g. somekey: !ENV somestring${MYENVVAR}blah blah blah
    loader.add_implicit_resolver(tag, pattern, None)

    def constructor_env_variables(loader, node):
        """
        Extracts the environment variable from the node's value
        :param yaml.Loader loader: the yaml loader
        :param node: the current node in the yaml
        :return: the parsed string that contains the value of the environment
        variable
        """
        value = loader.construct_scalar(node)
        match = pattern.findall(value)  # to find all env variables in line
        if match:
            full_value = value
            for g in match:
                full_value = full_value.replace(
                    f'${{{g}}}', os.environ.get(g, g)
                )
            return full_value
        return value

    loader.add_constructor(tag, constructor_env_variables)

    if path:
        with open(path) as conf_data:
            return yaml.load(conf_data, Loader=loader)
    elif data:
        return yaml.load(data, Loader=loader)
    else:
        raise ValueError('Either a path or data should be defined as input')

您可以在终端上像这样运行。

ENV_NAME=test
cat << EOF > new.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ${ENV_NAME}
EOF

Then do a cat new.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test