如何使用 sed 只输出一个捕获组

Question

我有一个输入文件


Werkzeug==2.0.2 # https://github.com/pallets/werkzeug
ipdb==0.13.9  # https://github.com/gotcha/ipdb
psycopg2==2.9.1  # https://github.com/psycopg/psycopg2
watchgod==0.7  # https://github.com/samuelcolvin/watchgod

# Testing
# ------------------------------------------------------------------------------
mypy==0.910  # https://github.com/python/mypy
django-stubs==1.8.0  # https://github.com/typeddjango/django-stubs
pytest==6.2.5  # https://github.com/pytest-dev/pytest
pytest-sugar==0.9.4  # https://github.com/Frozenball/pytest-sugar
djangorestframework-stubs==1.4.0  # https://github.com/typeddjango/djangorestframework-stubs

# Documentation
# ------------------------------------------------------------------------------
sphinx==4.2.0  # https://github.com/sphinx-doc/sphinx
sphinx-autobuild==2021.3.14 # https://github.com/GaretJax/sphinx-autobuild

# Code quality
# ------------------------------------------------------------------------------
flake8==3.9.2  # https://github.com/PyCQA/flake8
flake8-isort==4.0.0  # https://github.com/gforcada/flake8-isort
coverage==6.0.2  # https://github.com/nedbat/coveragepy
black==21.9b0  # https://github.com/psf/black
pylint-django==2.4.4  # https://github.com/PyCQA/pylint-django
pylint-celery==0.3  # https://github.com/PyCQA/pylint-celery
pre-commit==2.15.0  # https://github.com/pre-commit/pre-commit

# Django
# ------------------------------------------------------------------------------
factory-boy==3.2.0  # https://github.com/FactoryBoy/factory_boy

django-debug-toolbar==3.2.2  # https://github.com/jazzband/django-debug-toolbar
django-extensions==3.1.3  # https://github.com/django-extensions/django-extensions
django-coverage-plugin==2.0.1  # https://github.com/nedbat/django_coverage_plugin
pytest-django==4.4.0  # https://github.com/pytest-dev/pytest-django

我正在尝试使用此命令

为以 pytest 开头的每一行提取 # 之前的部分

sed -nE "s/(^pytest.+)#//p" ./requirements/local.txt

预期输出

pytest==6.2.5  
pytest-sugar==0.9.4  
pytest-django==4.4.0

实际产量

pytest==6.2.5   https://github.com/pytest-dev/pytest
pytest-sugar==0.9.4   https://github.com/Frozenball/pytest-sugar
pytest-django==4.4.0   https://github.com/pytest-dev/pytest-django

对达到预期有什么帮助吗？

这些参考没有帮助解决这个特定问题

How can I output only captured groups with sed?

Answer 1

使用sed:

sed -nE 's/^(pytest[^=]*=[^[:blank:]]*).*//p' file

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

然而 grep -o 解决方案会更简单：

grep -o '^pytest[^=]*=[^[:blank:]]*' file

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

解释：

^pytest：匹配pytest开头
[^=]*：匹配0个或多个除=
=：匹配一个=
[^[:blank:]]*：匹配0个或多个非空白字符

Answer 2

您在 # 之后缺少正则表达式。这应该可以解决它：

$ sed -nE "s/(^pytest.+)#.*//p" ./requirements/local.txt

Answer 3

第一个解决方案： 使用 awk 你可以尝试以下。在这里使用 awk 的 match 函数，用 GNU awk 编写和测试应该可以在任何地方工作。简单的解释是，使用 awk 的 match 函数匹配正则表达式 ^pytest[^ ]* 以匹配 pytest 的起始值，直到第一次出现 space 并使用 [= 打印匹配值19=] awk.

函数

awk 'match([=10=],/^pytest[^ ]*/){print substr([=10=],RSTART,RLENGTH)}' Input_file

第二个解决方案： 使用 GNU awk 尝试跟随使用它的 RS 变量的地方。

awk -v RS='(^|\n)pytest[^ ]*' 'RT{sub(/^\n*/,"",RT);print RT}' Input_file

Answer 4

作为使用 awk 的替代方法，您还可以将字段分隔符设置为 #，前面是可选空格，如果第一列以 pytest

开头，则打印第一列

awk -F"[[:blank:]]*#" '/^pytest/ {print }' ./requirements/local.txt

输出

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

如果 # 并不总是存在，您还可以使匹配更具体以匹配数字，然后打印第一个字段：

awk '/^pytest[^[:blank:]]*==[0-9]+(\.[0-9]+)*/ {print }' file

Answer 5

一个sed单行将是：

sed -e '/^pytest/!d' -e 's/[[:blank:]]*#.*//' file

第一个表达式删除不以 pytest 开头的行。第二个删除评论部分（包括#之前的空白），如果有的话。

Answer 6

使用sed

$ sed -n '/^pytest/s/#.*//p' input_file
pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

如何使用 sed 只输出一个捕获组

How do I output only a capture group with sed

regex

linux

bash

sed