如何使用 python 从字符串中删除 .ds 模式字符串
How to remove .ds patterned string from a string using python
我有这个字符串
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
我需要从这个字符串中删除 SELECT 和 GROUP BY 之后但不是 WHERE 之后的带有 ds 的列。
需要输出:
a="""SELECT
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.unit"""
Tranform_abc 只是一个 table 名称,它可以是任何 table 名称。所以我们不能在正则表达式中使用它。
不确定如何解决这个问题
请在下面找到解决方案。这可能在语法上不正确,没有编译它但应该符合逻辑。
#!/usr/bin/python
import re
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
a = re.sub("SELECT[\s]+([a-zA-Z0-9.])+_([a-zA-Z0-9.])+.ds", "SELECT ", a)
a = re.sub("SELECT[\s]+[a-zA-Z]+[\s]+(\")ds(\",)","SELECT ",a)
a = res.sub("GROUP BY[\s]+([a-zA-Z0-9.])+_([a-zA-Z0-9.])+.ds,","GROUP BY ",a)
a 将包含最终的预期结果
谢谢 KM
这里有一个方法:
import re
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
res = re.sub(r'((?:SELECT|GROUP BY)\s+(?:(?!WHERE)[\s\S])*?)\s+[\w.]+\.ds.+', r'', a)
print res
输出:
SELECT
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.unit
如果 SQL 在一行,使用:
((?:SELECT|GROUP BY)\s+(?:(?!WHERE)[\s\S])*?)\s*[\w.]+\.ds[^,]*,
我有这个字符串
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
我需要从这个字符串中删除 SELECT 和 GROUP BY 之后但不是 WHERE 之后的带有 ds 的列。
需要输出:
a="""SELECT
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.unit"""
Tranform_abc 只是一个 table 名称,它可以是任何 table 名称。所以我们不能在正则表达式中使用它。 不确定如何解决这个问题
请在下面找到解决方案。这可能在语法上不正确,没有编译它但应该符合逻辑。
#!/usr/bin/python
import re
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
a = re.sub("SELECT[\s]+([a-zA-Z0-9.])+_([a-zA-Z0-9.])+.ds", "SELECT ", a)
a = re.sub("SELECT[\s]+[a-zA-Z]+[\s]+(\")ds(\",)","SELECT ",a)
a = res.sub("GROUP BY[\s]+([a-zA-Z0-9.])+_([a-zA-Z0-9.])+.ds,","GROUP BY ",a)
a 将包含最终的预期结果
谢谢 KM
这里有一个方法:
import re
a="""SELECT
transform_abc.ds AS "ds",
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.ds,
transform_abc.unit"""
res = re.sub(r'((?:SELECT|GROUP BY)\s+(?:(?!WHERE)[\s\S])*?)\s+[\w.]+\.ds.+', r'', a)
print res
输出:
SELECT
SUM(transform_abc.dollars) AS "dollars",
transform_abc.unit AS "unit"
FROM fct_table_abc transform_abc
WHERE
(
transform_abc.is_charged > 0
OR transform_abc.account_status = 0
)
AND transform_abc.ds = '2020-02-20'
GROUP BY
transform_abc.unit
如果 SQL 在一行,使用:
((?:SELECT|GROUP BY)\s+(?:(?!WHERE)[\s\S])*?)\s*[\w.]+\.ds[^,]*,