lisp 从另一个列表中删除一个列表的内容
lisp remove a the content of one list from another list
我有一个像这样的字符串列表 F:
("hello word i'am walid" "goodbye madame")
=> 这个列表包含字符串的两个元素
我还有另一个这样的列表调用 S ("word" "madame")
=> 这包含两个词
现在我想从列表 F 的每个字符串中删除列表 S 的元素,输出应该是这样的 ("hello i'am walid" "goodbye")
我已经找到这个函数了:
(defun remove-string (rem-string full-string &key from-end (test #'eql)
test-not (start1 0) end1 (start2 0) end2 key)
"returns full-string with rem-string removed"
(let ((subst-point (search rem-string full-string
:from-end from-end
:test test :test-not test-not
:start1 start1 :end1 end1
:start2 start2 :end2 end2 :key key)))
(if subst-point
(concatenate 'string
(subseq full-string 0 subst-point)
(subseq full-string (+ subst-point (length rem-string))))
full-string)))
示例:
(remove-string "walid" "hello i'am walid")
=> 输出 "hello i'am"
但是有一个问题
示例:
(remove-string "wa" "hello i'am walid")
=> 输出 "hello i'am lid"
但输出应该是这样的 "hello i'am walid"
换句话说,我不会从字符串中删除确切的词
请帮助我,谢谢
Common Lisp Cookbook提供了这个功能:
(defun replace-all (string part replacement &key (test #'char=))
"Returns a new string in which all the occurences of the part
is replaced with replacement."
(with-output-to-string (out)
(loop with part-length = (length part)
for old-pos = 0 then (+ pos part-length)
for pos = (search part string
:start2 old-pos
:test test)
do (write-string string out
:start old-pos
:end (or pos (length string)))
when pos do (write-string replacement out)
while pos)))
使用该函数:
(loop for raw-string in '("hello word i'am walid" "goodbye madame")
collect (reduce (lambda (source-string bad-word)
(replace-all source-string bad-word ""))
'("word" "madame")
:initial-value raw-string))
您可以将 cl-ppcre
库用于正则表达式。它的正则表达式风格理解单词边界 \b
.
替代品可以这样工作:
(cl-ppcre:regex-replace-all "\bwa\b" "ba wa walid" "")
=> "ba walid"
我猜您想将已删除单词周围的所有空格折叠成一个:
(cl-ppcre:regex-replace-all "\s*\bwa\b\s*" "ba wa walid" " ")
=> "ba walid"
请参阅上面链接的文档。
更新:您将问题扩展到标点符号。这实际上有点复杂,因为您现在有三种字符:字母数字、标点符号和空格。
我不能在这里给出完整的解决方案,但我设想的大纲是在所有这三种类型之间创建边界定义。为此,您需要 positive/negative lookaheads/lookbehinds。然后查看替换后的字符串,无论它是否以标点符号开头或结尾,并将相应的边界追加或前置到有效表达式。
为了以可读的方式定义边界,cl-ppcre 的解析树语法可能很有用。
我有一个像这样的字符串列表 F:
("hello word i'am walid" "goodbye madame")
=> 这个列表包含字符串的两个元素
我还有另一个这样的列表调用 S ("word" "madame")
=> 这包含两个词
现在我想从列表 F 的每个字符串中删除列表 S 的元素,输出应该是这样的 ("hello i'am walid" "goodbye")
我已经找到这个函数了:
(defun remove-string (rem-string full-string &key from-end (test #'eql)
test-not (start1 0) end1 (start2 0) end2 key)
"returns full-string with rem-string removed"
(let ((subst-point (search rem-string full-string
:from-end from-end
:test test :test-not test-not
:start1 start1 :end1 end1
:start2 start2 :end2 end2 :key key)))
(if subst-point
(concatenate 'string
(subseq full-string 0 subst-point)
(subseq full-string (+ subst-point (length rem-string))))
full-string)))
示例:
(remove-string "walid" "hello i'am walid")
=> 输出 "hello i'am"
但是有一个问题
示例:
(remove-string "wa" "hello i'am walid")
=> 输出 "hello i'am lid"
但输出应该是这样的 "hello i'am walid"
换句话说,我不会从字符串中删除确切的词
请帮助我,谢谢
Common Lisp Cookbook提供了这个功能:
(defun replace-all (string part replacement &key (test #'char=))
"Returns a new string in which all the occurences of the part
is replaced with replacement."
(with-output-to-string (out)
(loop with part-length = (length part)
for old-pos = 0 then (+ pos part-length)
for pos = (search part string
:start2 old-pos
:test test)
do (write-string string out
:start old-pos
:end (or pos (length string)))
when pos do (write-string replacement out)
while pos)))
使用该函数:
(loop for raw-string in '("hello word i'am walid" "goodbye madame")
collect (reduce (lambda (source-string bad-word)
(replace-all source-string bad-word ""))
'("word" "madame")
:initial-value raw-string))
您可以将 cl-ppcre
库用于正则表达式。它的正则表达式风格理解单词边界 \b
.
替代品可以这样工作:
(cl-ppcre:regex-replace-all "\bwa\b" "ba wa walid" "")
=> "ba walid"
我猜您想将已删除单词周围的所有空格折叠成一个:
(cl-ppcre:regex-replace-all "\s*\bwa\b\s*" "ba wa walid" " ")
=> "ba walid"
请参阅上面链接的文档。
更新:您将问题扩展到标点符号。这实际上有点复杂,因为您现在有三种字符:字母数字、标点符号和空格。
我不能在这里给出完整的解决方案,但我设想的大纲是在所有这三种类型之间创建边界定义。为此,您需要 positive/negative lookaheads/lookbehinds。然后查看替换后的字符串,无论它是否以标点符号开头或结尾,并将相应的边界追加或前置到有效表达式。
为了以可读的方式定义边界,cl-ppcre 的解析树语法可能很有用。