lisp 从另一个列表中删除一个列表的内容

lisp remove a the content of one list from another list

我有一个像这样的字符串列表 F:

("hello word i'am walid" "goodbye madame") => 这个列表包含字符串的两个元素

我还有另一个这样的列表调用 S ("word" "madame") => 这包含两个词

现在我想从列表 F 的每个字符串中删除列表 S 的元素,输出应该是这样的 ("hello i'am walid" "goodbye")

我已经找到这个函数了:

(defun remove-string (rem-string full-string &key from-end (test #'eql)
                      test-not (start1 0) end1 (start2 0) end2 key)
  "returns full-string with rem-string removed"
  (let ((subst-point (search rem-string full-string 
                             :from-end from-end
                             :test test :test-not test-not
                             :start1 start1 :end1 end1
                             :start2 start2 :end2 end2 :key key)))
    (if subst-point
        (concatenate 'string
                     (subseq full-string 0 subst-point)
                     (subseq full-string (+ subst-point (length rem-string))))
        full-string)))

示例: (remove-string "walid" "hello i'am walid") => 输出 "hello i'am"

但是有一个问题

示例: (remove-string "wa" "hello i'am walid") => 输出 "hello i'am lid"

但输出应该是这样的 "hello i'am walid" 换句话说,我不会从字符串中删除确切的词

请帮助我,谢谢

Common Lisp Cookbook提供了这个功能:

(defun replace-all (string part replacement &key (test #'char=))
"Returns a new string in which all the occurences of the part 
is replaced with replacement."
    (with-output-to-string (out)
      (loop with part-length = (length part)
            for old-pos = 0 then (+ pos part-length)
            for pos = (search part string
                              :start2 old-pos
                              :test test)
            do (write-string string out
                             :start old-pos
                             :end (or pos (length string)))
            when pos do (write-string replacement out)
            while pos))) 

使用该函数:

(loop for raw-string in '("hello word i'am walid" "goodbye madame")
        collect (reduce (lambda (source-string bad-word)
                          (replace-all source-string bad-word ""))
                        '("word" "madame")
                     :initial-value raw-string))

您可以将 cl-ppcre 库用于正则表达式。它的正则表达式风格理解单词边界 \b.

替代品可以这样工作:

(cl-ppcre:regex-replace-all "\bwa\b" "ba wa walid" "")

=> "ba  walid"

我猜您想将已删除单词周围的所有空格折叠成一个:

(cl-ppcre:regex-replace-all "\s*\bwa\b\s*" "ba wa walid" " ")

=> "ba walid"

请参阅上面链接的文档。

更新:您将问题扩展到标点符号。这实际上有点复杂,因为您现在有三种字符:字母数字、标点符号和空格。

我不能在这里给出完整的解决方案,但我设想的大纲是在所有这三种类型之间创建边界定义。为此,您需要 positive/negative lookaheads/lookbehinds。然后查看替换后的字符串,无论它是否以标点符号开头或结尾,并将相应的边界追加或前置到有效表达式。

为了以可读的方式定义边界,cl-ppcre 的解析树语法可能很有用。