Common LISP：制作你自己的联合函数

Question

我正在尝试创建我自己的联合函数并意识到我有多么不喜欢 LISP。目标是给函数两个列表，它将 return 两者的集合理论联合。我尝试的解决方案变得越来越复杂，结果相同：无。无论我做什么我都无法改变它的结果。

我正在考虑在下面的 "removeDuplicates" 函数中构建一个单独的列表，但后来我不知道我是如何 return 使用递归的。我认为发生的事情是我的 "removeDuplicates" 函数最终 return 是一个空列表（如预期的那样）但是当递归展开（开始 returning 堆栈中的值）但我可能是错的。我总是难以详细理解递归。代码如下。

(defun rember (A LAT)
  (cond
   ((null LAT) ())
   ((EQ (car LAT) A) (cdr LAT))
   (T (cons (car LAT)(rember A (cdr LAT))))
   )
  )

(defun my_member (A LAT)
  (cond
   ((null LAT) nil)
   ((EQ (car LAT) A) T)
   (T (my_member A (cdr LAT)))
   )
  )

(defun removeDuplicates (L)
  (cond
   ((null L) '())
   ((my_member (car L) (cdr L)) (rember (car L) L) (removeDuplicates (cdr L)))
   (T (removeDuplicates (cdr L)))
   )
  )

(defun my_union (A B)
  (setq together(append A B))
  (removeDuplicates together)
  )

我知道大多数人不喜欢这种 LISP 代码格式，但我更喜欢它。与将所有右括号放在函数和条件块的末尾相比，它让我可以更好地了解括号的排列方式。

如果我运行 (my_union '(a b) '(b c))例如，结果是NIL。

Answer 1

当您在最后一个条件中递归调用 removeDuplicates 时，您没有将结果与列表的 car 组合，因此您将丢弃该元素。

您也没有使用 rember 的结果。

(defun removeDuplicates (L)
  (cond
   ((null L) '())
   ((my_member (car L) (cdr L)) 
    (cons (car L) 
          (removeDuplicates 
           (rember (car L) (cdr L)) 
           ))
    )
   (T (cons (car L) (removeDuplicates (cdr L))))
   )
  )

Answer 2

这是一个简单、明显的联合函数：

(defun union/tfb (&rest lists)
  ;; compute the union of any number of lists, implicitly using EQL
  (labels ((union/spread (l1 ls)
             ;; UNION/SPREAD just exists to avoid the impedance
             ;; mismatch in argument convention
             (if (null ls)
                 l1
               (let ((result l1))
                 (destructuring-bind (l2 . more) ls
                   (dolist (e l2 (union/spread result more))
                     (pushnew e result)))))))
    (union/spread (first lists) (rest lists))))

我认为这是相当自然的 CL，当然，使用像 CL 这样的语言的全部意义在于避免像这样无休止地重新发明轮子。

所以游戏规则可能会说你不允许使用 PUSHNEW：好吧，你可以很容易地用涉及 MEMBER:

的条件替换它

(defun union/tfb (&rest lists)
  ;; compute the union of any number of lists, implicitly using EQL
  (labels ((union/spread (l1 ls)
             ;; UNION/SPREAD just exists to avoid the impedance
             ;; mismatch in argument convention
             (if (null ls)
                 l1
               (let ((result l1))
                 (destructuring-bind (l2 . more) ls
                   (dolist (e l2 (union/spread result more))
                     ;; Really use PUSHNEW for this
                     (unless (member e result)
                       (setf result (cons e result)))))))))
    (union/spread (first lists) (rest lists))))

也许您也不允许使用 MEMBER：好吧，您可以轻松地编写一个谓词来满足您的需要：

(defun union/tfb (&rest lists)
  ;; compute the union of any number of lists, implicitly using EQL
  (labels ((union/spread (l1 ls)
             ;; UNION/SPREAD just exists to avoid the impedance
             ;; mismatch in argument convention
             (if (null ls)
                 l1
               (let ((result l1))
                 (destructuring-bind (l2 . more) ls
                   (dolist (e l2 (union/spread result more))
                     ;; Really use MEMBER for this, and in fact
                     ;; PUSHNEW
                     (unless (found-in-p e result)
                       (setf result (cons e result))))))))
           (found-in-p (e list)
             ;; is e found in LIST? This exists only because we're not
             ;; meant to use MEMBER
             (cond ((null list) nil)
                   ((eql e (first list)) t)
                   (t (found-in-p e (rest list))))))
    (union/spread (first lists) (rest lists))))

如果您希望结果是一个具有唯一元素的集合，即使第一个列表不是，您也可以轻松地做到这一点（注意 CL 的 UNION 不承诺这一点，您可以获得相同的结果(union/tfb '() ...) 的 UNION/TFB 的早期版本):

(defun union/tfb (&rest lists)
  ;; compute the union of any number of lists, implicitly using EQL
  (labels ((union/spread (l1 ls)
             ;; UNION/SPREAD just exists to avoid the impedance
             ;; mismatch in argument convention
             (if (null ls)
                 l1
               (let ((result l1))
                 (destructuring-bind (l2 . more) ls
                   (dolist (e l2 (union/spread result more))
                     ;; Really use MEMBER for this, and in fact
                     ;; PUSHNEW
                     (unless (found-in-p e result)
                       (setf result (cons e result))))))))
           (found-in-p (e list)
             ;; is e found in LIST? This exists only because we're not
             ;; meant to use MEMBER
             (cond ((null list) nil)
                   ((eql e (first list)) t)
                   (t (found-in-p e (rest list))))))
    (union/spread '() lists)))

最后，如果规则阻止您使用迭代构造和赋值，您也可以这样做：

(defun union/tfb (&rest lists)
  ;; compute the union of any number of lists, implicitly using EQL
  (labels ((union/spread (l1 ls)
             ;; UNION/SPREAD just exists to avoid the impedance
             ;; mismatch in argument convention
             (if (null ls)
                 l1
               (union/loop l1 (first ls) (rest ls))))
           (union/loop (result l more)
             ;; UNION/LOOP is just an iteration
             (if (null l)
                 (union/spread result more)
               (destructuring-bind (e . remainder) l
                 (union/loop (if (found-in-p e result)
                                 result
                               (cons e result))
                             remainder more))))
           (found-in-p (e list)
             ;; is e found in LIST? This exists only because we're not
             ;; meant to use MEMBER
             (cond ((null list) nil)
                   ((eql e (first list)) t)
                   (t (found-in-p e (rest list))))))
    (union/spread '() lists)))

所有这些变化的最终结果可能是非常纯粹的，但根本不是自然的 CL：类似的东西在 Scheme 中可能更自然（尽管不是无缘无故地将 MEMBER 替换为像这样的本土谓词）。

Answer 3

测试您的 Common Lisp 代码的一种方法是让您的解释器执行 TRACE 函数：

(trace removeDuplicates my_member rember)

为避免痕迹过多，请使用小示例。

首先，让我们尝试使用一个空列表；这是来自 REPL ("read eval print loop") 的示例，在 "SO" 包 (Whosebug) 中使用 SBCL 进行了测试；跟踪打印有点缩进，a 根据递归的深度编号。这里的调用不是递归的，会立即终止：

SO> (removeduplicates nil)
  0: (SO::REMOVEDUPLICATES NIL)
  0: REMOVEDUPLICATES returned NIL
NIL

这行得通，让我们尝试一个带有单例列表的示例，其中显然没有重复项：

SO> (removeduplicates '(1))
  0: (SO::REMOVEDUPLICATES (1))
    1: (SO::MY_MEMBER 1 NIL)
    1: MY_MEMBER returned NIL
    1: (SO::REMOVEDUPLICATES NIL)
    1: REMOVEDUPLICATES returned NIL
  0: REMOVEDUPLICATES returned NIL
NIL

removeDuplicate 调用 my_member， 正确地 returns nil，然后递归调用 removeDuplicates 和 nil， 正确地 returns nil。但是有一个问题，因为最外面的调用 returns nil 也是 不正确的 .

看trace，我们要回头看代码，找到调用了my_member的地方，然后递归调用了removeDuplicates。只有一个地方调用了my_member，作为对cond中第二个子句的测试；由于该测试的结果是 nil，因此将尝试下一个子句，在这种情况下默认情况：

(cond
   ...
   ;; this is the call to my_member (= nil)
   ((my_member (car L) (cdr L)) ...)

   ;; this is the recursive call
   (t (removeDuplicates (cdr L))))

cond的值是最后一个(removeDuplicates (cdr L))给的值，只是不保留L前面已有的元素。如果你正在改变一个序列，你可以只递归子序列并忽略前面的元素：在这种情况下，调用者仍然持有对原始序列的引用，这将通过你的函数的副作用删除它的元素。但是这里您遵循的是严格不可变的方法，您必须 重新构造一个列表 作为 return 值。

换句话说，removeDuplicates表示为：return一个新列表，其中包含与原始列表相同的元素，但没有重复。

所以要在(removeDuplicates (cdr L))前加上(car L)。

(defun removeDuplicates (L)
  (cond
    ((null L) '())
    ((my_member (car L) (cdr L)) (rember (car L) L) (removeDuplicates (cdr L)))
    (T (cons (car L)
             (removeDuplicates (rest L))))))

让我们测试一下：

SO> (removeduplicates '())
  0: (SO::REMOVEDUPLICATES NIL)
  0: REMOVEDUPLICATES returned NIL
NIL
SO> (removeduplicates '(1))
  0: (SO::REMOVEDUPLICATES (1))
    1: (SO::MY_MEMBER 1 NIL)
    1: MY_MEMBER returned NIL
    1: (SO::REMOVEDUPLICATES NIL)
    1: REMOVEDUPLICATES returned NIL
  0: REMOVEDUPLICATES returned (1)
(1)

您可以使用更长的列表（不重复）进行测试，结果是正确的，但轨迹更长。

现在，让我们添加重复项：

SO> (removeduplicates '(1 2 2 1))
  0: (SO::REMOVEDUPLICATES (1 2 2 1))
    1: (SO::MY_MEMBER 1 (2 2 1))
      2: (SO::MY_MEMBER 1 (2 1))
        3: (SO::MY_MEMBER 1 (1))
        3: MY_MEMBER returned T
      2: MY_MEMBER returned T
    1: MY_MEMBER returned T
    1: (SO::REMBER 1 (1 2 2 1))
    1: REMBER returned (2 2 1)
    1: (SO::REMOVEDUPLICATES (2 2 1))
      2: (SO::MY_MEMBER 2 (2 1))
      2: MY_MEMBER returned T
      2: (SO::REMBER 2 (2 2 1))
      2: REMBER returned (2 1)
      2: (SO::REMOVEDUPLICATES (2 1))
        3: (SO::MY_MEMBER 2 (1))
          4: (SO::MY_MEMBER 2 NIL)
          4: MY_MEMBER returned NIL
        3: MY_MEMBER returned NIL
        3: (SO::REMOVEDUPLICATES (1))
          4: (SO::MY_MEMBER 1 NIL)
          4: MY_MEMBER returned NIL
          4: (SO::REMOVEDUPLICATES NIL)
          4: REMOVEDUPLICATES returned NIL
        3: REMOVEDUPLICATES returned (1)
      2: REMOVEDUPLICATES returned (2 1)
    1: REMOVEDUPLICATES returned (2 1)
  0: REMOVEDUPLICATES returned (2 1)
(2 1)

结果正确（顺序无关紧要）。

到目前为止，我们的测试都很好。

您可能没有发现该函数中的另一个问题，即对 rember 的所有调用都是无用的，坦率地说，这不一定很容易通过跟踪发现。但是看看代码，如果你写的代码有很少的副作用应该很清楚，下面的子句调用 (rember ...) 什么都不做：

((my_member (car L) (cdr L)) (rember (car L) L) (removeDuplicates (cdr L)))

cond 子句具有语法 (TEST . BODY)，其中 BODY 是一个表达式序列，其计算结果类似于 PROGN：PROGN 的值是值在其最后一个子句中，所有中间子句仅用于它们的副作用。例如：

(progn
  (print "I am here")
  (* 10 3))

在上面，调用 PRINT return 是一个值，但它被丢弃了：封闭的 PROGN 的值为 30。

在您的代码中，rember 没有副作用，它的 return 值被丢弃。只需删除它：

(defun removeDuplicates (L)
  (cond
    ((null L) '())
    ((my_member (car L) (cdr L)) 
     (removeDuplicates (cdr L)))
    (T (cons (first L)
             (removeDuplicates (rest L))))))

我个人会写如下相同的代码：

(defun remove-duplicate-elements (list)
  (when list
    (let ((head (first list))
          (tail (remove-duplicate-elements (rest list))))
      (if (member head tail) tail (cons head tail)))))

Answer 4

这是一个 remove-dupes，它使用散列 table 在 O(n) 时间内从列表中删除重复项。它支持自定义相等函数（必须是eq、eql、equal或`equalp）和自定义测试函数，这样一个项目的任何方面都可以被视为关键.

(defun remove-dupes (list &key (test #'eql) (key #'identity))
  (let ((hash (make-hash-table :test test)))
    (loop for item in list
          for item-key = (funcall key item)
          for seen = (gethash item-key hash)
          unless seen collect item and
                      do (setf (gethash item-key hash) t))))

例如，假设我们有关联列表((a . 1) (a . 2) (b . 3) (c . 4) (b . 4))。我们想通过 car:

删除重复项

[1]> (remove-dupes '((a . 1) (a . 2) (b . 3) (c . 4) (b . 4)) :key #'car)
((A . 1) (B . 3) (C . 4))

仅报告最左边的 A、B 和 C 条目；重复项被抑制。现在让我们通过 cdr:

来完成

[2]> (remove-dupes '((a . 1) (a . 2) (b . 3) (c . 4) (b . 4)) :key #'cdr)
((A . 1) (A . 2) (B . 3) (C . 4))

(b . 4) 由于 4 值重复而被剔除。

但是，当 Common Lisp 提供了一个 remove-duplicates 函数（更不用说 union）时，为什么要做这一切。

remove-duplicates 比我这里的更通用：它处理序列，而不仅仅是列表，因此它适用于向量和字符串。它有更多的关键字参数。

Common LISP：制作你自己的联合函数

Common LISP: Make Your Own Union Function

lisp

union