从序列中删除所有重复元素

Question

Common Lisp 序列函数 remove-duplicates 留下每个重数的一个元素。以下类似函数 remove-equals 的目标是删除 all 个多重性。

但是，我想使用内置函数 remove-if（不是迭代），以及 SBCL 的散列 table 工具用于 :test 函数，以将时间复杂度保持在 O( n).眼前的问题是 SBCL 相等性测试需要是全局的，但测试还需要依赖于 remove-equals 的 key 参数。能不能写成同时满足这两个要求？

(defun remove-equals (sequence &key (test #'eql) (start 0) end (key #'identity))
  "Removes all repetitive sequence elements based on equality test."
  #.(defun equality-test (x y)
      (funcall test (funcall key x) (funcall key y)))
  #.(sb-ext:define-hash-table-test equality-test sxhash)
  (let ((ht (make-hash-table :test #'equality-test)))
    (iterate (for elt in-sequence (subseq sequence start end))
             (incf (gethash (funcall key elt) ht 0)))
    (remove-if (lambda (elt)
                 (/= 1 (gethash elt ht)))
               sequence :start start :end end :key key)))

Answer 1

免责声明：我发现@Sylwester 的回答更清晰、更清晰 - 更好（没有宏）。

然而，这只是假设（但不是一个好的做法）：

(ql:quickload :iterate)    ;; you forgot these - but they are necessary
(use-package :iterate)     ;; for your code to run - at least my definition
(ql:quickload :alexandria) ;; of 'minimal working example' is to include imports.

(defmacro remove-equals (sequence &key (test #'eql) (start 0) end (key #'identity))
  "Remove all repetitive sequence alements based on equality test."
  (alexandria:once-only (sequence test start end key) ; as hygyenic macro
    `(progn
       (defun equality-test (x y)
          (funcall ,test (funcall ,key x) (funcall ,key y)))
       (sb-ext:define-hash-table-test equality-test sxhash)
       (let ((ht (make-hash-table :test #'equality-test)))
          (iterate (for elt in-sequence (subseq ,sequence ,start ,end))
                   (incf (gethash (funcall ,key elt) ht 0)))
          (remove-if (lambda (elt)
                       (/= 1 (gethash (funcall ,key elt) ht)))
                     ,sequence :start ,start :end ,end :key ,key)))))

(remove-equals '(1 2 3 1 4 5 3) :test #'= :end 6)
;; WARNING: redefining COMMON-LISP-USER::EQUALITY-TEST in DEFUN
;; 
;; (2 3 4 5 3)

(describe 'equality-test) ;; shows new definition
;; COMMON-LISP-USER::EQUALITY-TEST
;;   [symbol]
;; 
;; EQUALITY-TEST names a compiled function:
;;   Lambda-list: (X Y)
;;   Derived type: (FUNCTION (T T) (VALUES BOOLEAN &OPTIONAL))
;;   Source form:
;;     (SB-INT:NAMED-LAMBDA EQUALITY-TEST
;;         (X Y)
;;       (BLOCK EQUALITY-TEST
;;         (FUNCALL #'= (FUNCALL #1=#<FUNCTION IDENTITY> X)
;;                  (FUNCALL #1# Y))))

警告总是会出现——如果你使用不止一个哈希表，这肯定会干扰并导致错误。所以我不推荐！

Answer 2

像这样的 read-time 计算函数不会按照您的想法进行。从您的代码中简化：

(defun foo (a b test)
  #.(defun equality-test (x y)
      (funcall test x y))
  (funcall #'equality-test a b))

这不可能行得通。

原因1：一个读取时间创建的函数没有访问周围代码的词法变量（这里没有办法参考 test，因为在阅读期间不存在具有函数 foo 的环境）

equality-test 中的 test 变量不是指词法变量。这是 undefined/undeclared.

原因 2：DEFUN 计算为一个符号

阅读和评估 read-time 代码后，代码如下所示：

(defun foo (a b test)
   equality-test
   (funcall #'equality-test a b))

嗯，equality-test 是一个未绑定的变量。这是运行时的错误。

原因 3：函数 equality-test 可能不存在

如果我们用文件编译器编译代码，函数equality-test是在读取表单时在compile-time环境中创建的，但它不会成为编译代码的一部分。

Answer 3

define-hash-table-test 的第三个参数将测试与散列函数相关联。使用 sxhash 达不到目的，因为它应该针对 test 函数进行定制。 (equal x y) 表示 (= (sxhash x) (sxhash))。因此，第二个参数应该是一个函数 test-hash，使得 (funcall test x y) 意味着 (= (test-hash x) (test-hash y))。仅仅拥有测试功能是不可能做到这一点的。通过记录它需要哈希支持来规避整个事情可能会更好：

(defun remove-duplicated (sequence &key (test #'eql) (start 0) end (key #'identity))
  "Removes all repetitive sequence elements based on equality test.
   equalily tests other than eq, eql, equal and equalp requires you
   add it to be allowed in a hash table eg. sb-ext:define-hash-table-test in SBCL"

  (let ((ht (make-hash-table :test test)))
    (iterate (for elt in-sequence (subseq sequence start end))
             (incf (gethash (funcall key elt) ht 0)))
    (remove-if (lambda (elt)
                 (/= 1 (gethash elt ht)))
               sequence :start start :end end :key key)))

现在，如果用户想要自定义测试，他们需要自己进行：

(defun car-equals (a b)
  (equal (car a) (car b)))

(defun car-equals-hash (p)
  (sxhash (car p)))

(sb-ext:define-hash-table-test car-equals car-equals-hash)

(car-equals '(1 2 3 4) '(1 3 5 7)) ; ==> t
(defparameter *ht* (make-hash-table :test 'car-equals))
(setf (gethash '(1 2 3 4) *ht*) 'found)
(gethash '(1 3 5 7) *ht*) ; ==> found

(remove-duplicated '((5 0 1 2) (5 1 2 3) (5 1 3 2) (5 2 3 4)) 
                   :test #'car-equals 
                   :key #'cdr) 
; ==> ((5 0 1 2) (5 2 3 4))

从序列中删除所有重复元素

Removing All Repetitive Elements from a Sequence

hashtable

sbcl

common-lisp

sequence