如何将 &HashSet<&T> 用作 IntoIterator<Item = &T>？

Question

我有一个函数接受 &T（由 IntoIterator 表示）的集合，要求每个元素都是唯一的。

fn foo<'a, 'b, T: std::fmt::Debug, I>(elements: &'b I)
where
    &'b I: IntoIterator<Item = &'a T>,
    T: 'a,
    'b: 'a,

我还想编写一个包装函数，即使元素不是唯一的，它也可以工作，方法是先使用 HashSet 删除重复元素。

我尝试了以下实现：

use std::collections::HashSet;

fn wrap<'a, 'b, T: std::fmt::Debug + Eq + std::hash::Hash, J>(elements: &'b J)
where
    &'b J: IntoIterator<Item = &'a T>,
    T: 'a,
    'b: 'a,
{
    let hashset: HashSet<&T> = elements.into_iter().into_iter().collect();
    foo(&hashset);
}

playground

但是，编译器似乎对我 HashSet<&T> 实现 IntoIterator<Item = &'a T>:

的假设不满意

error[E0308]: mismatched types
  --> src/lib.rs:10:9
   |
10 |     foo(&hashset);
   |         ^^^^^^^^ expected type parameter, found struct `std::collections::HashSet`
   |
   = note: expected type `&J`
              found type `&std::collections::HashSet<&T>`
   = help: type parameters must be constrained to match other types
   = note: for more information, visit https://doc.rust-lang.org/book/ch10-02-traits.html#traits-as-parameters

我知道我可以通过克隆所有输入元素来使用 HashSet<T>，但我想避免不必要的复制和内存使用。

Answer 1

Shepmaster 指出我的代码存在许多正交问题，但要解决将 HashSet<&T> 用作 IntoIterator<Item=&T> 的问题，我发现一种解决方法是使用包装器结构：

struct Helper<T, D: Deref<Target = T>>(HashSet<D>);

struct HelperIter<'a, T, D: Deref<Target = T>>(std::collections::hash_set::Iter<'a, D>);

impl<'a, T, D: Deref<Target = T>> Iterator for HelperIter<'a, T, D>
where
    T: 'a,
{
    type Item = &'a T;
    fn next(&mut self) -> Option<Self::Item> {
        self.0.next().map(|x| x.deref())
    }
}

impl<'a, T, D: Deref<Target = T>> IntoIterator for &'a Helper<T, D> {
    type Item = &'a T;
    type IntoIter = HelperIter<'a, T, D>;
    fn into_iter(self) -> Self::IntoIter {
        HelperIter((&self.0).into_iter())
    }
}

其用法如下：

struct Collection<T> {
    item: PhantomData<T>,
}

impl<T: Debug> Collection<T> {
    fn foo<I>(elements: I) -> Self
    where
        I: IntoIterator + Copy,
        I::Item: Deref<Target = T>,
    {
        for element in elements {
            println!("{:?}", *element);
        }
        for element in elements {
            println!("{:?}", *element);
        }
        return Self { item: PhantomData };
    }
}

impl<T: Debug + Eq + Hash> Collection<T> {
    fn wrap<I>(elements: I) -> Self
    where
        I: IntoIterator + Copy,
        I::Item: Deref<Target = T> + Eq + Hash,
    {
        let helper = Helper(elements.into_iter().collect());
        Self::foo(&helper);
        return Self { item: PhantomData };
    }
}

fn main() {
    let v = vec![Foo(1), Foo(2), Foo(4)];
    Collection::<Foo>::wrap(&v);
}

我猜其中一些可能比需要的更复杂，但我不确定如何。

full playground

Answer 2

如果你有一个&HashSet<&T>并且需要一个你可以处理多次的&T（不是&&T）的迭代器，那么你可以使用Iterator::copied来转换迭代器的 &&T 到 &T:

use std::{collections::HashSet, fmt::Debug, hash::Hash, marker::PhantomData};

struct Collection<T> {
    item: PhantomData<T>,
}

impl<T> Collection<T>
where
    T: Debug,
{
    fn foo<'a, I>(elements: I) -> Self
    where
        I: IntoIterator<Item = &'a T> + Clone,
        T: 'a,
    {
        for element in elements.clone() {
            println!("{:?}", element);
        }
        for element in elements {
            println!("{:?}", element);
        }
        Self { item: PhantomData }
    }
}

impl<T> Collection<T>
where
    T: Debug + Eq + Hash,
{
    fn wrap<'a, I>(elements: I) -> Self
    where
        I: IntoIterator<Item = &'a T>,
        T: 'a,
    {
        let set: HashSet<_> = elements.into_iter().collect();
        Self::foo(set.iter().copied())
    }
}

#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);

fn main() {
    let v = vec![Foo(1), Foo(2), Foo(4)];
    Collection::<Foo>::wrap(&v);
}

另请参阅：

Using the same iterator multiple times in Rust

Note that the rest of this answer made the assumption that a struct named Collection<T> was a collection of values of type T. OP has clarified that this is not true.

这不是你的问题，正如你后面的例子所示。这可以归结为：

struct Collection<T>(T);

impl<T> Collection<T> {
    fn new(value: &T) -> Self {
        Collection(value)
    }
}

您正在引用类型 (&T) 并尝试将其存储在需要 T 的位置；这些是不同的类型，会产生错误。您出于某种原因正在使用 PhantomData 并通过迭代器接受引用，但问题是相同的。

事实上，PhantomData 使问题更难发现，因为您可以编造无效的值。例如，我们这里从来没有任何类型的字符串，但我们 "successfully" 创建了结构：

use std::marker::PhantomData;

struct Collection<T>(PhantomData<T>);

impl Collection<String> {
    fn new<T>(value: &T) -> Self {
        Collection(PhantomData)
    }
}

最终，您的 wrap 函数也没有意义：

impl<T: Eq + Hash> Collection<T> {
    fn wrap<I>(elements: I) -> Self
    where
        I: IntoIterator<Item = T>,

这相当于

impl<T: Eq + Hash> Collection<T> {
    fn wrap<I>(elements: I) -> Collection<T>
    where
        I: IntoIterator<Item = T>,

也就是说，给定一个元素 T 的迭代器，您将 return 这些元素的集合。但是，您将它们放在 HashMap 中并迭代对它的 reference，这会产生 &T。因此这个函数签名不可能是对的。

您似乎最有可能想要接受拥有值的迭代器：

use std::{collections::HashSet, fmt::Debug, hash::Hash};

struct Collection<T> {
    item: T,
}

impl<T> Collection<T> {
    fn foo<I>(elements: I) -> Self
    where
        I: IntoIterator<Item = T>,
        for<'a> &'a I: IntoIterator<Item = &'a T>,
        T: Debug,
    {
        for element in &elements {
            println!("{:?}", element);
        }
        for element in &elements {
            println!("{:?}", element);
        }

        Self {
            item: elements.into_iter().next().unwrap(),
        }
    }
}

impl<T> Collection<T>
where
    T: Eq + Hash,
{
    fn wrap<I>(elements: I) -> Self
    where
        I: IntoIterator<Item = T>,
        T: Debug,
    {
        let s: HashSet<_> = elements.into_iter().collect();
        Self::foo(s)
    }
}

#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);

fn main() {
    let v = vec![Foo(1), Foo(2), Foo(4)];
    let c = Collection::wrap(v);
    println!("{:?}", c.item)
}

这里我们直接在通用迭代器类型上放置一个特征边界，在迭代器的引用上放置第二个更高级别的特征边界。这允许我们使用对迭代器的引用作为迭代器本身。

另请参阅：

Is there any way to return a reference to a variable created in a function?

如何将 &HashSet<&T> 用作 IntoIterator<Item = &T>？

How do I use a &HashSet<&T> as an IntoIterator<Item = &T>?

lifetime

rust