Scala：结束 Stream.iterate

Question

经过反复试验，我找到了 结束 Stream.iterate 的方法（如果标准输入在我的情况下结束）。但对我来说，这似乎更像是一种邪恶的黑客攻击，而不是最佳实践解决方案。

Before（未结束如果标准输入结束因为Stream.iterate是运行无限）：

val initialDocument = Document()
val in: Stream[Document] = Stream.iterate(Stream[Document]()) { documents =>
  val lastDocument: Document = documents.lastOption.getOrElse(initialDocument)
  val line: String = io.StdIn.readLine
  if(line != null) {
    line
      .split(";")
      .map(_.trim)
      .scanLeft(lastDocument)((document: Document, line: String) => document.processInput(line))
      .drop(1) // drop the seed
      .toStream
  } else {
    Stream.empty
  }
}.flatten
for(document <- in) {
  // do something with the document snapshot
}

After（现在按预期工作）：

val initialDocument = Document()
val in: Stream[Document] = Stream.iterate(Stream[Option[Document]]()) { documents =>
  val lastDocument: Option[Document] = Some(documents.lastOption.flatten.getOrElse(initialDocument))
  val line: String = io.StdIn.readLine
  if(line != null) {
    line
      .split(";")
      .map(_.trim)
      .scanLeft(lastDocument)((document: Option[Document], line: String) => document.map(_.processInput(line)))
      .drop(1) // drop the seed
      .toStream
  } else {
    Stream(None) // "None" is used by "takeWhile" to see we have no more input
  }
}.flatten.takeWhile(_.isDefined).map(_.get)
for(document <- in) {
  // do something with the document snapshot
}

如您所见，引入了几个新的 Option 类型值。他们的唯一目的是告诉 takeWhile 是否到达终点。

如何以更优雅的形式编写此功能？

Answer 1

我想知道事情是否变得有点太复杂了，Streams 中的 Streams 和 iterate() 中的 scanLeft() 等等。投入 Option 类型以确定Stream-结尾有点腥味。

一个Iterator有一个自然的结束条件。我想知道这样的事情是否会更好。

class DocItr(private var prev :Document) extends Iterator[Document] {
  private var innerItr :Iterator[Document] = _
  private var line     :String = _

  override def hasNext :Boolean = innerItr.hasNext || {
    line = io.StdIn.readLine
    Option(line).fold(false)(_.nonEmpty)
  }

  override def next() :Document = {
    if (!innerItr.hasNext) {
      innerItr = line.split(";")
                     .map(_.trim)
                     .scanLeft(prev)((doc: Document, in: String) =>
                                                       doc.processInput(in))
                     .drop(1) // drop the seed
                     .toIterator
    }
    prev = innerItr.next()
    prev
  }
}

for(document <- new DocItr(initialDocument)) {
  // do something with the document snapshot
}

我不知道这是否真的有效。我没有你的 Document 类型可以使用。

我将 "continue" 条件从 line != null 更改为 Option(line).fold(false)(_.nonEmpty)，这样它将在任何空输入时结束，而不仅仅是 null。它只是让测试变得更容易。

Answer 2

如果我理解你的做法是正确的，这将以更简单的方式解决你的问题：

val in = Iterator
  .continually(io.StdIn.readLine())       // Read all lines from StdIn infinitely
  .takeWhile(_ != null)                   // Stop on EOI
  .flatMap(_.split(';'))                  // Iterator of sublines
  .map(_.trim)                            // Iterator of trimmed sublines
  .scanLeft(Document())(_ processInput _) // Iterator of a Document snapshot per subline
  .drop(1)                                // Drop the empty Document

for (document -> in) {
  // do something with the document snapshot
}

基本上，首先从整个输入中创建一个懒惰的Iterator修剪线部分，然后根据这个迭代器制作文档快照。

最好避免使用 Stream，除非您真的需要它的记忆功能。 Stream速度慢，memoization容易造成内存泄露。 Iterator 具有创建有限或无限惰性序列的所有相同的好方法，并且应该是用于该目的的首选集合。

Scala：结束 Stream.iterate

Scala: Ending Stream.iterate

iteration

stdin

scala

stream