从单元格中提取文本部分

Question

给定一个单元格，其文本来自 HTML，格式如下：

OA-1

Interpret products of whole numbers, e.g., interpret 5 × 7 as the total number of objects in 5 groups of 7 objects each. For example, describe a context in which a total number...

More

OA-2

Interpret whole-number quotients of whole numbers, e.g., interpret 56 ÷ 8 as the number of objects in each share when 56 objects are partitioned equally into 8 shares, or as a number ...

目标：提取 header 标识符的列表，以便输出如下所示： OA-1,OA-2...

我已通过 =importhtml 函数提取数据，如本 MWE sheet 的两个示例所示。

注意到 char(10) 是一个 return 字符，我正在考虑这样的代码伪代码：

Left(Cell_with_text,number_of_characters = find(first char(10))-1)&","&"find_next_heade"+\r

另一种方法可能是创建一个包含所有 header 的库（例如，“OA-1,OA-2...”），并以某种方式在单元格中找到每个实例，可能带有在数组中查找函数？

假设

header 可以有 3 到 7 个字符。
Headers 不要总是以相同的字母开头。
Headers 总是有破折号，但它可以在从第二个到第二个到最后一个位置的任何位置。
每个header之后总是有一个char(10)。

Answer 1

这个公式一次拆分所有这些，然后只保留第一列（这是你想要的输出）。然后执行 JOIN().

=JOIN(", ",INDEX(SPLIT(importhtml("https://contentexplorer.smarterbalanced.org/target/m-g3-c1-ta","list",3),CHAR(10)),,1))

Here is a sample sheet, viewable to all in perpetuity.

从单元格中提取文本部分

Extract Text Sections from a Cell

string

google-sheets

google-apps-script

google-sheets-formula

假设