'context string for the given token' 是什么意思？

Question

我正在编写基于 nand2tetris 课程的分词器作业（用 C++ 编写），作业的一部分需要上下文字符串。我不确定这是什么意思，我正在寻找故障或某种 pseudo/example 代码来说明它的含义。（我觉得这是一种盯着书架寻找一本就在你面前的书的情况，但你看不到，因为你已经找了太久了！）

说明是：

Generate a context string for the given token. It shows the line before the token, the line containing the token, and a line with a ^ marking the token's position. Tab stops are every 8 characters in the context string, tabs are replaced by spaces (1 to 8) so that the next character starts on an 8 character boundary.

我知道这可能是明显的英语而不是代码的情况，但我只是有点迷路，任何帮助都是传奇，因为我在编程方面仍然非常基础。

我在想：

string token_context(Token token)
{
    return "previous line \n" + "token" + "somehow having 8 spaces and the ^ symbol where the token is" ;
}

Answer 1

想像您在编译器错误消息中看到的上下文字符串。上下文字符串用于显示令牌周围的内容或其上下文。问题是要求三行：

紧接着包含标记的行之后的文本行。
包含标记的文本行。
其中包含 ^ 的一行。 ^ 的位置应在实际令牌下方。

有关选项卡的内容可帮助您将 ^ 放在正确的位置。基本上，它是说一个选项卡就像可变数量的 space 一样。制表符 space 的数量使下一个字符成为 8 的倍数。例如 "ab\tc" 应该被认为与 "ab c" 相同，因为制表符 (\t) 位于第三个 space，因此它的行为类似于 6 spaces，因此 c 将位于字符串的第八个位置。

Answer 2

请编译并运行此代码。我认为它将演示标签和 space 的使用，正如@Jonathan Geisler 所解释的那样。

请注意，在您教授的系统中，假定制表符占用 8 space 秒。但是，在我的系统上，它们输出 4 space 宽度。所以，我有一个定义为 tab_spaces 的常量设置为 8。如果您发现克拉不在正确的位置，请将该常量更改为 4 并重试。

检查调试器中的输出，我认为它会很清楚。

Output:

int index = 10;
if(index < 8 & index % 2 == 1) {
             ^

进程已完成，退出代码为 0

代码：

#include <iostream>
#include <string>

using namespace std;

struct Token {
    string prior_line;
    string token_line;
    string token;
    size_t token_offset;
    size_t token_length;
};

const size_t tab_spacing{8};   // this has to be '4' on my system
string token_context(Token token)
{
  size_t tab_count = 0;
  size_t space_count = 0;

  string return_string{token.prior_line};  // 1st line: prior line
  return_string += "\n";                   // end of line 1,

  return_string += token.token_line;  // 2nd line, the one with the token
  return_string += "\n";              // end of line 2,

  // calculate tabs and spaces for line 3
  tab_count = token.token_offset / tab_spacing;  // tabs to get to token offset
  space_count = token.token_offset % tab_spacing;  // spaces to get to token offset

  // Build the 3rd line of the context string by inserting tabs
  for(size_t i = 0; i < tab_count; i++) {
    return_string += "\t";
  }

  // now insert the spaces
  for(size_t i = 0; i < space_count; i++) {
    return_string += " ";
  }
  // now, add the carat '^'
  return_string += "^";

  return_string += "\n";
  return return_string;
}

int main()
{
  string str1 = "int index = 10;";
  string str2 = "if(index < 8 & index % 2 == 1) {";
  size_t token_offset = 13;
  size_t token_length = 1;
  string token_str = "&";

  Token token;
  token.prior_line = str1;
  token.token_line = str2;
  token.token_offset = token_offset;
  token.token = token_str;
  token.token_length = token_length;

  std::cout << token_context(token) << std::endl;
  return 0;
}

'context string for the given token' 是什么意思？

What does 'context string for the given token' mean?

c++

tokenize