从非递归上下文无关文法生成有限语言的算法

Algorithm to generate finite language from non-recursive context-free grammar

我正在寻找一种算法来从非递归上下文无关文法中生成完整的有限语言。该应用程序是为测试自动化生成一组可能的场景。

EBNF 语法示例(S 是起始规则):

S = Sender, Receiver;
Sender = Human | Machine;
Human = "user-type-1" | "user-type-2"
Machine = Access, Protocol;
Access = "internal" | "external";
Protocol = "soap" | "smtp";
Receiver = "local" | "remote";

应该生成一组句子,例如:

user-type-1 local
internal soap local
external smtp remote

到目前为止,我发现的示例和文献都是基于递归语法随机生成的示例。但我的问题更简单。 欢迎所有提示、名称或出版物链接。

谢谢,

S.

一种方法是用编程语言定义语法,然后编写代码迭代所有可能性。您的语法是使用变量和三个结构指定的:lit(x) 表示像 "local" 这样的文字,alt(a, b, c, ...) 表示选择 ab, c, ... 和 seq(a, b, ..., z) 表示来自 a 的一件事的序列,与来自 b 的一件事相连,依此类推。

这是你的语法形式。

Protocol = alt(lit("soap"), lit("smtp"))
Receiver = alt(lit("local"), lit("remote"))
Access = alt(lit("internal"), lit("external"))
Human = alt(lit("user-type-1"), lit("user-type-2"))
Machine = seq(Access, Protocol)
Sender = alt(Human, Machine)
S = seq(Sender, Receiver)

这里有一些完整的 (Python) 代码,这些代码使用 altseqlit 的精心选择的定义,使每个生产规则成为生成所有的函数它的可能性:

import itertools

def seq(*xs):
    def r():
        for f in itertools.product(*[x() for x in xs]):
            yield ' '.join(f)
    return r

def alt(*xs):
    def r():
        for x in xs:
            for f in x():
                yield f
    return r

def lit(x):
    def r():
        yield x
    return r

Protocol = alt(lit("soap"), lit("smtp"))
Receiver = alt(lit("local"), lit("remote"))
Access = alt(lit("internal"), lit("external"))
Human = alt(lit("user-type-1"), lit("user-type-2"))
Machine = seq(Access, Protocol)
Sender = alt(Human, Machine)
S = seq(Sender, Receiver)

for s in S():
    print s

您可以递归地生成一棵树,其分支将根据语法规则表示推导,其叶子将表示语法语言中的单词。恢复整个有限语言就像在生成叶子时保存叶子一样简单。

将每个节点表示为有序的符号集合(终结符或非终结符)。对于每个非终结符,递归地下降到一组新的节点,在那里进行所有可能的替换。继续,直到您的列表只包含终端符号,然后输出与您的节点对应的符号的有序连接。您的初始节点将始终为 [S]。示例:

S = Sender, Receiver;
Sender = Human | Machine;
Human = "user-type-1" | "user-type-2"
Machine = Access, Protocol;
Access = "internal" | "external";
Protocol = "soap" | "smtp";
Receiver = "local" | "remote";

[S]
 [Sender, ",", Receiver]
  [Human, ",", Receiver]
   ["user-type-1", ",", Receiver]
    ["user-type-1", ",", "local"]   ***
    ["user-type-1", ",", "remote"]  ***
   ["user-type-2", ",", Receiver]
    ["user-type-2", ",", "local"]   ***
    ["user-type-2", ",", "remote"]  ***
  [Machine, ",", Receiver]
   [Access, ",", Protocol, ",", Receiver]
    ["internal", ",", Protocol, ",", Receiver]
     ["internal", ",", "soap", ",", Receiver]
      ["internal", ",", "soap", ",", "local"]   ***
      ["internal", ",", "soap", ",", "remote"]  ***
     ["internal", ",", "smtp", ",", Receiver]
      ["internal", ",", "smtp", ",", "local"]   ***
      ["internal", ",", "smtp", ",", "remote"]  ***
    ["external", ",", Protocol, ",", Receiver]
     ["external", ",", "soap", ",", Receiver]
      ["external", ",", "soap", ",", "local"]   ***
      ["external", ",", "soap", ",", "remote"]  ***
     ["external", ",", "smtp", ",", Receiver]
      ["external", ",", "smtp", ",", "local"]   ***
      ["external", ",", "smtp", ",", "remote"]  ***