确定 python 0 和 1 列表中序列类型的最有效方法？

Question

假设我们有 python 个随机大小的列表，其值以 0 和 1 的随机序列排列。确定序列是否为以下类型之一（最多 3 个位置）以及 return 以下“序列类型”字符串之一的好方法是什么？:

0（只有0个）[0,0,0]
1（只有1个）[1,1]
01（以 0 开头然后遇到 1 x 之后的数字）[0,0,0,0,1]
10（10 的倒数）[1,0,0,0,0]
010（从0开始，遇到0后的1×位数，然后1后的0×位数）[0,0,0,1,0,1 ,1,0]
101（上面的倒数）[1,1,1,1,1,0,0,1,0,1,0,1]

我能想到简单的情况，然后是嵌套循环并保留计数器的天真方法，但是有没有更优雅的方法来做到这一点？

def sequence_type(sequence):
    
if 0 not in sequence:
    return '1'

elif 1 not in sequence:
    return '0'

else:
    if sequence[0] == 0:
        # loop through for sequence type 0xx
    elif sequence[0] == 1:
        # loop through for sequence type 1xx

编辑：我们不关心在检查类型时序列末尾的内容，而是在查看前 3 个“唯一”数字时序列是什么。

例如：[0,0,1,0,1,0] 是类型 010，因为：

0 是我们“起点”的第一个数字
然后我们往右走，看到又是一个0所以不唯一，跳过再往右走
然后遇到一个 1，我们记录这个，因为它是唯一的，再次向右移动
看到数字是 0 并且是唯一的（我们现在计算了 3 个数字），所以模式是 010。

Answer 1

您可以 return 通过逐个位来确定序列类型。类型的第一位数字始终等于第一位。如果在第一个位置之后找到反转位，则该类型将有两个或三个数字，第二个数字是第一个数字的倒数。如果该位出现在反转位置之后，则序列类型为 3 位（再次交替）：

def seqType(seq):
    bit = seq[0]                 # seq type starts with first bit
    try: p = seq.index(1-bit)    # search position of inverse bit
    except: return str(bit)      # not found --> single digit type
    if bit not in seq[p+1:]:     # search for initial bit after inverse
        return f"{bit}{1-bit}"   # not found --> two digit type
    return f"{bit}{1-bit}{bit}"  # --> 3 digit type

输出：

tests = ([0,0,0],[1,1],[0,0,0,0,1],[1,0,0,0,0],
         [0,0,0,1,0,1,1,0],[1,1,1,1,1,0,0,1,0,1,0,1])    
for seq in tests:
    print(seq,seqType(seq))
    
[0, 0, 0] 0
[1, 1] 1
[0, 0, 0, 0, 1] 01
[1, 0, 0, 0, 0] 10
[0, 0, 0, 1, 0, 1, 1, 0] 010
[1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1] 101

如果您想要更高级的方法，可以使用 zip 函数计算类型以压缩相同的连续位。压缩序列的前 3 位将对应序列类型：

def seqType(seq):
    return "".join(str(a) for a,b in zip(seq,seq[1:]+[2]) if a!=b)[:3])

或者，如果您喜欢递归解决方案：

def seqType(seq):
    if len(seq) == 1:  return str(seq[0])
    if seq[0]==seq[1]: return seqType(seq[1:])
    return str(seq[0]) + seqType(seq[1:])[:2]

Answer 2

如果我对你的理解正确，你正在解析 regular language, so you can implement it as a finite-state automaton。

这是 C 中的示例：

bool start(char *s) {
  if (*s) {
    if (*s++ == '0') A(s);
    else B(s);
  } else /* empty string doesn't match anything (or it's both a type "0" and type "1" */ 
}

bool A(char *s) {
  if (*s) {
    if (*s++ == '0') A(s);
    else C(s);
  } else /* "0" */
}

bool B(char *s) {
  if (*s) {
    if (*s++ == '0') D(s);
    else E(s)
  } else /* "1" */
}

bool C(char *s) {
  if (*s) {
    if (*s++ == '0') /* "010" */
    else /* Should be illegal. */
  } else /* "01" */
}

bool D(char *s) {
  if (*s) {
    if (*s++ == '0') G(s);
    else /* "101" */
  } else /* "10" */

bool E(char *s) {
  if (*s) {
    if (*s++ == '0') F(s);
    else E(s);
  } else /* "1" */
}

bool F(char *s) {
  if (*s) {
    if (*s++ == '0') /* illegal */
    else /* "101" */
  } else 
}

bool G(char *s) {
  if (*s) {
    if (*s++ == '0') G(s);
    else /* "101" */
  } else /* "10" */
}

这可能不是对有限状态自动机进行编码的最简洁的方式，但按照这些思路应该可行——有限自动机当然是优雅的结构。

你也可以正则表达式。

确定 python 0 和 1 列表中序列类型的最有效方法？

Most efficient way to determine the type of sequence in a python list of 0s and 1s?

python

algorithm

list

sequence