无限字符串中字符串的第一个索引 - Java

Question

简介

我有一个无限长的字符串。这个字符串的长度在我们的想象中是无限的，不可能limited.Suppose我们在String中有这样一个序列：

"123456789..."

数字9后面的点实际代表下一个序列。所以，它会是这样的：

"...7891011121314..."

要求

在这一节中，我想解释一下这个要求。要求是找到输入字符串第一次出现的索引（称为 n）。让我举个例子：

示例 1

n = "3"

First Index of n = 2

示例 2

n = "910"

First Index of n = 8

问题

我编写了算法来查找字符串 n 的索引。但是该算法只是一个while循环来检查n的索引，如果n的索引不是，则一个一个地添加下一个序列号成立。我 想要一个更好的算法 来找到 n 第一次出现的索引，而不依赖于循环或更少的循环。至少，如果 n 的值很大（例如：123456790 或 62716855），算法不会运行超过 2 秒。

---编辑---

我的代码片段：

while(!num.contains(s)){
    num +=start.toString();
    start = start.add(BigInteger.ONE);
}

这是我的完整代码：My Full Code

Answer 1

使用String我们可以这样做，

String largeValue  = "2323254534534642342354346876985374";
String searchValue = "32545345346423423543468769";
if(largeValue.contains(searchValue)){
    System.out.println("The index is : "+largeValue.indexOf(searchValue));
}

注意：- 如果 return -1 表示 searchValue 不存在于 largeValue 中，我们只能使用 largeValue.indexOf(searchValue) 否则您将获得特定的索引。

Answer 2

据我了解，您可以使用 String.indexOf 方法：

int found = longString.indexOf(searchString);
if(found != -1) System.out.println("Found index is: " + found);

Answer 3

我的理解是：你有一个类似 String sequence = "1234567891011121314"; 的序列字符串；您想要在其中找到输入字符串 say ("910") 首次出现的索引。如果我的理解是正确的，我们在 Java 中有内置函数来执行此操作 - sequence.indexOf(input_string) -String str = "1234567891011121314"; String sub = "910"; System.out.println(str.indexOf(sub));

Answer 4

我的想法是对数据进行统计，其中一种方法是将每个数字（0到9）的所有索引位置都放在索引数据的基础上进行搜索，因为输入文本很大并且搜索速度会太慢，所以下面会导致在同一个大文本上快速搜索多个搜索输入：

使用 C# 的示例：（在 Java 中使用：

HashMap<K, V> and ArrayList<T> and Character.getNumericValue(c)

)

string input = "................";

Dictionary<int, List<int>> numIndex = new Dictionary<int, List<int>>(10);

for(int index = 0; index < 10; index++)
     numIndex.Add(index , new List<int>(20));

for(int charIndex = 0; charIndex < input.Length; charIndex++){
     for(int index = 0; index < 10; index++){
          int value = Convert.ToInt32(input[charIndex]);
          if(value == index)
                numIndex[value].Add(charIndex);
     }
} 

int FindIndex(string nValue){
    // nValue = "213654789";

    foreach(int indexValue in numIndex[Conver.ToInt32(nValue[0])])
    {
         if(nValue == input.Substring(indexValue, nValue.Length))
              return indexValue; // First Index Value Found
    }
    return -1;
}

编辑以添加无限缓冲区的想法，没有缓冲区关闭检查逻辑的伪代码逻辑，我留给用户添加它：

int charIndex = -1
char charValue 
string textValue = "456321587"
char[] textCompare = new char[textValue.Length]
while charValue = charsBuffer.ReadChar()
BEGIN
     charIndex = charIndex + 1 
     if textValue[0] == charValue 
     BEGIN
          int count = 1
          textCompare[0] = charValue 
          while count < textValue.Length
          BEGIN
              textCompare[count] = charsBuffer.ReadChar()
              count = count + 1
          END
          if textValue == new string(textCompare)
              return charIndex
          charIndex = charIndex + textValue.Length 
     END
END
return -1

charsBuffer.ReadChar() 可能是大文本文件缓冲区或网络文本缓冲区或任何大文本缓冲区

Answer 5

下面是如何解决此问题的一般说明。将其翻译成 Java 仍然具有挑战性。

您的输入字符串基本上是所有自然数的无限序列 1 2 3 4 5 6 7 8 9 10 11 12 13 ....

我认为练习的重点是识别子字符串 n 所属的输入字符串的第一个自然数子序列，然后计算其索引而不实际构造大 "infinite" 字符串.

为此，您必须尝试将子字符串 n 拆分为数字尽可能少的递增数字序列。

首先您必须检查子字符串 n 是否创建了一个单位数字序列。就是这种情况，例如，如果n == 345678（注意n可能同时包含一位数和两位数，例如n == 345678910，您应该也能识别）。

如果你在那一步失败了，你应该寻找一个两位数的序列。例如，n == 33343536 就是这种情况。现在，这可能会变得更棘手，因为 n == 2333435363 也是一个两位数字的序列，但是序列的前导和尾随数字（32 和 37）被截断了。

如果您再次失败，您将寻找一个 3 位数字的序列。

如果找不到任何序列，则将整个子字符串 n 视为大字符串中的单个数字。

现在，假设 n 是 199319941995，并且您在上一步中发现序列中的第一个数字是 1993。剩下的工作是计算数字 1993 在输入 String 中的索引。您知道单个数字采用 1*9 索引。两位数取 2*90 个索引。三位数字取 3*900 个索引。 1000 到 1993 之间的三位数取 4*993 个索引。因此1993的索引为1*9+2*90+3*900+4*993，也就是子串199319941995.

的第一个索引

Answer 6

除了实现一些高级算法（如另一个答案中提到的 Knuth-Morris-Pratt）之外，您还可以使用有限状态机进行字符串匹配。这样做的好处是回溯成本比 while 循环的简单解决方案低得多，但可以很容易地与标准 javas 正则表达式一起使用。解决方案是：

CharSequence text = // the long sequence of text
String search = // whatever you want to search
Matcher matcher = Pattern.compile(Pattern.quoute(search)).matcher(text);
matcher.find();
int startIndex = matcher.match();

无限字符串中字符串的第一个索引 - Java

First Index of String in Infinite String - Java

java

string

algorithm

indexof

sequence

简介

要求

示例 1

示例 2

问题

---编辑---