在没有内置函数的情况下在 Python 中查找字符串的子字符串

Finding a substring of a string in Python without inbuilt functions

我正在尝试编写一些代码来查找字符串中的子字符串。到目前为止我有这个:

main = "dedicated"
sub = "cat"
count = 0
for i in range (0,len(main)):
   match = True
   if sub[0]==main[i]:
     j=0
     for j in range(0,len(sub)):
         if sub[j]!=main[i+j]:
             match = False
             print "No substring"
             break
         else:
             count=count+1
             if match == True and count == len(sub):
                 print "Substring"
                 print "Position start:",i

任何人都可以帮助 me/give 我 pointers/improve 代码,使其与上面的要点正确工作吗?

要解决您的问题,请添加:

main = "this is an example"
sub = "example"
count = 0
done = False
for i in range (0,len(main)):
   match = True
   if sub[0]==main[i]:
     j=0
     for j in range(0,len(sub)):
         if sub[j]!=main[i+j]:
             match = False
             print "No substring"
             break
         else:
             count=count+1
             if match == True and count == len(sub):
                 print "Substring"
                 print "Position start:",i
                 done = True
                 break
   if done == True:
     break

注意最后,你已经完成了..所以然后将它设置为用一个变量结束程序,并跳出循环。然后跳出外层循环。

然而,您确实需要解决 sub 可能会尝试超过 main 长度的问题,例如。

main = "this is an example"
sub = "examples"

在这种情况下,您需要检查 j 迭代器是否超出范围。我会把它留给你去弄清楚,因为它不是原始问题的一部分。

def index(s, sub):
    start = 0
    end = 0
    while start < len(s):
        if s[start+end] != sub[end]:
            start += 1
            end = 0
            continue
        end += 1
        if end == len(sub):
            return start
    return -1

输出:

>>> index("dedicate", 'cat')
4
>>> index("this is an example", 'example')
11
>>> index('hello world', 'word')
-1
s1="gabcdfahibdgsabc hi kilg hi"
s2="hi"
count=0
l2=len(s2)
for i in range(len(s1)):
    if s1[i]==s2[0]:   
        end=i+l2
        sub1=s1[i:end]
        if s2==sub1:
            count+=1
print (count)
def find_sub_str(sample, sub_str):
    count = 0
    for index in range(len(sample)):
        nxt_len = index + len(sub_str)
        if sample[index:nxt_len] == sub_str:
            count += 1
        print("Sub string present {} position start at index 
        {}".format(sample[index:nxt_len], index))
    print("no of times subsring present: ", count)

find_sub_str("dedicate", "cat")
子字符串当前猫位置从索引 4
开始 出现的次数:1

find_sub_str("nothing", "different")
出现的次数:0

find_sub_str("this is an example", "example")
子字符串当前示例位置从索引 11
开始 出现的次数:1

考虑跟随输入

sub_str = "hij"
input_strs = "abcdefghij"

这里的逻辑是-

按主串的先后顺序分组检查串是否为子串 从 0 开始到字符串结尾。

Iterations are like following - 
Iteration 1:  abc
Iteration 2:  bcd
Iteration 3:  cde
Iteration 4:  def
Iteration 5:  efg
Iteration 6:  fgh
Iteration 7:  ghi
Iteration 8:  hij

当主字符串的长度为 8,子字符串的长度为 3 时,最多需要 8 次迭代

复杂度:

Worst case complexity = LenOfMainString - LenOfSubString + 1
Best case complexity = 0 when LenOfSubString is greater than LenOfMainString

注意:这是用于查找给定字符串是否存在于主字符串中的代码,即子字符串。不是获取索引,而是代码打印索引如果匹配否则打印 -1

代码

def is_sub_string(main_str, sub_str):
    """
    @Summary: Check string is sub string of main or not
    @Param main_str(String): Main string in which we have to check sub string is
     present or not.
    @Param sub_str(String): String which we want to check if present in main
     string or not.
    @Return (Boolean): True if present else False.
    """
    # Length of main string and sub string
    # We will iterate over main string is input_str_len - sub_len + 1
    # Means if main string have 10 characters and sub string have 3 characters
    # then in worst case if have to iterate 8 time because last two character
    # can not be sub string, as sub string length is 3
    sub_len = len(sub_str)
    input_str_len = len(main_str)
    index = 0
    is_sub_string = False
    while index<input_str_len-sub_len+1:
        # Check sub_str is equal to sequential group of same characters in main
        # string.
        if sub_str==main_str[index:index+sub_len]:
            is_sub_string = True
            break
        # Increase index count by one to move to next character. 
        index += 1
    print("Total Iteration:", index + 1 if is_sub_string else index, end="\t")
    print("Is Substring:", is_sub_string, end="\t")
    print("Index:",  index if is_sub_string else -1)
    return is_sub_string

输出

案例 01:当字符串出现在主字符串的开头时。

status = is_sub_string("abcdefghij", "abc")
>> Total Iteration: 1      Is Substring: True      Index: 0

案例 02:当字符串出现在主字符串的末尾时。

status = is_sub_string("abcdefghij", "hij")
>> Total Iteration: 8      Is Substring: True      Index: 7

案例 03:当主字符串中不存在字符串时。

status = is_sub_string("abcdefghij", "hix")
>>Total Iteration: 8      Is Substring: False     Index: -1

案例04:当字符串长度大于主字符串时。

status = is_sub_string("abcdefghij", "abcdefghijabcdefghij")
>>Total Iteration: 0      Is Substring: False     Index: -1

如果我们在开头和结尾搜索字符串,您可以将迭代次数减少一半。

复杂度

Worst case complexity = (LenOfMainString - LenOfSubString + 1)/2
Best case complexity = 0 when LenOfSubString is greater than LenOfMainString

代码

def is_sub_string(main_str, sub_str):
    """
    @Summary: Check string is sub string of main or not
    @Param main_str(String): Main string in which we have to check sub string is
     present or not.
    @Param sub_str(String): String which we want to check if present in main
     string or not.
    @Return (Boolean): True if present else False.
    """
    # Length of main string and sub string
    # We will iterate over main string is (main_str_len - sub_len + 1)/2
    sub_len = len(sub_str)
    input_str_len = len(main_str)
    index = 0
    is_sub_string = False
    find_index = -1
    while index<(input_str_len-sub_len+1)/2:
        # Check sub_str is equal to sequential group of same characters in main
        # string.
        if sub_str==main_str[index:index+sub_len]:
            is_sub_string = True
            find_index = index
            break
        print((index+sub_len)*-1, input_str_len-index, end="\t")
        print(main_str[(index+sub_len)*-1:input_str_len-index], main_str[index:index+sub_len])
        if sub_str==main_str[(index+sub_len)*-1:input_str_len-index]:
            is_sub_string = True
            find_index = (index+sub_len-input_str_len) * (-1)
            break
        # Increase index count by one to move to next characters. 
        index += 1
    print("Total Iteration:", index + 1 if is_sub_string else index, end="\t")
    print("Is Substring:", is_sub_string, end="\t")
    print("Index:",  find_index)
    return is_sub_string
# abcd
# ab
t = int(input())
while t > 0:
    str_Original = input()
    str_Find = input()
    i = 0
    j = 0
    count = 0
    while i < len(str_Original):
        for j in range(len(str_Find)):
            if (len(str_Original) - i - 1) >= j:
                if str_Original[i+j] == str_Find[j]:
                    if j == len(str_Find) - 1:
                        print("Found String at index : " + str(i))
                        count += 1
                else:
                    break
        i += 1
    if count > 0:
        print("Count : " + str(count))
    else:
        print("Did not find any match")
    t -= 1

我认为最简洁的方法如下:

string = "samitsaxena"
sub_string = "sa"

sample_list =[]

for i in range(0, len(string)-len(sub_string)+1):
   sample_list.append(string[i:i+len(sub_string)])

print(sample_list)
print(sample_list.count(sub_string))

输出结果如下:

['sa', 'am', 'mi', 'it', 'ts', 'sa', 'ax', 'xe', 'en', 'na']
2

请观察 sample_list 输出。

逻辑是我们将创建长度等于子字符串长度的子字符串(从主字符串)。

我们这样做是因为我们想将这些子字符串与给定的子字符串相匹配。

您可以更改代码中字符串和子字符串的值来尝试不同的组合,这也有助于您学习代码的工作。

def count_substring(string, sub_string):
    string = string.lower()
    sub_string = sub_string.lower()

    start = 0
    end = 0

    for index, letter in enumerate(string):
        if letter == sub_string[0]:
            temp_string = string[index:index+len(sub_string)]
            if temp_string == sub_string:
                return index
    return -1