使用系统调用和打印行读取文件

Reading files using system calls and printing lines

这个程序读取一个文本文件"hello.txt"并在其中查找字符串w的出现并打印行号和整行。它还打印字符串 w 在文件中出现了多少次。程序编译没有错误,代码如下:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

int main() {

    int fd;
    char c;
    char str[152];
    int i = 0, j = 0;
    int bytesread;
    int flag = 1;
    int found = 0;
    int line = 1;
    int foundflag = 1;

    char w[] = {'h', 'e', 'l', 'l', 'o'};
    int len = strlen(w);

    if ((fd = open("hello.txt", O_RDONLY, 0)) != -1) { //if 1

        bytesread = read(fd, &c, 1);
        str[j] = c;
        j++;

        if (bytesread != -1) { //if 2

            while (bytesread != 0) { //while

                if (c == '\n')
                    line++;

                if (c == w[i]) { //if 3
                    i++;
                    flag = 0;
                } else if (flag == 0 || i == len) //end of f3
                { // else 3
                    i = 0;
                    flag = 1;
                }// end of else 3
                else if (flag == 1) {
                    while (read(fd, &c, 1)) {
                        str[j] = c;
                        j++;
                        if (c == ' ')
                            break;
                        if (c == '\n') {
                            line++;
                            break;
                        }
                    }
                }

                bytesread = read(fd, &c, 1);
                str[j] = c;
                j++;

                if ((c == ' ' || c == '\n') && flag == 0 && i == len) {
                    found++;
                    foundflag = 0;
                    printf("w was found in line %d.\n", line);
                }

                if ((c == '\n')&&(foundflag == 0)) {

                    for (j = 0; str[j] != '\n'; j += 5) {
                        printf("%c", str[j]);

                        if (str[j + 1] != '\n')
                            printf("%c", str[j + 1]);
                        else {
                            j++;
                            break;
                        }

                        if (str[j + 2] != '\n')
                            printf("%c", str[j + 2]);
                        else {
                            j += 2;
                            break;
                        }

                        if (str[j + 3] != '\n')
                            printf("%c", str[j + 3]);
                        else {
                            j += 3;
                            break;
                        }

                        if (str[j + 4] != '\n')
                            printf("%c", str[j + 4]);
                        else {
                            j += 4;
                            break;
                        }
                    }

                    for (; str[j] != '\n'; j++)
                        printf("%c", str[j]);

                    printf("\n");
                    j = 0;

                } else if (c == '\n')
                    foundflag = 1;

            } //end of while
            printf("w has occured %d times.\n", found);

        } else //end of if 2
            printf("couldn't read file.\n");

    } else //end of if 1
        printf("Couldn't open file for read.\n");

    close(fd);
} //end of main

这是终端的输出:

w was found in line 1.
hello
w was found in line 2.
w was found in line 6.
hello world
hellooooo
w has occured 3 times.

这里是"hello.txt"的内容:

hello
hello world
hallo
I'm here
we're here
hello
hellooooo

输出中打印的行数是 1,2 和 6,但输出应该是这样的:

w was found in line 1.
hello
w was found in line 2.
hello world
w was found in line 6.
hello
w has occured 3 times.
  1. 我建议你读一些 C material。您的代码表明您还不太了解该语言。
  2. 我不会更改您的代码,因为那很难。
  3. 我将 post 我的代码的相关部分并解释这些位。

因此,代码位:

const char fname[] = "hello.txt";
const char w[] = "hello";

(...)    

while (read(fd, &buffer[i], 1) == 1) {
    /* end of line */
    if (buffer[i] == '\n' || buffer[i] == 0x0) {
        buffer[i] = 0;
        if (!strncmp(buffer, w, strlen(w))) {
            printf("w was found in line %d\n", line);
            puts(buffer);
            n++;
        }
        line++;
        i = 0;
        continue;
    }
    i++;
}

说明

  1. while (read(fd, &buffer[i], 1) == 1): 这将从您的 fd(由先前的 open 调用返回)中读取一个字符并将其存储在 buffer[i] 中。这里需要注意的相关事项是,在此之前您应该声明 int i = 0 并确保 buffer 是定义的数组或 malloced 内存区域。此 while 将继续,直到读取的字节数不同于 1(这是我们要求的)。

  2. if (buffer[i] == '\n' || buffer[i] == 0x0)if 检测到行尾。非常直接。

  3. buffer[i] = 0;if (!strncmp(buffer, w, strlen(w))): buffer[i] = 0 将设置当前缓冲区的最后一个字符为零。它所做的是去掉我们读取的最后一个 \n,所以我们可以用 puts 很好地打印它。我在评论中建议的一点是使用 strncmp。这个函数就像strcmp,但它只会比较至多定义的字节数。因此,使用此函数,您可以有效地确定字符串是否以您要查找的子字符串开头。如果找到这个字符串,我们打印它所在的行,打印缓冲区本身并递增 n,这是我们计算 w 被发现次数的计数器。您应该在代码的开头声明 int n = 0;...

  4. line++; i = 0; continue; :这是行尾检测里面的if。所以,它所做的是增加我们的行计数器,将 i 设置为零——这很重要,因为在新的一行我们将读取一个新的缓冲区,并且缓冲区索引必须从 0 开始。并且 continue 强制循环重复而不执行其余代码。

  5. 最后,while范围的其余部分定义为i++。当我们的 while 循环在每个字符处执行时,缓冲区索引必须在每个字符读取后递增。


我测试的文件就是你提供的文件。我得到的输出是:

w was found in line 1
hello
w was found in line 2
hello world
w was found in line 6
hello
w was found 3 times