是否有 sscanf 的变体,它带有指向输入字符串而不是缓冲区的指针?
Is there a variant of sscanf with pointer to input string instead of buffers?
sscanf
是这样工作的:
int main(const int argc, const char *argv[]) {
char buf1[1024] = {0};
char buf2[1024] = {0};
char buf3[1024] = {0};
char *str = "abc, 123; xyz";
sscanf(str, "%[^,], %[^;]; %s", buf1, buf2, buf3);
printf("'%s' '%s' '%s'", buf1, buf2, buf3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
我想知道是否有一个函数不需要将 str
的内容复制到缓冲区 (buf1, buf2, buf3
) 中,也不需要分配任何新内存。相反,它只会将指针 (ptr1, ptr2, ptr3
) 设置为指向 str
中的匹配部分,并且 null 终止匹配之后的任何内容。
int main(const int argc, const char *argv[]) {
char *ptr1 = NULL;
char *ptr2 = NULL;
char *ptr3 = NULL;
char *str = "abc, 123; xyz";
//
// str = "abc, 123; xyz[=11=]"
//
_sscanf(str, "%[^,], %[^;]; %s", &ptr1, &ptr2, &ptr3);
//
// str = "abc[=11=] 123[=11=] xyz[=11=]"
// ^ ^ ^
// ptr1 ptr2 ptr3
//
printf("'%s' '%s' '%s'", ptr1, ptr2, ptr3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
我知道可以使用 strtok_r
和 regex.h
库等函数,但我认为这在可以修改输入字符串的情况下会更方便。
它不漂亮,但 %n
说明符可用于捕获标记开始和结束的索引。错误检查将确保索引和结束值不是 -1
#include <stdio.h>
int main(int argc, char *argv[]) {
int index1 = -1;
int end1 = -1;
int index2 = -1;
int end2 = -1;
int index3 = -1;
int end3 = -1;
char *str = "abc, 123; xyz";
sscanf(str, " %n%*[^,]%n, %n%*[^;]%n; %n%*s%n", &index1, &end1, &index2, &end2, &index3, &end3);
printf("'%.*s' '%.*s' '%.*s'", end1, str + index1, end2 - index2, str + index2, end3 - index3, str + index3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
没有以 char *
指向原始字符串中的位置的指针结束的标准化变体。 POSIX 中有一个变体,它为每个字符串项分配内存并将数据复制到其中。
sscanf()
的功能与 fscanf()
和其他变体的功能相匹配,并且在非常广泛的范围内,适用于一个变体的适用于所有变体。但是,您正在寻找的内容无法应用于基于文件的变体,因此它不存在。
有一个 sscanf()
的变体,它为字符串分配内存。它是 sscanf()
的 POSIX 2008 变体和 m
修饰符。
[CX] ⌦ The %c
, %s
, and %[
conversion specifiers shall accept an optional assignment-allocation character 'm', which shall cause a memory buffer to be allocated to hold the string converted including a terminating null character. In such a case, the argument corresponding to the conversion specifier should be a reference to a pointer variable that will receive a pointer to the allocated buffer. The system shall allocate a buffer as if malloc()
had been called. The application shall be responsible for freeing the memory after usage. If there is insufficient memory to allocate a buffer, the function shall set errno
to [ENOMEM]
and a conversion error shall result. If the function returns EOF, any memory successfully allocated for parameters using assignment-allocation character 'm' by this call shall be freed before the function returns. ⌫
[CX] 表示法将其标记为对 C 标准的扩展(因此 m
修饰符不是标准 C 的一部分并且并非在任何地方都受支持),⌦ 和 ⌫ 符号标记扩展的范围。
因此,如果您的实现支持它(例如,Linux 支持;macOS Sierra 不支持),sscanf()
的变体将为您分配正确大小的缓冲区,它需要 char **
个参数。
Linux 上的手册页说:
An optional 'm' character. This is used with string conversions (%s
, %c
, %[
), and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, scanf()
allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char *
variable (this variable does not need to be initialized before the call). The caller should subsequently free(3) this buffer when it is no longer required.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char data[] = "The hills are alive with the sound of music";
char *w[9];
if (sscanf(data, "%ms %ms %ms %ms %ms %ms %ms %ms %ms",
&w[0], &w[1], &w[2], &w[3], &w[4], &w[5], &w[6], &w[7], &w[8]) != 9)
{
fprintf(stderr, "Oops!\n");
return 1;
}
printf("Forwards: %s\n", data);
printf("Reversed:");
for (int i = 8; i >= 0; i--)
printf(" %s", w[i]);
putchar('\n');
for (int i = 0; i < 9; i++)
free(w[i]);
return 0;
}
输出:
Forwards: The hills are alive with the sound of music
Reversed: music of sound the with alive are hills The
sscanf
是这样工作的:
int main(const int argc, const char *argv[]) {
char buf1[1024] = {0};
char buf2[1024] = {0};
char buf3[1024] = {0};
char *str = "abc, 123; xyz";
sscanf(str, "%[^,], %[^;]; %s", buf1, buf2, buf3);
printf("'%s' '%s' '%s'", buf1, buf2, buf3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
我想知道是否有一个函数不需要将 str
的内容复制到缓冲区 (buf1, buf2, buf3
) 中,也不需要分配任何新内存。相反,它只会将指针 (ptr1, ptr2, ptr3
) 设置为指向 str
中的匹配部分,并且 null 终止匹配之后的任何内容。
int main(const int argc, const char *argv[]) {
char *ptr1 = NULL;
char *ptr2 = NULL;
char *ptr3 = NULL;
char *str = "abc, 123; xyz";
//
// str = "abc, 123; xyz[=11=]"
//
_sscanf(str, "%[^,], %[^;]; %s", &ptr1, &ptr2, &ptr3);
//
// str = "abc[=11=] 123[=11=] xyz[=11=]"
// ^ ^ ^
// ptr1 ptr2 ptr3
//
printf("'%s' '%s' '%s'", ptr1, ptr2, ptr3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
我知道可以使用 strtok_r
和 regex.h
库等函数,但我认为这在可以修改输入字符串的情况下会更方便。
它不漂亮,但 %n
说明符可用于捕获标记开始和结束的索引。错误检查将确保索引和结束值不是 -1
#include <stdio.h>
int main(int argc, char *argv[]) {
int index1 = -1;
int end1 = -1;
int index2 = -1;
int end2 = -1;
int index3 = -1;
int end3 = -1;
char *str = "abc, 123; xyz";
sscanf(str, " %n%*[^,]%n, %n%*[^;]%n; %n%*s%n", &index1, &end1, &index2, &end2, &index3, &end3);
printf("'%.*s' '%.*s' '%.*s'", end1, str + index1, end2 - index2, str + index2, end3 - index3, str + index3); // Prints: "'abc' '123' 'xyz'"
return 0;
}
没有以 char *
指向原始字符串中的位置的指针结束的标准化变体。 POSIX 中有一个变体,它为每个字符串项分配内存并将数据复制到其中。
sscanf()
的功能与 fscanf()
和其他变体的功能相匹配,并且在非常广泛的范围内,适用于一个变体的适用于所有变体。但是,您正在寻找的内容无法应用于基于文件的变体,因此它不存在。
有一个 sscanf()
的变体,它为字符串分配内存。它是 sscanf()
的 POSIX 2008 变体和 m
修饰符。
[CX] ⌦ The
%c
,%s
, and%[
conversion specifiers shall accept an optional assignment-allocation character 'm', which shall cause a memory buffer to be allocated to hold the string converted including a terminating null character. In such a case, the argument corresponding to the conversion specifier should be a reference to a pointer variable that will receive a pointer to the allocated buffer. The system shall allocate a buffer as ifmalloc()
had been called. The application shall be responsible for freeing the memory after usage. If there is insufficient memory to allocate a buffer, the function shall seterrno
to[ENOMEM]
and a conversion error shall result. If the function returns EOF, any memory successfully allocated for parameters using assignment-allocation character 'm' by this call shall be freed before the function returns. ⌫
[CX] 表示法将其标记为对 C 标准的扩展(因此 m
修饰符不是标准 C 的一部分并且并非在任何地方都受支持),⌦ 和 ⌫ 符号标记扩展的范围。
因此,如果您的实现支持它(例如,Linux 支持;macOS Sierra 不支持),sscanf()
的变体将为您分配正确大小的缓冲区,它需要 char **
个参数。
Linux 上的手册页说:
An optional 'm' character. This is used with string conversions (
%s
,%c
,%[
), and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead,scanf()
allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to achar *
variable (this variable does not need to be initialized before the call). The caller should subsequently free(3) this buffer when it is no longer required.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char data[] = "The hills are alive with the sound of music";
char *w[9];
if (sscanf(data, "%ms %ms %ms %ms %ms %ms %ms %ms %ms",
&w[0], &w[1], &w[2], &w[3], &w[4], &w[5], &w[6], &w[7], &w[8]) != 9)
{
fprintf(stderr, "Oops!\n");
return 1;
}
printf("Forwards: %s\n", data);
printf("Reversed:");
for (int i = 8; i >= 0; i--)
printf(" %s", w[i]);
putchar('\n');
for (int i = 0; i < 9; i++)
free(w[i]);
return 0;
}
输出:
Forwards: The hills are alive with the sound of music
Reversed: music of sound the with alive are hills The