我怎样才能从 glibc 中拦截一个函数并打印它的参数值？

Question

在 glibc 中，函数 _IO_new_fopen() 由 fopen() libcall 调用。如果我是运行以下代码，是否有任何方法可以让我在 _IO_new_fopen() 函数被调用时拦截它并打印出其参数值？

对于内核函数，这可以通过Jprobe来实现，我实际上正在为glibc中的函数寻找类似的机制。 LD_PRELOAD是glibc中的一个相关机制，可以让我们用自己定义的函数替换一个glibc函数，但是这对实现我的目的没有帮助。

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main(void){

    char buff[100];
    int i, r1;
    FILE *f1 = fopen("text.txt", "wb");

    for(i = 0; i < 100; i++){
            buff[i] = 'b';
    }
    assert(f1);
    r1 = fwrite(buff, 1, 100, f1);
    printf("wrote %d elements\n", r1);
    fclose(f1);
}

Answer 1

在glibc中，fopen是[public]符号别名到[局部符号]_IO_new_fopen（即它们是相同的），因此，从技术上讲，fopen 不会调用 _IO_new_fopen--它是它。

如果您有兴趣拦截示例中所示的调用（即来自 app），则使用 LD_PRELOAD [以及 dlopen , dlsym, 等等] 并定义您自己的 fopen 将工作。我以前在自己的代码中这样做过。

您可能只是缺少一个入口点（即您可能需要 define/intercept 多个符号）：fopen、fopen64、_IO_file_fopen、_IO_fopen, 等等。如果你对你的可执行文件执行 readelf -s，简单的 fopen 可能会显示为其他东西。我猜你会看到 fopen64。此外，您可能需要考虑符号版本控制。

您将无法拦截内部在 glibc 到 fopen 之间的调用，因为它们会绕过该机制并直接进行。但是，这不是很有用，因此可能需要您提供更多详细信息。

您还可以查看 fopencookie 作为拦截底层 read(2)、write(2) 等系统调用的方法。

更新：

Specifically, I am trying to print the address and the content of the buffer that is used by fread()/fwrite().

很容易做到。详情如下...

I think it should be able to be done by adding "printk()" somewhere and rebuild libc.so.6, but is there any approach that is more convenient (i.e. without rebuilding libc.so.6)?

无需重建 libc，LD_PRELOAD 会处理它。请记住 printk 在内核中。没必要去那里[而且它可能行不通]

您想要的地址是传递给 fread 的地址， 而不是 传递给 read(2) 的缓冲区。传递给 read(2) 的缓冲区在 FILE 结构中，因此，它不会告诉你太多。否则，您可以运行 strace(1) 下的整个程序 [或者编写您自己的使用 ptrace(2)] 的自定义版本。

您可以拦截、跟踪、断点等任意数量的函数。您只需要创建自己的共享库 (.so) 并将其设置为 LD_PRELOAD。

这是一个示例拦截器函数 [for fread]:

// trapme -- put gdb breakpoint on this function
__attribute__((__noinline__)) void
trapme(void)
{

    // prevent the function call to this from being optimized away
    __asm__ __volatile__ () ::: memory;
}

// dumpme -- dump out a buffer
void
dumpme(const void *buf,size_t xlen)
{

    // dump data in whatever format you'd like ...
}

// fread -- intercept fread calls
size_t
fread(void *ptr, size_t size, size_t nmemb, FILE *stream)
{
    static size_t
    (*fread_real)(void *ptr,size_t size,size_t nmemb,FILE *stream) = NULL;
    size_t xlen;

    printf("fread_fake: ENTER ptr=%p size=%lX nmemb=%ld stream=%p\n",
        ptr,size,nmemb,stream);

    // do trap on some suspicious activity ...
#if 0
    if (ptr == ...)
        trapme();
#endif

    // locate the real symbol in glibc
    if (fread_real == NULL)
        fread_real = dlsym(RTLD_NEXT,"fread");
    // abort if fread_real is still null ...

    xlen = fread_real(ptr,size,nmemb,stream);

    // dump out the data
    if (xlen > 0)
        dumpme(ptr,xlen * size);

    // do trap on some suspicious activity ...
#if 0
    if (xlen == 372)
        trapme();
#endif

    printf("fread_fake: EXIT xlen=%ld\n",xlen);

    return xlen;
}

这是基本机制。你可以添加你想要的任何[邪恶的]东西。如果缓冲区处于某个 [坏] 范围或有趣的缓冲区内容，您可以花哨并添加一些陷阱逻辑。也就是说，类似于 cond 断点的 cond 语句。因此，您可以使用它来触发并进入 gdb，使用比单独使用 gdb 更复杂的测试 [找到很难找到的错误]。

还可以监控fopen，记住文件名等

让这个机制工作不太困难，但我的建议是为 fopen 编写拦截器并熟悉 dlsym首先技巧（与尝试在循环中调试它）。

您可以根据需要创建任意数量的这些函数。从简单开始（例如 fopen、fread、fwrite）。然后，当您在覆盖范围内找到 "gap" 时添加更多内容（例如，最终，您可能会发现拦截 fseek 会为您提供所需的信息）

更新#2：

这是一个示例 strace 脚本，其中包含我喜欢使用的选项：

strace -ttt -i -f -etrace=all -o \
    /home/me/log/foobar.spysys \
    -eread=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 \
    -ewrite=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 \
    -x foobar

-eread 和 -ewrite 可以根据需要扩展到任意数量的单元

我怎样才能从 glibc 中拦截一个函数并打印它的参数值？

How can I intercept a function from glibc and print the values of its parameters?

c

glibc