如何将内存中的多个缓冲区压缩为一个并获得其压缩大小?

How to compress multiple buffers in memory with boost into one and get its compressed size?

我想通过 boost 的 zlib 压缩将多个缓冲区(在我的例子中来自不同来源的视频帧)压缩到一个新缓冲区中,然后,稍后将所有内容写入磁盘上的一个文件。我需要这两个步骤,因为我想在包含压缩缓冲区的最终大小的文件中添加一个 header (这稍后将用作解析器的偏移量)。我想用 boost 的 iostreams 库来实现这个。

产生了以下相关问题:

a) 我需要使用 filtering_stream 还是 filtering_streambuf?我希望后者已经具有某种缓冲行为。

b) 如何关闭 filtering_stream(buf) 并将其写入缓冲区?

c) 如何读取压缩数据的最终大小? .tellg() 未针对这些 filtering_streams 实施(如 SO 其他地方所述)

d) 你可以有多个来源吗,即我的三个缓冲区,还是我需要将它们组合起来? (见下文了解我的方法)。

class Frame {
private:
    /* other things */
public:
    float buf1[3];
    float buf2[3];
    float buf3[4];
    /* more things */
};

int main() {
    Frame frame;
    
    using boost::iostreams bio;
    
    bio::filtering_streambuf<bio::input> in;
    in.push(bio::gzip_compressor());
    /* Could you also add the buffers indiviually? */
    in.push(bio::array_source(reinterpret_cast<const char*>(frame.buf1), 3 + 7 + 12 + (sizeof(float) * 3)));
    
    const char *compressed = /* How to close in and write the contents to this buffer? */
    int compressedSize = /* How to get this from in? in.tellg() does not work */
    
    std::stringstream headerInformation;
    headerInformation << "START";
    headerInformation << "END " << compressedSize;
    
    std::ofstream ofs("ouput.data", std::ofstream::out | std::ofstream::binary | std::ofstream::app);
    bio::filtering_ostream out;
    out.push(ofs);
    out.write(headerInformation.str(), headerInformation.str().length());
    out.write(compressed, compressedSize);
    
    boost::iostreams::close(out);
    boost::iostreams::close(in);
    
    return 0;
}

a) Do I need to use filtering_stream of filtering_streambuf? I would expect for the latter to have some kind of buffer behavior already.

两者都可以。该流添加了 文本和区域设置 功能,就像在标准库中一样。

b) How can I close the filtering_stream(buf) and write it to a buffer?

您可以使用 array_sinkback_inserter_device、内存映射等。参见 https://www.boost.org/doc/libs/1_72_0/libs/iostreams/doc/(“模型”)。

c) How can I read the final size of the compressed data? .tellg() is not implemented for these filtering_streams (as mentioned somewhereelse on SO)

从你的底层输出中检测它device/stream。做之前不要忘记 flush/close 过滤层。

d) Can you have multiple sources, i.e. my three buffers or do I need to combine them? (see below for my approach).

你可以做你想做的事。

显示代码...

我会反其道而行之,让过滤器压缩 写入 到输出缓冲区:

using RawBuffer = std::vector<char>;
using Device = bio::back_insert_device<RawBuffer>;

RawBuffer compressed_buffer; // optionally reserve some size

{
    bio::filtering_ostream filter;
    filter.push(bio::gzip_compressor());
    filter.push(Device{ compressed_buffer });

    filter.write(reinterpret_cast<char const*>(&frame.buf1),
                 sizeof(frame) - offsetof(Frame, buf1));
}

改为使用过滤流缓冲区:

{
    bio::filtering_ostreambuf filter;
    filter.push(bio::gzip_compressor());
    filter.push(Device{ compressed_buffer });

    std::copy_n(reinterpret_cast<char const*>(&frame.buf1),
                sizeof(frame) - offsetof(Frame, buf1),
                std::ostreambuf_iterator<char>(&filter));
}

现在您的问题的答案很突出:

const char *compressed = compressed_buffer.data();
int compressedSize = compressed_buffer.size();

我会将剩余的代码缩减为:

{
    std::ofstream ofs("ouput.data", std::ios::binary | std::ios::app);
    ofs << "START";
    ofs << "END " << compressed_buffer.size();
    ofs.write(compressed_buffer.data(), compressed_buffer.size());
}

Consider not reopening the output stream for each frame :)

现场演示

Live On Coliru

#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <iterator>
#include <fstream>
#include <vector>
namespace bio = boost::iostreams;

class Frame {
private:
    /* other things */
public:
    float buf1[3];
    float buf2[3];
    float buf3[4];
    /* more things */
};

int main() {
    Frame const frames[]{
        {
            { 1, 2, 3 },
            { 4, 5, 6 },
            { 7, 8, 9, 10 },
        },
        {
            { 11, 12, 13 },
            { 14, 15, 16 },
            { 17, 18, 19, 20 },
        },
        {
            { 21, 22, 23 },
            { 24, 25, 26 },
            { 27, 28, 29, 30 },
        },
    };

    // avoiding UB:
    static_assert(std::is_trivial_v<Frame> &&
                  std::is_standard_layout_v<Frame>);

    using RawBuffer = std::vector<char>;
    using Device = bio::back_insert_device<RawBuffer>;

    std::remove("output.data");
    std::ofstream ofs("output.data", std::ios::binary | std::ios::app);

    RawBuffer compressed_buffer; // optionally reserve some size

    for (Frame const& frame : frames) {
        compressed_buffer.clear(); // do not shrink_to_fit optimizing allocation

        {
            bio::filtering_ostreambuf filter;
            filter.push(bio::gzip_compressor());
            filter.push(Device{ compressed_buffer });

            std::copy_n(reinterpret_cast<char const*>(&frame.buf1),
                        sizeof(frame) - offsetof(Frame, buf1),
                        std::ostreambuf_iterator<char>(&filter));
        }

        ofs << "START";
        ofs << "END " << compressed_buffer.size();
        ofs.write(compressed_buffer.data(), compressed_buffer.size());
    }
}

确定性地生成 output.data:

00000000: 5354 4152 5445 4e44 2035 301f 8b08 0000  STARTEND 50.....
00000010: 0000 0000 ff63 6068 b067 6060 7000 2220  .....c`h.g``p." 
00000020: 6e00 e205 407c 0088 1f00 3183 2303 8300  n...@|....1.#...
00000030: 102b 3802 0058 a049 af28 0000 0053 5441  .+8..X.I.(...STA
00000040: 5254 454e 4420 3438 1f8b 0800 0000 0000  RTEND 48........
00000050: 00ff 6360 3070 6460 7000 e200 204e 00e2  ..c`0pd`p... N..
00000060: 0220 6e00 e20e 209e 00c4 3380 7881 2300  . n... ...3.x.#.
00000070: 763b 7371 2800 0000 5354 4152 5445 4e44  v;sq(...STARTEND
00000080: 2034 391f 8b08 0000 0000 0000 ff63 6058   49..........c`X
00000090: e1c8 c0b0 0188 7700 f101 203e 01c4 1780  ......w... >....
000000a0: f806 103f 00e2 1740 fcc1 1100 dfb4 6cde  ...?...@......l.
000000b0: 2800 0000                                (...