文章/答案/技术大牛

发布

社区首页 >问答首页 >Boost Beast按部分读取Conent

问Boost Beast按部分读取Conent
EN

Stack Overflow用户

提问于 2022-06-06 14:45:12

回答 1查看 314关注 0票数 1

我试图理解如何通过在boost the中调用'read_some‘函数来限制从互联网读取的数据量。

起点是野兽文档中的增量读示例。从文档中我了解到，真正读取的数据存储在flat_buffer中。我做了以下实验：

将最大平面缓冲区的大小设置为1024
连接到一个相对较大的(几KB) html页面
一次呼叫read_some
关掉互联网
试着把这一页读到最后

由于缓冲区的容量不足以存储整个页面，所以我的实验应该失败--我不应该能够读取整个页面。尽管如此，它还是成功地结束了。这意味着存在存储读取数据的附加缓冲区。但是它是用来做什么的，我怎样才能限制它的尺寸？

UPD这里是我的源代码：

#include <boost/beast/core.hpp>
#include <boost/beast/http.hpp>
#include <boost/beast/version.hpp>
#include <boost/asio/strand.hpp>
#include <cstdlib>
#include <functional>
#include <iostream>
#include <memory>
#include <string>

namespace beast = boost::beast;         // from <boost/beast.hpp>
namespace http = beast::http;           // from <boost/beast/http.hpp>
namespace net = boost::asio;            // from <boost/asio.hpp>

using namespace http;

template<
        bool isRequest,
        class SyncReadStream,
        class DynamicBuffer>
void
read_and_print_body(
        std::ostream& os,
        SyncReadStream& stream,
        DynamicBuffer& buffer,
        boost::beast::error_code& ec ) {
    parser<isRequest, buffer_body> p;
    read_header( stream, buffer, p, ec );
    if ( ec )
        return;
    while ( !p.is_done()) {
        char buf[512];
        p.get().body().data = buf;
        p.get().body().size = sizeof( buf );
        read_some( stream, buffer, p, ec );
        if ( ec == error::need_buffer )
            ec = {};
        if ( ec )
            return;
        os.write( buf, sizeof( buf ) - p.get().body().size );
    }
}

int main(int argc, char** argv)
{
    try
    {
        // Check command line arguments.
        if(argc != 4 && argc != 5)
        {
            std::cerr <<
            "Usage: http-client-sync <host> <port> <target> [<HTTP version: 1.0 or 1.1(default)>]\n" <<
            "Example:\n" <<
            "    http-client-sync www.example.com 80 /\n" <<
            "    http-client-sync www.example.com 80 / 1.0\n";
            return EXIT_FAILURE;
        }
        auto const host = argv[1];
        auto const port = argv[2];
        auto const target = argv[3];
        int version = argc == 5 && !std::strcmp("1.0", argv[4]) ? 10 : 11;

        // The io_context is required for all I/O
        net::io_context ioc;

        // These objects perform our I/O
        boost::asio::ip::tcp::resolver resolver(ioc);
        beast::tcp_stream stream(ioc);

        // Look up the domain name
        auto const results = resolver.resolve(host, port);

        // Make the connection on the IP address we get from a lookup
        stream.connect(results);

        // Set up an HTTP GET request message
        http::request<http::string_body> req{http::verb::get, target, version};
        req.set(http::field::host, host);
        req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);

        // Send the HTTP request to the remote host
        http::write(stream, req);

        // This buffer is used for reading and must be persisted
        beast::flat_buffer buffer;

        boost::beast::error_code ec;
        read_and_print_body<false>(std::cout, stream, buffer, ec);
    }
    catch(std::exception const& e)
    {
        std::cerr << "Error: " << e.what() << std::endl;
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

c++

http

boost

boost-beast

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-06-06 20:03:41

操作系统的TCP堆栈显然需要缓冲数据，因此很可能会在那里缓冲数据。

测试所需场景的方法：

住在Coliru

#include <boost/beast.hpp>
#include <iostream>
#include <thread>
namespace net = boost::asio;
namespace beast = boost::beast;
namespace http = beast::http;
using net::ip::tcp;

void server()
{
    net::io_context ioc;
    tcp::acceptor acc{ioc, {{}, 8989}};
    acc.listen();

    auto conn = acc.accept();

    http::request<http::string_body> msg(
        http::verb::get, "/", 11, std::string(20ull << 10, '*'));
    msg.prepare_payload();

    http::request_serializer<http::string_body> ser(msg);

    size_t hbytes = write_header(conn, ser);
    // size_t bbytes = write_some(conn, ser);
    size_t bbytes = write(conn, net::buffer(msg.body(), 1024));

    std::cout << "sent " << hbytes << " header and " << bbytes << "/"
              << msg.body().length() << " of body" << std::endl;
    // closes connection
}

namespace {
    template<bool isRequest, class SyncReadStream, class DynamicBuffer>
        auto
        read_and_print_body(
                std::ostream& /*os*/,
                SyncReadStream& stream,
                DynamicBuffer& buffer,
                boost::beast::error_code& ec)
        {
            struct { size_t hbytes = 0, bbytes = 0; } ret;

            http::parser<isRequest, http::buffer_body> p;
            //p.header_limit(8192);
            //p.body_limit(1024);

            ret.hbytes = read_header(stream, buffer, p, ec);
            if(ec)
                return ret;
            while(! p.is_done())
            {
                char buf[512];
                p.get().body().data = buf;
                p.get().body().size = sizeof(buf);
                ret.bbytes += http::read_some(stream, buffer, p, ec);
                if(ec == http::error::need_buffer)
                    ec = {};
                if(ec)
                    break;
                //os.write(buf, sizeof(buf) - p.get().body().size);
            }
            return ret;
        }
}

void client()
{
    net::io_context ioc;
    tcp::socket conn{ioc};
    conn.connect({{}, 8989});

    beast::error_code ec;
    beast::flat_buffer buf;
    auto [hbytes, bbytes] = read_and_print_body<true>(std::cout, conn, buf, ec);

    std::cout << "received hbytes:" << hbytes << " bbytes:" << bbytes
              << " (" << ec.message() << ")" << std::endl;
}

int main()
{
    std::jthread s(server);

    std::this_thread::sleep_for(std::chrono::seconds(1));
    std::jthread c(client);
}

打印

sent 41 header and 1024/20480 of body
received 1065 bytes of message (partial message)

旁注

你首先要问的是：

我正在努力理解如何限制从互联网上读取的数据量。

是建在野兽身上的

通过在boost野兽中调用'read_some‘函数。

为了限制读取的数据量，您不必在循环中使用read_some (根据定义，http::read已经做到了这一点)。

例如，使用上面的示例，如果您将20ull<<10 (20 KiB)替换为20ull<<20 (20 MiB)，您将超过默认的大小限制：

http::request<http::string_body> msg(http::verb::get, "/", 11,
                                     std::string(20ull << 20, '*'));

打印住在Coliru

sent 44 header and 1024/20971520 of body
received hbytes:44 bbytes:0 (body limit exceeded)

您还可以设置自己的解析器限制：

http::parser<isRequest, http::buffer_body> p;
p.header_limit(8192);
p.body_limit(1024);

打印住在Coliru的

发送了41个标头和1024/20480的正文接收到了h字节: 41 b字节:0(超过了身体限制)

正如您所看到的，它甚至知道在读取标头之后，使用来自标头的content-length信息来拒绝请求。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72519383

复制

相似问题

问Boost Beast按部分读取Conent
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Boost Beast按部分读取ConentEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Boost Beast按部分读取Conent
EN