我试图理解如何通过在boost the中调用'read_some‘函数来限制从互联网读取的数据量。
起点是野兽文档中的增量读示例。从文档中我了解到,真正读取的数据存储在flat_buffer中。我做了以下实验:
由于缓冲区的容量不足以存储整个页面,所以我的实验应该失败--我不应该能够读取整个页面。尽管如此,它还是成功地结束了。这意味着存在存储读取数据的附加缓冲区。但是它是用来做什么的,我怎样才能限制它的尺寸?
UPD这里是我的源代码:
#include <boost/beast/core.hpp>
#include <boost/beast/http.hpp>
#include <boost/beast/version.hpp>
#include <boost/asio/strand.hpp>
#include <cstdlib>
#include <functional>
#include <iostream>
#include <memory>
#include <string>
namespace beast = boost::beast; // from <boost/beast.hpp>
namespace http = beast::http; // from <boost/beast/http.hpp>
namespace net = boost::asio; // from <boost/asio.hpp>
using namespace http;
template<
bool isRequest,
class SyncReadStream,
class DynamicBuffer>
void
read_and_print_body(
std::ostream& os,
SyncReadStream& stream,
DynamicBuffer& buffer,
boost::beast::error_code& ec ) {
parser<isRequest, buffer_body> p;
read_header( stream, buffer, p, ec );
if ( ec )
return;
while ( !p.is_done()) {
char buf[512];
p.get().body().data = buf;
p.get().body().size = sizeof( buf );
read_some( stream, buffer, p, ec );
if ( ec == error::need_buffer )
ec = {};
if ( ec )
return;
os.write( buf, sizeof( buf ) - p.get().body().size );
}
}
int main(int argc, char** argv)
{
try
{
// Check command line arguments.
if(argc != 4 && argc != 5)
{
std::cerr <<
"Usage: http-client-sync <host> <port> <target> [<HTTP version: 1.0 or 1.1(default)>]\n" <<
"Example:\n" <<
" http-client-sync www.example.com 80 /\n" <<
" http-client-sync www.example.com 80 / 1.0\n";
return EXIT_FAILURE;
}
auto const host = argv[1];
auto const port = argv[2];
auto const target = argv[3];
int version = argc == 5 && !std::strcmp("1.0", argv[4]) ? 10 : 11;
// The io_context is required for all I/O
net::io_context ioc;
// These objects perform our I/O
boost::asio::ip::tcp::resolver resolver(ioc);
beast::tcp_stream stream(ioc);
// Look up the domain name
auto const results = resolver.resolve(host, port);
// Make the connection on the IP address we get from a lookup
stream.connect(results);
// Set up an HTTP GET request message
http::request<http::string_body> req{http::verb::get, target, version};
req.set(http::field::host, host);
req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);
// Send the HTTP request to the remote host
http::write(stream, req);
// This buffer is used for reading and must be persisted
beast::flat_buffer buffer;
boost::beast::error_code ec;
read_and_print_body<false>(std::cout, stream, buffer, ec);
}
catch(std::exception const& e)
{
std::cerr << "Error: " << e.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}发布于 2022-06-06 20:03:41
操作系统的TCP堆栈显然需要缓冲数据,因此很可能会在那里缓冲数据。
测试所需场景的方法:
住在Coliru
#include <boost/beast.hpp>
#include <iostream>
#include <thread>
namespace net = boost::asio;
namespace beast = boost::beast;
namespace http = beast::http;
using net::ip::tcp;
void server()
{
net::io_context ioc;
tcp::acceptor acc{ioc, {{}, 8989}};
acc.listen();
auto conn = acc.accept();
http::request<http::string_body> msg(
http::verb::get, "/", 11, std::string(20ull << 10, '*'));
msg.prepare_payload();
http::request_serializer<http::string_body> ser(msg);
size_t hbytes = write_header(conn, ser);
// size_t bbytes = write_some(conn, ser);
size_t bbytes = write(conn, net::buffer(msg.body(), 1024));
std::cout << "sent " << hbytes << " header and " << bbytes << "/"
<< msg.body().length() << " of body" << std::endl;
// closes connection
}
namespace {
template<bool isRequest, class SyncReadStream, class DynamicBuffer>
auto
read_and_print_body(
std::ostream& /*os*/,
SyncReadStream& stream,
DynamicBuffer& buffer,
boost::beast::error_code& ec)
{
struct { size_t hbytes = 0, bbytes = 0; } ret;
http::parser<isRequest, http::buffer_body> p;
//p.header_limit(8192);
//p.body_limit(1024);
ret.hbytes = read_header(stream, buffer, p, ec);
if(ec)
return ret;
while(! p.is_done())
{
char buf[512];
p.get().body().data = buf;
p.get().body().size = sizeof(buf);
ret.bbytes += http::read_some(stream, buffer, p, ec);
if(ec == http::error::need_buffer)
ec = {};
if(ec)
break;
//os.write(buf, sizeof(buf) - p.get().body().size);
}
return ret;
}
}
void client()
{
net::io_context ioc;
tcp::socket conn{ioc};
conn.connect({{}, 8989});
beast::error_code ec;
beast::flat_buffer buf;
auto [hbytes, bbytes] = read_and_print_body<true>(std::cout, conn, buf, ec);
std::cout << "received hbytes:" << hbytes << " bbytes:" << bbytes
<< " (" << ec.message() << ")" << std::endl;
}
int main()
{
std::jthread s(server);
std::this_thread::sleep_for(std::chrono::seconds(1));
std::jthread c(client);
}打印
sent 41 header and 1024/20480 of body
received 1065 bytes of message (partial message)旁注
你首先要问的是:
我正在努力理解如何限制从互联网上读取的数据量。
是建在野兽身上的
通过在boost野兽中调用'read_some‘函数。
为了限制读取的数据量,您不必在循环中使用read_some (根据定义,http::read已经做到了这一点)。
例如,使用上面的示例,如果您将20ull<<10 (20 KiB)替换为20ull<<20 (20 MiB),您将超过默认的大小限制:
http::request<http::string_body> msg(http::verb::get, "/", 11,
std::string(20ull << 20, '*'));打印住在Coliru
sent 44 header and 1024/20971520 of body
received hbytes:44 bbytes:0 (body limit exceeded)您还可以设置自己的解析器限制:
http::parser<isRequest, http::buffer_body> p;
p.header_limit(8192);
p.body_limit(1024);打印住在Coliru的
发送了41个标头和1024/20480的正文接收到了h字节: 41 b字节:0(超过了身体限制)
正如您所看到的,它甚至知道在读取标头之后,使用来自标头的content-length信息来拒绝请求。
https://stackoverflow.com/questions/72519383
复制相似问题