当您执行大量删除(例如在队列系统中)时,RocksDB wiki建议使用CompactOnDeletionCollector到加速压实和更快地回收删除的空间。
RocksDB代码中还有一条注释提到收集器“将SST文件标记为需要压缩”,但尚不清楚压缩何时实际发生,以及如何将其优先于常规压缩。
我是这样配置收集器的:
options.table_properties_collector_factories.emplace_back(
rocksdb::NewCompactOnDeletionCollectorFactory(10000, 7500, 0.5));但是,尚不清楚压缩应该在什么时候进行。
例如,我创建了一个示例程序来观察收集器工厂/不包含收集器工厂的压缩统计数据和SST文件大小,但是在所有上都没有区别。
#include <chrono>
#include <rocksdb/db.h>
#include <rocksdb/write_batch.h>
#include <rocksdb/options.h>
#include <rocksdb/utilities/table_properties_collectors.h>
#include <iostream>
using namespace std;
int main() {
const std::string state_dir_path = "/tmp/typesense-data";
system("rm -rf /tmp/typesense-data && mkdir -p /tmp/typesense-data");
rocksdb::DB *db;
rocksdb::Options options;
rocksdb::WriteOptions write_options;
// create the DB if it's not already present
options.create_if_missing = true;
options.write_buffer_size = 1*1048576;
options.level0_file_num_compaction_trigger = 2;
options.max_write_buffer_number = 1;
// no difference when the following is commented out
options.table_properties_collector_factories.emplace_back(
rocksdb::NewCompactOnDeletionCollectorFactory(10000, 7500, 0.5));
write_options.disableWAL = true;
rocksdb::Status s = rocksdb::DB::Open(options, state_dir_path, &db);
if(!s.ok()) {
std::cout << "Error while initializing store: " << s.ToString() << std::endl;
return 1;
}
for(size_t i = 0; i < 100000; i++) {
db->Put(write_options, "RL_" + std::to_string(i), "HELLO123HELLO123HELLO123HELLO123");
}
std::cout << "Deleting keys..." << std::endl;
db->DeleteRange(rocksdb::WriteOptions(), db->DefaultColumnFamily(), "RL_", "RL`");
std::cout << "Done deleting keys..." << std::endl;
std::string stats;
db->GetProperty("rocksdb.stats", &stats);
std::cout << "Stats: " << stats << std::endl;
delete db;
return 0;
}发布于 2022-08-09 17:00:03
CompactOnDeletionCollector在DB::Delete()操作上工作,对DB::DeleteRange()没有影响。另外,它只在创建SST文件之后触发。用户可能需要等待自然发生,或者触发刷新()以加快速度。
https://stackoverflow.com/questions/73294390
复制相似问题