所以我读到了这个:https://thehoard.blog/how-kafkas-storage-internals-work-3a29b02e026
关于卡夫卡的存储内部,并提出了两个问题:
发布于 2018-09-21 18:00:14
这是个有趣的故事!
诚然,我对卡夫卡内部的理解是有限的,但无论如何,我还是会试着去尝试一下。
关于第一个问题,:
我查看了OffsetIndex.scala的源代码--每次输入新条目时,索引文件中的偏移部分似乎都是在relativeOffset()方法中计算的。要添加这一点,源代码中的描述如下
将偏移量映射到特定日志段的物理文件位置的索引。这个索引可能是稀疏的:也就是说,它可能不会保存日志中所有消息的条目。
因此,根据您所分享的参考文章--可能是因为这个索引的稀疏性质,
偏移量查找使用二进制搜索来查找最近的偏移量小于或等于目标偏移量。
从解释上看,偏移量似乎只是在增加--可能不一定是这样的。例如,我创建了一个主题,并查看了日志和索引的内容。
* 000---180.index文件的内容是*(请注意此处的偏移-不按顺序增加):
offset: 217 position: 4107 offset: 254 position: 8214 offset: 291 position: 12321 offset: 328 position: 16428 offset: 365 position: 20535 offset: 402 position: 24642 offset: 439 position: 28749
* 000---180.log文件的内容是*(在这里观察偏移-依次增加):
为了保护眼睛,我用了3个点(.)表示索引中可用的这些偏移量之间的行。
offset: 217 position: 4107 CreateTime: 1537550091903 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 218 position: 4218 CreateTime: 1537550092908 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 219 position: 4329 CreateTime: 1537550093910 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] ... offset: 253 position: 8103 CreateTime: 1537550127960 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 254 position: 8214 CreateTime: 1537550128961 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 255 position: 8325 CreateTime: 1537550129962 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] ... offset: 289 position: 12099 CreateTime: 1537550164007 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 290 position: 12210 CreateTime: 1537550165008 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 291 position: 12321 CreateTime: 1537550166009 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 292 position: 12432 CreateTime: 1537550436878 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] ... offset: 327 position: 16317 CreateTime: 1537550471917 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 328 position: 16428 CreateTime: 1537550472919 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: [] offset: 329 position: 16539 CreateTime: 1537550473920 isvalid: true keysize: 0 valuesize: 43 magic: 2 compresscodec: NONE producerId: -1 sequence: -1 isTransactional: false headerKeys: []
用于第二个问题
我认为上述例子应该澄清这一点。是的,索引中的位置反映了段日志文件和分区中的位置。在获取请求的情况下,一旦在二进制搜索中找到最近的偏移量--该控件将转到段日志中的偏移量。
我希望这能帮到你!
https://stackoverflow.com/questions/52338890
复制相似问题