博客
关于我
kafka日志存储(五):LogSegment
阅读量:250 次
发布时间:2019-03-01

本文共 3132 字,大约阅读时间需要 10 分钟。

为了防止Log文件过大,Log文件会被切分成多个日志文件,每个日志文件对应一个LogSegment。在LogSegment中,封装了FileMessageSet和OffsetIndex对象。LogSegment类的实现如下:

LogSegment类结构

class LogSegment(val log: FileMessageSet,                 val index: OffsetIndex,                 val baseOffset: Long,                 val indexIntervalBytes: Int,                 val rollJitterMs: Long,                 time: Time) extends Logging {    private var bytesSinceLastIndexEntry = 0  var created: Long = time.milliseconds}

append方法

def append(offset: Long, messages: ByteBufferMessageSet): Unit = {  if (messages.sizeInBytes > 0) {    trace("Inserting %d bytes at offset %d at position %d".format(      messages.sizeInBytes, offset, log.sizeInBytes()))        if (bytesSinceLastIndexEntry > indexIntervalBytes) {      index.append(offset, log.sizeInBytes())      this.bytesSinceLastIndexEntry = 0    }        log.append(messages)    this.bytesSinceLastIndexEntry += messages.sizeInBytes  }}

read方法

def read(  startOffset: Long,  maxOffset: Option[Long],  maxSize: Int,  maxPosition: Long = size): FetchDataInfo = {  if (maxSize < 0) {    throw new IllegalArgumentException("Invalid max size for log read (%d)".format(maxSize))  }    val logSize = log.sizeInBytes  val startPosition = translateOffset(startOffset)    if (startPosition == null) {    return null  }    val offsetMetadata = new LogOffsetMetadata(    startOffset,     this.baseOffset,     startPosition.position  )    if (maxSize == 0) {    return FetchDataInfo(offsetMetadata, MessageSet.Empty)  }    val length = maxOffset match {    case None =>      min((maxPosition - startPosition.position).toInt, maxSize)    case Some(offset) =>      if (offset < startOffset) {        return FetchDataInfo(offsetMetadata, MessageSet.Empty)      }            val mapping = translateOffset(offset, startPosition.position)      val endPosition = if (mapping == null) {        logSize      } else {        mapping.position      }            min(min(maxPosition, endPosition) - startPosition.position, maxSize).toInt  }    FetchDataInfo(offsetMetadata, log.read(startPosition.position, length))}

recover方法

def recover(maxMessageSize: Int): Int = {  index.truncate()  index.resize(index.maxIndexSize)    var validBytes = 0  var lastIndexEntry = 0    val iter = log.iterator(maxMessageSize)    try {    while (iter.hasNext) {      val entry = iter.next      entry.message.ensureValid()            if (validBytes - lastIndexEntry > indexIntervalBytes) {        val startOffset = entry.message.compressionCodec match {          case NoCompressionCodec =>            entry.offset          case _ =>            ByteBufferMessageSet.deepIterator(entry).next().offset        }                index.append(startOffset, validBytes)        lastIndexEntry = validBytes      }            validBytes += MessageSet.entrySize(entry.message)    }  } catch {    case e: CorruptRecordException =>      logger.warn("Found invalid messages in log segment %s at byte offset %d: %s.".format(        log.file.getAbsolutePath, validBytes, e.getMessage))  }    val truncated = log.sizeInBytes - validBytes  log.truncateTo(validBytes)  index.trimToValidSize()  truncated}

转载地址:http://gjxx.baihongyu.com/

你可能感兴趣的文章
org/hibernate/validator/internal/engine
查看>>
orm总结
查看>>
paddle的两阶段基础算法基础
查看>>
SpringBoot中重写addCorsMapping解决跨域以及提示list them explicitly or consider using “allowedOriginPatterns“ in
查看>>
Palo Alto Networks PAN-OS身份认证绕过导致RCE漏洞复现(CVE-2024-0012)
查看>>
pandas DataFrame 中的自定义浮点格式
查看>>
Pandas 读取具有浮点值的 csv 文件会导致奇怪的舍入和小数位数
查看>>
pandas 适用,但仅适用于满足条件的行
查看>>
Pandas-通过对列和索引的值求和来合并两个数据框
查看>>
pandas.read_csv()的详解-ChatGPT4o作答
查看>>
Pandas数据可视化怎么做?用实战案例告诉你!
查看>>
Pandas数据结构之DataFrame常见操作
查看>>
pandas整合多份csv文件
查看>>
pandas某一列转数组list
查看>>
pandas的to_sql方法中使用if_exists=‘replace‘
查看>>
Parallel.ForEach的基础使用
查看>>
parallels desktop for mac安装虚拟机 之parallelsdesktop密钥 以及 parallels desktop安装win10的办公推荐可以提高办公效率...
查看>>
PATA1038题解(需复习)
查看>>
Path does not chain with any of the trust anchors
查看>>
Path形状获取字符串型变量数据
查看>>