Rowles.LeanCorpus.Codecs.Postings
Classes
BlockPostingsWriter
Writes postings in packed block format (v3). Doc IDs and frequencies are written in 128-int delta-encoded packed blocks, with VarInt encoding for the tail (remaining < 128 values). Skip data is emitted after every block for efficient
Advance().
PackedIntCodec
Frame-of-Reference bit-packing codec for blocks of 128 integers. Packs values using the minimum number of bits needed for the largest value in the block.
Output format:
[numBits : 1 byte][packed data : numBits × 16 bytes]. WhennumBitsis 0 (all values are zero) the output is a single byte.
PostingsReader
Reads delta-encoded postings lists written by PostingsWriter.
PostingsWriter
Writes delta-encoded postings lists for a given term. Deltas are encoded as variable-length integers (VarInt/LEB128) for compactness.
StreamingPostingsMerger
Streaming k-way merge of per-segment postings into a single merged segment. Iterates terms in sorted order across all source segments without ever materialising the full term-doc map in memory. Per-term position data is the only buffered state, bounded by the document frequency of one term.
StreamingPostingsMerger.Source
One source segment for the merge. The DocIdMap maps source local doc IDs to merged doc IDs; entries containing -1 are dropped (deleted).
Structs
BlockPostingsEnum
Block-at-a-time postings iterator (v3 format). Reads packed blocks of 128 doc IDs written by BlockPostingsWriter. Only the current block is decoded, keeping memory at a constant ~1 KB (2 × 128 ints) regardless of postings list length.
PostingsEnum
Forward-only cursor over a postings list. Decodes doc IDs and frequencies once into ArrayPool-rented buffers, then yields (DocId, Freq) pairs via MoveNext(). Optionally decodes positions when created via CreateWithPositions(IndexInput, long, byte).
Lifetime contract: When using the lazy position path, this struct holds a raw
byte*pointer into a memory-mapped IndexInput. The source input (_sourceInput) must remain open and un-disposed for the entire lifetime of this PostingsEnum. Callers must not dispose the IndexInput while any PostingsEnum referencing it is still alive.
StreamingPostingsMerger.Result
Result of a streaming merge: the sorted term list and the per-term .pos offsets needed to write the .dic file.
TermPostingMetadata
Metadata returned by FinishTerm() for storage in the term dictionary.