Table of Contents

Public namespace Rowles.LeanCorpus.Codecs.TermVectors

Classes

Internal classInternal TermVectorsReader

Reads per-document term vectors from .tvd/.tvx files using memory-mapped I/O.

Internal classInternal TermVectorsStreamWriter

Streaming variant of TermVectorsWriter for the merge path. Per-doc term vectors are appended directly to .tvd; .tvx is written on dispose. Only the offsets list is buffered (8B per doc).

Internal classInternal TermVectorsWriter

Writes per-document term vectors to .tvd (data) and .tvx (offset index) files. Format: .tvx: [docCount:int32] [long[] offsets into .tvd] .tvd per doc: [fieldCount:int32] per field: [fieldName:string] [termCount:int32] per term: [term:string] [freq:int32] [posCount:int32] [positions:int32[]]

Structs

Public struct TermVectorEntry

A single term vector entry: term text, frequency in the document, and positions.