Skip to main content

Types of Optimizations Supported in OLake

OLake provides three types of optimizations that can be performed on a table, depending on the level of optimization required.

  1. Lite – Performs a lightweight optimization by converting equality delete files into positional delete files and merging small files. Files smaller than 1/8 of the target file size are merged into new files of up to about 1/8 of that target—for example, with a 128 MB target, files under 16 MB are merged into files of roughly 16 MB.

  2. Medium – Applies all deletes and merges data files. Output files are typically between 1/8 of the target file size and the target file size itself, depending on how much data is available to merge—for example, with a 128 MB target file size, merged files might be anywhere between 16 MB and 128 MB.

  3. Full – Performs the deepest level of optimization by rewriting data files so that they align with the configured target file size. This results in a complete copy-on-write (COW) rewrite of the table’s data files, producing the most optimal file layout. Full optimization is typically used when tables have accumulated significant fragmentation or when maximum query performance is required.

Optimization precedence

When more than one type of optimization is scheduled for a table to run at the same time, only the highest runs: Full overrides Medium and Lite; Medium overrides Lite. For example, if Full, Medium, and Lite are all due together, Full runs alone; if Medium and Lite are due together, Medium runs alone.

Choosing the Right Optimization Type

Optimization TypeOutputWhat it DoesCost IncurredWhen to Use
LiteEquality delete files are converted to positional delete files and small files are mergedImproves query engine compatibility without rewriting data filesLowUse when the table has too many small files and you want lightweight maintenance with low compute.
MediumDeletes are applied and data files are merged; output sizes fall between 1/8 of target file size and the target file size itselfReduces fragmentation by merging data files into larger files up to the target sizeMediumUse when you need more than Lite: deletes fully applied and files merged toward the target size without a full table rewrite.
FullData files are completely rewritten into files aligned with the target file sizePerforms a full copy-on-write rewrite of the table to produce the most optimal file layoutHighUse when tables are heavily fragmented or when maximum query performance and optimal file layout are required.


💡 Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
👉 Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. 🚀

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!