MongoDBStateConfigDetails

Last updated:3/4/2025|... min read

Key	Data Type	Description	Sample Value
`type`	string	Identifies the type of state stored. Typically, it is set to `"STREAM"`.	`"STREAM"`
`streams`	array	An array containing state objects for each stream.	`[ { ... }, { ... } ]`
`stream`	string	The unique identifier for the stream whose state is recorded.	`"stream_8"` or `"stream_0"`
`namespace`	string	The namespace or logical grouping the stream belongs to.	`"otter_db"`
`sync_mode`	string	Indicates the active synchronization mode for the stream. This value may be empty or contain a specific mode.	`""` (empty string) or sync modes like `"cdc"`, `"full_refresh"`, `"incremental"` (WIP)
`state`	object	Contains the resume token or offset. This token determines the point until which data has been synced.	`{ "_data": "8267B34D61000000022B0429296E1404" }`

Refer here for more about sync modes.

How It Works

Resume Token / Offset:
The value stored in the state object (in the _data field) represents the cursor token, resume token (in MongoDB), or offset (in other databases) indicating the last processed record.
Incremental Syncing:
By keeping track of the token, OLake can start the next sync run from this point, ensuring that previously processed records are not re-fetched.
Multiple Streams:
Each stream in the streams array maintains its own synchronization state. This allows OLake to handle multiple data sources or partitions independently.

Efficiency:
Incremental synchronization reduces data transfer and processing by only fetching new or changed records.
Data Consistency:
Tracking the synchronization state prevents duplicate processing, ensuring that data remains consistent.
Flexibility:
The state mechanism supports various data sources (e.g., MongoDB with resume tokens, other databases with offsets), making it adaptable to different backend systems.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!