Skip to main content
KeyData TypeDescriptionSample Value
typestringIdentifies the type of state stored. For streaming replication, it is typically set to "STREAM"."STREAM"
streamsarrayAn array that contains one or more stream state objects. Each object represents the replication state for a specific table or partition.[ { ... } ]
streamstringWithin each stream object, this specifies the unique identifier (often the table name) whose state is being tracked."sample_table"
namespacestringIndicates the database name or logical grouping the stream belongs to."main"
sync_modestringRepresents the active synchronization mode for the stream. It may be left empty or specify a mode such as "cdc"."" (empty string)
stateobjectContains the details required to resume replication. This nested object holds the exact point up to which data has been processed.{ "binlog_file": "mysql-bin.000003", "binlog_position": 1027, "chunks": [], "server_id": 1000 }
binlog_filestring(Nested in state) The name of the binary log file from which replication will resume."mysql-bin.000003"
binlog_positioninteger(Nested in state) The specific position in the binary log file indicating where to resume.1027
chunksarray(Nested in state) An array for storing chunk information, useful for managing segmented or large datasets.[]
server_idinteger(Nested in state) The identifier of the source MySQL server that generated the binary logs, used for ensuring correct replication tracking.1000

How It Works

  • State Tracking:
    The type field declares the kind of state (here, a streaming state), while the streams array holds one or more stream objects. Each stream object tracks the replication state for a particular table or partition.

  • Resuming Synchronization:
    The state object inside each stream contains fields like binlog_file and binlog_position which tell the system exactly where to resume data replication. This prevents reprocessing already synced records.

  • Handling Data Chunks:
    The chunks field, although empty in this sample, can be used to manage segmented data, which is useful when handling large datasets.

  • Source Identification:
    The server_id field helps identify which MySQL server’s binary logs are being tracked, ensuring consistency in multi-server replication setups.

Refer here for more about sync modes.


Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

  • Email Support: Reach out to our team at hello@olake.io for prompt assistance.
  • Join our Slack Community: where we discuss future roadmaps, discuss bugs, help folks to debug issues they are facing and more.
  • Schedule a Call: If you prefer a one-on-one conversation, schedule a call with our CTO and team.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!