Features Overview (for mongoDB connector)
Native BSON Extraction:
We pull data in its raw BSON form from MongoDB, decoding it on the ETL side. This approach maintains fidelity and boosts speed.
Parallel Chunking:
Instead of serial reads, we split big collections into smaller virtual chunks to parallelize the process.
CDC Cursor Preservation:
We never lose track of CDC offsets; even if a new large collection is added later, we do a full snapshot for it without interrupting ongoing incremental sync.
Custom Alerts:
Configurable alerts for schema changes let you address issues quickly, preventing data corruptions or silent failures.
Open Format for Freedom:
By embracing Parquet and Iceberg, we side-step vendor lock-in and enable multi-engine querying.
Both full snapshot and CDC are resumable as and when required by the customer.
Estimated time duration for the initial snapshot to be completed so you have a better insight on what’s going on inside the hood.