Skip to main content

Zero Pipeline Failures, 50% Faster Loads: How Xeno Rebuilt Their Data Foundation on OLake

ยท 9 min read
Merlyn Mathew
Product Manager, OLake

Xeno Cover Image

TL;DR

Xeno, an AI-powered customer engagement platform for retailers, ran its MySQL CDC replication on Stitch and then AWS DMS, but kept hitting broken replication whenever its schema changed, with recovery often forcing full table reloads. After ruling out Fivetran on row-based pricing, Xeno migrated its MySQL CDC pipelines to OLake, self-hosted on Kubernetes via Helm.

  • Full-load time dropped from approximately 24 hours on AWS DMS to around 13 hours on OLake, a nearly 50% reduction.
  • Schema changes no longer break pipelines or trigger full reloads.
  • OLake processes 40 to 50 GB daily, with most pipelines syncing hourly and high-priority pipelines every five minutes.
  • Every MySQL CDC pipeline that previously ran on DMS now runs on OLake; MongoDB CDC is next.

AWS DMS vs OLake Before After

About Xenoโ€‹

Xeno is an AI-powered customer engagement platform built for retailers and consumer brands. Operating across India, Xeno helps hundreds of fashion, beauty, QSR, and retail brands unify customer data, run personalized campaigns, and drive repeat purchases across WhatsApp, SMS, email, Instagram, and Facebook.

The Problem: AWS DMS Pipeline That Couldn't Keep Up With Schema Changesโ€‹

Xeno, an AI-powered customer engagement platform for retailers, ran its MySQL CDC replication on Stitch and then AWS DMS, but kept hitting broken replication whenever its schema changed, with recovery often forcing full table reloads. After ruling out Fivetran on row-based pricing, Xeno migrated its MySQL CDC pipelines to OLake, self-hosted on Kubernetes via Helm.

The breaking point wasn't data volume. It was change.

Every fast-moving engineering team makes routine schema changes: adding a column, modifying a table structure, updating indexes. On AWS DMS, these changes frequently broke replication pipelines. What should have taken minutes of database maintenance turned into hours of incident response.

The recovery process compounded the pain:

  • No visibility into root causes. DMS offered limited logging, leaving engineers to comb through opaque errors with no clear path to resolution.
  • Full table reloads as the only fix. For larger tables, these reloads stretched past 17 hours, spiking infrastructure costs and leaving downstream dashboards displaying stale data.
  • Workarounds that created more problems. Duplicate tables, custom views, and manual interventions kept data flowing, but added fragility and complexity to an already brittle system. The engineering team found themselves spending more time keeping pipelines alive than building the data products the business actually needed.

AWS DMS Change Pipeline Breaks

Evaluating the Obvious Alternativesโ€‹

Fivetran was the natural first alternative to evaluate: managed, reliable, and well-regarded. It addressed the technical problems with AWS DMS. But when the team modeled future data volumes and growth, the row-based pricing became a long-term cost concern. Solving an operational problem while introducing a financial one wasn't the trade-off they were looking for.

They needed a self-hosted alternative that delivered managed-platform reliability without unpredictable scaling costs, and without handing over control of their infrastructure.

From POC to Production: Evaluating OLake as an AWS DMS Alternativeโ€‹

Xeno wasn't looking for another replication tool. They'd already been through two. What they needed was a platform that could handle the specific failures that had cost them months of engineering time: schema changes breaking pipelines, opaque errors with no recovery path, and full reloads that stretched into days.

Their requirements going into the OLake evaluation were clear:

  • Reliable CDC replication that survives routine schema changes
  • Visibility into pipeline failures and logs
  • Elimination of full-table reloads during recovery
  • Lower operational overhead and infrastructure costs Getting started was unexpectedly smooth. Deploying OLake via Helm on Kubernetes took minimal effort, the UI handled most configuration without custom scripting, and the documentation was clear enough that the team moved quickly from setup to testing real workloads.

The POC wasn't without hiccups. They hit issues with larger partitioned tables and performance tuning early on. But what stood out wasn't that problems arose, it was how fast they were resolved. The OLake team engaged directly over Slack, diagnosed issues, and shipped fixes within one to two business days. For a team that had spent months navigating slow AWS support cycles, this was a meaningful contrast.

By the end of the POC, Xeno had confidence in both the platform and the team behind it.

The migration followed a deliberate, staged approach, transitioning MySQL workloads from AWS DMS to OLake incrementally and validating reliability and performance at each step. Today, every MySQL CDC pipeline that previously ran on DMS runs on OLake.

The results have been immediate and measurable. OLake now processes 40 to 50 GB of data daily, with most pipelines syncing hourly and high-priority pipelines ingesting every five minutes.

For the engineering team, the impact is straightforward: they've stopped managing pipelines and started building with data.

Why OLake Won as an AWS DMS Alternativeโ€‹

OLake addressed both sides of the equation that other tools couldn't solve simultaneously: operational reliability and cost efficiency. Four things stood out during evaluation.

1. Schema evolution that just works. The core failure mode with DMS, routine schema changes breaking pipelines, was OLake's strongest differentiator. The platform handles DDL and DML changes gracefully, without triggering reloads or requiring manual intervention. For a team whose source systems were constantly evolving, this was the deciding factor.

2. Self-hosted, predictable costs. Unlike row-based SaaS pricing, OLake deploys on Xeno's own Kubernetes environment via Helm. There's no per-row cost, no cross-region transfer markup, and no pricing surprises as data volumes grow, giving the team cost visibility that scaled with the business, not against it.

3. Significantly faster full loads. Full-load performance improved from approximately 24 hours on AWS DMS to around 13 hours on OLake, a nearly 50% reduction. This directly cut recovery times, reduced infrastructure costs during load windows, and meant downstream dashboards waited far less for fresh data.

4. Support that moved as fast as they needed. During the POC, Xeno hit issues with partitioned tables and performance tuning. What mattered wasn't that problems arose, it was that the OLake team diagnosed and resolved them within a few days. For a team that had spent months navigating slow AWS support cycles, this responsiveness was a meaningful signal of what production support would look like.

Schema Evolution AWS DMS VS OLake

In Their Own Wordsโ€‹

"Since moving to OLake, our MySQL pipelines have been stable since three months now. We no longer need repeated full reloads for schema changes, and the full-load performance improvement alone has been significant."

Shiv Kumar
Shiv Kumar

Lead Platform Engineer, Xeno

Aman Seth
Aman Seth

Devsecops Lead, Xeno

What's Nextโ€‹

With MySQL pipelines stable, Xeno now views OLake as the strategic ingestion layer for their broader data ecosystem, not just a DMS replacement.

MongoDB migration is the immediate next step. Xeno plans to replicate the MySQL playbook for MongoDB CDC replication.

Data archival on S3 is the longer-term opportunity, using OLake not just for replication but as a foundation for cost-effective historical storage.

The question has shifted from "Can OLake handle our workloads?" to "How far can we take it?"

Conclusionโ€‹

For Xeno's data team, the measure of success was never uptime percentages or load times. It was whether they could trust their pipelines enough to stop thinking about them.

Three months into production, that trust has been earned. OLake handles the complexity of a constantly evolving data environment so the engineering team doesn't have to, and that shift, more than any individual metric, is what makes it a long-term foundation for Xeno's data strategy.

Frequently Asked Questionsโ€‹

Q1. Why do schema changes break AWS DMS pipelines?โ€‹

AWS DMS maps each source table to a fixed target structure when a task starts, and its CDC has limited support for propagating DDL. So when a column is added, a table is altered, or an index changes, the source no longer matches what the task expects, and replication errors out instead of adapting. Limited logging then makes the failure hard to diagnose, and the usual way back is a full table reload, which stretches past 17 hours on large tables.

Q2. Is OLake a good AWS DMS alternative for MySQL CDC?โ€‹

For Xeno it was. OLake handles DDL and DML schema changes without triggering full reloads or manual intervention, and every MySQL CDC pipeline that previously ran on AWS DMS now runs on OLake, processing 40 to 50 GB daily.

Q3. How does OLake handle schema evolution differently from AWS DMS?โ€‹

When a column is added, a table is altered, or an index changes, OLake applies the change and keeps replicating with no full reload and no manual intervention. AWS DMS often interrupts replication on those same routine changes, and recovery typically means a full table reload, which is slow and costly on large tables.

Q4. How does OLake pricing compare to Fivetran?โ€‹

Fivetran uses row-based pricing, which Xeno found difficult to forecast as data volumes grew. OLake is open-source, self-hosted on the customer's own Kubernetes environment via Helm, with no per-row cost and no cross-region transfer markup, giving more predictable costs at scale.

Q5. How much faster are full loads on OLake versus AWS DMS?โ€‹

In a benchmark moving 4.0 billion rows from PostgreSQL to Parquet files in S3, OLake completed the full load in 1 hour 59 minutes versus 9 hours 8 minutes on AWS DMS, about 4.6 times faster and roughly 78% less time. OLake sustained around 558,765 rows/sec against 122,000 rows/sec on AWS DMS.

Q6. Does OLake support MongoDB CDC as well as MySQL?โ€‹

Yes. Xeno started with MySQL CDC and plans MongoDB migration as the immediate next step, applying the same playbook used for its MySQL pipelines.