Skip to main content
Community / Ideas

GSoC Project Ideas

Browse project ideas for Google Summer of Code at OLake. Pick a Small, Medium, or Large project and discuss with mentors before submitting your proposal.

Looking for how to create a proposal? Make sure to read our proposal guidelines page for expectations, evaluation criteria, and where to ask questions.

Project ideas

Click a card to expand and see full description, deliverables, and timeline.

Prometheus Metrics for OLake

Add a Prometheus-compatible /metrics endpoint so users can monitor job health, throughput, and failures.

Small (~90 hours)EasyGo

Mentors: Vaibhav, Vikash, Akshay, Nayan

Summary

Add a Prometheus-compatible HTTP endpoint (GET /metrics) to OLake so users can monitor job health and throughput using Prometheus and Grafana.

Problem statement

Today, operators have limited standardized observability into per-job sync throughput, job success/failure counts, and request volume trends. Prometheus-style metrics are the de-facto standard for production monitoring and alerting.

Goals and deliverables

  • A /metrics endpoint: configurable enable/disable, bind address/port; works with existing OLake deployment modes (local + containerized)
  • Core metrics: olake_rows_synced_total{job="..."}, olake_job_runs_total{job="...",status="success|failed"}, olake_requests_total{job="..."}
  • Documentation: how to enable metrics, example Prometheus scrape config, example PromQL queries (throughput, error rate, request rate)
  • Tests: /metrics returns metrics text format; at least one test that verifies counters increment on lifecycle events

Optional stretch: Additional metrics (to be finalized with mentors): start/finish timestamps, per-stream table counts, lag metrics for CDC.

Implementation sketch

  • Use Prometheus Go client; register counters in a small metrics package.
  • Update counters from job lifecycle events and request-handling paths.
  • Expose via existing HTTP server or dedicated metrics server (configurable).

Timeline (example)

  • •Community Bonding: validate metric names/labels with mentors, locate job lifecycle hooks and request paths in code, draft docs and example dashboards.
  • •Week 1–2: endpoint + scaffolding
  • •Week 3–4: job lifecycle counters (rows + run status)
  • •Week 5–6 (midterm): request counter + first docs + basic tests
  • •Week 7–8: CI polish, more tests, docs + examples, optional stretch metrics

PostgreSQL TOAST Support in OLake

Correctly ingest UPDATE/DELETE events when pgoutput omits unchanged TOASTed values (byte 'u').

Medium (~175 hours)MediumGoJava

Mentors: Vaibhav, Vikash

Apache Iceberg v3 Support with Deletion Vector–based CDC

Upgrade OLake to Iceberg v3 and implement CDC updates/deletes using deletion vectors stored in Puffin files.

Large (~350 hours)HardGoJava

Mentors: Vaibhav, Vikash, Ankit