GSoC Project Ideas
Browse project ideas for Google Summer of Code at OLake. Pick a Small, Medium, or Large project and discuss with mentors before submitting your proposal.
Looking for how to create a proposal? Make sure to read our proposal guidelines page for expectations, evaluation criteria, and where to ask questions.
Project ideas
Click a card to expand and see full description, deliverables, and timeline.
Prometheus Metrics for OLake
Add a Prometheus-compatible /metrics endpoint so users can monitor job health, throughput, and failures.
Mentors: Vaibhav, Vikash, Akshay, Nayan
Summary
Add a Prometheus-compatible HTTP endpoint (GET /metrics) to OLake so users can monitor job health and throughput using Prometheus and Grafana.
Problem statement
Today, operators have limited standardized observability into per-job sync throughput, job success/failure counts, and request volume trends. Prometheus-style metrics are the de-facto standard for production monitoring and alerting.
Goals and deliverables
- A /metrics endpoint: configurable enable/disable, bind address/port; works with existing OLake deployment modes (local + containerized)
- Core metrics: olake_rows_synced_total{job="..."}, olake_job_runs_total{job="...",status="success|failed"}, olake_requests_total{job="..."}
- Documentation: how to enable metrics, example Prometheus scrape config, example PromQL queries (throughput, error rate, request rate)
- Tests: /metrics returns metrics text format; at least one test that verifies counters increment on lifecycle events
Optional stretch: Additional metrics (to be finalized with mentors): start/finish timestamps, per-stream table counts, lag metrics for CDC.
Implementation sketch
- Use Prometheus Go client; register counters in a small metrics package.
- Update counters from job lifecycle events and request-handling paths.
- Expose via existing HTTP server or dedicated metrics server (configurable).
Timeline (example)
- •Community Bonding: validate metric names/labels with mentors, locate job lifecycle hooks and request paths in code, draft docs and example dashboards.
- •Week 1–2: endpoint + scaffolding
- •Week 3–4: job lifecycle counters (rows + run status)
- •Week 5–6 (midterm): request counter + first docs + basic tests
- •Week 7–8: CI polish, more tests, docs + examples, optional stretch metrics
PostgreSQL TOAST Support in OLake
Correctly ingest UPDATE/DELETE events when pgoutput omits unchanged TOASTed values (byte 'u').
Mentors: Vaibhav, Vikash
Apache Iceberg v3 Support with Deletion Vector–based CDC
Upgrade OLake to Iceberg v3 and implement CDC updates/deletes using deletion vectors stored in Puffin files.
Mentors: Vaibhav, Vikash, Ankit