About Me
With over a decade in data engineering, I specialize in architecting, developing, and optimizing scalable end-to-end data platforms. My expertise lies in cloud-native solutions like Snowflake, automated ELT/ETL pipelines using Python, dbt, and Airflow, and integrating Generative AI to automate complex data enrichment. I am passionate about delivering systems that enhance data quality, reduce operational costs, and drive powerful business intelligence.
Technical Skills
Cloud & DevOps
Data Warehousing & Databases
Data Processing & Big Data
Programming & Frameworks
Core Concepts
Data Ingestion & Integration
Professional Experience
GenAI-Powered Audience Analytics Platform for Tableau
- Architected a backend data platform for a Tableau Extension, enabling GenAI-driven audience segmentation directly within dashboards.
- Engineered a Python (FastAPI) backend to translate dashboard interactions into on-the-fly ELT jobs in Snowflake, reducing time-to-insight from days to under a minute.
- Integrated Google Gemini Pro to provide natural language summaries and analysis of user-defined audience segments.
- Built a dynamic SQL generation engine to construct complex queries automatically, enabling limitless persona creation.
- Optimized performance by 90% by pre-calculating key metrics in Snowflake, ensuring low-latency API responses.
Unified Nielsen Media Analytics Platform
- Architected a centralized ELT platform to unify disparate Nielsen media datasets in Snowflake.
- Engineered an automated ingestion framework (Python/Shell) for 500+ GB/month from SFTP, S3, and APIs.
- Designed a layered Snowflake architecture with dbt, improving BI dashboard query performance by 300%.
- Developed a dynamic schema generation engine, reducing manual DDL maintenance by over 20 hours/month.
AI-Powered Persona Generation Platform
- Built an end-to-end pipeline automating marketing persona creation by enriching Snowflake data with GenAI models.
- Orchestrated an ETL pipeline integrating with LLM APIs (Vertex AI) to generate thousands of personas, automating 100+ manual hours per quarter.
- Engineered a fault-tolerant batch system with checkpointing, achieving a 99.9% completion rate for over 1 million records.
Reverse ETL for AdTech Platform Sync
- Built a reverse ETL pipeline to sync audience segments from Snowflake to The Trade Desk (TTD) via REST API.
- Automated delivery of 20,000+ segments, reducing data latency by 95% for near real-time campaigns.
- Engineered a scalable, config-driven design for 7+ markets, enabling rapid onboarding with zero code changes.
- Implemented resilient API integration with exponential backoff, achieving a 99.95% data delivery success rate.
Athlete Performance & Analytics Platform
- Architected a data platform to analyze athlete data from unstructured sources (HTML, CSVs).
- Developed an ETL pipeline to parse over 10 million performance records from web-scraped files and APIs.
- Engineered an entity resolution algorithm, improving data accuracy by over 80% by creating unified profiles.
- Built optimized MongoDB aggregation pipelines, reducing public dashboard query latency from minutes to <2 seconds.
Automated Cloud & On-Prem Infrastructure (IaC)
- Designed an IaC framework using Ansible and Terraform to automate MongoDB & ELK cluster deployments.
- Reduced new cluster deployment time from 3 days to under 2 hours with a fully automated Ansible playbook.
- Integrated Terraform and Ansible into a robust CI/CD pipeline for infrastructure.
- Optimized database performance via automated system-level tuning, improving query throughput by 25%.
Self-Service Natural Language Reporting Platform
- Built a Streamlit app using GenAI (Gemini) to translate natural language queries into SQL for Snowflake.
- Increased data accessibility across the business by 400% by enabling non-technical users to query data with simple English.
- Developed a self-service schema discovery feature, reducing ad-hoc query requests to the data team.
Presentations
Making GenAI Trustworthy at Nielsen
Nielsen Global Data Science Annual Conference 2025
Presented a novel approach to Generative AI implementation, highlighting a Tableau extension that addresses key challenges of AI accuracy and data overload for enterprise users.
Project Leadership & Facilitation
Project Lead & Facilitator, Nielsen
August - September 2025
- Spearheaded a workshop to demystify generative AI for non-technical sales teams.
- Designed and executed a 3-day virtual/in-person workshop for over 30 professionals across ANZ.
- Developed a curriculum on advanced prompting techniques tailored for a non-technical audience.
- Guided 9 cross-functional teams to build and present AI-powered solutions to real business problems.
- Successfully shifted team mindset from AI apprehension to proactive engagement, creating 9 'Prompt Playbooks'.