Kamalakar Peta

Platform Blueprints

Enterprise-grade data platform designs and reference patterns

Medallion Lakehouse Architecture

Bronze-Silver-Gold layered data platform on Databricks with Delta Lake and Unity Catalog governance.

Databricks Delta Lake Unity Catalog AWS

Microsoft Fabric End-to-End Pipeline

Unified analytics platform leveraging OneLake, Data Factory, and Power BI for enterprise reporting.

Microsoft Fabric OneLake Power BI Data Factory

Real-Time Streaming Data Platform

Event-driven architecture with Kafka, Spark Structured Streaming, and Delta Live Tables.

Kafka Spark Streaming DLT EventHub

Data Mesh on Cloud

Domain-oriented decentralized data ownership with federated governance and self-serve infrastructure.

Data Mesh Domain-Driven Governance

Professional Path

A journey through data engineering leadership across global financial enterprises

Current

Assistant Director - Data Engineer

Moody's Analytics

Leading data platform modernization using Databricks on AWS — building lakehouse architectures, Delta Live Tables pipelines, and Unity Catalog governance frameworks.

FactSet

Engineered high-throughput data pipelines processing financial market data at scale with Spark and cloud-native services.

Franklin Templeton

Built ETL frameworks and data warehousing solutions for investment analytics and portfolio management systems.

S&P Global

Developed data integration solutions and analytics dashboards for credit risk and market intelligence platforms.

Project Journey

Evolving with the data stack — from Python & SQL pipelines to Databricks and Microsoft Fabric — building the platforms that serve analysts, data scientists and ML engineers

2014

Structured Finance Pricing Pipeline

Python 2.7 · SQL — Data Engineer

Aggregated and normalized real-time vendor pricing for US structured finance securities into one consistent store for the pricing desk.

View repository →

2016

Market Data Pipeline & Feature Platform

Python · pandas — Data Engineer

Conformed multi-vendor market data into analysis-ready, point-in-time feature tables serving analysts and data scientists.

View repository →

2018

Error Telemetry Pipeline (DE for ML)

Python · NLP — Data Engineer → Senior Data Engineer

Built the ingestion, feature-engineering and train/serve-parity pipeline feeding downstream error-prediction and root-cause models.

View repository →

2019

Yield Curve Outlier Detection

AWS · Streamlit · Terraform — Senior Data Engineer

First cloud build: a deployed data-quality tool for yield-curve validation, provisioned on AWS (EC2/S3/Redshift) with Infrastructure as Code.

View repository →

2021

Platform Usage Analytics

Azure (ADF · Synapse · ADLS Gen2) · Power BI — Senior Data Engineer

Orchestrated an Azure data platform with a zoned lake and star schema, serving always-current self-service BI to stakeholders.

View repository →

2022

Customs & Trade Analytics Lakehouse

Databricks · PySpark · Delta · Unity Catalog — Senior DE → Staff Data Engineer

Moved to the lakehouse: a medallion architecture enriching the Orbis data product with large-scale international-trade analytics.

View repository →

2023

Grant Data Integration Pipeline

Databricks · Delta · Great Expectations — Staff Data Engineer

End-to-end automated integration with data quality as a first-class concern — validation gates, quarantine and a reusable ingestion template.

View repository →

2025

Financial Research RAG

Databricks GenAI · Mosaic AI Vector Search · MLflow — Data & AI Platform Engineering

The team's first RAG proof-of-concept over financial research — prototyping grounded, cited retrieval on Databricks Mosaic AI Vector Search.

View repository →

2026 · Present

Enterprise Lakehouse on Microsoft Fabric

Microsoft Fabric · OneLake · Direct Lake — Data & AI Platform Engineering

A unified Fabric lakehouse serving BI (Direct Lake), analysts (SQL) and data science from one governed Gold layer — mirroring the Databricks build to show both platforms.

View repository →

Technical Blog

Deep dives into data architecture, Databricks, and Microsoft Fabric

Jan 15, 2025 • Databricks

Building a Medallion Architecture on Databricks (AWS)

The Medallion Architecture has become the de facto standard for organizing data in a lakehouse. In this post, we expl...

Data Architect Telugu

Empowering the Telugu tech community — enterprise data engineering, demystified.
తెలుగులో డేటా ఇంజనీరింగ్ — మన భాషలో, మన కోసం.

Series

తెలుగు · Telugu

Databricks Lakehouse — Zero to Production

డేటాబ్రిక్స్ లేక్‌హౌస్ — మొదటి నుండి ప్రొడక్షన్ వరకు

Your complete roadmap to mastering Databricks — clusters, notebooks, Delta Lake, Unity Catalog, and production-grade pipelines. Let's build this together!

Series

తెలుగు · Telugu

Microsoft Fabric — The Unified Analytics Revolution

మైక్రోసాఫ్ట్ ఫ్యాబ్రిక్ — యూనిఫైడ్ అనలిటిక్స్ విప్లవం

OneLake, Lakehouses, Data Factory, and Direct Lake mode — everything you need to architect modern analytics in Fabric. This changes the game.

Deep Dive

తెలుగు · Telugu

Medallion Architecture — Bronze, Silver & Gold Explained

మెడాలియన్ ఆర్కిటెక్చర్ — బ్రాంజ్, సిల్వర్ & గోల్డ్ వివరణ

The architecture pattern powering modern lakehouses. I'll walk you through real-world implementations with Delta Live Tables on Databricks.

Masterclass

తెలుగు · Telugu

PySpark for Data Engineers — Interview & Beyond

డేటా ఇంజనీర్ల కోసం పైస్పార్క్ — ఇంటర్వ్యూ & అంతకు మించి

Not just interview prep — real production patterns. Transformations, window functions, performance tuning, and the questions top companies actually ask.

Tutorial

తెలుగు · Telugu

Unity Catalog — Enterprise Data Governance

యూనిటీ క్యాటలాగ్ — ఎంటర్‌ప్రైజ్ డేటా గవర్నెన్స్

Access control, data lineage, and quality enforcement at scale. I'll show you how to set up governance that actually works across multi-cloud Databricks.

Hands-On

తెలుగు · Telugu

Delta Live Tables — Declarative ETL Pipelines

డెల్టా లైవ్ టేబుల్స్ — డిక్లరేటివ్ ETL పైప్‌లైన్స్

Stop writing boilerplate. DLT lets you declare your pipeline logic and handles orchestration, quality, and recovery. Let me show you how the pros do it.

Subscribe to Data Architect Telugu

Tech Stack

Tools and technologies I build with daily

Cloud Platforms

Microsoft Azure AWS Microsoft Fabric Databricks

Data Engineering

Apache Spark PySpark Delta Lake Delta Live Tables Apache Kafka Airflow

Architecture Patterns

Medallion Architecture Data Mesh Lakehouse Data Vault 2.0 Star Schema

Governance & Security

Unity Catalog Purview RBAC Data Lineage Data Quality

Languages & Tools

Python SQL Scala Terraform Git Docker

Analytics & BI

Power BI Databricks SQL Azure Synapse dbt

About Me

Building the future of enterprise data, one architecture at a time

I'm a hands-on Staff Data Engineer with deep expertise in designing, building, and owning enterprise-scale data platforms. My career spans senior data-engineering roles at some of the world's most respected financial-services firms — Moody's Analytics, FactSet, Franklin Templeton, and S&P Global.

Currently at Moody's Analytics as Assistant Director, Data Engineering (a Staff-level IC role), I build and own the governed Databricks (on AWS) and Microsoft Fabric lakehouse platform my engineering org builds on — design and CI/CD standards adopted across 12+ production pipelines, Unity Catalog governance, and FinOps that cut cross-environment compute ~35%. Most recently I built the team's first GenAI/RAG proof-of-concept on Mosaic AI Vector Search, and I'm extending into AI/ML platform engineering (feature stores, model serving, MLOps).

Beyond work, I create Telugu-language tutorials on YouTube, making data engineering concepts accessible to the Telugu-speaking tech community worldwide.

11+

Years Experience

4

Tier-1 Firms

50+

Platforms Delivered

Building Data & AI Platformsat Enterprise Scale

Platform Blueprints

Medallion Lakehouse Architecture

Microsoft Fabric End-to-End Pipeline

Real-Time Streaming Data Platform

Data Mesh on Cloud

Professional Path

Assistant Director - Data Engineer

Moody's Analytics

Product Specialist II / Senior Data Engineer

FactSet

Research Analyst / Data Engineer

Franklin Templeton

Data Researcher II/ Data Engineer

S&P Global

Project Journey

Structured Finance Pricing Pipeline

Python 2.7 · SQL — Data Engineer

Market Data Pipeline & Feature Platform

Python · pandas — Data Engineer

Error Telemetry Pipeline (DE for ML)

Python · NLP — Data Engineer → Senior Data Engineer

Yield Curve Outlier Detection

AWS · Streamlit · Terraform — Senior Data Engineer

Platform Usage Analytics

Azure (ADF · Synapse · ADLS Gen2) · Power BI — Senior Data Engineer

Customs & Trade Analytics Lakehouse

Databricks · PySpark · Delta · Unity Catalog — Senior DE → Staff Data Engineer

Grant Data Integration Pipeline

Databricks · Delta · Great Expectations — Staff Data Engineer

Financial Research RAG

Databricks GenAI · Mosaic AI Vector Search · MLflow — Data & AI Platform Engineering

Enterprise Lakehouse on Microsoft Fabric

Microsoft Fabric · OneLake · Direct Lake — Data & AI Platform Engineering

Technical Blog

Building a Medallion Architecture on Databricks (AWS)

Data Architect Telugu

Databricks Lakehouse — Zero to Production

Microsoft Fabric — The Unified Analytics Revolution

Medallion Architecture — Bronze, Silver & Gold Explained

PySpark for Data Engineers — Interview & Beyond

Unity Catalog — Enterprise Data Governance

Delta Live Tables — Declarative ETL Pipelines

Tech Stack

Cloud Platforms

Data Engineering

Architecture Patterns

Governance & Security

Languages & Tools

Analytics & BI

About Me

Kamalakar Peta

Building Data & AI Platforms
at Enterprise Scale