Sarath Chandrika K

I'm a

Turning data mess into data magic... ⏳⌨️

ABOUT

Hey there! I’m Chandrika — part data whisperer, part cloud explorer, and full-time believer that good data pipelines make the world a better place 🛠️☁️

I’ve been in the data game for over 6 years, wrangling bytes, streaming events, and optimizing SQL like it's a form of poetry. Whether it's building real-time fraud detection pipelines with Kafka + Spark, modernizing massive workloads on GCP & AWS, or migrating legacy systems into sleek, cloud-native setups — I do it all with a mix of logic, creativity, and just a little caffeine. ☕

When I'm not deep in DAGs or debugging a PySpark job at 2 AM (just kidding… mostly), you'll find me:

🎨 Crafting something with glitter and glue (yes, I'm serious — arts & DIY is my zen)
💃 Dancing my heart out to a good beat
🧩 Solving tech puzzles or making architecture diagrams look like art

Skills

Python SQL System Design Spark Tableau AWS GCP

RESUME

Summary

Experienced and driven Senior Data Engineer with over 7+ years of hands-on expertise delivering scalable data solutions for Fortune 500 enterprises. Specializes in designing cloud-native data platforms, building real-time pipelines, and modernizing legacy systems into lakehouse architectures across AWS, Azure, and GCP. Skilled in tools like Apache Spark, Kafka, Airflow, dbt, Snowflake, and Databricks, with a strong focus on solving business problems, ranging from fraud detection to predictive insights.

• Phone: +1(678) 776-9033

• Email: sarathk8901@gmail.com

Core Competencies

Programming & Scripting: Python, SQL, Scala, Java, Shell Scripting, Bash, JSON, YAML

Big Data & Streaming: Apache Spark, Kafka, Flink, Hadoop, Hive, HBase, Sqoop, Trino

ETL & Integration: Airflow, NiFi, Glue, ADF, Dataflow, dbt, Informatica, Talend

Cloud Platforms: AWS (S3, Redshift, Glue, EMR), Azure (ADF, Synapse, Data Lake), GCP (BigQuery, Dataflow)

Lakehouse & Storage: Databricks, Delta Lake, Iceberg, Hudi, Snowflake, BigQuery, S3, ADLS

Databases: Snowflake, Redshift, BigQuery, PostgreSQL, MySQL, SQL Server, Oracle, MongoDB

Education

Bachelor of Engineering – Electrical and Electronics Engineering

Jawaharlal Nehru Technological University, Hyderabad (JNTUH)

Professional Experience

Senior Data Engineer

May 2024 – Present

Hartford Financial Service Group

Designed and implemented scalable Spark-based pipelines for ingesting and transforming daily financial transaction data securely.
Developed unsupervised anomaly detection logic using Python and statistical methods to identify potential credit fraud.
Integrated the data pipeline with existing banking systems using Kafka, Lambda, and S3 for real-time ingestion.
Optimized data partitioning, filtering, and joins, improving Spark job execution time by approximately 30%.
Created ETL monitoring dashboards using Power BI and Datadog to track pipeline health and anomaly alerts.
Coordinated with DevOps, Compliance, and Risk teams to align architecture with SOX and GDPR standards.
Built reproducible and reusable data quality tests using Great Expectations for automated rule-based validation.
Enabled real-time risk scoring and flagging by integrating machine learning models within Spark streaming pipelines.
Conducted root cause analysis and resolved data inconsistencies, reducing manual QA rework by over 40%.

Data Engineer

Apr 2018 – Apr 2024

Coforge

Migrated legacy Oracle and MySQL data into AWS S3 using Glue, Athena, and automated NiFi connectors.
Built ETL pipelines with Airflow and dbt to automate ingestion, transformation, and model deployment for SaaS data.
Reduced average report refresh time by 50% by optimizing SQL queries and using partitioned datasets in Athena.
Developed Looker dashboards that provided stakeholders with on-demand operational insights across global business units.
Implemented data profiling and validation checks using dbt tests and custom Python logic for critical KPIs.
Conducted hands-on training sessions to upskill internal teams on dbt, Airflow, and cloud-native data workflows.
Led the architecture and deployment of a global Azure-based data lakehouse for IT asset lifecycle analytics.
Designed Delta Lake ingestion layers using Auto Loader and PySpark to manage asset logs and usage metrics.
Reduced query execution time by 40% through optimal partitioning, caching, and columnar file formats (Parquet).
Built predictive models for hardware failure trends using Python, XGBoost, and time-series anomaly scoring.

PROJECTS

My Projects

Financial Transaction Data Pipeline & Anomaly Detection

Built a robust real-time pipeline using Apache Spark, AWS Glue, and Kafka to ingest and process financial transactions. Integrated anomaly detection logic to flag fraudulent transactions, reducing detection time from 24 hours to under 30 minutes. Ensured compliance with SOX and GDPR while delivering interactive dashboards in Power BI.

Enterprise Data Lakehouse Platform

Architected and led the deployment of a scalable Azure-based lakehouse using Databricks and Delta Lake for IT asset analytics. Centralized 100+ data sources, improved asset utilization visibility, and enabled predictive usage analytics. Reduced provisioning cycles by 70% and saved $1.2M/year in licensing costs.

Modern Data Lake Migration & Report Optimization

Migrated legacy MySQL and Oracle reporting systems to an AWS-based data lake using Glue, dbt, and Athena. Automated hourly-refresh ETL pipelines with Airflow and Looker dashboards. Reduced report refresh times by 50% and enabled self-service analytics, saving $80K annually.

CONTACT

Location:

Texas, USA

Email:

sarathk8901@gmail.com

Call:

+1(678) 776-9033