Sneha Santhosh - Personal Portfolio

Email
snehams.santhosh@gmail.com
Phone
+1 (647) 269-8404
Birthday
April 8, 1998
Location

Toronto, ON, Canada

About me

I'm a Data Engineer with 5 years of experience architecting enterprise-scale data platforms that drive strategic business insights. I specialize in building robust ETL/ELT pipelines, designing Azure-based lakehouse architectures, and enabling advanced analytics through highly performant and governed data systems.

My toolkit includes Python, PySpark, SQL, Azure Data Factory, Databricks, and Apache Airflow, with strong expertise in Delta Lake, Unity Catalog, and MLOps integration. I thrive on optimizing data workflows for scale, reliability, and cost efficiency—ensuring 99.9% data accuracy and system availability across cloud environments. From automating CI/CD pipelines to enabling clean, production-ready data for machine learning models, I bridge the gap between data engineering, governance, and business value with a product-driven mindset.

What i'm doing

Data Engineering & Pipelines

Designing and orchestrating scalable ETL/ELT pipelines using Databricks, ADF, and PySpark—optimized for performance, reliability, and cost efficiency across Azure.
Cloud Data Architecture

Implementing Azure-based lakehouse solutions with Delta Lake, Unity Catalog, and medallion architecture for unified analytics and governed data access.
Data Enablement for AI & ML

Preparing clean, production-grade datasets and feature stores that accelerate machine learning workflows, experimentation, and model deployment in collaboration with data science teams.
Analytics & BI Enablement

Supporting advanced analytics and decision-making through dimensional modeling, curated datasets, and interactive dashboards using Power BI.

Tools & Technologies

Azure

Databricks

ADF

Fabric

Python

SQL / T-SQL

Apache Airflow

Azure DevOps

Unity Catalog

SQL Server

Snowflake

Power BI

Resume

Download Resume (PDF) View Resume Online

Professional Summary

Seasoned Data Engineer with 5 years of experience building enterprise-scale data solutions across the retail, manufacturing, energy, and finance sectors. Proven expertise in Azure cloud platforms, Databricks, real-time analytics, and ML pipeline optimization. Strong in cross-functional collaboration, automation, and delivering business intelligence with precision and scalability.

Technical Skills

Cloud Platforms: Azure Databricks (Delta Live Tables, Unity Catalog), ADF, ADLS Gen2, Azure Synapse, Microsoft Fabric, Apache Spark
Programming & Automation: Python, PySpark, SQL, T-SQL, Azure DevOps, Git
Data Engineering: ETL/ELT Pipelines, Delta Lake, Medallion Architecture, CDC, Data Modeling, Validation & Cleansing
Workflow & Monitoring: Apache Airflow, Azure Monitor, Spark UI, Job & Cluster Config, Performance Tuning
Governance & Security: Unity Catalog, Data Lineage, Metadata Management, RBAC, Azure AD, Data Quality Frameworks
Databases & BI: SQL Server, Snowflake, Power BI

Professional Experience

Data Engineer

Landmark Group – Data Labs Landmark
May 2023 – Apr 2025
Designed and deployed 50+ ETL/ELT pipelines using Azure Databricks, ADF, and Delta Lake. Reduced manual prep by 80% with reusable frameworks. Implemented Unity Catalog for governance and automated CI/CD pipelines via Azure DevOps. Led data engineering for inventory forecasting, predictive analytics, and GA4 integration.
Data Engineer

Infosys Limited (Clients: Siemens Gamesa, AMS Osram, Huhtamaki)
Jan 2021 – Apr 2023
Developed 70+ pipelines using ADF, Databricks, Logic Apps, and Python. Handled 2TB+ weekly ingestion from SAP, APIs, and ADLS. Integrated data into SQL Server & Snowflake, deployed CI/CD with Azure DevOps, and managed Apache Airflow DAGs. Improved system reliability to 99.9%.