Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic

Janani Shanmugam

Senior Data Engineer
Chennai

Summary

Have 6 years of strong experience in AWS cloud (2X certified), Python, Pyspark, Kafka and SQL, along with DevOps (CICD pipeline). Self-motivated and driven to excel, maintaining a strong work ethic and a proactive approach to tasks and responsibilities. Demonstrated ability to work effectively both independently and as part of a team, adapting seamlessly to various work dynamics and environments.

Overview

6
6
years of professional experience
4
4
Certificates

Work History

Senior Data Engineer

Kloutix Solutions
06.2025 - Current
  • Client : Salesforce
  • Developed an ETL framework that dynamically instantiates data tasks at Airflow runtime by designing a configuration-based pipeline for 50+ datasets.
  • Worked on creating Python code that handles the migration of Hive tables to Iceberg tables using Airflow pipelines, and creates a reference table in Snowflake. Reducing the snowflake storage cost to 62%.
  • Migrated data from DEA to MOPS S3 buckets, establishing infrastructure-level permissions, and access control.
  • Worked with the marketing analytics team and focused on the creation of the Marketing Cloud model logic. Mostly around the attribution, website session, web visitor, and paid media tables. Conducted thorough data quality checks, rectifying inconsistencies, and ensuring the reliability of information.
  • Monitoring DAG failure alerts in production, and effectively resolving issues.
  • Worked on the automation of the backfilling process in a marketing project by reducing the manual effort.
  • Tools: Airflow, Snowflake, AWS (Terraform), Python, SQL.

Senior Data Engineer

ACCENTURE
11.2022 - 06.2025
  • Client: Generali.
  • Developed the Accenture accelerator framework using Data Vault modeling from scratch for different data sources, using Azure Data Factory and Databricks.
  • Client: Bayer, the maker of Aspirin.
  • Developed middleware for SAP to Tableau dashboard, which involved data handling, modeling, and transformations using Python.
  • Developed an ETL framework using a Glue job for data processing and transformation from SAP to different databases (RDS and S3).
  • Worked on creating a pipeline between EDH (Apache Kafka), RDS, and S3, using Lambda (Python) and automated using Terraform.
  • Worked on building KAFKA consumers on ECS for streamlining the data from the EDH portal to RDS, REDSHIFT, and S3.
  • Reengineered existing ETL workflows to improve performance by identifying bottlenecks and optimizing code accordingly.
  • Handled production issues such as EMR, Airflow throttling issues, cost increases, mismatch of data counts, and failure of AWS services.
  • Completed unit testing for the Lambda and Glue job by following GXP process guidelines.
  • Tools: AWS (Lambda, RDS, Redshift, S3, Athena, EMR, Step Function, Glue Job, Cloud Watch), GitHub, GitLab.

Data Engineering Analyst

ACCENTURE
08.2021 - 09.2022
  • Client: Novartis – the world’s largest pharmaceutical company.
  • Developed the SF-EBX ETL framework, a data pipeline developed to push metadata from Snowflake to EBX using an API gateway on a scheduled basis, automated using a DevOps pipeline. Similar setup with AWS Glue and AWS Lambda.
  • Centree Ontology – Worked on setting up the Centree architecture, which deals with ontologies and has ECS on an EC2 setup to build a DevOps pipeline, to automate the architecture.
  • China data management – pipelines for Redshift, Glue connection, and Glue crawler creation.
  • Delivered on time for the emergency change request for the SF-EBX framework. Collaborative teamwork was done in case of issue handling, with various teams.
  • Tools: Jenkins, AWS Cloud, Bitbucket, Databricks, Snowflake, and Airflow.

Projects

Final Year B.E Project
09.2020 - 03.2021
  • The main aim of this project is to reduce the feature set using PCA (principal component analysis) and to use the pattern matching technique.
  • And classify the arrhythmia dataset with a machine learning classifier (Naïve Bayes classifier) and neural network-based classification.
  • Classifying the fashion MNIST and MNIST (handwritten images) datasets of images by training a model using a neural network.
  • Classification of different classes of images in CIFAR-10 (color images) using a CNN network (convolution and max pooling) with PyTorch.
  • Tools: ML, AI neural networks, Python, and Big Data.

Education

B.E - Electronics and Instrumentation

MIT - Anna University
Chennai
09.2021

Skills

AWS Cloud

Python programming

PySpark - Databricks, AWS Glue

SQL - Snowflake, PostgreSQL

Apache Kafka

Airflow, Postman

CICD practices - Terraform, Jenkins, YAML

Version Control - Github, Gitlab

Data visualization - PowerBI

ETL Development, Data modeling

Agile methodology, JIRA, Confluence

Machine Learning

C,C

Accomplishments

  • Received ACE (Accenture Celebrates Excellence) award twice for sparkling growth in ATCI for May 2023, Jan 2024
  • Recognized for Exemplify Client-Centricity 2023
  • Got award for Wizards at Work for Star-Awards Mar-Aug 2022
  • First Prize Awarded for Short Film – eWIT competition at AU 2019

Certification

AWS Certified Data Analytics – Specialty, DAS-C01, 2024

Timeline

Senior Data Engineer

Kloutix Solutions
06.2025 - Current

Senior Data Engineer

ACCENTURE
11.2022 - 06.2025

Data Engineering Analyst

ACCENTURE
08.2021 - 09.2022

Projects

Final Year B.E Project
09.2020 - 03.2021

B.E - Electronics and Instrumentation

MIT - Anna University
Janani ShanmugamSenior Data Engineer