Summary

Overview

Work History

Education

Skills

Timeline

Sumana Devireddy

Mount Juliet,TN

Summary

AWS Data Engineer Around 9 years of experience in implementing Data warehouse and data base applications with Informatica ETL in addition with data modeling and reporting tools on Apache Hadoop ecosystem, Teradata, Oracle, DB2, SQL Server, Hadoop MapReduce, database applications using Apache.Automation of data workflows using Python, Airflow, and AWS services. Managing data infrastructure with Terraform for cloud-native solutions. Building and optimizing Spark jobs in Databricks for large-scale data processing streamlining deployment and monitoring through CI/CD tools

Overview

years of professional experience

Work History

DataOps Engineer

Asurion

06.2023 - Current

Working with Apache Spark for distributed data processing and transformation within Databricks. This includes writing optimized Spark jobs in Python (PySpark), Scala, or SQL for large-scale data transformations.
Setting up monitoring and alerting for Databricks clusters and jobs. This involves integrating with tools like AWS CloudWatch for real-time insights into performance of clusters and jobs.
Managing Databricks logs to ensure visibility into job failures, resource usage, and performance bottlenecks.
Orchestrated complex ETL/ELT pipelines with Apache Airflow, ensuring reliable data flow and task dependencies.
Automated infrastructure provisioning using Terraform, managing cloud resources like AWS S3, EC2, RDS, and Lambda in data pipelines.
Cost analysis on AWS resources.
starburst Presto support and setting up new hive metastore using apache Spark.
Implemented CI/CD pipelines for data workflows, automating deployment and testing with Python-based scripts in Jenkins and GitHub Actions.

AWS DATA ENGINEER/DATABRICKS

Cigna

08.2022 - 05.2023

Set up monitoring tools like CloudWatch alarms to ensure optimal performance of the cloud environment.
Developed real-time streaming solutions using Kinesis Firehose, Streams and Analytics along with Apache Kafka and Flink on EMR clusters.
Developed and maintained AWS EC2, S3, EMR clusters with Spark and Hive for data processing.
Designed and implemented ETL pipelines using Glue, Athena, and other AWS services.
created ETL Databricks pipeline and automated on Terraform framework in CI/CD pipeline
created External tables in AWS glue and access through data in Databricks.
Identified and corrected bugs for maintaining cloud stack functionality.
Traced and corrected potential network issues by analyzing systematic vulnerabilities.
Automated Delta data delete and insert in RTIM Tool.

AWS Data Engineer

Capital One

08.2019 - 07.2022

Developed Spark applications with Scala and Python and implemented Apache Spark for data processing from various streaming sources.
Worked on diﬀerent ﬁle formats reading/writing data with CSV, Parquet, JSON, and Avro for data Analytics and load data into spark for ETL transformations.
Experience in creating Lambda functions for SNS alerts and creating EMR instances to run applications in cron scheduler.
Load and maintain data in DB2, Aurora MySQL, Snowﬂake, and Postgres databases.
Experience in creating AWS infrastructure, Events, S3 bucket policies, IAM roles, EC2 instances, and SNS topics.
Migrated an existing on-premises application to AWS and used AWS services like EC2 and S3 for small data sets processing and storage, experienced in maintaining the Hadoop cluster on AWS EMR.
Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop using.
Spark Context, Spark-SQL, Data Frames, and pair RDDs.
Experience in creating Lambda functions to automate scheduled jobs in EMR cluster in-step functions.
Participated in AWS TREx activities.

DATABASE ANALYST

Wells Fargo

08.2015 - 07.2019

Experience with Process data into HDFS by developing solutions analyzing the data using MapReduce, Pig, Hive, and producing summary results from Hadoop to downstream systems.
Developed complete ETL process in Teradata by writing stored procedures and complex SQL’s.
Developed various ETL transformation scripts using Hive to create reﬁned datasets analytics use cases.
Worked in improving the performance and optimization of the existing algorithms using Spark.
Worked on performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDDs, and Spark YARN.
Extensive experience in importing and exporting data using stream processing platforms like Flume and Kafka.
Developed UNIX shell scripts to load large number of ﬁles into HDFS from local ﬁle system.

Education

Masters - Information Technology

VIU

05.2015

Bachelors - Electronics and communications

JNTU

05.2013

Skills

Spark SQL, PySpark
Python, Scala, Java
Databricks, MYSQL
Control-M Athena
PL/SQL, Snowﬂake
GitHub Linux
MYSQL Cassandra
Agile, SDLC

AWS Cloud Services, S3, EC2 , EMR
EMR SNS
Lambda CloudWatch
Teradata Admin
Linux/Unix
Agile S3 EC2
EMR SNS
Lambda CloudWatch

Timeline

DataOps Engineer

Asurion

06.2023 - Current

AWS DATA ENGINEER/DATABRICKS

Cigna

08.2022 - 05.2023

AWS Data Engineer

Capital One

08.2019 - 07.2022

DATABASE ANALYST

Wells Fargo

08.2015 - 07.2019

Bachelors - Electronics and communications

JNTU

Masters - Information Technology

VIU

Sumana Devireddy

Summary

Overview

Work History

DataOps Engineer

AWS DATA ENGINEER/DATABRICKS

AWS Data Engineer

DATABASE ANALYST

Education

Masters - Information Technology

Bachelors - Electronics and communications

Skills

Timeline

DataOps Engineer

AWS DATA ENGINEER/DATABRICKS

AWS Data Engineer

DATABASE ANALYST

Bachelors - Electronics and communications

Masters - Information Technology

Similar Profiles

Akeia WilliamsAkeia Williams

Xavier FieldsXavier Fields

Jocelyn MaciasJocelyn Macias

Sara AmacherSara Amacher

Bailey JonesBailey Jones