Summary

Overview

Work History

Education

Skills

Certification

Timeline

Sai Prasad Nayani

Hyderabad

Summary

Big Data & Hadoop Developer with 8+ years of experience in building scalable data engineering solutions. Skilled in Hadoop ecosystem (Hive, Pig, HBase, Sqoop, Spark, PySpark, Spark SQL) with strong expertise in Python, SQL, and ETL pipeline development. Proven track record in data ingestion, transformation, and workflow automation using HDFS, AWS, Azure, and BMC Control-M to deliver high-performance, large-scale analytics solutions.

Overview

years of professional experience

Certification

Work History

Sr Developer

Cgnizant

Hyderabad

11.2020 - Current

Data Engineer at Blue Cross Blue Shield.

Developed and executed ETL processes for integrating data lake assets into ORMB and multi-cloud environments (AWS, Azure).
Implemented data transformation workflows on Hive tables, consolidating data from various sources for analysis.
Developed Talend jobs for file transfers between servers, leveraging Talend FTP components.
Played a key role in developing scalable data lake architecture using Hadoop technologies (Hive, HBase, PySpark).
Developed HQL pipelines for event joins and pre-aggregations, reducing downstream query latency, and improving analytics efficiency.
Streamlined data processing by developing Control-M workflows for automated batch scheduling, reducing manual effort, and errors.
Implemented real-time data ingestion pipelines by integrating Kafka with Spark Streaming, and PySpark for storage in HDFS.
Processed JSON data with PySpark by encoding and decoding objects to build and modify Spark DataFrames.

Big Data Hadoop Developer

Blue cross blue shield

Chicago

03.2018 - 09.2020

Designed and optimized PySpark and Spark SQL code for high-performance data processing on Apache Spark.
Designed and implemented HQL queries to perform transformations, event-based joins, and pre-aggregations, ensuring optimized data storage in HDFS.
Designed and implemented Hive tables on HDFS, leveraging HiveQL for data querying and processing.
Developed and deployed AWS Lambda functions in Python to handle scalable computation workloads.
Aggregated large-scale datasets using Spark, and staged them in HDFS to enable downstream data analysis.
Redesigned Hive/SQL workflows into efficient Spark transformations with PySpark DataFrames and RDDs, reducing processing time, and improving maintainability.
Partnered with the QA lead to create test plans, cases, and streamline the defect identification and resolution process.

Hadoop Developer

Walmart

Bentonville

07.2017 - 02.2018

Tracked processing time for multiple fill indicators in the health and wellness project.
Developed reports using BI tools to highlight trends and key performance indicators.
Performed data analysis, transformations, and aggregations on source datasets, loading results into Hive tables for downstream use.
Delivered aggregated data into SQL for downstream consumption in BI tools, supporting business decision-making.
Participated in daily Agile Scrum meetings, contributing to system architecture design, and use case development.
Analyzed data sources, designed source-to-target mappings, and projected storage capacity requirements for Hadoop environments.
Collaborated with clients to capture requirements and scope timelines for developing advanced Hive queries in logistics systems.
Delivered scalable solutions using Microsoft Azure, managing project timelines, and deliverables effectively.

Hadoop Developer

Blue Cross Blue Shield

Phoenix

11.2016 - 07.2017

Created an Enterprise Data Hub to enable data analytics across business units using Cloudera Hadoop.
Defined, designed, and developed Java applications, leveraging Hadoop frameworks, including Cascading and Hive.
Coordinated with the offshore development team for application development and unit testing.
Developed workflows using Oozie to execute MapReduce jobs and Hive queries.
Loaded log data into HDFS directly using Flume for efficient data processing.
Migrated applications from on-premises data centers to AWS Public Cloud after architecture analysis.
Built reusable Hive UDF libraries to support complex business requirements in querying.
Provisioned Azure Data Lake Store and Analytics while leveraging U-SQL for cross-service queries.

Education

Master of Science - Information Systems

Stratford University

FairFax, VA, US

07.2016

Bachelors - information Technology

Vardaman College of Engineering

Telangana, India

03.2013

Skills

Core big data and Hadoop ecosystem

Hadoop Distributed File System (HDFS)
MapReduce concepts
Hive (HiveQL, Hive tables, partitioning, bucketing)
HBase (NoSQL storage and retrieval)
Pig, if relevant to your projects
Sqoop / Flume (for data ingestion)

Spark and PySpark

pyspark (RDDs, DataFrames, Datasets, Spark SQL, UDFs)
Spark Streaming (real-time data ingestion and processing)

Performance tuning (partitioning, caching, broadcast variables)
Integration with Kafka for streaming pipelines

ETL and data engineering

Data ingestion, transformation, and aggregation
Source to target mapping and schema design
Pre-aggregations and event joins
Building batch and streaming data pipelines
Workflow automation using BMC Control-M, Oozie, or Airflow

Certification

Databricks Certified Associate Data Engineer

Data engineering on Microsoft Azure

Timeline

Sr Developer

Cgnizant

11.2020 - Current

Big Data Hadoop Developer

Blue cross blue shield

03.2018 - 09.2020

Hadoop Developer

Walmart

07.2017 - 02.2018

Hadoop Developer

Blue Cross Blue Shield

11.2016 - 07.2017

Master of Science - Information Systems

Stratford University

Bachelors - information Technology

Vardaman College of Engineering

Sai Prasad Nayani

Summary

Overview

Work History

Sr Developer

Big Data Hadoop Developer

Hadoop Developer

Hadoop Developer

Education

Master of Science - Information Systems

Bachelors - information Technology

Skills

Certification

Timeline

Sr Developer

Big Data Hadoop Developer

Hadoop Developer

Hadoop Developer

Master of Science - Information Systems

Bachelors - information Technology

Similar Profiles

BHAVYA GOPIBHAVYA GOPI

Govardan ReddyGovardan Reddy

SUBHASHREE DASHSUBHASHREE DASH

KOTESWARA RAO LELLAKOTESWARA RAO LELLA

Abhishek ShakyaAbhishek Shakya