Summary
Overview
Work History
Education
Skills
Additional Information
Timeline
Generic

Sankha Suvra Ghosal

Data Science Consultant
Kolkata

Summary

An experienced Analytics professional with over 9+ years in the CPG and Retail sectors, providing data-driven insights to global clients. I am proficient in tools like R, Python, SQL, and Bayesialab, with expertise in Market Mix
Modeling, A/B Testing, Linear and Logistic Regression, Optimization, Bayesian Networks, and machine learning techniques such as Decision Trees, NLP, Random Forests, XG Boost, and Light GBM. Skilled in forecasting with FBProphet, I also have significant experience with AWS platforms, including SageMaker, Glue, and CloudWatch. Currently, I am working on Generative AI projects, backed by a strong academic foundation in Statistics.

Overview

9
9
years of professional experience
2
2
years of post-secondary education
3
3
Languages

Work History

Decision Science Consultant

Accenture Solution Private Limited
Kolkata
07.2021 - Current

Project 1: Sales Decomposition for a Leading QSR across Markets

Objective: Identify key sales drivers for various product categories across markets and build a predictive model to forecast sales for the next 3 months.

Approach:

  • Worked on the development of a robust data pipeline and built the initial model using Ordinary Least Squares (OLS) regression.
  • Enhanced the modeling framework with Gradient Descent, an advanced Machine Learning algorithm, for improved predictive accuracy.
  • Implemented monthly model refreshes and conducted comparative analysis against the baseline model.
  • Applied Fbprophet for time-series for base sales forecasting
  • Entire project was executed using Python, with the reporting layer integrated and stored via AWS S3.

Outcome:

  • Delivered insights into feature-level sales impact across products and time periods, enabling targeted decision-making.
  • Helped the client identify actionable interventions to improve sales and supported forward-looking planning through accurate forecasting and resource allocation.

Project 2: Data Migration & Deployment for Global Petroleum Company

Objective: Migrate codes, reporting jobs, and models for multiple markets of a leading global petroleum company to the client’s production environment.

Approach:

  • Led the end-to-end migration of PySpark/Python shell jobs, Power BI reports, and models from the CIS environment to the client environment using Azure DevOps.
  • Adjusted existing code and job configurations, and built deployment pipelines for seamless integration.
  • Configured job triggers via Octagon to execute pipelines on AWS Glue, with outputs written to AWS S3 Bucket.
  • Monitored job status and health using AWS monitoring tools.
  • Migrated static tables via DDP, moving them to the desired AWS S3 bucket using custom-built PySpark/Python pipelines, also deployed through Azure DevOps.

Outcome:

  • Successfully enabled the client to execute all datamart and model reports with improved reliability, timeliness, and operational efficiency.
  • Ensured a smooth transition to the production environment without disruptions to business reporting.

Project 3: Data Management Rules for Smart Supply Chain – Pharmaceutical Industry

Objective: Develop data quality rules for the Data Management Substream to identify anomalies within the Smart Supply Chain of a leading Pharmaceutical Client.

Approach:

  • Led the design and implementation of rule-based logic in R to detect anomalies across multiple data sources including Kinaxis, Athena, and P08.
  • Identified GMIDs/materials that deviated from the defined rules, classifying them as data anomalies for further action.
  • Focused on automating the identification process to ensure proactive resolution of data integrity issues within the supply chain pipeline.

Outcome:

  • Empowered the client to detect and rectify data discrepancies effectively, significantly enhancing data quality and reliability for future supply chain operations.
  • Strengthened downstream decision-making by ensuring cleaner, rule-compliant datasets.

Project 4: Supply Chain Optimization for a Global Tech Leader

Objective: Improve the accuracy of predicted product arrival times (iETA) by optimizing the supply chain process for a global technology leader.

Approach:

  • Worked on the development and integration of custom logic within data cleaning and prediction pipelines to enhance Turnaround Time (TAT) accuracy across each product lifecycle milestone.
  • Merged refined TAT values with milestone dates to derive accurate and robust iETA predictions.
  • Explored and evaluated alternative Machine Learning algorithms beyond Gradient Boosting (GBM) to improve predictive performance.
  • Designed and tested simulation architecture across key CCMOTs (Start Country, Destination Country, Mode of Transport) to validate the newly implemented logic.
  • Entire solution was executed in R and deployed using AWS services including SageMaker, Step Functions, and CloudWatch for monitoring and orchestration.

Outcome:

  • Achieved significant improvement in delivery date prediction accuracy, enabling on-time product delivery and restoring customer trust and goodwill.
  • Strengthened the client’s supply chain reliability and operational efficiency through actionable insights and enhanced forecasting capabilities.

Project 5: Product Catalog Enrichment using Gen AI – Global E-commerce Giant

Objective: To automate and enhance Product Information Management for a leading global e-commerce company using Generative AI, with a focus on improving product categorization and attribute extraction for better search relevance.

Approach:

  • Leveraged Generative AI models hosted on GCP, utilizing Google Analytics APIs (text-bison 0.1, text-bison 0.2, Gemini Flash 1.5) for LLM-based vector search to automate product categorization.
  • Extracted product attributes using text and image embeddings, enriching product metadata to improve search performance.
  • Played a pivotal role in the Validation POD, responsible for: Validating input and output data. Updating records as needed. Checking and optimizing match rates by testing different model combinations. Conducting manual sample validations to ensure accuracy and reliability.
  • Refined Gen AI prompts to better align model outputs with specific business requirements and analytical needs.
  • Took full ownership of project delivery, ensuring on-time completion of milestones with agility, efficiency, and high-quality results.

Outcome:

  • Successfully implemented an automated, scalable Gen AI pipeline for product categorization and metadata enrichment.
  • Enhanced match rate accuracy and improved search relevance, contributing to better product discoverability.
  • Delivered validated, client-ready outputs with high confidence and accuracy, strengthening stakeholder trust and onshore collaboration.

Project 6: Demand Transfer Analysis for a Global Giant CPG Company

Objective:
Identify and quantify lost sales and retained sales by analyzing demand transfer patterns for rationalized SKUs—both within the brand and across competitive SKUs.

Approach:

  • Led the development of an end-to-end Machine Learning pipeline using Random Forest and XG Boost to extract important attributes from historical sales data.
  • Identified similar SKUs for each rationalized product across both brand and competitor portfolios, based on attribute similarity.
  • Calculated the sales contribution of each relevant attribute for every rationalized SKU to understand drivers of substitution.
  • Built a custom Python algorithm to quantify how sales were transferred from each discontinued SKU to its corresponding similar SKUs, measuring retained vs. lost sales.
  • Collaborated with the data engineering team by supporting data extraction via web scraping, enriching the dataset with external product and pricing information.

Outcome:

  • Delivered a comprehensive demand transfer framework enabling the business to take strategic actions in assortment planning, portfolio optimization, and competitive benchmarking.
  • Empowered stakeholders with clear visibility into SKU cannibalization, brand loyalty, and competitive leakage, supporting data-driven decision-making.

Senior Analyst

Tesco
01.2021 - 07.2021

Project: Reporting and Tracker Development for Tesco Business Units

Objective:
Develop daily reports and seasonal trackers to support Buyers, Suppliers, and cross-functional teams within Tesco for informed sales and promotion decisions.

Approach:

  • Worked on the end-to-end creation of custom reports and dashboards for various Tesco business units.
  • Utilized Excel and SQL (Teradata) to extract, transform, and generate actionable insights tailored for different stakeholders.
  • Built seasonal trackers to monitor sales trends, promotion effectiveness, and product performance across categories.

Outcome:

  • Delivered timely, data-driven insights to internal teams, enabling better decisions on sales strategies, stock movement, and promotion planning.
  • Strengthened collaboration across departments through reliable and consistent reporting outputs.

Analytics Consultant

Bridge I2i Analytics Solutions
Bengaluru
05.2018 - 01.2021

Project 1: Performance Prediction for Liquor Manufacturing Company

Objective:

Predict the impact of various KPIs on employee performance ratings and BU heads for a global liquor manufacturing company.

Approach:

  • Performed data exploration followed by statistical modeling using Ordered Logit, Multinomial Logit, Logistic Regression, and Machine Learning algorithms such as Random Forest and XG Boost.
  • Created word clouds for BU heads to visualize summarized performance insights. Tools used included R and Python.

Outcome:

  • Identified efficient employees and recommended them for promotions or future roles. Enabled the client to make informed decisions regarding employee development and talent management.

Project 2: Supply Chain Optimization

Objective:

Identify key factors contributing to dispatch rate drop in FMCG supply chain management.

Approach:

  • Conducted data exploration, feature engineering, and applied statistical modeling techniques including Logistic Regression, Random Forest, and XG Boost to identify critical factors.
  • Performed anomaly-based deviation analysis and used a tree-based approach for factor importance. The entire process was executed in R.

Outcome:

  • Helped the client understand factors impacting dispatch rate drop and provided actionable insights to prevent recurrence.

Project 3: Brand Structure Analysis

Objective:

Predict and identify key factors affecting the equity variable, which influences sales for a liquor brand in a specific country for a global liquor manufacturer.

Approach:

  • Applied Factor Analysis and Path Analysis (SEM Modelling) to identify significant equity variables and assess how factors affect equity and sales, both directly and indirectly. The process was executed in R.

Outcome:

  • Enabled the client to understand subjective brand aspects that impact equity and sales in a defined market environment.

Project 4: Control Group Matching and Lift Calculation for Sales Campaigns

Objective:

Identify a matching control group of stores for exposed stores and calculate lift in sales volume based on campaign and non-campaign periods.

Approach:

  • Merged exposed data with control data, identified control groups using correlation and Euclidean distance, and calculated lift across different time periods using Python.

Outcome:

  • Helped the client understand the control population for exposed stores and evaluate campaign effectiveness through lift analysis.

Project 5: White Space Analysis

Objective:

Identify white space opportunities to increase sales and quantify the opportunity number for different SKUs produced by the company.

Approach:

  • Identified SKUs below defined thresholds in Rate of Sale (ROS), Distribution, and Assortment. Calculated opportunity potential using three distinct algorithms. The entire process was implemented in Python.

Outcome:

  • Helped the client pinpoint SKUs with sales potential across ROS, Distribution, and Assortment, and quantified the potential gain.

Project 6: Invoice Image Data Extraction using OCR and NLP Techniques

Objective:

Extract various fields from invoice images using OCR and NLP techniques.

Approach:

  • Extracted information using Azure API, which provided bounding box coordinates for text elements in the image. Applied a rule-based algorithm to retrieve relevant fields. The process was implemented using Python.

Outcome:

  • Delivered a final data set containing all relevant invoice fields, enabling the client to use it for further business analysis.

Business Analyst

Genpact India Private Ltd
03.2017 - 05.2018

Project: Digital Campaign Impact Analysis for an FMCG Product

Objective:
Measure the impact of a digital campaign on sales lift, occasions, dollars spent per occasion, and penetration.

Approach:
Prepared exposure data for exposed households and merged POS and FSP data. Built statistical models for occasion, dollar per occasion, and penetration using Poisson Regression, Gamma Regression, and Logistic Regression. Applied ANCOVA for scoring. Tools used included Hive, R, Julia, and SAS.

Outcome:
Calculated individual lift from three models and generalized lift, and delivered consumer diagnostics to evaluate campaign effectiveness.

Analyst

Rainman Consulting Pvt Limited
11.2015 - 02.2017

Project 1: Marketing Strategy Impact Analysis

Objective:
Determine the impact of marketing strategies on sales and improve spending effectiveness through optimized resource allocation.

Approach:
Applied linear modeling and non-linear optimization techniques to estimate parameters and assess the effectiveness of different media and marketing variables.

Outcome:
Provided insights on the influence of marketing levers, identified optimal deployment levels, and recommended budget allocation for maximum impact

Project 2: Constituency Segmentation using K-Means Clustering

Objective:
Categorize 350 constituencies based on 6 years of vote share data across 4 political parties.

Approach:
Applied K-Means Clustering to segment constituencies into Unipolar, Bipolar, Multi-Polar, and Divided-Unipolar groups based on voting patterns.

Outcome:
Provided insights on vote share variations across constituencies over time, supporting strategic political analysis and planning..

Education

Master of Science - Statistics

University of Kalyani
West Bengal
09.2013 - 09.2015

Skills

Problem-solving

MS office

R

Python

SQL

AWS (Basic)

Azure Dev Ops (Basic)

Linear Regression

Logistic Regression

Decision Tree

Random Forest

XG-Boost

Gen AI (Basic)

LGBM

FB Prophet

Additional Information

AWARDS AND RECOGNITION
Client Value Creation - Well Done!
Client Value Creation - HighFlyer!
Client Value Creation - Great Work!
Completed Workera Assessments

Timeline

Decision Science Consultant

Accenture Solution Private Limited
07.2021 - Current

Senior Analyst

Tesco
01.2021 - 07.2021

Analytics Consultant

Bridge I2i Analytics Solutions
05.2018 - 01.2021

Business Analyst

Genpact India Private Ltd
03.2017 - 05.2018

Analyst

Rainman Consulting Pvt Limited
11.2015 - 02.2017

Master of Science - Statistics

University of Kalyani
09.2013 - 09.2015
Sankha Suvra GhosalData Science Consultant