Pranjali Chaudhari — Data Scientist & AI/ML Engineer

About Me

I turn data into
decisions.

I'm a final-year M.Tech Data Science student who came from a Mechanical Engineering background — which means I approach data problems like engineering problems: understand the system, find the failure point, fix it.

At DATASMITH AI Solutions, I built production data pipelines and supported ML model development on real business datasets. That experience taught me the difference between a model that scores well on paper and one that actually performs under messy, real-world conditions.

My current work spans LLM security — evaluating how adversarial attacks break instruction-tuned models and building defenses — and medical imaging AI, applying explainable deep learning to cervical cancer detection.

I'm looking for roles in AI/ML Engineering, Data Science, or Python Development where I can ship useful things, work on hard problems, and keep growing.

pranjalichaudhari0408@gmail.com

linkedin.com/in/pranjali-chaudhari

github.com/Pranjali0408

+91 77560 68761

Pune, Maharashtra, India

Certifications

Generative AI Fundamentals — Databricks

Python for Data Science & ML — Udemy

Python for Everybody — Coursera

Full Stack Web Dev Bootcamp — Udemy

Technical Skills

What I bring
to the table

Languages

PythonSQLJavaJavaScript

Python is my primary tool — from data wrangling to model training to scripting experiments.

Machine Learning

Scikit-learnXGBoostSVM Feature EngineeringSMOTECross-validation

Used in industry pipelines at DATASMITH and across cervical cancer risk prediction projects.

Deep Learning

PyTorchCNNResNet Transfer LearningGrad-CAMXAI

Built and fine-tuned CNNs for medical image classification; applied explainability techniques.

LLMs & NLP

LLMsRAGPrompt Engineering HuggingFaceFLAN-T5Transformers

Built a RAG-based domain chatbot; evaluated LLM robustness under adversarial prompt attacks.

Data & Analytics

PandasNumPyEDA MatplotlibPower BISQL

End-to-end data work from raw CSV cleaning to interactive Power BI dashboards tracking KPIs.

Tools & Practices

GitJupyterGoogle Colab VS CodeFederated LearningAI Safety

Comfortable in collaborative dev workflows, reproducible experiment setups, and distributed ML.

Experience

Where I've
worked

Aug 2025 – Nov 2025

Full-time Internship

AI / Data Science Intern

DATASMITH AI Solutions — Pune

Built end-to-end data preprocessing and cleaning pipelines for structured real-world datasets — handled missing values, format inconsistencies, and outlier treatment to ensure reliable downstream model inputs.
Supported ML model training and evaluation workflows: cross-validation, hyperparameter search, and benchmarking classification and regression models across different business datasets.
Applied advanced feature engineering using Pandas and NumPy — polynomial features, interaction terms, and mutual-information-based selection — improving precision, recall, and F1 across tasks.
Maintained clean, version-controlled Python code in Git, wrote documentation, and delivered findings through clear visualisations and summary reports for stakeholders.

PythonPandasNumPy Scikit-learnFeature Engineering ML PipelinesGit

Projects

Things I've
built

2025

AI Chatbot — Bhagavad Gita & Chanakya Niti

Built a domain-specific conversational AI using Retrieval-Augmented Generation (RAG). Rather than relying on raw LLM memory — which hallucinates freely on niche knowledge — this system embeds philosophical texts into a vector store and retrieves relevant passages before generating responses. The result: accurate, grounded answers that stay true to the source material across all 18 chapters of the Gita and Chanakya's key texts. Implemented chunking strategy, similarity search tuning, and prompt templates to preserve authentic meaning in responses.

PythonLLMsRAG Prompt EngineeringVector DBHuggingFace

2026

Privacy-Preserving Crop Disease Prediction

Designed a federated learning workflow for agricultural disease prediction — each farm node trains locally, no raw data ever centralised. Compared FedAvg vs FedProx convergence under non-IID conditions, studied communication overhead, and implemented differential privacy mechanisms to quantify the accuracy–privacy trade-off.

Federated LearningPyTorch Differential PrivacyFedAvgPython

2025–Present

Cervical Cancer Detection — Multimodal DL

CNN-based classifier for cervical cell images (SIPaKMeD dataset) fused with clinical risk features. Applied Grad-CAM and LIME to generate interpretable saliency maps that highlight which image regions drove the model's decision — critical for medical AI adoption. Review paper accepted at ETFI 2026 (IEEE).

PyTorchCNNGrad-CAM XAITransfer Learning

2025–2026

LLM Security — Adversarial Attack Defense

Evaluated FLAN-T5 and Phi-3 Mini under 120+ adversarial prompts. Built a 4-layer adaptive defense framework (ARML-Defense) combining lexical filtering, semantic embeddings, a risk scoring formula R = αK + βS + γC, and output validation. Achieved 93.3% defense accuracy and 92.3% usability preservation vs 23.1% for hard restriction. Two papers accepted / under review.

FLAN-T5Phi-3 MiniAI Safety NLPHuggingFacePython

2023

Packaging Industry — Data Analysis & KPI Dashboard

Cleaned production datasets from a packaging company — resolved inconsistent formats, imputed missing values, standardised KPI definitions across departments. Built interactive Power BI dashboards tracking production throughput, defect rates, yield loss, and machine downtime patterns. Reduced manual reporting time significantly by enabling real-time visibility for operations teams. This project sharpened my ability to translate messy operational data into clean, decision-ready visuals.

Power BIPandasEDA Data VisualisationKPI DesignExcel

Research & Publications

Published
work

Accepted — Camera Ready Springer LNCS → Web of Science

Evaluating Security and Robustness of Instruction-Tuned LLMs Against Adversarial Prompt Attacks

ICAIS 2026 — NMIMS University, Indore

120-prompt controlled evaluation across 4 attack types on FLAN-T5-base. Lightweight lexical sanitization reduced adversarial attack success by 32.14% (23.33% → 15.83%) without model retraining.

Decision Pending

Adaptive Risk-Aware Multi-Layer Defense Framework for Securing LLMs Against Adversarial Prompt Injection

Conference — Jamia Millia Islamia, New Delhi

ARML-Defense: 4-layer adaptive framework with R = αK + βS + γC scoring. 93.3% defense accuracy, F1 = 0.94, 7.7% false positive rate on FLAN-T5 and Phi-3 Mini. 26.6-point improvement over Hard Restriction.

Accepted & Presented IEEE Xplore → Scopus + Web of Science

A Comprehensive Review on Multimodal and Explainable Deep Learning for Cervical Cancer Detection

ETFI 2026 — DESPU, Pune (IEEE Proceedings)

Systematic review of 20+ IEEE papers on CNN architectures, hybrid ML-DL models, Vision Transformers, and XAI (Grad-CAM, SHAP, LIME) for cervical cancer screening. Identifies key research gaps in multimodal integration and interpretability benchmarking.

ICAIS 2026 (Springer LNCS) → Published in Springer LNCS (Scopus Indexed), with potential inclusion in Web of Science CPCI. ETFI 2026 (IEEE) → IEEE Xplore, Scopus, and Web of Science (Clarivate — ESCI/SCI), subject to conference-level inclusion. Jamia Millia (ARML) → Decision awaited.

Education

Academic
background

2024 – Present

M.Tech in Data Science (Information Technology)

Marathwada Mitra Mandal's College of Engineering (MMCOE), Pune

CGPA: 9.56 / 10

Focus: ML, Deep Learning, LLM Security, Federated Learning, Research Methodology. Published 2 conference papers during this program.

2019 – 2023

B.Tech in Mechanical Engineering

Vishwakarma Institute of Information Technology (VIIT), Pune

CGPA: 8.51 / 10

Strong analytical and systems-thinking foundation. Transitioned to Data Science leveraging engineering problem-solving skills.

2019

HSC (Class XII)

Maharashtra State Board

84.92%

2017

SSC (Class X)

Maharashtra State Board

93.20%

2 International Papers
ICAIS 2026 (Springer) + ETFI 2026 (IEEE)

98 / 100 in Cyber Security
Academic coursework score

32.14% Attack Reduction
Published, peer-reviewed result

Contact

Open to
opportunities

I'm actively looking for AI/ML Engineering, Data Science, and Python Development roles. I'm also happy to discuss internships, research collaborations, or just an interesting problem.

Drop me an email or connect on LinkedIn — I typically reply within 24 hours.

pranjalichaudhari0408@gmail.com linkedin.com/in/pranjali-chaudhari github.com/Pranjali0408 +91 77560 68761

Hi, I'm Pranjali Chaudhari.

I turn data intodecisions.

What I bringto the table