Soumyadip Chanda +91-7029965995
Data Science linkedin: Soumyadip’s Profile
NSHM Knowledge Campus, Durgapur github: Soumyadip004
Summary
As a budding data science enthusiast, I’m deeply fascinated by the potential of data to unveil invaluable
insights, I’m dedicated to expanding expertise through continuous education and exploration of emerging
trends. Effective communicator and strategic thinker and collaborative problem-solver, passionate about
staying at the forefront of technology trends, Seeking to leverage skills and experience to make a meaningful
impact in data science industry.
Education
Degree Institute Board / University CGPA/Percentage Year
B.Tech in Data Science NSHM Knowledge Campus, Durgapur MAKAUT 8.6 (Till 7th Sem) 2024-2025
Senior Secondary Kalyani public School CBSE 74% 2021
Matriculation Stratford day school ICSE 80% 2019
Experience
• Cognifyz Technologies April 2024 - May 2024
Data Science intern Virtual
◦ Build a regression model to predict the aggregate rating of different restaurant’s based on other co-related features
in a online food ordering system, Projects accounts high scalablility, high-throughput applications for users to take
descions.
◦Operational Mechanisms Machine Learning | Data Visualizations | Statistical Analysis | Model Deployment |
MS-Excel | Model Evaluation.
• Prodigy Infotech July 2024 - August 2024
Data Science intern Virtual
◦ Analyzed an sentiment patterns in social media data to understand public opinion and attitudes towards specific
topics or brands, The model is compatible to tokenize the user Sentiments and predict the reviews of an particular
product.
◦ Operational Mechanisms NLP | Data Visualizations | Market Basket Analysis | Statistical Analysis | Machine
Learning | Data Mining.
Projects
• Multi-Disease Predictor:
Developed a machine learning model to predict liver disease, Parkinson’s, breast cancer, heart disease, and kidney disease. [§]
◦ Developed a machine learning model capable of predicting 5+ diseases with 85–95% accuracy.
◦ Optimized feature selection and data preprocessing, reducing false positives by 15%, Built an interactive web
application using Streamlit
• MCQ Generator:
An NLP-based system to automatically generate multiple-choice questions from text input. [§]
◦ Implemented deep learning and NLP techniques (TF-IDF, T5, spaCy) to extract key concepts and generate
contextually relevant questions,Optimized question generation accuracy by 20%, reducing errors in distractor
selection.
◦ Built a web-based interface using Flask, improving usability and accessibility.
• Deepfake Detector:
A deep learning model to detect AI-generated deepfake images with an accuracy of 90%. [§]
◦ Implemented transfer learning using ImageNet-based architectures (EfficientNet, Xception, ResNet) for feature
extraction. Integrated confidence score for each prediction.
◦ Utilized TensorFlow, Keras, OpenCV, and Streamlit for model deployment. Optimized pre-processing, reducing
false positives by 15%.
• LLM Vision-Based Q&A System:
Built an AI system with LLaMA 3.2/3.1 Vision, delivering 90%+ accurate context-aware responses. [§]
◦ Implemented image-to-text interaction, processing 100+ image types with an average inference time of <2 seconds.
◦ Optimized API response time, reducing latency by 30% and improving request handling efficiency.
Project Portfolio
• 25+ Data Science Projects
◦ A curated collection of 25+ projects across Machine Learning, Deep Learning, NLP, and Computer Vision, featuring
predictive modeling, image classification, sentiment analysis, and recommender systems.
◦ GitHub Repository: [Click Here]
Business Intelligence Dashboards
• 15+ Interactive Dashboards
◦ Developed interactive dashboards using Power BI to visualize complex datasets, enabling data-driven
decision-making. Projects include financial analytics, healthcare insights, and machine learning model
interpretability.
◦ GitHub Repository: [Click Here]
Technical Feats
• Programming Languages Python (Advanced) | C (Intermediate) | SQL | Java (Intermediate)
• Technologies Flask | HTML | MongoDB | Streamlit | Power BI | NLP | LangChain | Hugging Face | Transformers
• ML workflow Pandas | OpenCV | Numpy | Matplotlib | Seaborn | Plotly | Sklearn | Mlxtend | Tensorflow |
Web-Scraping(Beautiful soup)
• Developer Tools Git & Github | Docker | VS Code | Jupyter Notebook | Google Colab | Pycharm
• Operating System Windows | Linux (Fedora)
Competency Covers
• EDA • Feature Engineering • Transfer Learning • Transformer Models
• Deep Learning • Data Mining • Large Language (BERT, GPT,
Models (LLMs) LLaMA)
• Statistical modeling • Data Visualization
• DBMS • Computer Vision
• Model Deployment
Certifications
• Database Management System: NPTEL (IIT Kharagpur)
• Effective writing: NPTEL (IIT Rorkee)
• Data Siecnce with python: SKILLUP by Smiplelearn
• Enterprise Data Science: IBM
• Data Visualisation: Empowering Business with Effective Insights: TATA
• Introduction To Generative AI Studio: GOOGLE Cloud(Skillup)
Achievements
• Campus Ambassador for Skolar: Worked as a campus ambassador in our college raised Awareness of many lucrative
courses and enrolled bulk amount of students in it 2022.
• Documentation on Secondary-level Virtual Library System (VLS): The purpose of this SRS is to describe the
requirements involved in developing in an Virtual Library System (VLS).
Research Publications
• Comparative Analysis of Machine Learning Algorithms for Liver Disease Prediction Accepted in Springer | Presented
at Analytics Global Conference 2025
• Conducted a comprehensive evaluation of multiple ML models, analyzing their predictive performance on liver disease
diagnosis. The study optimized feature selection, hyperparameter tuning, and model interpretability to enhance
classification accuracy.
Declaration
I affirm that all information provided in this resume is true, accurate, and complete to the best of my knowledge. I
pledge that every detail provided reflects my true qualifications, experiences, and achievements. I understand the
importance of honesty and integrity in the recruitment process and commit to upholding these values.