I am a passionate Data Scientist and Machine Learning enthusiast pursuing B.Tech in Mechanical Engineering at NIT Jamshedpur. I specialize in transforming complex data into actionable insights and building predictive models that drive business value.
My experience includes generative AI development, data analysis, machine learning, and natural language processing. I'm proficient in Python, SQL, and various data science libraries including Pandas, NumPy, and scikit-learn. I enjoy solving challenging problems and continuously learning new techniques in the ever-evolving field of data science and AI.
Built and deployed 3 AI-driven applications including a multimodal PDF page analyzer, real-time audio transcription system, and a time-series data visualization dashboard.
Tech Stack: Python, FastAPI, PyTorch, Streamlit, NumPy, SciPy, Plotly, SQLite, PyAudio, OpenRouter API, and configparser for modular configuration.
Designed and implemented a conversational AI chatbot using LangChain and HuggingFace for PDF document retrieval and QA. Developed a conversation-based QA chain for handling quantitative and qualitative questions about employee resumes.
Utilized FAISS for efficient vector storage and retrieval, and integrated a conversational buffer memory. Implemented robust PDF processing and information extraction mechanisms, ensuring accurate responses.
Current CGPA: 7.62
Percentage: 93.5%
Percentage: 95.0%
Here are some of my data science and machine learning projects that demonstrate my skills and expertise.
Developed a machine learning model to predict customer purchases of a Wellness Tourism package, improving marketing efficiency. Analyzed 4,888 customer records and compared 5 classification models, achieving 90.26% accuracy with a Gradient Boost Classifier.
Analyzed a dataset of 1,000 customers to calculate credit scores and segment them by creditworthiness. Used a formula that weights payment history, credit utilization, number of accounts, and education and employment status. Applied KMeans clustering to divide customers into four segments.
Developed predictive models for the forest fire weather index (FWI) using the Algerian forest fires dataset. Ridge regression yielded the best results with a mean absolute error of 0.564 and an R² score of 98.43%. Cross-validation methods validated the models, emphasizing effective feature selection.
Analyzed Google Play Store dataset using Python, pandas, and NumPy, focusing on app ratings, category distributions, and user reviews. Visualized key trends with Matplotlib and Seaborn, highlighting correlations between app size, installation count, and content ratings to provide data-driven recommendations.
Developed a comprehensive production dashboard using Excel Pivot Tables to monitor and optimize manufacturing metrics across regions, product types, and management teams. Analyzed key metrics including Units Produced, Total Cost, and Production Cost Per Unit to track performance and drive cost reduction.
Performed customer segmentation analysis for an e-commerce company using Power BI. Utilized DAX language for calculations including total sales and average sales to identify key customer segments and optimize marketing strategies.
Besides data science, photography is one of my passions. Here's a collection of my favorite shots capturing moments and scenes that inspire me.
I'm open to data science opportunities, collaborations, and freelance projects. Feel free to reach out if you're looking for someone passionate about turning data into actionable insights.