I’m a Data Science graduate student at Stevens passionate about building end-to-end ML systems, data-driven products, and scalable analytics pipelines that solve real-world problems.
- 🤖 Machine Learning & Applied AI
- 🏗️ Data Engineering & Big Data (Spark, Hadoop)
- ☁️ Cloud & Databases (AWS, MongoDB, SQL)
| Project | Description | Tech Stack | Outcome |
|---|---|---|---|
| ETF Recommender System | Built a recommender engine to identify structurally similar ETFs based on holdings overlap and weight distribution. Stored complete ETF master + holdings data in MongoDB for scalable querying. Computed similarity using Jaccard & Cosine metrics. | Python, Pandas, MongoDB, PyMongo, NumPy, SciPy | Successfully identified high-similarity ETF alternatives (e.g., IYY → ITOT/IWB/ILCB), enabling portfolio diversification and risk-aligned fund substitution. |
| Automated Fruit Grading System | Designed a computer vision model to classify fruits by size, color, ripeness, and quality. Developed real-time inference pipeline and designed a UI interface for usage in market settings. | Python, OpenCV, TensorFlow/Keras, CNNs, Gradio | Improved grading accuracy to ~95%, helping automate manual labor and support fair pricing for farmers. Inspired by real agricultural challenges. |
| Weather Forecasting & Landslide Prediction | Built predictive models for rainfall trends and integrated environmental indicators to forecast landslide risk. Conducted feature engineering and multi-model comparison. | Python, Pandas, Scikit-Learn, Random Forest, XGBoost, GIS Datasets | Achieved high predictive accuracy and demonstrated how data-driven alerts can support disaster prevention planning. |
| Quantitative Stock Forecasting System | Developed an algorithmic forecasting system for stock price movements using time-series modeling + statistical indicators. Evaluated trend signals for trading decisions. | Python, NumPy, Pandas, yFinance API, Statsmodels, Scikit-Learn | Generated directional accuracy signals useful for backtesting trading strategies & understanding market volatility. |
| Crime Resource Allocation & Predictive Analytics (911 NYPD Dataset) | Built a predictive pipeline to forecast emergency call hotspots and optimized patrol allocation for response efficiency. Conducted spatiotemporal analysis & visualization. | Python, R, ggplot2, dplyr, Pandas, Jupyter | Highlighted crime density patterns and demonstrated how data-driven deployment improves urban safety resource management. |
| Financial Stress & Default Risk Predictor (Satellite + ESG Data) | Integrated satellite imagery + ESG features + financial indicators to estimate corporate credit deterioration risk. Applied feature encoding, dimensionality reduction, and gradient boosting models. | Python, XGBoost, Pandas, Scikit-Learn, Geospatial Imagery, ESG Data APIs | Built a multimodal model capable of improving early-warning risk detection for investment due diligence. |
Using data & intelligence to create real impact, not just models.