A collection of data analysis, data science, and machine learning projects spanning public policy, finance, healthcare, and business analytics.
| Project | Description | Tools |
|---|---|---|
| 🦠 COVID-19 Data Analysis | Global COVID-19 trend analysis with EDA, predictive modeling, and an interactive Streamlit dashboard | Python, Pandas, Streamlit |
| 📈 Gold / USD Price Tracker | Automated collection of historical Gold/USD OHLC price data from Yahoo Finance, saved as Excel | Python, yfinance, Pandas |
| 📉 Telco Customer Churn Analysis | End-to-end churn prediction with EDA, machine learning, and dual dashboards in Python and R | Python, Scikit-learn, R, Shiny |
| 🌲 NY State Park Attendance Analysis | 20-year visitor trend analysis for NY State Parks with ARIMA forecasting and an interactive Shiny dashboard | R, Shiny, ggplot2, forecast |
| 🏔️ NY Licensed Guides Analysis | Geographic and activity analysis of 2,516 licensed outdoor guides in New York State | R, Shiny, plotly |
| 📊 NY Labor Market Analysis | Time series analysis of NY employment and unemployment from the 2008 Financial Crisis through 2019, with ARIMA forecasting | R, Shiny, forecast |
- Data layout: each project uses
data/raw/for source data anddata/processed/for derived outputs. - Reports: R Markdown reports live in
reports/and render HTML snapshots committed to the repo. - Entrypoints: apps use
app.R(Shiny) orapp.py(Streamlit) at the project root. - Dependencies: Python projects include
requirements.txt; R projects includerequirements.R.
This repository uses a mix of tools depending on the project:
- Python — Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Streamlit, yfinance
- R — ggplot2, Shiny, plotly, tidyverse, forecast, leaflet, R Markdown
Prince Peter Yalley