Data Scientist with 5+ years delivering end-to-end solutions for retail and adjacent industries. I cover the full stack—from data engineering/ETL to modeling (ML/DL/NLP), validation, and production/MLOps—primarily in Python and SQL. Strong track record building scalable pipelines (BigQuery, Databricks, Spark) and running cloud-native deployments (Google Cloud, Azure, AWS), tying technical metrics to business outcomes.
🔎 NLP at scale: customer review mining, sentiment/topic classification, and LLM workflows (prompting/embeddings) for actionable insights.
☁️ Cloud & Production: data orchestration, experiment tracking, and model deployment on Vertex AI and Azure ML.
📈 Impact-driven analytics: predictive modeling (churn, recommendation, time series) and executive dashboards (Power BI, Qlik Sense, Looker Studio) for decision-making.
One-liner (optional): Data Scientist (5+ yrs) — NLP, forecasting & recommenders in production; Python/SQL, BigQuery/Databricks/Spark, Vertex AI/Azure ML; business-impact focus.
Programming Languages
- Python, R, SQL
Data & ETL
- Pandas, NumPy, PySpark, Spark, Hadoop
- SQL Server, BigQuery, Databricks
Statistics & Mathematics
- Descriptive & inferential stats, probability, regression, hypothesis testing
Machine Learning
- Regression, Classification, Clustering (K-Means, DBSCAN, Hierarchical)
- Ensemble methods: Random Forest, XGBoost, LightGBM, CatBoost
- Feature engineering & selection, dimensionality reduction (PCA, t_SNE)
- Recommenders, anomaly detection, hyperparameter optimization (Optuna, Bayesian)
Time Series
- ARIMA/SARIMA, Prophet
- Deep Learning: LSTM, GRU
Deep Learning
- MLPs, CNNs, RNNs
- GANs (DCGAN, Pix2Pix, CycleGAN, DeOldify)
- Transfer Learning (VGG, ResNet, MobileNet, EfficientNet)
NLP & Transformers
- Preprocessing (tokenization, lemmatization, n-grams, embeddings)
- BERT (sequence classification), BART (summarization/generation)
- ChemBERTa (biomedical/healthcare text)
- Sentiment analysis, classification, summarization, NER
LLMs (Large Language Models)
- Fine-tuning, Prompt Engineering, RAG
- Embedding-based classifiers, LangChain, LlamaIndex, Gemini, LLaMA
Cloud & MLOps
- Google Cloud (BigQuery, Vertex AI), Azure ML, AWS SageMaker
- Model deployment, CI/CD, monitoring, pipelines (Airflow, Prefect, Kubeflow)
BI & Visualization
- Power BI, Qlik Sense, Looker Studio, Excel
- Matplotlib, Seaborn, Plotly
-
CNN Car Damage Severity (MobileNetV2 + YOLOv8) – severity classification and damage detection.
Repo: https://github.com/RafaelGallo/CNN_Car_Damage_Severity_MobileNetV2_YoloV8
Deep Learning · CV · Transfer Learning · Detection & Classification -
Bank Customer Churn – ML Pipeline – churn prediction with feature engineering & validation.
Repo: https://github.com/RafaelGallo/ML_Bank-Customer-Churn
Classical ML · Classification · sklearn Pipeline -
Auto Insurance Claims – Perceptron/MLP – supervised modeling for claims/propensity.
Repo: https://github.com/RafaelGallo/Auto-Insurance-Claims-Neural-Networks-Perceptron-MLP
Neural Nets · MLP · Evaluation -
DeOldify – GAN Colorization – colorizing B/W images with DeOldify.
Repo: https://github.com/RafaelGallo/GAN_Generative-Adversarial-Network_DeOldify
GAN · Computer Vision · Inference -
Plant Disease Recognition – CNN (MobileNetV2/VGG16) – leaf disease classification.
Repo: https://github.com/RafaelGallo/Plant-disease-recognition---CNN-MobileNetV2-VGG16
Deep Learning · Transfer Learning · Images -
Breast Tumor Cell Nuclei – CNN Segmentation – nuclei/cell segmentation in pathology.
Repo: https://github.com/RafaelGallo/Breast-Tumor-Cell-Nuclei-Convolutional-Neural-Network-Segmentation
Segmentation · CNN · Healthcare -
Skin Cancer – U-Net VGG16 – skin lesion segmentation with U-Net (VGG16 encoder).
Repo: https://github.com/RafaelGallo/Convolutional-Neural-Network-Skin-Cancer-U-Net-VGG16
U-Net · Transfer Learning · Medical Imaging -
LLM Prompt Engineering – Llama 3 (Sentiment/Climate) – prompt design and sentiment analysis.
Repo: https://github.com/RafaelGallo/LLM_Engineering_prompt_LLama3_Sentiment_analysis_climate
LLM · Prompting · NLP -
Kaggle Projects
Works hosted in Kaggle_comp: Notebooks and production pipelines for datasets, including EDA, feature engineering, modeling, and submission for real-world challenges.
Python · scikit-learn · Ensemble models · EDA · visualization · documented kernels
-
FIAP Cognitive Environments
Hands-on work with cognitive environments, agents, computer vision and NLP.
Python · CV · NLP · Heuristics
Repo: https://github.com/RafaelGallo/FIAP_Cognitive_Environments -
Deep Learning Rede Neural FIAP MBA
Deep Learning models (CNNs, LSTM/GRU), tuning, callbacks, metrics and best practices.
TensorFlow/Keras · PyTorch · Regularization · EDA
Repo: https://github.com/RafaelGallo/DeepLearning_Rede_Neural_FIAP_MBA -
Bot Trader LLM Qlearning and RNN
Trading agent with LLM (prompting) + Q-Learning + RNN, plus backtesting.
Reinforcement Learning · Time Series · Finance · Backtest
Repo: https://github.com/RafaelGallo/robo_trader_LLM_Qlearning_RNN -
Portiforio FIAP MBA Data Science
MBA portfolio (EDA, classical ML, visualization and reports).
Pandas · scikit-learn · BI · Storytelling
Repo: https://github.com/RafaelGallo/portiforio_FIAP_MBA_DataScience
- Practical Projects – Data Science Academy
A collection of mini-projects from the Data Science Academy training: big data with R and Azure ML, Python/Spark, Machine Learning, Business Analytics, visualization, and data engineering (Hadoop/Spark).
Tasks include churn analysis, recommender systems, fraud detection, sentiment analysis, time series forecasting, and dashboards.
Repo: https://github.com/RafaelGallo/Projetos_dsa
- Deep Learning Specialization (DeepLearning.AI – Coursera)
Hands-on implementations across five modules, covering everything from basic neural networks to CNNs, Seq2Seq, LSTMs/GRUs, and optimization strategies.
Python · TensorFlow/Keras · Backpropagation · CNN · RNN · Regularization
Repo: https://github.com/RafaelGallo/Deep-Learning-Specailization-DeepLearningAI-Coursera
- LinkedIn: https://www.linkedin.com/in/SEU_USUARIO
- E-mail: [email protected]
- GitHub: https://github.com/RafaelGallo