π Montes Claros, Minas Gerais - Brazil
π§ icaetanodiniz@gmail.com
πΌ LinkedIn
π± +55 38 988636216
Data Scientist and Machine Learning Engineer with over 4 years of experience developing scalable, data-driven solutions. Currently pursuing a Ph.D. in Data Science at PUC-Rio, focusing on scalable generative AI architectures and anomaly detection.
π Academic Background:
- Approved in both IME and ITA entrance exams
- Gold medal winner at Desafio PUC-Rio Olympiad (full scholarship recipient)
- Master's in Applied Mathematics from PUC-Rio
- Bachelor's in Mathematics from PUC-Rio
π’ Industry Experience:
- Worked on high-impact AI initiatives with Petrobras, Intel, Embraer, and Eletrobras
- Delivered solutions in: Generative AI, RAG, Computer Vision, LLMs, NLP, and Anomaly Detection
- Full ML lifecycle expertise: modeling, deployment, CI/CD pipelines, MLOps, cloud-native infrastructure
Machine Learning Staff Researcher and Engineer @ HVAR (Sep 2025 - Present)
- Research and development on LLMs, RAG, and GraphRAG solutions
- Working with LLAMA and GEMINI for text-to-SQL applications
- Designing advanced ML solutions within Databricks
- Deep Learning: Neural Networks, CNNs, RNNs, Transformers
- Classical ML: Random Forest, XGBoost, SVM, Decision Trees, KMeans
- NLP & LLMs: LLAMA, Gemini, GPT, RAG, Text-to-SQL
- Computer Vision: Object Detection, Image Classification
- Anomaly Detection: Isolation Forest, SOS, LOF
- Data Processing: Pandas, NumPy, Elasticsearch
- Visualization: Matplotlib, Streamlit
- Version Control: Git, GitHub Actions
- Deployment: BentoML, Flask
- Other: Unity Catalog, Unix Systems
- Developed anomaly detection system for petroleum well time series
- Achieved 80%+ detection rate using SOS, Isolation Forest, and LOF
- Enabled proactive maintenance and optimized resource allocation
- Built experimental RAG framework for SQL query generation
- Leveraged LLAMA, Gemini, and GPT architectures
- Automated natural language to SQL conversion
- Implemented ML-based fraud detection system
- Achieved 4x higher accuracy compared to manual methods
- Utilized clustering, regression, and decision trees
- Built end-to-end ML pipeline for default risk prediction
- Reduced debt by 50%+ and recovered $1M in losses
- Automated data pipelines with AWS and PySpark
Ph.D. in Data Science (2023 - 2026)
PontifΓcia Universidade CatΓ³lica do Rio de Janeiro (PUC-Rio)
M.Sc. in Applied Mathematics (2021 - 2022)
PUC-Rio | Thesis: Random Forest for Reservoir Simulation
B.Sc. in Mathematics (2018 - 2020)
PUC-Rio
Telecommunications Engineering (2015 - 2017)
Instituto Militar de Engenharia (IME)
- Portuguese: Native
- English: Advanced
I'm always interested in collaborating on innovative ML and AI projects. Feel free to reach out!
βοΈ From igorconsulting


