Tendai TendaiShoko

Hey, I'm Tendai 👋

Senior Data Scientist • AI/ML Engineer • MLSecOps

Building intelligent systems that solve real-world problems.

🤖 AI & ML Expertise

Core Competencies

Deep Learning — CNNs, Transformers, Vision Transformers (ViT), attention mechanisms
Computer Vision — Object detection, target re-identification, image segmentation, feature extraction
Natural Language Processing — Text classification, sentiment analysis, named entity recognition, embeddings
Generative AI — LLM fine-tuning, RAG architectures, prompt engineering, multi-modal models
Foundation Models — CLIP, ALIGN, GPT, BERT, experience with model fusion and adaptation
Knowledge Distillation — Model compression, teacher-student architectures, edge deployment
MLSecOps — Secure ML pipelines, model monitoring, drift detection, responsible AI

Techniques & Methods

Mixture-of-Experts (MoE) architectures
Contrastive learning & triplet loss
Transfer learning & domain adaptation
Hyperparameter optimization (Optuna, Ray Tune)
Model interpretability (SHAP, LIME, Grad-CAM)
A/B testing for ML models

🔧 Tech Stack

ML & Deep Learning

Generative AI & LLMs

Cloud & MLOps

Data & Databases

CI/CD & Tools

🚀 Key Projects

MoE-KD: Foundation Model Fusion for Real-Time Re-Identification

Developed a novel Mixture-of-Experts framework that dynamically fuses CLIP and ALIGN foundation models, then distills knowledge into a compact student network for edge deployment. Achieved 50% reduction in inference time while maintaining competitive accuracy on VeRi-776 (63.5% mAP) and Market-1501 (76.1% mAP) benchmarks.

PyTorch CLIP ALIGN Knowledge Distillation Computer Vision

Multi-Camera Target Re-Identification System

Built an end-to-end re-ID pipeline for matching targets across non-overlapping camera networks. Implemented triplet loss with hard negative mining, cross-camera domain adaptation, and real-time inference optimization for surveillance applications processing 6,000+ frames/second.

PyTorch OpenCV CUDA TensorRT Docker

Large-Scale Social Sentiment Analysis Pipeline

Engineered a distributed NLP pipeline processing millions of tweets for immigration sentiment analysis in South Africa. Implemented custom BERT fine-tuning, multi-label classification, and temporal trend analysis. Published in IEEE ICTAS 2024.

Transformers BERT NLP Spark AWS

Generative AI Document Intelligence System

Designed a RAG-based system for enterprise document Q&A with multi-format ingestion (PDF, DOCX, images), hybrid search (dense + sparse retrieval), and hallucination mitigation. Deployed on Azure with autoscaling to handle 10K+ daily queries.

LangChain Azure OpenAI Pinecone FastAPI Kubernetes

Real-Time ML Fraud Detection Platform

Architected a streaming ML pipeline for transaction fraud detection processing 50K+ events/second. Implemented online learning with concept drift detection, feature stores, and model versioning with automated retraining triggers.

Kafka Flink SageMaker Feature Store MLflow

Automated MLOps Pipeline with Security Controls

Built a complete MLSecOps framework including automated model training, vulnerability scanning, bias detection, model signing, and secure deployment. Integrated with CI/CD for continuous model delivery with governance controls.

GitHub Actions Docker Kubernetes SageMaker Trivy

📚 Publications

Enhancing Target Re-Identification via Model Fusion and Knowledge Distillation of Pre-trained Foundation Models
SACAIR 2025 — Novel MoE-KD framework for efficient real-time re-identification using foundation models.

Analyzing the Perception of Immigrants in South Africa: A Machine Learning Approach to Aggregate Twitter Sentiment Data
IEEE ICTAS 2024 — Read Paper

🎓 Education & Certifications

MSc Artificial Intelligence — University of Johannesburg

📫 Connect

Always learning. Always building.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly