Skip to content
View PavanBollepalli's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing

Highlights

  • Pro

Block or report PavanBollepalli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PavanBollepalli/README.md

Hey, I'm Pavan πŸ‘‹

Typing SVG

I build AI-powered systems with production-grade backend architecture. Currently shipping RAG pipelines, hybrid vector search, and real-time market intelligence.


Portfolio LinkedIn Email


🧠 About Me

class Pavan:
    role      = "Full-Stack Developer & AI/ML Engineer"
    education = "B.Tech CS (AI & ML) β€” VVIT, Graduating May 2026"
    
    certifications = [
        "Google Cloud Associate Cloud Engineer",
        "AWS Certified Cloud Practitioner",
    ]
    
    achievements = [
        "1st Place β€” ACM Programming Contest (200+ participants)",
        "Open Source Contributor β€” Wikimedia Foundation",
    ]
    
    currently_building = "AI-powered GitHub Repository Analyzer"
    
    fun_fact = "I optimized a RAG pipeline from 9.6s to 0.48s and it felt better than any game win"

πŸš€ Flagship Project

⚑ SkillVector β€” AI Career Intelligence Platform

A full-stack system that generates personalized, multi-phase learning paths using RAG, hybrid vector search, and U.S. labor market data. Not a wrapper around ChatGPT β€” a complete AI pipeline with measured performance.

πŸ” Technical Deep Dive (click to expand)

3-Layer RAG Retrieval Cache

L0  In-Memory (Python dict)     β†’  ~0ms     β€” role + query tuple key
L1  pgvector Hybrid Search       β†’  1 DB trip β€” HNSW cosine + B-tree metadata filter
L2  Tavily Live Web Fetch        β†’  1-3s     β€” only for cache misses, parallelized

Measured Performance:

Scenario RAG Latency Total Generation
Cold start (0 cache hits) 9.6s 18.1s
Warm (L0 in-memory hit) 0.48s 10.1s
Improvement 95% reduction 44% reduction

Hybrid Vector Search (pgvector)

  • 1024-dim Mistral embeddings with HNSW cosine index
  • B-tree metadata filter on target_role partitions search space per role
  • N vector lookups batched into 1 SQL query via UNION ALL
  • Language-aware cache bypass: non-English queries always fetch fresh results

Concurrency & Safety

  • pg_try_advisory_xact_lock(user_id) prevents duplicate generation from concurrent requests
  • Per-user locking β€” 100 users generate paths fully in parallel
  • Server-side test answers β€” MCQ correct answers never sent to frontend

O*NET Market Intelligence

  • Fuzzy matches roles to SOC codes across 900+ occupations
  • Extracts Hot Technology skills, knowledge domains, work activities
  • LLM fallback for modern roles absent from O*NET

πŸ“‚ Other Projects

Project What it does Stack
NextVentures Startup discovery platform with SSR, GitHub OAuth, and real-time CMS sync Next.js 14, TypeScript, Sanity, NextAuth
HandShake Real-time ASL gesture recognition β€” CNN + MediaPipe hand landmark detection TensorFlow, OpenCV, MediaPipe, Flask
PrepWise AI interview prep with real-time speech analysis and feedback Node.js, FastAPI, React, Socket.io

πŸ’Ό Experience

Open Source Contributor β€” Wikimedia Foundation

Jul 2024 – Sep 2025 Β· Remote

  • Built community insights dashboard (Python + Streamlit + MySQL) surfacing editor contribution trends
  • Reduced SQL query execution time ~25% via composite indexing on replicated production datasets
  • Authored reusable Python query modules with unit tests, eliminating ad-hoc analysis scripts
  • Contributions merged after international code reviews under Wikimedia engineering standards

πŸ› οΈ Tech Stack

Languages

Python TypeScript JavaScript Java SQL

Backend

FastAPI Node.js Express SQLAlchemy

Frontend

Next.js React Tailwind Three.js

Data & AI

PostgreSQL MySQL MongoDB TensorFlow

Cloud & DevOps

GCP AWS Docker Linux Git


πŸ† Certifications & Recognition

GCP AWS

πŸ₯‡ 1st Place β€” ACM Programming Contest (200+ participants)


πŸ“Š GitHub Stats


Building things that work at scale β€” not just things that compile.

Pinned Loading

  1. SkillVector SkillVector Public

    TypeScript 1 2

  2. CodeAtlas CodeAtlas Public

    Python 1

  3. indictechcom/community-insights-dashboard-tooling indictechcom/community-insights-dashboard-tooling Public

    wip

    Python 11