Skip to content

An research project combining Compressed Trie data structures, BERT-based semantic analysis, and Google Gemini AI to create an intelligent educational story generation system for elementary school children learning English.

License

Notifications You must be signed in to change notification settings

ahmedsilat44/LexiQuest

Repository files navigation

Tries-Based Dictionary: AI-Powered Educational Story Generation

Kotlin Android Python TensorFlow

An research project combining Compressed Trie data structures, BERT-based semantic analysis, and Google Gemini AI to create an intelligent educational story generation system for elementary school children learning English.

πŸ“‹ Table of Contents

🎯 Project Overview

This repository contains a comprehensive educational platform that generates contextually relevant, fill-in-the-blank stories for children aged 6-10. The system uniquely combines three key technologies:

  1. Compressed Tries - Efficient storage and retrieval of 5000+ vocabulary words
  2. BERT Neural Networks - Semantic word ranking for contextual relevance
  3. Google Gemini AI - Dynamic story generation with educational content

The platform includes both an Android mobile application and a command-line backend prototype that demonstrates the core story generation architecture.

πŸ—οΈ Backend Story Generation Architecture

System Overview

graph TB
    A[Story Context] --> B[Compressed Trie]
    B --> C[Unused Words Filter]
    C --> D[BERT Model]
    D --> E[Top Relevant Words]
    E --> F[Gemini API]
    F --> G[Generated Story Template]
    G --> H[Word Placement Algorithm]
    H --> I[Final Story with Blanks]
Loading

Core Architecture Components

1. Compressed Trie Implementation (compressed_tries.kt)

  • Memory Efficient: Stores 5K+ words with shared prefixes
  • Fast Retrieval: O(m) search complexity where m = word length
  • Auto-completion: Levenshtein distance-based suggestions
  • Multiple Definitions: Supports words with multiple meanings

2. BERT-Based Word Ranking (rank_words.py)

  • Model: bert-base-uncased for masked language modeling
  • Context-Aware: Analyzes story context to predict relevant words
  • Semantic Scoring: Uses transformer attention for word relevance
  • Duplicate Prevention: Ensures no word repetition across story levels

3. Gemini AI Story Generation (testing.main.kts)

  • Model: gemini-2.0-flash-lite for natural language generation
  • Educational Focus: Prompts optimized for grade-level vocabulary
  • Contextual Continuity: Maintains story coherence across levels
  • Controlled Output: Generates exactly 3 blanks per story segment

Data Flow Pipeline

  1. Initialization

    • Load 5K+ words from Oxford 5000 csv.
    • Initialize BERT model and tokenizer
    • Set up Gemini API connection
  2. Story Level Generation

    fun playNextLevel() {
        val unusedWords = getAllWordsFromTrie() - usedWords
        val story = generateDynamicStory(currentContext, unusedWords)
        displayStory(story)
    }
  3. Dynamic Story Creation Process

    • Context Analysis: Current story context passed to BERT
    • Word Filtering: Unused words from trie filtered by relevance
    • Template Generation: Gemini creates story with [MASK] placeholders
    • Word Placement: BERT selects best words for each mask
    • Story Assembly: Final story with blanks and answers
  4. Educational Game Mechanics

    data class Story(
        val text: String,           // Complete story text
        val blankPositions: List<Int>, // Indices of words to blank out
        val answers: List<String>      // Correct answers for blanks
    )

πŸ“± Android Application

Screenshots

Home Screen Dictionary Lookup Word Detail Story World Story Level Example
image image image image image

Features

  • Interactive Dictionary: Trie-based word lookup with auto-completion
  • Story Mode: Dynamic Story Generation
  • Educational Design: Child-friendly UI with Jetpack Compose
  • Offline Capability: Local trie storage for fast performance

Tech Stack

  • UI: Jetpack Compose
  • Database: Room with compressed trie storage
  • Architecture: MVVM with Repository pattern
  • Background: WorkManager for data processing

πŸš€ Getting Started

Prerequisites

# For Android App
- Android Studio Arctic Fox or newer
- Kotlin 1.9+
- Android SDK 24+

# For Backend Architecture
- Python 3.8+
- Kotlin 1.9+
- transformers library
- torch library

Backend Setup

  1. Install Python Dependencies

    pip install torch transformers numpy pandas
  2. Set Up Gemini API

    # Add your Gemini API key to testing.main.kts
    val apiKey = "your-gemini-api-key-here"
  3. Run the Story Generator

    cd architecture
    kotlin -script testing.main.kts

Android App Setup

  1. Clone and Build

    git clone https://github.com/ahmedsilat44/Tries-Based-Dictionary.git
    cd Tries-Based-Dictionary
    ./gradlew assembleDebug
  2. Install on Device

    ./gradlew installDebug

πŸ“Š Performance Metrics

memory_usage insert_time delete_time search_time

πŸ”¬ Research Contributions

Novel Architecture

  • Scalable Unlike most AI storytelling systems, our approach integrates classical data structures (compressed tries) with modern transformer models, enabling both memory efficiency and educational word selection.

Educational Impact

  • Contextual Learning: Words selected based on semantic relevance
  • Engagement: AI-generated content keeps stories fresh and interesting

πŸ—‚οΈ Project Structure

β”œβ”€β”€ app/                          # Android application
β”‚   β”œβ”€β”€ src/main/java/           # Kotlin source files
β”‚   β”‚   β”œβ”€β”€ compressed_tries.kt  # Compressed trie implementation
β”‚   β”‚   β”œβ”€β”€ MainActivity.kt      # Main app entry point
β”‚   β”‚   └── ui/                  # Compose UI components
β”‚   └── build.gradle.kts         # Android build configuration
β”œβ”€β”€ architecture/                # Backend story generation system
β”‚   β”œβ”€β”€ testing.main.kts        # Main backend application
β”‚   β”œβ”€β”€ rank_words.py           # BERT-based word ranking
β”‚   β”œβ”€β”€ words.txt               # 5K word dictionary
β”‚   └── The_Oxford_3000.txt     # Curated vocabulary list
β”œβ”€β”€ trie-implement.kts          # Basic trie implementation
β”œβ”€β”€ compressed-trie-implement.kts # Advanced compressed trie
└── README.md                   # This documentation

πŸ› οΈ Development Status

βœ… Completed Features

  • Compressed trie implementation with 5K+ words
  • BERT-based contextual word ranking
  • Gemini AI story template generation
  • Android app with Jetpack Compose UI
  • End-to-end story generation pipeline
  • Educational game mechanics

🎯 Future Enhancements

Although active development has concluded, several potential extensions could further improve the system:

  • Output Processing: More robust parsing of Gemini API responses for [MASK] tokens
  • Model Fine-tuning: Better alignment of BERT word relevance with educational content
  • Prompt Optimization: Refining Gemini prompts for consistent story quality
  • Progressive Difficulty: Implementing intentional progression in story complexity
  • Multi-language support for international users
  • Advanced difficulty progression algorithms
  • Teacher dashboard for progress tracking
  • Voice narration and audio features
  • Multiplayer collaborative story creation
  • In-Game Story Generation: Option for learners to dynamically generate brand-new stories during gameplay
  • Additional Minigames: Introduce vocabulary puzzles, matching games, and quizzes to boost gamification and long-term engagement

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“š Citations

If you use this research in your work, please cite:

@misc{tries-dictionary-2025,
  title={Tries-Based Dictionary: AI-Powered Educational Story Generation},
  author={Ahmed Silat, Taha Zahid, Minhaj Ul Hasan, Maria Samad},
  year={2024},
  url={https://github.com/ahmedsilat44/Tries-Based-Dictionary}
}

About

An research project combining Compressed Trie data structures, BERT-based semantic analysis, and Google Gemini AI to create an intelligent educational story generation system for elementary school children learning English.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •