Skip to content

j0rGeT/fast3d

Repository files navigation

Fast3D - Multimodal Text/Image-to-3D Generation System

A comprehensive Python system for generating 3D models from text descriptions, images, or both using advanced neural networks, geometric algorithms, and machine learning techniques.

Features

Core Capabilities

  • 🔤 Text-to-3D: Generate 3D models from natural language descriptions using neural networks
  • 🖼️ Image-to-3D: Convert 2D images to 3D models using depth estimation and computer vision
  • 🔄 Multimodal Generation: Combine text and images for enhanced 3D model creation
  • 🎨 Multiple Algorithms: Neural, procedural, and hybrid generation approaches

Technical Features

  • Neural Networks: CLIP text encoding, NeRF-based 3D representation, diffusion models
  • Computer Vision: Depth estimation, stereo vision, photogrammetry, point cloud processing
  • Advanced Processing: Attention-based feature fusion, semantic understanding, material properties
  • Export Formats: OBJ, STL, PLY, GLB with materials and textures
  • Quality Control: Post-processing, mesh optimization, and validation

Installation

pip install -r requirements.txt

Dependencies

Core Dependencies:

  • numpy >= 1.21.0
  • trimesh >= 3.15.0
  • open3d >= 0.16.0
  • torch >= 1.12.0
  • torchvision >= 0.13.0
  • scipy >= 1.9.0

Machine Learning:

  • transformers >= 4.20.0
  • diffusers >= 0.21.0
  • sentence-transformers >= 2.2.0
  • scikit-learn >= 1.1.0

Computer Vision:

  • opencv-python >= 4.5.0
  • Pillow >= 8.3.0

3D Processing:

  • mcubes >= 0.1.0
  • point-cloud-utils >= 0.29.0

Additional:

  • matplotlib >= 3.5.0
  • noise >= 1.2.2

Quick Start

Unified Multimodal Interface

from multimodal_3d_generator import MultiModal3DGenerator

# Create multimodal generator
generator = MultiModal3DGenerator()

# Generate from text
mesh, metadata = generator.generate_from_text(
    "a futuristic robot with metallic armor",
    output_path="robot.obj"
)

# Generate from image
mesh, metadata = generator.generate_from_image(
    "path/to/image.jpg",
    output_path="from_image.obj"
)

# Generate from both text and image
mesh, metadata = generator.generate_from_multimodal(
    "a blue metallic sculpture",
    "path/to/reference.jpg",
    output_path="multimodal.obj"
)

Text-to-3D Generation

from neural_text_to_3d import NeuralTextTo3D

# Neural generation
neural_gen = NeuralTextTo3D()
mesh = neural_gen.generate_from_text("a crystal castle with towers")

Image-to-3D Generation

from image_to_3d import ImageTo3DConverter

# Image-to-3D conversion
img_gen = ImageTo3DConverter()

# From single image
mesh = img_gen.single_image_to_3d("photo.jpg", method="depth_estimation")

# From stereo pair
mesh = img_gen.stereo_images_to_3d("left.jpg", "right.jpg")

Web Application

# Start web server
./start.sh dev

# Or manually
python app.py

# Visit http://localhost:5000 in your browser

Interactive Demo

# Command line interactive mode
python demo.py comprehensive

# Quick generation
python demo.py --prompt "a medieval sword"
python demo.py --image "path/to/image.jpg"

# Test web API
python run_demo.py

Supported Text Features

Shape Keywords

  • Basic shapes: cube, box, sphere, ball, cylinder, cone, pyramid, torus, ring, plane
  • Complex objects: building, house, tower, car, tree, mountain, rock, crystal, gem, bottle, vase

Size Modifiers

  • Scale: tiny, small, medium, large, huge, massive, enormous
  • Dimensions: tall, thin, thick, wide, narrow, long, flat

Materials

  • Metals: metal, metallic, steel, iron, gold, silver
  • Natural: wood, stone, rock, crystal, glass
  • Synthetic: plastic, ceramic, fabric, leather

Surface Properties

  • Texture: smooth, rough, bumpy, textured, polished, coarse
  • Finish: shiny, glossy, matte, dull, reflective
  • Transparency: transparent, clear, translucent

Style Descriptors

  • Modern: modern, contemporary, minimalist, clean, simple
  • Traditional: ancient, old, vintage, ornate, decorative, elaborate
  • Geometric: angular, sharp, pointed, faceted, curved, twisted

Architecture

Frontend Components

  • Web Interface: Modern responsive UI with Bootstrap and Three.js
  • Real-time Communication: Socket.IO for live progress updates
  • 3D Visualization: Three.js-based model viewer with interactive controls
  • File Upload: Drag-and-drop image upload with preview

Backend API

  • Flask Server: RESTful API with multimodal endpoints
  • Background Processing: Async task handling with real-time updates
  • Model Storage: Automatic file management and serving
  • Error Handling: Graceful fallbacks and user feedback

Core 3D Generation

  1. TextProcessor: Extracts 3D modeling parameters from natural language
  2. GeometryGenerator: Creates and modifies 3D geometry based on extracted features
  3. ModelExporter: Handles exporting to various 3D file formats
  4. NeuralTextTo3D: Neural network-based text-to-3D generation
  5. ImageTo3DConverter: Computer vision-based image-to-3D conversion
  6. MultiModal3DGenerator: Unified multimodal generation system

Examples

Basic Examples

# Simple shapes
generator.generate("a blue cube")
generator.generate("a large yellow sphere")
generator.generate("a thin red cylinder")

# With surface properties
generator.generate("a rough metallic sphere")
generator.generate("a smooth glass cube")
generator.generate("a bumpy stone surface")

# Complex descriptions
generator.generate("a twisted green cone with textured surface")
generator.generate("a tall modern building with clean lines")

Advanced Examples

# Semantic objects
generator.generate_advanced("an ancient stone tower with weathered surfaces")
generator.generate_advanced("a modern glass skyscraper with reflective facade")
generator.generate_advanced("a wooden treasure chest with metal hinges")

# Material properties
generator.generate_advanced("a shiny metallic robot with angular design")
generator.generate_advanced("a translucent crystal formation")
generator.generate_advanced("a matte ceramic vase with smooth curves")

Batch Processing

# Generate multiple models
prompts = [
    "a red sports car",
    "a medieval castle",
    "a futuristic spaceship",
    "a natural rock formation"
]

meshes = generator.batch_generate(
    prompts, 
    output_dir="generated_models",
    format="obj"
)

API Reference

TextTo3DGenerator

Methods

  • generate(text_prompt, output_path=None, format='obj'): Generate 3D model from text
  • batch_generate(text_prompts, output_dir, format='obj'): Generate multiple models
  • get_model_info(mesh): Get technical information about generated model

AdvancedTextTo3D

Methods

  • generate_advanced(text_prompt, output_path=None): Generate with advanced features
  • analyze_prompt(text_prompt): Extract and return features from text

Output Formats

  • OBJ: Wavefront OBJ format with materials
  • STL: STereoLithography format for 3D printing
  • PLY: Polygon File Format with color support
  • GLB: Binary glTF format for web applications

Performance Notes

  • Complexity: Higher complexity values increase generation time but add detail
  • Resolution: Default resolution balances quality and performance
  • Batch Processing: More efficient for generating multiple models

Limitations

  • Generated models are based on geometric primitives and procedural techniques
  • Complex organic shapes may require additional post-processing
  • Material properties are approximated and may need refinement for specific applications

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Submit a pull request

License

MIT License - see LICENSE file for details# fast3d

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published