A comprehensive Python system for generating 3D models from text descriptions, images, or both using advanced neural networks, geometric algorithms, and machine learning techniques.
- 🔤 Text-to-3D: Generate 3D models from natural language descriptions using neural networks
- 🖼️ Image-to-3D: Convert 2D images to 3D models using depth estimation and computer vision
- 🔄 Multimodal Generation: Combine text and images for enhanced 3D model creation
- 🎨 Multiple Algorithms: Neural, procedural, and hybrid generation approaches
- Neural Networks: CLIP text encoding, NeRF-based 3D representation, diffusion models
- Computer Vision: Depth estimation, stereo vision, photogrammetry, point cloud processing
- Advanced Processing: Attention-based feature fusion, semantic understanding, material properties
- Export Formats: OBJ, STL, PLY, GLB with materials and textures
- Quality Control: Post-processing, mesh optimization, and validation
pip install -r requirements.txtCore Dependencies:
- numpy >= 1.21.0
- trimesh >= 3.15.0
- open3d >= 0.16.0
- torch >= 1.12.0
- torchvision >= 0.13.0
- scipy >= 1.9.0
Machine Learning:
- transformers >= 4.20.0
- diffusers >= 0.21.0
- sentence-transformers >= 2.2.0
- scikit-learn >= 1.1.0
Computer Vision:
- opencv-python >= 4.5.0
- Pillow >= 8.3.0
3D Processing:
- mcubes >= 0.1.0
- point-cloud-utils >= 0.29.0
Additional:
- matplotlib >= 3.5.0
- noise >= 1.2.2
from multimodal_3d_generator import MultiModal3DGenerator
# Create multimodal generator
generator = MultiModal3DGenerator()
# Generate from text
mesh, metadata = generator.generate_from_text(
"a futuristic robot with metallic armor",
output_path="robot.obj"
)
# Generate from image
mesh, metadata = generator.generate_from_image(
"path/to/image.jpg",
output_path="from_image.obj"
)
# Generate from both text and image
mesh, metadata = generator.generate_from_multimodal(
"a blue metallic sculpture",
"path/to/reference.jpg",
output_path="multimodal.obj"
)from neural_text_to_3d import NeuralTextTo3D
# Neural generation
neural_gen = NeuralTextTo3D()
mesh = neural_gen.generate_from_text("a crystal castle with towers")from image_to_3d import ImageTo3DConverter
# Image-to-3D conversion
img_gen = ImageTo3DConverter()
# From single image
mesh = img_gen.single_image_to_3d("photo.jpg", method="depth_estimation")
# From stereo pair
mesh = img_gen.stereo_images_to_3d("left.jpg", "right.jpg")# Start web server
./start.sh dev
# Or manually
python app.py
# Visit http://localhost:5000 in your browser# Command line interactive mode
python demo.py comprehensive
# Quick generation
python demo.py --prompt "a medieval sword"
python demo.py --image "path/to/image.jpg"
# Test web API
python run_demo.py- Basic shapes: cube, box, sphere, ball, cylinder, cone, pyramid, torus, ring, plane
- Complex objects: building, house, tower, car, tree, mountain, rock, crystal, gem, bottle, vase
- Scale: tiny, small, medium, large, huge, massive, enormous
- Dimensions: tall, thin, thick, wide, narrow, long, flat
- Metals: metal, metallic, steel, iron, gold, silver
- Natural: wood, stone, rock, crystal, glass
- Synthetic: plastic, ceramic, fabric, leather
- Texture: smooth, rough, bumpy, textured, polished, coarse
- Finish: shiny, glossy, matte, dull, reflective
- Transparency: transparent, clear, translucent
- Modern: modern, contemporary, minimalist, clean, simple
- Traditional: ancient, old, vintage, ornate, decorative, elaborate
- Geometric: angular, sharp, pointed, faceted, curved, twisted
- Web Interface: Modern responsive UI with Bootstrap and Three.js
- Real-time Communication: Socket.IO for live progress updates
- 3D Visualization: Three.js-based model viewer with interactive controls
- File Upload: Drag-and-drop image upload with preview
- Flask Server: RESTful API with multimodal endpoints
- Background Processing: Async task handling with real-time updates
- Model Storage: Automatic file management and serving
- Error Handling: Graceful fallbacks and user feedback
- TextProcessor: Extracts 3D modeling parameters from natural language
- GeometryGenerator: Creates and modifies 3D geometry based on extracted features
- ModelExporter: Handles exporting to various 3D file formats
- NeuralTextTo3D: Neural network-based text-to-3D generation
- ImageTo3DConverter: Computer vision-based image-to-3D conversion
- MultiModal3DGenerator: Unified multimodal generation system
# Simple shapes
generator.generate("a blue cube")
generator.generate("a large yellow sphere")
generator.generate("a thin red cylinder")
# With surface properties
generator.generate("a rough metallic sphere")
generator.generate("a smooth glass cube")
generator.generate("a bumpy stone surface")
# Complex descriptions
generator.generate("a twisted green cone with textured surface")
generator.generate("a tall modern building with clean lines")# Semantic objects
generator.generate_advanced("an ancient stone tower with weathered surfaces")
generator.generate_advanced("a modern glass skyscraper with reflective facade")
generator.generate_advanced("a wooden treasure chest with metal hinges")
# Material properties
generator.generate_advanced("a shiny metallic robot with angular design")
generator.generate_advanced("a translucent crystal formation")
generator.generate_advanced("a matte ceramic vase with smooth curves")# Generate multiple models
prompts = [
"a red sports car",
"a medieval castle",
"a futuristic spaceship",
"a natural rock formation"
]
meshes = generator.batch_generate(
prompts,
output_dir="generated_models",
format="obj"
)generate(text_prompt, output_path=None, format='obj'): Generate 3D model from textbatch_generate(text_prompts, output_dir, format='obj'): Generate multiple modelsget_model_info(mesh): Get technical information about generated model
generate_advanced(text_prompt, output_path=None): Generate with advanced featuresanalyze_prompt(text_prompt): Extract and return features from text
- OBJ: Wavefront OBJ format with materials
- STL: STereoLithography format for 3D printing
- PLY: Polygon File Format with color support
- GLB: Binary glTF format for web applications
- Complexity: Higher complexity values increase generation time but add detail
- Resolution: Default resolution balances quality and performance
- Batch Processing: More efficient for generating multiple models
- Generated models are based on geometric primitives and procedural techniques
- Complex organic shapes may require additional post-processing
- Material properties are approximated and may need refinement for specific applications
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
MIT License - see LICENSE file for details# fast3d