Intern Robotics

All

58 repositories

GenManip
Public
[CVPR 2025] Official implementation of "GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation"
robotics simulation manipulation isaac-sim
Python
•3•123•4•0•Updated Dec 16, 2025Dec 16, 2025
MMSI-Video-Bench
Public
Python
•0•14•0•0•Updated Dec 16, 2025Dec 16, 2025
InternNav
Public
InternRobotics' open platform for building generalized navigation foundation models.
Jupyter Notebook
•53•487•8•2•Updated Dec 15, 2025Dec 15, 2025
internvla-n1-dualvln.github.io
Public
JavaScript
•0•0•0•0•Updated Dec 10, 2025Dec 10, 2025
NavDP
Public
Official implementation of the paper: "NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance"
Python
•18•305•5•0•Updated Dec 9, 2025Dec 9, 2025
MeshCoder
Public
Jupyter Notebook
•
MIT License
•20•415•7•0•Updated Dec 8, 2025Dec 8, 2025
MMSI-Bench
Public
[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
Python
•0•63•0•0•Updated Dec 4, 2025Dec 4, 2025
G2VLM
Public
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
3d-reconstruction spatial-reasoning mllms spatial-intelligence 3d-llms spatial-understanding
Python
•
Apache License 2.0
•4•218•2•0•Updated Nov 27, 2025Nov 27, 2025
EgoThinker
Public
Official implementation of EgoThinker at NIPS 2025
Python
•0•22•2•0•Updated Nov 25, 2025Nov 25, 2025
EgoHOD
Public
Official implementation of EgoHOD at ICLR 2025; 14 EgoVis Challenge Winners in CVPR 2024
Python
•
Apache License 2.0
•2•31•1•0•Updated Nov 25, 2025Nov 25, 2025
MV-CoLight
Public
[NIPS 2025] MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation
Python
•
MIT License
•2•13•2•0•Updated Nov 21, 2025Nov 21, 2025
interndata-a1.github.io
Public
HTML
•0•1•0•0•Updated Nov 20, 2025Nov 20, 2025
InternVLA-M1
Public
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
robotics vision-language-model vision-language-action-model
Python
•
MIT License
•15•310•10•0•Updated Nov 11, 2025Nov 11, 2025
CronusVLA
Public
[AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling
Python
•
MIT License
•2•65•0•0•Updated Nov 4, 2025Nov 4, 2025
internrobotics.github.io
Public
Documentation of Intern Robotics Platform & Toolkits
Python
•4•2•0•2•Updated Nov 4, 2025Nov 4, 2025
StreamVLN
Public
Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
Python
•21•339•18•1•Updated Nov 2, 2025Nov 2, 2025
Aether
Public
[ICCV 2025 & ICCV 2025 RIWM Outstanding Paper] Aether: Geometric-Aware Unified World Modeling
navigation multi-modal video-generation video-prediction embodied-ai visual-planning 4d-reconstruction foundation-models world-model 4d-generation
Python
•
MIT License
•6•553•0•0•Updated Oct 26, 2025Oct 26, 2025
internvla-m1.github.io
Public
Astro
•0•1•0•0•Updated Oct 23, 2025Oct 23, 2025
Humanoid-Goalkeeper
Public
[arxiv 2025] Official implementation of "Humanoid Goalkeeper: Learning from Position Conditioned Task-Motion Constraints"
Python
•
Other
•6•124•0•0•Updated Oct 22, 2025Oct 22, 2025
F1-VLA
Public
F1: A Vision Language Action Model Bridging Understanding and Generation to Actions
Python
•9•141•3•0•Updated Oct 20, 2025Oct 20, 2025
InternScenes
Public
[NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.
dataset scene-generation embodied-ai interactive-scenes
Python
•6•203•4•0•Updated Oct 17, 2025Oct 17, 2025
AdaMimic
Public
[arxiv 2025] Official implementation of "Towards Adaptable Humanoid Control via Adaptive Motion Tracking"
Python
•11•188•7•0•Updated Oct 17, 2025Oct 17, 2025
.github
Public
4•0•0•0•Updated Oct 16, 2025Oct 16, 2025
InternManip
Public
An All-in-one robot manipulation learning suite for policy models training and evaluation on various datasets and benchmarks.
Python
•
MIT License
•10•165•6•0•Updated Oct 15, 2025Oct 15, 2025
PhysHSI
Public
Official implementation of the paper: "PhysHSI: Towards a Real-World Generalizable and Natural Humanoid-Scene Interaction System"
robotics humanoid
Python
•
Other
•11•213•5•1•Updated Oct 14, 2025Oct 14, 2025
InternHumanoid
Public
A versatile, all-in-one toolbox for whole-body humanoid robot control.
Python
•
MIT License
•2•149•1•0•Updated Oct 10, 2025Oct 10, 2025
ARTDECO
Public
ARTDECO unifies 3D foundation priors with structured scene representations, enabling robust and generalizable 3D reconstruction of diverse real-world scenes using only monocular video.
8•134•0•0•Updated Oct 10, 2025Oct 10, 2025
InstructVLA
Public
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
Python
•4•83•0•0•Updated Sep 30, 2025Sep 30, 2025
MesaTask
Public
[NeurIPS 2025 Spotlight] MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
scene-generation embodied-ai 3d-dataset
Python
•
MIT License
•4•68•3•0•Updated Sep 29, 2025Sep 29, 2025
OST-Bench
Public
[NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
Python
•1•69•2•0•Updated Sep 29, 2025Sep 29, 2025