Skip to content
Change the repository type filter

All

    Repositories list

    • GenManip

      Public
      [CVPR 2025] Official implementation of "GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation"
      Python
      312340Updated Dec 16, 2025Dec 16, 2025
    • Python
      01400Updated Dec 16, 2025Dec 16, 2025
    • InternNav

      Public
      InternRobotics' open platform for building generalized navigation foundation models.
      Jupyter Notebook
      5348782Updated Dec 15, 2025Dec 15, 2025
    • JavaScript
      0000Updated Dec 10, 2025Dec 10, 2025
    • NavDP

      Public
      Official implementation of the paper: "NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance"
      Python
      1830550Updated Dec 9, 2025Dec 9, 2025
    • MeshCoder

      Public
      Jupyter Notebook
      2041570Updated Dec 8, 2025Dec 8, 2025
    • MMSI-Bench

      Public
      [arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
      Python
      06300Updated Dec 4, 2025Dec 4, 2025
    • G2VLM

      Public
      G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
      Python
      421820Updated Nov 27, 2025Nov 27, 2025
    • EgoThinker

      Public
      Official implementation of EgoThinker at NIPS 2025
      Python
      02220Updated Nov 25, 2025Nov 25, 2025
    • EgoHOD

      Public
      Official implementation of EgoHOD at ICLR 2025; 14 EgoVis Challenge Winners in CVPR 2024
      Python
      23110Updated Nov 25, 2025Nov 25, 2025
    • [NIPS 2025] MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation
      Python
      21320Updated Nov 21, 2025Nov 21, 2025
    • HTML
      0100Updated Nov 20, 2025Nov 20, 2025
    • InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
      Python
      15310100Updated Nov 11, 2025Nov 11, 2025
    • CronusVLA

      Public
      [AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling
      Python
      26500Updated Nov 4, 2025Nov 4, 2025
    • internrobotics.github.io

      Public
      Documentation of Intern Robotics Platform & Toolkits
      Python
      4202Updated Nov 4, 2025Nov 4, 2025
    • StreamVLN

      Public
      Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
      Python
      21339181Updated Nov 2, 2025Nov 2, 2025
    • Aether

      Public
      [ICCV 2025 & ICCV 2025 RIWM Outstanding Paper] Aether: Geometric-Aware Unified World Modeling
      Python
      655300Updated Oct 26, 2025Oct 26, 2025
    • Astro
      0100Updated Oct 23, 2025Oct 23, 2025
    • [arxiv 2025] Official implementation of "Humanoid Goalkeeper: Learning from Position Conditioned Task-Motion Constraints"
      Python
      612400Updated Oct 22, 2025Oct 22, 2025
    • F1-VLA

      Public
      F1: A Vision Language Action Model Bridging Understanding and Generation to Actions
      Python
      914130Updated Oct 20, 2025Oct 20, 2025
    • [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.
      Python
      620340Updated Oct 17, 2025Oct 17, 2025
    • AdaMimic

      Public
      [arxiv 2025] Official implementation of "Towards Adaptable Humanoid Control via Adaptive Motion Tracking"
      Python
      1118870Updated Oct 17, 2025Oct 17, 2025
    • .github

      Public
      4000Updated Oct 16, 2025Oct 16, 2025
    • An All-in-one robot manipulation learning suite for policy models training and evaluation on various datasets and benchmarks.
      Python
      1016560Updated Oct 15, 2025Oct 15, 2025
    • PhysHSI

      Public
      Official implementation of the paper: "PhysHSI: Towards a Real-World Generalizable and Natural Humanoid-Scene Interaction System"
      Python
      1121351Updated Oct 14, 2025Oct 14, 2025
    • A versatile, all-in-one toolbox for whole-body humanoid robot control.
      Python
      214910Updated Oct 10, 2025Oct 10, 2025
    • ARTDECO

      Public
      ARTDECO unifies 3D foundation priors with structured scene representations, enabling robust and generalizable 3D reconstruction of diverse real-world scenes using only monocular video.
      813400Updated Oct 10, 2025Oct 10, 2025
    • InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
      Python
      48300Updated Sep 30, 2025Sep 30, 2025
    • MesaTask

      Public
      [NeurIPS 2025 Spotlight] MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
      Python
      46830Updated Sep 29, 2025Sep 29, 2025
    • OST-Bench

      Public
      [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
      Python
      16920Updated Sep 29, 2025Sep 29, 2025