-
GSoC'23 @wikimedia developer @mdgspace
- Bengaluru, Karnataka, India
-
11:00
(UTC +05:30) - https://nik-55.github.io
- in/nikhilmahajan123
- @m_nik55
- https://medium.com/@nik.xyz.in
- https://huggingface.co/nik-55
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Starred repositories
World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty
Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
A construction kit for reinforcement learning environment management.
[ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fields
Code, data and weights for the paper **What drives success in physical planning with Joint-Embedding Predictive World Models?**
Code for "Evaluating Robot Policies in a World Model".
A Comprehensive Survey on World Models for Embodied AI
Ray tracing and hybrid rasterization of Gaussian particles
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Code implementation of the paper "World-in-World: World Models in a Closed-Loop World"
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
SkyRL: A Modular Full-stack RL Library for LLMs
[NeurIPS 2025] The official repository of "Sekai: A Video Dataset towards World Exploration"
SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
A generative world for general-purpose robotics & embodied AI learning.
ENACT is a benchmark that evaluates embodied cognition through world modeling from egocentric interaction. It is designed to be simple and have a scalable dataset.
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
[NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
[Arxiv 2025] In-Video Instructions: Visual Signals as Generative Control
HoloMotion: A Foundation Model for Whole-Body Humanoid Control


