dematsunaga

Follow

Daiki Matsunaga dematsunaga

Follow

3 followers · 2 following

https://sites.google.com/view/daikieddymatsunaga

Achievements

Achievements

Highlights

Pro

Stars

namhkoh / gelab-env

Forked from summonerloong/gelab-engine

Python 1 Updated Mar 24, 2026

junho328 / openvla-oft

Python 1 Updated Jan 21, 2026

KAIST-AILab / R-CVAE

Implementation code of Reasoning Test-time Compute with CVAE

Python 12 7 Updated Dec 17, 2025

KAIST-AILab / GC-DPO

Python 13 9 Updated Dec 17, 2025

KAIST-AILab / SR-GRPO

Python 14 10 Updated Dec 17, 2025

KAIST-AILab / VL-DNP

Python 12 10 Updated Dec 17, 2025

promotion-kim / TMT

Python 15 13 Updated Dec 10, 2025

ggoggam / gdpo

Code for GFlowNet-DPO (Direct Preference Optimization) EMNLP 2024 Main

Python 19 8 Updated Feb 22, 2026

junho328 / MARLLM

Multi-Agent Reinforcement Learning in LLMs

Python 1 Updated Oct 24, 2025

TsinghuaC3I / MARTI

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

Python 468 48 Updated Feb 19, 2026

wenshuaizhao / optimappo

This is the code for Optimistic Multi-Agent Policy Gradient.

Python 12 1 Updated Sep 5, 2024

marl-book / slides

Lecture slides for the MARL book (www.marl-book.com)

TeX 162 35 Updated May 14, 2025

marl-book / codebase

Official code repo for the MARL book (www.marl-book.com)

Python 621 104 Updated Mar 30, 2025

twni2016 / pomdp-baselines

Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022

Python 344 48 Updated Aug 22, 2024

albertruaz / MAPPO-in-TinyHanabi

Running MAPPO model in Tiny-Hanabi using on-policy environment

Python 1 Updated Jan 20, 2025

denisyarats / pytorch_sac

PyTorch implementation of Soft Actor-Critic (SAC)

Jupyter Notebook 594 110 Updated Dec 5, 2021

PKU-RL / FOP-DMAC-MACPF

Python 14 3 Updated Mar 5, 2023

p-christ / Deep-Reinforcement-Learning-Algorithms-with-PyTorch

PyTorch implementations of deep reinforcement learning algorithms and environments

Python 5,929 1,211 Updated Jul 25, 2024

albertruaz / Sparta-in-open-spiel

Running sparta model in Tiny-Hanabi using open spiel environment

Python 1 Updated Nov 23, 2024

marlbenchmark / on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

Python 1,941 374 Updated Jul 18, 2024

google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

C++ 5,104 1,108 Updated Mar 26, 2026

bramgrooten / DeepRL-for-Hanabi

[ALA 2022] Analyzing the Deep RL algorithms SPG, VPG, PPO on the cooperative card game Hanabi.

Python 5 3 Updated Mar 17, 2022

secury / APAC

Python 6 1 Updated Nov 1, 2022

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,556 188 Updated Mar 28, 2026

google-deepmind / mctx

Monte Carlo tree search in JAX

Python 2,603 207 Updated Sep 2, 2025

facebookresearch / BenchMARL

BenchMARL is a library for benchmarking Multi-Agent Reinforcement Learning (MARL). BenchMARL allows to quickly compare different MARL algorithms, tasks, and models while being systematically ground…

Python 593 120 Updated Feb 7, 2026

dematsunaga / alberdice

Official PyTorch implementation of AlberDICE

Python 23 14 Updated Dec 8, 2023

RDLLab / posggym

A collection of environments and reference agents for planning and reinforcement learning research in partially observable, multi-agent environments.

Python 30 7 Updated Jun 2, 2025

ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…

Python 3,887 842 Updated May 29, 2022

udacity / deep-reinforcement-learning

Repo for the Deep Reinforcement Learning Nanodegree program

Jupyter Notebook 5,153 2,371 Updated Nov 16, 2023