zzp1012

Follow

Zhanpeng Zhou zzp1012

Follow

Ph.D. candidate in Computer Science at Shanghai Jiao Tong University.

134 followers · 214 following

Shanghai Jiao Tong University
Shanghai, China
https://zzp1012.github.io
@zhanpeng_zhou

Achievements

Achievements

Starred repositories

lobehub / lobe-vidol

🧸 Lobe Vidol - Making Virtual Idols Accessible for EveryOne

TypeScript 899 123 Updated Mar 8, 2025

deepseek-ai / Engram

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 3,414 233 Updated Jan 14, 2026

OpenCausaLab / CauScientist

Implementation for paper CauScientist: Teaching LLMs to Respect Data for Causal Discovery.

4 Updated Jan 18, 2026

Unakar / Spectral-Sphere-Optimizer

Spectral Sphere Optimizer

Python 90 1 Updated Jan 14, 2026

f-dangel / sirfshampoo

[ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)

Python 15 1 Updated Nov 4, 2024

FFTYYY / mhc-lite

mHC-lite: You Don’t Need 20 Sinkhorn-Knopp Iterations

Python 58 1 Updated Jan 12, 2026

tokenbender / mHC-manifold-constrained-hyper-connections

implementations and experimentation on mHC by deepseek - https://arxiv.org/abs/2512.24880

Python 274 22 Updated Jan 4, 2026

shehper / scaling_laws

Forked from karpathy/nanoGPT

An open-source implementation of Scaling Laws for Neural Language Models using nanoGPT

Python 53 7 Updated Dec 8, 2023

ByteDance-Seed / Seed-1.8

Jupyter Notebook 204 3 Updated Dec 19, 2025

a-usually / Label-Noise-SGD

[AAAI2026 (oral)] On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD (Open Source Code)

Jupyter Notebook 4 Updated Nov 17, 2025

OptimAI-Lab / Minimalist_LLM_Pretraining

A Minimalist Optimizer Design for LLM Pretraining

Python 14 Updated Aug 7, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 38,226 4,583 Updated Jan 18, 2026

damek / specgd

Code to generate figures of paper "When do spectral gradient updates help in deep learning?"

Python 13 Updated Dec 3, 2025

Thinklab-SJTU / GenSCO

Python 4 Updated Jan 2, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 11,197 1,463 Updated Nov 3, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 6,195 665 Updated Jan 18, 2026

MiniMax-AI / MiniMax-M2

MiniMax-M2, a model built for Max coding & agentic workflows.

2,317 183 Updated Nov 13, 2025

princeton-nlp / tree-of-thought-llm

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 5,808 596 Updated Jan 16, 2025

llm-merging / LLM-Merging

LLM-Merging: Building LLMs Efficiently through Merging

Jupyter Notebook 209 43 Updated Sep 24, 2024

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 40,999 5,314 Updated Jan 29, 2026

ZHZisZZ / dllm

dLLM: Simple Diffusion Language Modeling

Python 1,673 166 Updated Jan 6, 2026

meituan-longcat / LongCat-Flash-Chat

1,284 63 Updated Jan 21, 2026

facebookresearch / iGSM

The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Process" (arxiv 2407.20311) and "Physics of Language Models Part 2…

Python 84 8 Updated Jan 12, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,699 2,029 Updated Jan 13, 2026

stepfun-ai / Step3

447 10 Updated Aug 10, 2025

sii-research / predictive-consistency-learning

Forked from Thinklab-SJTU/predictive-consistency-learning

[ICML 2025] Generative Modeling Reinvents Supervised Learning: Label Repurposing with Predictive Consistency Learning

Python 2 Updated Jul 14, 2025

facebookresearch / PhysicsLM4

Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality

HTML 315 16 Updated Jan 5, 2026

cloneofsimo / min-max-gpt

Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training

Python 132 5 Updated Apr 17, 2024

cloneofsimo / ezmup

Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam

Python 85 4 Updated Jul 28, 2024

cloneofsimo / minSAE

Python 30 Updated Dec 2, 2024

Starred topics

Awesome Lists