Skip to content
View yangzy39's full-sized avatar
πŸ˜‰
Looking for job~
πŸ˜‰
Looking for job~

Block or report yangzy39

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yangzy39/README.md

Hi there πŸ‘‹

My name is Ziyi Yang (杨子逸). You can call me Ziyi.

  • 🌱 I’m currently learning at Sun Yat-sen University as a third-year MS student (expected to graduate in 2026), advised by Prof. Xiaojun Quan. Before this, I received my Bachelor's degree (2019-2023, computer science and technology) from Sun Yat-sen University. I am currently an intern at Tongyi Lab, Alibaba Group (2025.05-now).
  • πŸ€” My primary research interests lie at several key areas in LLM post-training. These include heterogeneous model fusion, with a focus on integrating diverse LLMs into a stronger one; advanced preference learning algorithms such as DPO and SimPO; the development of large reasoning models (LRMs) capable of adaptive thinking; and novel reinforcement learning (RL) methodologies, particularly in long-context reasoning and mutli-agent self-play scenarios. My representative publications are listed below.
  • πŸ”­ I’m actively seeking algorithm jobs focused on LLM mid-training & post-training, with interest in discovering novel mutli-task training paradigm, advanced RL algorithm (e.g., multi-agent self-play), and scalable reward system for non-verifiable tasks (e.g., rubric as rewards, generative verifier).
  • πŸ“« How to reach me: E-mail

View my homepage.

Pinned Loading

  1. WRPO WRPO Public

    [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion

    Python 3 2

  2. FuseChat-3.0 FuseChat-3.0 Public

    Forked from SLIT-AI/FuseChat-3.0

    Python 2

  3. arcee-ai/mergekit arcee-ai/mergekit Public

    Tools for merging pretrained large language models.

    Python 6.8k 666

  4. tatsu-lab/alpaca_eval tatsu-lab/alpaca_eval Public

    An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

    Jupyter Notebook 1.9k 302

  5. 18907305772/FuseAI 18907305772/FuseAI Public

    FuseAI Project

    Python 88 47