Skip to content
View dyth's full-sized avatar

Block or report dyth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dyth/README.md

David Yu-Tung Hui / 許宇同

There are multiple ways to write my name. In Latin script, my surname is "Hui" and my firstname is "David Yu-Tung." In Traditional Chinese characters, my family name is "許" and my given name is "宇同." Most people call me "David." Others call me "宇同" or "Yu-Tung."

I am currently unemployed. I used to be an AI researcher in deep reinforcement learning. I wrote two works improving the optimization stability of off-policy gradient-based Q-learning algorithms.

  1. Stabilizing Q-Learning for Continuous Control
    David Yu-Tung Hui
    MSc Thesis, University of Montreal, 2022
    I derived a deep reinforcement learning algorithm from mathematical first principles. I derived the SACLite loss functions from the principle of maximum-entropy and justified the use of LayerNorm with a neural-tangent-kernel-inspired analysis. Compared to baseline actor-critic algorithms, my algorithm did not diverge in high-dimensional continuous control.
    [.pdf] [Errata]

  2. Double Gumbel Q-Learning
    David Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon
    Spotlight at NeurIPS 2023
    We showed that Q-learning with function approximation has two previously unnoticed heteroscedastic Gumbel noise sources. An algorithm accounting for these noise sources attained almost 2 times the aggregate asymptotic performance of the popular SAC baseline.
    [.pdf] [Reviews] [Poster (.png)] [5-min talk] [1-hour seminar] [Code (GitHub)] [Errata]

The best way to contact me is email. My email address is listed in one of my written works.

Pinned Loading

  1. doublegum doublegum Public

    NeurIPS 2023 Spotlight

    Python 10 5

  2. causal-entropic-forces causal-entropic-forces Public

    Python reimplementation of Wissner-Gross & Freer, 2013

    Jupyter Notebook 13 5