rdspring1

Ryan Spring rdspring1

I contribute to PyTorch, Lightning-AI Thunder, and Nvidia/Fuser.

97 followers · 47 following

Achievements

x3 x3

Achievements

x3 x3

Organizations

cutlass Public
Forked from NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

C++ Other Updated Dec 8, 2025
lightning-thunder Public
Forked from Lightning-AI/lightning-thunder

Source to source compiler for PyTorch. It makes PyTorch programs faster on single accelerators and distributed.

Python Apache License 2.0 Updated Dec 6, 2025
NvFuser Public
Forked from NVIDIA/Fuser

A Fusion Code Generator for NVIDIA GPUs

C++ Other Updated Oct 15, 2025
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated Jul 21, 2025
simplegemm Public
Forked from bertmaher/simplegemm

Cuda MIT License Updated Jul 17, 2025
pytorch Public
Forked from pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python Other Updated Jul 10, 2025
AITemplate Public
Forked from facebookincubator/AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python Apache License 2.0 Updated Jul 27, 2023
vector-search-class-notes Public
Forked from edoliberty/vector-search-class-notes

Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 495 @ Princeton Fall 2023

TeX MIT License Updated Jun 14, 2023
Auto-GPT Public
Forked from Significant-Gravitas/AutoGPT

An experimental open-source attempt to make GPT-4 fully autonomous.

Python MIT License Updated Apr 3, 2023
twitter-algorithm-ml Public
Forked from twitter/the-algorithm-ml

Source code for Twitter's Recommendation Algorithm

Python GNU Affero General Public License v3.0 Updated Apr 1, 2023
nvprims-torchdynamo Public
Forked from pytorch/torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python BSD 3-Clause "New" or "Revised" License Updated Nov 9, 2022
Autodiff-Puzzles Public
Forked from srush/Autodiff-Puzzles

Jupyter Notebook MIT License Updated Oct 31, 2022
Autopilot-TensorFlow Public
Forked from SullyChen/Autopilot-TensorFlow

A TensorFlow implementation of this Nvidia paper: https://arxiv.org/pdf/1604.07316.pdf with some changes

Jupyter Notebook MIT License Updated Oct 10, 2022
tutel Public
Forked from microsoft/Tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Python MIT License Updated Sep 3, 2022
micrograd Public
Forked from karpathy/micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook MIT License Updated Aug 29, 2022
minGPT Public
Forked from karpathy/minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python MIT License Updated Aug 5, 2022
Optimizing-SGEMM-on-NVIDIA-Turing-GPUs Public
Forked from yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Python GNU General Public License v3.0 Updated Jun 18, 2022
RzLinear Public
Forked from apd10/RzLinear

A compressed alternative to matrix multiplication using state-of-the art compression ROBE-Z

Python 1 MIT License Updated May 18, 2022
LSH_DeepLearning Public

Scalable and Sustainable Deep Learning via Randomized Hashing

deep-learning neural-network parallel-computing locality-sensitive-hashing randomised-algorithms

Java 94 22 Apache License 2.0 Updated May 16, 2022
cuda-training-series Public
Forked from olcf/cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda Updated Apr 4, 2022
xla Public
Forked from pytorch/xla

Enabling PyTorch on Google TPU

C++ Other Updated Feb 9, 2022
Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F Public
Forked from yzhaiustc/Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

C GNU General Public License v3.0 Updated Feb 3, 2022
mongoose Public
Forked from HazyResearch/mongoose

A Learnable LSH Framework for Efficient NN Training

Python MIT License Updated Jul 22, 2021
Optimizing-DGEMV-on-Intel-CPUs Public
Forked from yzhaiustc/Optimizing-DGEMV-on-Intel-CPUs

Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.

C GNU General Public License v3.0 Updated May 24, 2021
dlrm_ssm Public
Forked from yanzhoupan/dlrm_ssm

Python MIT License Updated Mar 24, 2021
cs231n Public
Forked from AutomanHan/standford-cs231n-2018

Solutions to Stanford CS231n Spring 2018 Course Assignments.

Jupyter Notebook Updated Nov 18, 2020
Count-Sketch-Optimizers Public

A compressed adaptive optimizer for training large-scale deep learning models using PyTorch

hashing deep-learning neural-network pytorch transformer imagenet count-min-sketch

Python 26 13 Apache License 2.0 Updated Nov 26, 2019
MISSION Public

MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

hashing feature-extraction large-scale-learning compressive-sensing dna-metagenomics count-sketches

C++ 13 6 Apache License 2.0 Updated Oct 6, 2019
PyTorch_GBW_LM Public

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

nlp machine-learning deep-learning gpu torch pytorch lstm

Python 123 20 Apache License 2.0 Updated Aug 22, 2019
LSH-Mutual-Information Public

Use LSH Sampling for Mutual Information Estimation

Python 5 Apache License 2.0 Updated Aug 3, 2019

Ryan Spring rdspring1

Achievements

Achievements

Organizations

cutlass Public

Uh oh!

lightning-thunder Public

Uh oh!

NvFuser Public

Uh oh!

vllm Public

Uh oh!

simplegemm Public

Uh oh!

pytorch Public

Uh oh!

AITemplate Public

Uh oh!

vector-search-class-notes Public

Uh oh!

Auto-GPT Public

Uh oh!

twitter-algorithm-ml Public

Uh oh!

nvprims-torchdynamo Public

Uh oh!

Autodiff-Puzzles Public

Uh oh!

Autopilot-TensorFlow Public

Uh oh!

tutel Public

Uh oh!

micrograd Public

Uh oh!

minGPT Public

Uh oh!

Optimizing-SGEMM-on-NVIDIA-Turing-GPUs Public

Uh oh!

RzLinear Public

Uh oh!

LSH_DeepLearning Public

Uh oh!

cuda-training-series Public

Uh oh!

xla Public

Uh oh!

Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F Public

Uh oh!

mongoose Public

Uh oh!

Optimizing-DGEMV-on-Intel-CPUs Public

Uh oh!

dlrm_ssm Public

Uh oh!

cs231n Public

Uh oh!

Count-Sketch-Optimizers Public

Uh oh!

MISSION Public

Uh oh!

PyTorch_GBW_LM Public

Uh oh!

LSH-Mutual-Information Public

Uh oh!