Tags: NVIDIA/kvpress
Tags
Release v0.4.0 - Add CURPress - CUR decomposition-based KV cache compression (#150) - Add Compactor press for enhanced compression capabilities (#143) - Add decoding press functionality for compression during decoding (#139) - Add AIME25 and Math500 benchmark datasets for evaluation (#142) - Add post_init_from_model hook to BasePress for model-specific initialization (#163) - Move tests to GPU for faster CI (#132) - Improve needle-in-haystack test (#133) - Update README and documentation (#162) - Update docstrings (#159) - Update decoding notebook (#156) - Move utils, clean and fix imports (#160) - Fix LongBench-v2 benchmark (#161) - Fix kvzip press access to past_key_values - Fix ComposedPress (#148) - Fix imports (#144)
PreviousNext