Stars
Data collection code for NeurIPS 2021 Datasets and Benchmarks Track paper: 'The Met Dataset: Instance-level Recognition for Artworks'
code for ICIP 2017 paper: SINGLE SHOT OBJECT DETECTION WITH TOP-DOWN REFINEMENT
Code for AAAI 2022 Oral paper: 'Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment'
Code for CVPR 2022 Oral paper: 'Few-Shot Object Detection with Fully Cross-Transformer'
Code for ICCV 2021 paper: 'Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks'
Context-Transformer: Tackling Object Confusion for Few-Shot Detection, AAAI 2020
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.
[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis
SCOPS: Self-Supervised Co-Part Segmentation (CVPR'19)
The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)
[ICCV 2019 (Oral)] Temporal Attentive Alignment for Large-Scale Video Domain Adaptation (PyTorch)
OpenMMD is an OpenPose-based application that can convert real-person videos to the motion files (.vmd) which directly implement the 3D model (e.g. Miku, Anmicius) animated movies.
Pytorch Repo for DeepGCNs (ICCV'2019 Oral, TPAMI'2021), DeeperGCN (arXiv'2020) and GNN1000(ICML'2021): https://www.deepgcns.org
A curated list of awesome architecture search resources
Starter code for working with the YouTube-8M dataset.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Awesome Person Re-identification
W-TALC: Weakly-supervised Temporal Activity Localization and Classification
3D ResNets for Action Recognition (CVPR 2018)
A collection of important graph embedding, classification and representation learning papers with implementations.
A curated list of action recognition and related area resources
The paper list about skeleton-based action recognition.
Translate images to unseen domains in the test time with few example images.
Scaling and Benchmarking Self-Supervised Visual Representation Learning