+
PDF
Chat
|
Deep Residual Learning for Image Recognition
|
2016
|
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
|
9
|
+
PDF
Chat
|
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
|
2020
|
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
|
5
|
+
PDF
Chat
|
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
|
2018
|
Andrew Owens
Alexei A. Efros
|
5
|
+
PDF
Chat
|
Audio-Visual Event Localization in Unconstrained Videos
|
2018
|
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
|
4
|
+
PDF
Chat
|
Rethinking the Inception Architecture for Computer Vision
|
2016
|
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
|
3
|
+
PDF
Chat
|
Densely Connected Convolutional Networks
|
2017
|
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
|
3
|
+
PDF
Chat
|
Ambient Sound Provides Supervision for Visual Learning
|
2016
|
Andrew Owens
Jiajun Wu
Josh H. McDermott
William T. Freeman
Antonio Torralba
|
3
|
+
PDF
Chat
|
Deep Multimodal Clustering for Unsupervised Audiovisual Learning
|
2019
|
Di Hu
Feiping Nie
Xuelong Li
|
3
|
+
PDF
Chat
|
Temporal Convolutional Networks for Action Segmentation and Detection
|
2017
|
Colin Lea
M. D. Flynn
René Vidal
Austin Reiter
Gregory D. Hager
|
3
|
+
PDF
Chat
|
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
|
2020
|
Yapeng Tian
Dingzeyu Li
Chenliang Xu
|
3
|
+
PDF
Chat
|
Learning Deep Features for Discriminative Localization
|
2016
|
Bolei Zhou
Aditya Khosla
Àgata Lapedriza
Aude Oliva
Antonio Torralba
|
3
|
+
|
Attention Is All You Need
|
2017
|
Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
|
3
|
+
PDF
Chat
|
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
|
2018
|
Phuc Nguyen
Bohyung Han
Ting Liu
Gautam Prasad
|
3
|
+
PDF
Chat
|
A Closer Look at Spatiotemporal Convolutions for Action Recognition
|
2018
|
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
|
3
|
+
|
Very Deep Convolutional Networks for Large-Scale Image Recognition
|
2014
|
Karen Simonyan
Andrew Zisserman
|
3
|
+
PDF
Chat
|
Spatiotemporal Pyramid Network for Video Action Recognition
|
2017
|
Yunbo Wang
Mingsheng Long
Jianmin Wang
Philip S. Yu
|
2
|
+
PDF
Chat
|
Dynamic Convolution: Attention Over Convolution Kernels
|
2020
|
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Dongdong Chen
Lu Yuan
Zicheng Liu
|
2
|
+
PDF
Chat
|
Spatiotemporal Modeling for Crowd Counting in Videos
|
2017
|
Feng Xiong
Xingjian Shi
Dit‐Yan Yeung
|
2
|
+
|
THE LASSO METHOD FOR VARIABLE SELECTION IN THE COX MODEL
|
1997
|
Robert Tibshirani
|
2
|
+
PDF
Chat
|
Bayesian Loss for Crowd Count Estimation With Point Supervision
|
2019
|
Zhiheng Ma
Xing Wei
Xiaopeng Hong
Yihong Gong
|
2
|
+
PDF
Chat
|
Emerging Properties in Self-Supervised Vision Transformers
|
2021
|
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jeǵou
Julien Mairal
Piotr Bojanowski
Armand Joulin
|
2
|
+
PDF
Chat
|
ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding
|
2019
|
Ning Liu
Yongchao Long
Changqing Zou
Qun Niu
Li Pan
Hefeng Wu
|
2
|
+
|
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization
|
2018
|
Bruno Korbar
Du Tran
Lorenzo Torresani
|
2
|
+
PDF
Chat
|
Temporal Pyramid Network for Action Recognition
|
2020
|
Ceyuan Yang
Yinghao Xu
Jianping Shi
Bo Dai
Bolei Zhou
|
2
|
+
PDF
Chat
|
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
|
2018
|
Zongwei Zhou
Md Mahfuzur Rahman Siddiquee
Nima Tajbakhsh
Jianming Liang
|
2
|
+
PDF
Chat
|
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling
|
2019
|
Yun Wang
Juncheng Li
Florian Metze
|
2
|
+
|
Attention is All you Need
|
2017
|
Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
|
2
|
+
PDF
Chat
|
Focal Loss for Dense Object Detection
|
2017
|
Tsung-Yi Lin
Priya Goyal
Ross Girshick
Kaiming He
Piotr Dollár
|
2
|
+
PDF
Chat
|
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
|
2017
|
João Carreira
Andrew Zisserman
|
2
|
+
PDF
Chat
|
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
|
2018
|
Yuhong Li
Xiaofan Zhang
Deming Chen
|
2
|
+
PDF
Chat
|
Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs
|
2017
|
Vishwanath A. Sindagi
Vishal M. Patel
|
2
|
+
PDF
Chat
|
Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection
|
2019
|
Da Zhang
Xiyang Dai
Yuan-Fang Wang
|
2
|
+
|
ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks
|
2018
|
Taehoon Kim
Jonghyun Choi
|
2
|
+
|
MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels
|
2017
|
Lu Jiang
Zhengyuan Zhou
Thomas Leung
Li-Jia Li
Li Fei-Fei
|
2
|
+
PDF
Chat
|
Squeeze-and-Excitation Networks
|
2018
|
Jie Hu
Li Shen
Gang Sun
|
2
|
+
PDF
Chat
|
Switching Convolutional Neural Network for Crowd Counting
|
2017
|
Deepak Babu Sam
Shiv Surya
R. Venkatesh Babu
|
2
|
+
PDF
Chat
|
Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN
|
2018
|
Deepak Babu Sam
Neeraj N Sajjan
R. Venkatesh Babu
Mukundhan Srinivasan
|
2
|
+
PDF
Chat
|
Audio Set Classification with Attention Model: A Probabilistic Perspective
|
2018
|
Qiuqiang Kong
Yong Xu
Wenwu Wang
Mark D. Plumbley
|
2
|
+
PDF
Chat
|
Leveraging Unlabeled Data for Crowd Counting by Learning to Rank
|
2018
|
Xialei Liu
Joost van de Weijer
Andrew D. Bagdanov
|
2
|
+
|
Crowd Counting using Deep Recurrent Spatial-Aware Network
|
2018
|
Lingbo Liu
Hongjun Wang
Guanbin Li
Wanli Ouyang
Liang Lin
|
2
|
+
PDF
Chat
|
Numerical Continuation Methods--An Introduction.
|
1992
|
Layne T. Watson
Eugene L. Allgower
Kurt Georg
|
2
|
+
PDF
Chat
|
Learning to Separate Object Sounds by Watching Unlabeled Video
|
2018
|
Ruohan Gao
Rogério Feris
Kristen Grauman
|
2
|
+
PDF
Chat
|
Locality-Constrained Spatial Transformer Network for Video Crowd Counting
|
2019
|
Yanyan Fang
Biyun Zhan
Wandi Cai
Shenghua Gao
Bo Hu
|
2
|
+
PDF
Chat
|
SMOTE: Synthetic Minority Over-sampling Technique
|
2002
|
Nitesh V. Chawla
Kevin W. Bowyer
Lawrence Hall
W. Philip Kegelmeyer
|
2
|
+
PDF
Chat
|
Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection
|
2017
|
Zhe Wang
Yanxin Yin
Jianping Shi
Wei Fang
Hongsheng Li
Xiaogang Wang
|
1
|
+
|
Numerical Continuation Methods
|
1990
|
Eugene L. Allgower
Kurt Georg
|
1
|
+
|
See, Hear, and Read: Deep Aligned Representations
|
2017
|
Yusuf Aytar
Carl Vondrick
Antonio Torralba
|
1
|
+
|
The Kinetics Human Action Video Dataset
|
2017
|
Andrew Zisserman
João Carreira
Karen Simonyan
Will Kay
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
T.C. Green
Trevor Back
|
1
|
+
PDF
Chat
|
<title>Adaptive self-quantization of wavelet subtrees: a wavelet-based theory of fractal image compression</title>
|
1995
|
Geoffrey M. Davis
|
1
|
+
PDF
Chat
|
Feature Pyramid Networks for Object Detection
|
2017
|
Tsung-Yi Lin
Piotr Dollár
Ross Girshick
Kaiming He
Bharath Hariharan
Serge Belongie
|
1
|