Video Recognition

or Action Recognition, 视频识别或者视频分类任务,针对视频中的连续帧分类(可以是整个视频,亦可以是视频中的某个片段)

DataSet

BenchMark

3D-Conv

传统的2D-conv在应用于单帧图片情况下表现良好,但用于多帧视频情况下,会丢失时间关系或者其他序列前后关系(CT或者MRI医学图像)

In Conclusion: 3d还是2d的核心区别是输出层上的shape是3 dimension还是2 dimension

Method

We separate the method into two part, extraction and classification .Introduced by (Unsupervised Learning from Video with Deep Neural Embeddings)