Paper Notes - Multi-Teacher Knowledge Distillation - 2

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks https://arxiv.org/pdf/2004.05937.pdf Learning from Multiple Teacher Networks, KDD 2017 Efficient knowledge distillation from an ensemble of teachers. Interspeech 2017: 对teacher的logits取加权平均, 加权平均和student的logit...

12 min · Michelia-zhx

Paper Notes - Object Detection

Definition Image Classification: 输入图片, 输出图中目标物体的类别. Object Localization: 输入图片, 输出图中物体的 bounding box. Object Detection: 输入图片, 输出图中物体的 bounding box 和类别. R-CNN Model Family 采用region proposal methods, 首...

6 min · Michelia-zhx

Paper Notes - Self-Attention

Self-Attention 机制和代码 BERT, RoBERTa, ALBERT, SpanBERT, DistilBERT, SesameBERT, SemBERT, SciBERT, BioBERT, MobileBERT, TinyBERT 和 CamemBERT 的共同点是 self-attention 机制. Self-attention 机制不仅是使某种架构被称为"BERT"的原因, 更准确地, 是基于...

2 min · Michelia-zhx

Paper Notes - Self-Supervised Learning

Why self-supervised learning: 主要的问题在于获取数据及其标注部分. Definition: Self-supervised learning is a method that poses the following question to formulate an unsupervised learning problem as a supervised one: “Can we design the task in such a way that we can generate virtually unlimited labels from our existing images and use that to learn the representations?” Replace...

5 min · Michelia-zhx

Paper Notes - Vision Transformer

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 在整体的实现上, 原文完全使用原始bert的transformer结构, 主要是对图片转换成类似token的处理, 原文引...

6 min · Michelia-zhx