Paper Notes - Self-Attention

Self-Attention 机制和代码 BERT, RoBERTa, ALBERT, SpanBERT, DistilBERT, SesameBERT, SemBERT, SciBERT, BioBERT, MobileBERT, TinyBERT 和 CamemBERT 的共同点是 self-attention 机制. Self-attention 机制不仅是使某种架构被称为"BERT"的原因, 更准确地, 是基于...

2 min · Michelia-zhx

Paper Notes - Self-Supervised Learning

Why self-supervised learning: 主要的问题在于获取数据及其标注部分. Definition: Self-supervised learning is a method that poses the following question to formulate an unsupervised learning problem as a supervised one: “Can we design the task in such a way that we can generate virtually unlimited labels from our existing images and use that to learn the representations?” Replace...

5 min · Michelia-zhx

Paper Notes - Vision Transformer

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 在整体的实现上, 原文完全使用原始bert的transformer结构, 主要是对图片转换成类似token的处理, 原文引...

6 min · Michelia-zhx

Paper Notes - Week 1

Mining Typhoon Knowledge with Neural Networks – Zhi-Hua Zhou, Shi-Fu Chen, Zhao-Qian Chen - 1999 需解决的问题: 神经网络的两个缺点 – 数据量大, 训练时间长; 神经网络对知识的学习果不能直接用于决策. Fast neural model - FTART (Firld Theory...

6 min · Michelia-zhx

Paper Notes - Week 2

Multi-Instance Multi-Label Learning with Application to Scene Classification – Zhi-Hua Zhou, Min-Ling Zhang, NIPS 2006 Multi-instance: 一个example包含多个instance, example只对应1个label; Multi-label: 一个example对应多个...

5 min · Michelia-zhx