Bias-Variance Decomposition

Bias–variance tradeoff The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: - The...

1 min · Michelia-zhx

Paper Notes - Attention

1. Attention 机制 1a. 背景知识 我们最为熟悉的NMT模型便是经典的Seq2Seq, 这篇文章从一个Seq2Seq模型开始介绍, 然后进一步看如何将Attent...

3 min · Michelia-zhx

Paper Notes - Self-Attention

Self-Attention 机制和代码 BERT, RoBERTa, ALBERT, SpanBERT, DistilBERT, SesameBERT, SemBERT, SciBERT, BioBERT, MobileBERT, TinyBERT 和 CamemBERT 的共同点是 self-attention 机制. Self-attention 机制不仅是使某种架构被称为"BERT"的原因, 更准确地, 是基于...

2 min · Michelia-zhx