Notes - Convex Optimization and Approximation (EE 227C)

1 Convexity 1.1 Convex sets $\textbf{Definition 1.1}$ (Convex set). A set $K \subseteq \mathbb{R}^n$ is convex if it the line segment between any two points in K is also contained in K. Formally, for all $x, y\in K$, and all scalars $\gamma\in [0,1]$, we have $\gamma x+(1-\gamma)y\in K$. $\textbf{Theorem 1.2}$ (Separation Theorem). Let $C, K \subseteq \mathbb{R}^n$ be convex sets with empty intersection $C \cap K = \emptyset$. Then there exists apoint $a\in \mathbb{R}^n$ and a number $b\in \mathbb{R}$ such that...

1 min · Michelia-zhx

Notes - Convex Optimization: Algorithms and Complexity

1 Introduction 定义 1.1 (Convex sets and convex functions) 我们关心的问题是定义在凸集上的凸函数的最小值问题 1.1 机器学习里的凸优化问题 $$\min_{x\in\mathbb{R}^n}\sum_{i=1}^mf_i(x) + \gamma\mathcal{R}(x)$$ 最小化表示代价+模型复杂性. 分类问题 SVM: $f_i(x) =...

2 min · Michelia-zhx

Paper Notes - Active Learning

Active Learning 也称为查询学习或者最优实验设计. 主动学习通过设计合理的查询函数, 不断从未标注的数据中挑选出数据标注后放入训练集. 有效的主动学习数据选择策...

3 min · Michelia-zhx

Paper Notes - Attention

1. Attention 机制 1a. 背景知识 我们最为熟悉的NMT模型便是经典的Seq2Seq, 这篇文章从一个Seq2Seq模型开始介绍, 然后进一步看如何将Attent...

3 min · Michelia-zhx

Paper Notes - Multi-Teacher Knowledge Distillation - 1

Learning from Multiple Teacher Networks http://library.usc.edu.ph/ACM/KKD%202017/pdfs/p1285.pdf loss: teachers的softmax输出取平均和student的交叉熵 中间层表示的相对相异度(仅适用于MTKD), 三元组$(q_i...

13 min · Michelia-zhx