作者归档:xczhang

Latent Alignment and Variational Attention

Abstract: Neural attention has become central to many state-of-the-art models in naturallanguage processing and related domains. Attention networks are an easy-to-trainand effective method for softly simulating alignment; however, the approach doesnot marginalize over latent alignments in a probabilistic sense. This property makesit difficult to compare attention to other alignment approaches, to compose it withprobabilistic models, and to perform posterior inference conditioned on observeddata. A related latent approach, hard attention, fixes these issues, but is generallyharder to train and less accurate. This work considers variational attention networks,alternatives to soft and hard attention for learning latent variable alignmentmodels, with tighter approximation bounds based on amortized variational inference.We further propose methods for reducing the variance of gradients to makethese approaches computationally feasible. Experiments show that for machinetranslation and visual question answering, inefficient exact latent variable modelsoutperform standard neural attention, but these gains go away when using hardattention based training. On the other hand, variational attention retains most ofthe performance gain but with training speed comparable to neural attention.

Latent Alignment and Variational Attention

Implicit Autoencoders

Abstract: In this paper, we describe the “implicit autoencoder” (IAE), a generative autoencoderin which both the generative path and the recognition path are parametrizedby implicit distributions. We use two generative adversarial networks to define thereconstruction and the regularization cost functions of the implicit autoencoder,and derive the learning rules based on maximum-likelihood learning. Using implicitdistributions allows us to learn more expressive posterior and conditionallikelihood distributions for the autoencoder. Learning an expressive conditional likelihood distribution enables the latent code to only capture the abstract andhigh-level information of the data, while the remaining information is capturedby the implicit conditional likelihood distribution. For example, we show thatimplicit autoencoders can disentangle the global and local information, and performdeterministic or stochastic reconstructions of the images. We further showthat implicit autoencoders can disentangle discrete underlying factors of variationfrom the continuous factors in an unsupervised fashion, and perform clustering andsemi-supervised learning.

Implicit Autoencoders

深度学习公开课

台湾大学李宏毅深度学习(2017):
http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.html

斯坦福吴恩达(Andrew Ng):Machine Learning
http://cs229.stanford.edu/

斯坦福吴恩达(Andrew Ng):Deep Learning
https://www.deeplearning.ai/

斯坦福李飞飞-CS231n: Convolutional Neural Networks for Visual Recognition
http://cs231n.stanford.edu/
http://cs231n.github.io/

深度学习自然语言处理公开课-斯坦福Richard Socher-CS224d:Deep Learning and Natural Language Processing:
http://cs224d.stanford.edu

Tensorflow 搭建自己的神经网络 (莫烦 Python 教程):https://morvanzhou.github.io/tutorials/machine-learning/tensorflow/

Wasserstein Auto-Encoders

Abstract: We propose the Wasserstein Auto-Encoder (WAE)—a new algorithm for building a generative model of the data distribution. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which leads to a different regularizer than the one used by the Variational Auto-Encoder (VAE) [1]. This regularizer encourages the encoded training distribution to match the prior. We compare our algorithm with several other techniques and show that it is a generalization of adversarial auto-encoders(AAE) [2]. Our experiments show that WAE shares many of the properties of VAEs (stable training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score.

Wasserstein Auto-Encoders

Decision Tree

决策树是一种用于分类/回归的机器学习算法,基于树结构来进行决策。一棵决策树包含根结点、内部结点和叶结点。根结点包含样本全集,内部结点对应属性,叶结点对应决策结果。

继续阅读

Variational Inference: A Unified Framework of Generative Models and Some Revelations

We reinterpreting the variational inference in a new perspective.Via this way, we can easily prove that EM algorithm,VAE, GAN, AAE, ALI(BiGAN) are all special cases of variationalinference. The proof also reveals the loss of standard GAN is incomplete and it explains why we need to train GAN cautiously. From that, we find out a regularization term to improvestability of GAN training.

Variational Inference: A Unified Framework of Generative Models and Some Revelations