diff --git a/README.md b/README.md
index 3661c45d..d66a7472 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@ This repository contains code (training / metrics / plotting) to investigate dis
* **Standard VAE Loss** from [Auto-Encoding Variational Bayes](https://arxiv.org/abs/1312.6114)
* **β-VAEH** from [β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework](https://openreview.net/pdf?id=Sy2fzU9gl)
* **β-VAEB** from [Understanding disentangling in β-VAE](https://arxiv.org/abs/1804.03599)
-* **FactorVAE** from [Disentangling by Factorising](https://arxiv.org/pdf/1802.05983.pdf)
+* **FactorVAE** from [Disentangling by Factorising](https://arxiv.org/abs/1802.05983)
* **β-TCVAE** from [Isolating Sources of Disentanglement in Variational Autoencoders](https://arxiv.org/abs/1802.04942)
Notes:
@@ -274,8 +274,8 @@ The losses differ in their estimates of each of these terms and the hyperparamet
* [**Standard VAE Loss**](https://arxiv.org/abs/1312.6114): α=β=ɣ=1. Each term is computed exactly by a closed form solution (KL between the prior and the posterior). Tightest lower bound.
* [**β-VAEH**](https://openreview.net/pdf?id=Sy2fzU9gl): α=β=ɣ>1. Each term is computed exactly by a closed form solution. Simply adds a hyper-parameter (β in the paper) before the KL.
* [**β-VAEB**](https://arxiv.org/abs/1804.03599): α=β=ɣ>1. Same as **β-VAEH** but only penalizes the 3 terms once they deviate from a capacity C which increases during training.
-* [**FactorVAE**](https://arxiv.org/pdf/1802.05983.pdf): α=ɣ=1, β>1. Each term is computed exactly by a closed form solution. Simply adds a hyper-parameter (β in the paper) before the KL. Adds a weighted Total Correlation term to the standard VAE loss. The total correlation is estimated using a classifier and the density-ratio trick. Note that ɣ in their paper corresponds to β+1 in our framework.
-* [**β-TCVAE**](https://arxiv.org/pdf/1802.05983.pdf): α=ɣ=1 (although can be modified), β>1. Conceptually equivalent to FactorVAE, but each term is estimated separately using minibatch stratified sampling.
+* [**FactorVAE**](https://arxiv.org/abs/1802.05983): α=ɣ=1, β>1. Each term is computed exactly by a closed form solution. Simply adds a hyper-parameter (β in the paper) before the KL. Adds a weighted Total Correlation term to the standard VAE loss. The total correlation is estimated using a classifier and the density-ratio trick. Note that ɣ in their paper corresponds to β+1 in our framework.
+* [**β-TCVAE**](https://arxiv.org/abs/1802.04942): α=ɣ=1 (although can be modified), β>1. Conceptually equivalent to FactorVAE, but each term is estimated separately using minibatch stratified sampling.