Skip to content

Conversation

@fehiepsi
Copy link
Contributor

@fehiepsi fehiepsi commented Jun 4, 2019

Currently, when the input of MVN is precision matrix, we take inverse to convert the result to covariance matrix. This, however, will easily make the covariance matrix not positive definite, hence will trigger a cholesky error.

For example,

import torch
torch.manual_seed(0)
x = torch.randn(10)
P = torch.exp(-(x - x.unsqueeze(-1)) ** 2)
torch.distributions.MultivariateNormal(loc=torch.ones(10), precision_matrix=P)

will trigger RuntimeError: cholesky_cpu: U(8,8) is zero, singular U.

This PR uses some math tricks (ref) to only take inverse of a triangular matrix, hence increase the stability.

cc @fritzo, @neerajprad , @ssnl

@pytorchbot pytorchbot added the module: distributions Related to torch.distributions label Jun 4, 2019
@fritzo
Copy link
Collaborator

fritzo commented Jun 6, 2019

@pytorchbot merge this please

@pytorchbot pytorchbot added the merge-this-please Was marked for merge with @pytorchbot merge this please label Jun 6, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in f8cab38.

@rojinsafavi
Copy link

Is this error fixed? I am getting this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-8-7babe3f1c55d> in <module>
      1 mu2 = mean1
      2 std2 = Cov1
----> 3 q = MultivariateNormal(mu2, std2)

/projects/sysbio/projects/rojin/anaconda3/lib/python3.6/site-packages/torch/distributions/multivariate_normal.py in __init__(self, loc, covariance_matrix, precision_matrix, scale_tril, validate_args)
    140             if precision_matrix is not None:
    141                 self.covariance_matrix = torch.inverse(precision_matrix).expand_as(loc_)
--> 142             self._unbroadcasted_scale_tril = torch.cholesky(self.covariance_matrix)
    143 
    144     def expand(self, batch_shape, _instance=None):

RuntimeError: cholesky_cpu: For batch 0: U(13,13) is zero, singular U.

@fehiepsi
Copy link
Contributor Author

@rojinsafavi This PR enhances the stability for precision_matrix. It look like your std2 is not positive definite. Maybe you can solve it by adding a small term to the diagonal part of std2.

@rojinsafavi
Copy link

So I'm actually using the same thing that I use in "R" kldiv function, and it works fine over there, is something different here?

@rojinsafavi
Copy link

@rojinsafavi This PR enhances the stability for precision_matrix. It look like your std2 is not positive definite. Maybe you can solve it by adding a small term to the diagonal part of std2.

This is the diagonal, no negative number:

array([0.5891058 , 0.7394688 , 0.84647965, 0.7394722 , 0.77156126, 0.5949623 , 0.7069786 , 1.4996048 , 0.49080923, 0.90659595, 0.29516238, 0.83300734, 0.81732637, 0.71747035, 0.6509524 , 0.797131 , 1.103374 , 0.5067505 , 1.1981595 , 1.0099905 , 0.9176539 , 0.70608026, 0.42208812, 1.0714566 , 0.7964946 ], dtype=float32)

@fehiepsi
Copy link
Contributor Author

I think positive diagonal is not equivalent to positive definite. You can check if std2 is positive definite in single precision (float32) by using torch.eig(std2) and checking if the eigenvalues are positive.

@rojinsafavi
Copy link

rojinsafavi commented Aug 14, 2019

I think positive diagonal is not equivalent to positive definite. You can check if std2 is positive definite in single precision (float32) by using torch.eig(std2) and checking if the eigenvalues are positive.

Thanks for the help, I was not using the right covariance matrix. The result is a bit different from what I get in R though. In R the KL is 1.955039, and in torch it is tensor(1.9948). Just to make sure that I'm doing everything as I should, here are the steps:


mu1 = mean0
std1 = Cov0
p = torch.distributions.MultivariateNormal(mu1, std1)
​
mu2 = mean1
std2 = Cov1
q = MultivariateNormal(mu2, std2)
​
torch.distributions.kl_divergence(p,q).mean()
tensor(1.9948)

Cov0.txt
Cov1.txt
mean0.txt
mean1.txt

@fehiepsi
Copy link
Contributor Author

@rojinsafavi I think the way you use kl_divergence is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-this-please Was marked for merge with @pytorchbot merge this please Merged module: distributions Related to torch.distributions open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants