Solve X A = B on CUDA for Positive Semi-Definite A
If we’re going to solve
\[AX=B\]with $A$, $B$ both dense, it is straightforward to call cusolverDn<t>potrf()
and cusolverDn<t>potrs()
. However what if we’re going to solve
where $X,B\in\mathbb{R}^{m\times n}$, $A\in\mathbb{R}^{n\times n}$?
We could first call cusolverDn<t>potrf()
to do the Cholesky decomposition such that
then call the BLAS-3 function cublas<t>trsm()
twice, the first time to solve
the second time
\[X U^{\dagger} = X'.\]We could find example code below.