MATH Seminar

Title: A parallel solver for inverse tranport problems
Seminar: Numerical Analysis and Scientific Computing
Speaker: Dr. Andreas Mang of The University of Texas at Austin
Contact: Lars Ruthotto, lruthotto@emory.edu
Date: 2016-04-22 at 1:00PM
Venue: W306
Download Flyer
Abstract: In this talk we will discuss fast algorithms for large-scale inverse transport problems. These type of problems appear in numerous areas like weather prediction, ocean physics, or the reconstruction of porous media flows. In this talk we will consider the problem of diffeomorphic image registration with applications in medical imaging. We use a PDE constrained optimization formulation, where the constraints are the transport equations for a given scalar field, which in our case is a grayscale image. We will compare semi-Lagrangian and spectral collocation schemes, discuss a two-level Hessian preconditioner, and showcase a parallel implementation on distributed memory architectures.. We invert for a velocity field that governs the transport equation for the image deformation. The objective functional consists of an L^2 mismatch term and a regularization functional that penalizes the H^k-norm of the velocity field and its divergence. We discretize the optimality conditions using a pseudospectral discretization in space with a Fourier basis. We use a globalized, preconditioned, inexact, matrix-free, reduced space Newton-Krylov method to solve for an optimal, diffeomorphic flow field. The reduced Hessian is an ill-conditioned, dense, and compact operator; efficiently solving this system represents a significant challenge. We introduce and analyze a semi-Lagrangian formulation that, together with a nested two-level Hessian preconditioner, yields a 20x speedup compared to the state of the art. We will showcase convergence results, and assess strong and weak scalability of our solver for problem sizes of up to 25 billion unknowns on up to 1024 compute nodes on the Texas Advanced Computing Center's systems. As a highlight: We can invert for half a billion unknowns in 140 seconds on one node with 16 MPI tasks. We can reduce the time to solution by 20x for 32 compute nodes with 512 MPI tasks. This is joint work with George Biros and Amir Gholami

See All Seminars