Current Projects

Scalable LU Factorizations

February 2020 - Present

Experimenting with the use of various approaches to improve the performance of distributed, GPU-accelerated LU factorizations. Such factorizations require partial pivoting for numerical stability; however, this introduces significant overheads to search for and apply such pivots. The primary branch of this work to date involves the use of Randomized Butterfly Transforms to shuffle the matrix in such a way that pivoting is unnecessary. Other efforts have included optimizing the pivoted and non-pivoted implementations of LU in SLATE, a dense linear algebra library for distributed, heterogenous systems. The current work uses the SLATE dense linear algebra library.

Atom for Common Lisp

April 2019 - Present

I maintain two packages for developing Lisp (especially Common Lisp) in the Atom text editor. First is SLIMA, which provides interactive Common Lisp development, based on the Emacs plugin SLIME. This is a fork of Steve Levine’s Atom-Slime. Second is Lisp-Paredit, which provides commands for editing any S-expression based language, originally developed by Jon Spalding.

Past Projects

Scalable Interpolation for Thermal-Fluids Applications

May 2021 - September 2021

Ported interpolation routines to the OCCA runtime system in NekRS, a GPU-accelerated, spectral-element code for simulation fluids and their temperature. These routines were used to implement both particle tracking and multiple, overlapped meshes.

Mixed precision GMRES

August 2019 - June 2021

Exploration of the use of different precisions for different parts of the solver affects the performance and convergence of GMRES. The work primarily focused on achieving the accuracy of a double precision GMRES implementation while selectively using lower precision to reduce data movement costs.

Reducing Memory Access Costs using Data Compression in Conjugate Gradient

May 2017 - April 2019

Exploration into whether the performance of sparse linear solvers (specifically Conjugate Gradient) can be improved by reducing data movement using compression.


July 2017 - May 2019

An implementation of Trilinos’s Petra Object Model in Julia. The project tried to understand how well Julia works for distributed, high performance computing.