Math PhD @ UIUC

I have one work on distributed computation.

Abbreviation | Title |
---|---|

PlutoCharon20 | Parity-Checked Strassen Algorithm |

PlutoCharon20 deals with distributed matrix-matrix multiplication (MMM) where the workers might crash or straggle. By MMM we mean that we want to compute $C=AB$, where $A, B$ are huge matrices. By crashing and straggling we mean that some entry multiplication, for instance $A_{12}\times B_{23}$, might be available very late, if at all. This makes the computation slow.

To compensate, one needs to hire more-than-necessary workers and ask them to do redundant computations. A possibility to generate redundancy is to draw random vectors $g, h$ and then ask extra workers to compute $(gA)\times(Bh)$ on top of $A\times B$. Once this is done, the associativity equation $(gA)(Bh)=gCh$ will give you some parity checks that help you recover the missing entries of $C$.

The contribution of PlutoCharon20 is three-fold.
One: We obverse that the computation of $A\times B$
can be carried-out by fast matrix multiplication (FMM).
This construction is named **Pluto codes**.
Two: Applying Pluto codes recursively,
we obtain a code that behaves like tensor product codes.
Three: The computation of $(gA)\times(Bh)$,
if $g, h$ are matrices, can be carried-out by FMM.
This is named **Charon construction**.