Skip to content

Paper Implementation: Quantum algorithm for persistent Betti numbers and topological data analysis #837

@Death0004

Description

@Death0004

Team members: Devin Bae, Sabarikirishwaran Ponnambalam (@Sabarikirishwaran).

1. The research paper you plan to implement:

"Quantum algorithm for persistent Betti numbers and topological data analysis" by Ryu Hayakawa (2022).

2. A technical approach outlining how you will execute it:

We will implement an algorithm for estimating the persistent Betti numbers of a simplicial pair $K → L$ by block encoding the persistent Laplacian given that the nullity of the $q$-th persistent Laplacian equals to the $q$-th persistent Betti numbers and conditioned that:

  1. K is a combinatorially dense $q$-simplex: $d^K_q = n^K_q/\genfrac{(}{)}{0pt}{}{n}{q+1} \in Ω(\frac{1}{poly(n)})$
  2. The $q$-th Laplacian has an inverse polynomial gap: $\lambda^q_{min} \in Ω(\frac{1}{poly(n)})$
  3. The up-Laplacian has an inverse polynomial gap: $\gamma^q_{min} \in Ω(\frac{1}{poly(n)})$.

For a persistent pair $K → L$, Mémoli et al. (2022) demonstrate that the $q$-th persistent Betti numbers $\beta^{K,L}_q$ are equal to the nullity of the $q$-th persistent Laplacian:

$$\text{For each integer } q≥0, \beta^{K,L}_q = \text{nullity}(\Delta^{K,L}_q)$$

Here the $q$-th persistent Laplacian for a subspace $C^L_q$ is given as:

$$\Delta^{K,L}q:= \partial^{L,K}_{q+1} \circ (\partial^{L,K}_{q+1})^* + (\partial^K_q)^* \circ \partial^K_q$$

Where:

$$\begin{align} \partial^{L,K}_{q+1}\circ(\partial^{L,K}_{q+1})^*=\Delta^{K,L}_{q,up} \\\ \\\ \text{ and } \\\ \\\ (\partial^K_q)^*\circ\partial^K_q=\Delta^{K,L}_{q,down} \end{align}$$ $$\begin{align} \text{ and } \\\ \\\ ∂^{L,K}_q \text{ is the restriction on the boundary operator } ∂^K_q \text{ to } C^{L,K}_q \text{ so that the image of } ∂^{L,K}_q \text{ is contained in: } C^{L,K}_{q-1} \\\ \end{align}$$

This algorithm relies on block-encoding to encode the (oftentimes) large $\Delta^{K,L}_{q}$ matrices into unitaries. Many classical algorithms face limitations in analyzing high-dimensional persistent Betti numbers because the number of simplices can grow superpolynomially with the number of vertices.

FIRST STEP: In order to block encode the persistent Laplacian one can implement an Oracle $O^K_q$ which, given a candidate set of vertices, tells you if they form a $q$-simplex in $K$. We can implement a state preparation unitary $\tilde{U_s}$ approximating the following unitary $U_s |0^{(n+1)} ⟩ = |\phi^K_q⟩ \otimes |1⟩$, by constructing a “Dicke state:”

$$P_q|0^n⟩= \dfrac{1}{\sqrt{\genfrac{(}{)}{0pt}{}{n}{q+1}}}\sum_{x \in W_q}|x⟩$$

Which is a uniform superposition over all $q$-simplices given by:

$$| \phi^K_q⟩ := \frac{1}{\sqrt{n^K_q}} \sum_{i \in [n^K_q]} |\sigma^K_q(i)⟩$$

Here, the “Dicke-state” essentially prepares a uniform superposition of $n$-bit strings defined as the set ($W_q$) of Hamming weight $q+1$ bit strings.

SECOND STEP: After the state preparation is finished, we can now start block-encoding the persistent Laplacian by converting the boundary operators into block-encoded unitaries using the Oracles: $O^K_q$ and $O^K_{q+1}$ that return whether a simplex is contained in $K$ and $L$ for any $\sigma \in {0,1}^n$ and $a \in {0,1}$. Recall that the persistent Laplacian can be split into 2 components:

$$\Delta^{K,L}_q=\Delta^{K,L}_{q,up} + \Delta^{K,L}_{q,down}$$

One can decompose $Δ^{K,L}_{q,up}$ into submatrices:

$$\begin{align} &\Delta_1 := \Delta^L_{q,up}([n^K_q],[n^K_q]) \\\ &\Delta_2 := \Delta^L_{q,up}([n^K_q],I^L_K) \\\ &\Delta_3 := \Delta^L_{q,up}(I^L_K,[n^K_q]) \\\ &\Delta_4 := \Delta^L_{q,up}(I^L_K,I^L_K) \end{align}$$

Allowing us to define the following expression:

$$\Delta^{K,L}_{q,up}=\Delta^L_{q,up}/\Delta^L_{q,up}(I^L_K,I^L_K)=\Delta_1-\Delta_2\Delta_4^+\Delta_3$$

Where $Δ^L_{q,up}/Δ^L_{q,up}(I^L_K,I^L_K)$ is the Schur complement.

This in turn enables us to “remove” unwanted components of the up-portion of the persistent Laplacian belonging to $K$ (ie. all $(q+1)$-simplices whose boundaries are not contained in $K$). In order to implement the Schur component in a circuit, we must block-encode another matrix inversion (using QSVT) to approximate the pseudo-inverse.

THIRD STEP: Once the complete persistent Laplacian $Δ^{K,L}_q$, is fully block-encoded, we can block encode a projector onto the zero-energy space of:

$$Δ^{K,L}_q: \Pi = \sum_{j: \lambda_j = 0}| \psi_j⟩⟨\psi_j|$$

Where $\lambda^q_{min}$ is the smallest non-zero eigenvalue of $Δ^{K,L}_{q}$.

To do this, we can use a unitary $\tilde{U_\Pi}$ which is a QSVT of $\tilde{\Delta}^{K,L}_{q}/ \beta$ with respect to the rectangle function:

$$\begin{align} P^{(SV)}(\tilde{\Delta}^{K,L}_q\beta)=(⟨0^{6b+10}|\otimes I_n) \tilde{U_\Pi} (|0^{6b+10}⟩\otimes I_n) \\\ \end{align}$$

FOURTH STEP: Finally, the algorithm measures whether the state (the uniform superposition on $q$-simplices) is projected onto the kernel by $\Pi$ which returns 1 with a probability: $\dfrac{\text{nullity}(\Delta^{K,L}_q)}{|S^K_q|}$

This gives the the fraction of $q$-simplices lying in the zero-eigenspace (aka normalized persistent Betti number) and by repeating with standard sampling (Chernoff bounds) one can get an additive-error estimate of $\beta^{K,L}_q/S^K_q$.

3. A high-level example demonstrating key concepts:

Vietoris-Rips (VR) complexes are a common object of analysis in topological data analysis (TDA). For a given set $S$ of $n$ points where $n \in \mathbb{R}^d$, a VR complex $R_{\epsilon}(S)$ with the scale $\epsilon$ can be defined as the set of all $\sigma \in S$ such that the largest distance between any points in the VR complex are at most $2 \epsilon$.

The membership Oracle $O^K_q$ in this algorithm can check if a candidate set of vertices forms a simplex in $K$.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions