Exploring the use of spectral seriation to uncover dynamics in embryonic development : a geometric and probabilistic approach (2023)
Understanding the dynamics of embryonic development is crucial to finding treatments forconditions such as aging and cancer. The development of an embryo can be represented as acurve in the Wasserstein space, and to construct this curve, static snapshots of gene expressionprofiles are obtained at n selected time points. Since the measurement techniques for obtainingthese snapshots are destructive, we have to infer the developmental trajectory using a series ofstatic snapshots of gene expression profiles taken at different time points t₁, t₂,...tₙ. To obtainthese snapshots, multiple embryos are allowed to develop until each of the desired time pointsis reached, and the gene expression profile is then captured. However, to reconstruct the curvewe need to know which embryo had reached which developmental stage; this information is lostduring the measurements. To overcome this, a pairwise similarity function between profiles canbe defined, and the profiles can be arranged so that the more similar they are, the closer theyare placed together. This is part of a larger class of problems known as the “seriation” problem.In this thesis, the feasibility of using the “spectral seriation” method proposed by Atkins et al.is investigated to recover the order of the profiles based on their similarity, which enablesthe construction of the curve.The gene expression profile of an embryo can be seen as a probability measure on a compactset. Although the exact measures are unknown, they can be approximated empiricallyusing m samples. In this thesis, we demonstrate that, under reasonable assumptions and withsufficient time points and samples per time point, the spectral seriation method can be effective insequencing the data. Additionally, we provide tools to determine the number of time points andsamples per time point needed to achieve a desired error bound. Furthermore, we investigatehow the geometric properties of the curve representing the embryonic development can affectour ability to sequence the data.
View record