IIn most unsupervised learning tasks, a notion of distance or similarity between data points is both crucial and usually not directly available as an input. For example, the efficiency of tasks like clustering and dimension reduction depend on the distance chosen. Furthermore, in many applications data lie on a lower dimensional manifold (unknown).
The Fermat distance takes into account the underlying structure (manifold) of the data and the underlying density from which the points are sampled.
Consider:
Given two points $x, y$ and a parameter $\beta > 0$, we define the Fermat distance as
$$ \mathscr D (x, y) = \inf_{\Gamma} \int_\Gamma \frac{1}{f^\beta} d \ell , $$
where the minimization is performed all over posible paths $\Gamma$ that connect $x$ with $y$.
Definition (Fermat distance estimator)
For $\alpha > 1$ and given two points $ x, y\in \mathbb{X}_n$, we define the Fermat distance estimator as:
$$ \mathscr{D}_{\mathbb{X}_n} (x, y) = \min_{ \begin{align} ({q}_1, ...,{q}_K ) \in \mathbb{X}_n^K \\ q_1 = x, \, q_K = y, \, K \geq 2 \end{align}} \, \sum_{i=1}^{K-1} \, \ell ( q_{i+1}, q_i)^\alpha.$$
Remaks