This Blog is Untitled: July 2019

Distribution of segments on Gaussian chain

The probability distribution function of $P_i(\mathbf{r}_i-\mathbf{r}_\mathrm{cm})$ of ideal chain is evaluated, for $P_i$ represents the probability distribution fucntion of $i$th segment with respect to its centre of mass on the ideal chain. An ideal chain is modeled as multidimensional random walk, where the steps are independent, and the distribution of a step at mean length $b$ is given by $P(\mathbf{r})\sim\mathcal{N}(0,b^2)$. Let $\mathbf{b}_i=\mathbf{r}_{i+1}-\mathbf{r}_{i}$ be the $i$th bond vector, then we have

$$\mathbf{r}_i=\sum_{j=1}^{i-1} \mathbf{b}_j$$

and the centre of mass $\mathbf{r}_\mathrm{cm}=\frac{1}{N}\sum_i \mathbf{r}_i$ is

$$\mathbf{r}_\mathrm{cm}=\frac{1}{N}\sum_{j=1}^{N-1}(N-j)\mathbf{b}_j$$

then

$$\mathbf{r}_i-\mathbf{r}_\mathrm{cm}=\sum_{j=1}^{i-1}\frac{j}{N}\mathbf{b}_j+\sum_{j=i}^{N-1}\frac{N-j}{N}\mathbf{b}_j$$

If variable $X$ is a Gaussian random variable with scale of $\sigma^2$, then $aX$ is a Gaussian random variable with scale of $a^2\sigma^2$, we can then write the characteristic function for $P_i(\mathbf{r}_i-\mathbf{r}_\mathrm{cm})$ of a $d$ dimensional ideal chain:

$$\phi_i(\mathbf{q})=\Pi_{j} \phi_{\mathbf{b}_j'}(\mathbf{q})=\exp\left(-\frac{1}{2}\mathbf{q}^T\left(\sum_{j=1}^{N-1}\Sigma_j\right)\mathbf{q}\right)$$

with $\mathbf{b}'_j=\frac{j}{N}\mathbf{b}_j$ for $j\le i-1$ and $\frac{N-j}{N}\mathbf{b}_j$ for $i\le j \le N-1$; $\phi_{\mathbf{b}_j'}=-\exp(-0.5\mathbf{q}^T\Sigma\mathbf{q})$ is the characteristic function of the probability distribution fucntion of bond $j$. $\Sigma_j=\frac{j^2 b^2}{d N^2}\mathbb{I}_d$ for $j\le i-1$ and $\Sigma_j=\frac{(N-j)^2 b^2}{d N^2}\mathbb{I}_d$ for $i\le j \le N-1$, and $\mathbb{I}_d$ is the $d$-dimensional identical matrix, and therefore: (For convenience, I will set $b=1$ in the later calculations.)

$$\phi_i(\mathbf{q})=$$ $$\exp \left(-\frac{\left(6 i^2-6 i (N+1)+2 N^2+3 N+1\right) \left(q_x^2+q_y^2+q_z^2\right)}{36 N}\right)$$

The corresponding distribution of this characteristic function is still a Gaussian distribution with $\Sigma=\frac{b^2}{3} \mathbb{I}_3$, where the equivalent bond length $b^2=\frac{\left(6 i^2-6 i (N+1)+2 N^2+3 N+1\right)}{6 N}$. The 6th moment is $\frac{1}{N}\sum_{i=1}^N \langle(\mathbf{r}_i-\mathbf{r}_\mathrm{cm})^6\rangle=\frac{58 N^6-273 N^4+462 N^2-247}{1944 N^3}$. At large $N$, only leading term matters, which is $\frac{29}{972} N^3$. For $N=20$, the result is $235.886$, which is in accord with the simulation. Another example is for $N=5$, the $R_g^2$ is $0.8$ rather than $5/6=0.8\dot{3}$ in this case.

Here is the simulation code:

ch = np.random.normal(size=(100000,20,3),scale=1/3**0.5)
ch[:,0,:]=0
ch = ch.cumsum(axis=1)
ch -= ch.mean(axis=1,keepdims=1)
m6 = np.mean(np.linalg.norm(ch,axis=-1)**6)

Simple Examples of Parallel Computing of Cython

It is more convenient to calculate some properties during a molecular simulation process by accessing data from API of the molecular simulation program than calculating after the whole simulation progress and dumping all the coordinates. Especially, for sharply fluctuated properties such as Virial, RDF, etc., require massive frames to calculate. It is expensive to dump densely. For some Python-friendly molecular simulation softwares, e.g., lammps, hoomd-blue, galamost, etc.; it is easily to embed customized Python functions into the simulation control scripts. However, Python has a poor performance in massive calculations, numba.cuda.jit does dramatically boost up the program, however, due to unknown reasons, GPU-accelerated molecular simulation softwares like hoomd-blue and galamost produce errors during simulations if one uses numba.cuda.jit compiled functions. Therefore, I try to use Cython to generate C-functions to accelerate the calculations. Here I put a simple example of a pairwise property calculation function (RDF) using Cython-parallel method, it's also a learning note of mine:

# cython: language_level=3, boundscheck=False, wraparound=False, cdivision=True
import numpy as np
cimport numpy as np
import cython
from cython.parallel import prange, parallel
cimport openmp
from libc.math cimport floor,sqrt,pow
from libc.stdlib cimport malloc, free
import multiprocessing


cdef long * unravel_index_f(long i, long[:] dim) nogil:
    cdef long k, d
    d = dim.shape[0]
    cdef long * ret = <long *> malloc(d * sizeof(long))
    for k in range(d):
        ret[k] = i % dim[k]
        i = (i - ret[k]) / dim[k]
    return ret


cdef long ravel_index_f(long * vec, long[:] dim) nogil:
    cdef long ret, d, tmp, k
    d = dim.shape[0]
    ret = (vec[0] + dim[0]) % dim[0]
    tmp = dim[0]
    for k in range(1,d):
        ret += ((vec[k] + dim[k]) % dim[k]) * tmp
        tmp *= dim[k]
    return ret


cdef long jth_neighbour(long * veci, long * vecj, long[:] dim) nogil:
    cdef long ret, d, tmp, k
    d = dim.shape[0]
    ret = (veci[0] + vecj[0] - 1 + dim[0]) % dim[0]
    # re-ravel tmpi + tmpj - 1 to cell_j
    # -1 for indices from -1, -1, -1 to 1, 1, 1 rather than 0,0,0 to 2,2,2
    tmp = dim[0]
    for k in range(1, d):
        ret += ((veci[k] + vecj[k] - 1 + dim[k]) % dim[k]) * tmp
        tmp *= dim[k]
    return ret


cdef double pbc_dist(double[:] x, double[:] y, double[:] b) nogil:
    cdef long i, d
    cdef double tmp=0, r=0
    d = b.shape[0]
    for i in range(d):
        tmp = x[i]-y[i]
        tmp = tmp - b[i] * floor(tmp/b[i]+0.5)
        r = r + pow(tmp, 2)
    return sqrt(r)


cdef long cell_id(double[:] p, double[:] box, long[:] ibox) nogil:
    # In the Fortran way
    cdef long ret, tmp, i, n
    n = p.shape[0]
    ret = <long> floor((p[0] / box[0] + 0.5) * ibox[0])
    tmp = ibox[0]
    for i in range(1, n):
        ret = ret + tmp * <long> floor((p[i] / box[i] + 0.5) * ibox[i])
        tmp = tmp * ibox[i]
    return ret


cdef void linked_cl(double[:, :] pos, double[:] box, long[:] ibox, long[:] head, long[:] body) nogil:
    cdef long i, n, ic
    n = pos.shape[0]
    for i in range(n):
        ic = cell_id(pos[i], box, ibox)
        body[i] = head[ic]
        head[ic] = i


def rdf(double[:,:] x, double[:,:] y, double[:] box, double bs, long nbins):
    cdef long i, j, k, l, n, m, d3, d, ic, jc
    cdef np.ndarray[np.double_t, ndim=2] ret
    cdef long[:] head, body, ibox, dim
    cdef double r, r_cut
    cdef long ** j_vecs
    cdef long * veci
    cdef int num_threads, thread_num
    num_threads = multiprocessing.cpu_count()
    r_cut = nbins * bs
    n = x.shape[0]
    d = x.shape[1]
    m = y.shape[0]
    d3 = 3 ** d
    ibox = np.zeros((d,), dtype=np.int64)
    for i in range(d):
        ibox[i] = <long> floor(box[i] / r_cut + 0.5)
    head = np.zeros(np.multiply.reduce(ibox), dtype=np.int64) - 1
    body = np.zeros((m,), dtype=np.int64) - 1
    linked_cl(y, box, ibox, head, body)
    ret = np.zeros((num_threads, nbins), dtype=np.float)
    dim = np.zeros((d,), dtype=np.int64) + 3
    j_vecs = <long **> malloc(sizeof(long *) * d3)
    for i in range(d3):
        j_vecs[i] = unravel_index_f(i, dim)  
    with nogil, parallel(num_threads=num_threads):
        for i in prange(n, schedule='dynamic'):
            ic = cell_id(x[i], box, ibox)
            thread_num = openmp.omp_get_thread_num()
            veci = unravel_index_f(ic, ibox)
            for j in range(d3):
                jc = jth_neighbour(veci, j_vecs[j], ibox)
                k = head[jc]
                while k != -1:
                    r = pbc_dist(x[i], y[k], box)
                    if r < r_cut:
                        l = <long> (r/bs)
                        ret[thread_num, l]+=1
                    k = body[k]
            free(veci)  # arrays created by malloc should be freed. (in unravel_index_f)
    free(j_vecs)   # without calling free(), memory leak would happen.
    return np.sum(ret, axis=0)

Notes:

def means Python function and cdef means C functions;
Arrays declared by double[:] etc. are memoryslice and can be conveniently initialized by np.zeros or reading an np.ndarray as function parameter;
Without GIL, any calling of Python function is forbidden, the array must be initialized and declared in the C method;
In prange loop, variables are thread-local (see the generated .c file, in section #pragma omp parallel), there is no explicit way to call an atomic operation in Cython (using with gil results in very low efficiency), hence the result is generated as ret = np.zeros((n, nbins), ... so that in each thread i that ret[i] are independent, or generate ret as (num_threads, data structure...), just keep independent amongst threads and in ith loop, call openmp.omp_get_thread_num() to get the current thread's id;
<long> is a type casting.

This RDF function uses a linked cell list algorithm to reduce the calculation from $O(n^2)$ to $O(n)$, a small ranged RDF ($r<r_\mathrm{cut}$) is calculated, the RDF on $r\ge r_\mathrm{cut}$ can be calculated using an FFT algorithm by setting $r_\mathrm{bin}<\frac{r_\mathrm{cut}}{2}$

The second example is calculation of histogram N-d array by modulus of indices: $\int f(\mathbf{r})\delta(|\mathbf{r}|-r)\mathrm{d}\mathbf{r}$:

# cython: language_level=3, boundscheck=False, wraparound=False, cdivision=True
import numpy as np
cimport numpy as np
import cython
from cython.parallel import prange, parallel
cimport openmp
from libc.math cimport floor,sqrt,pow
from libc.stdlib cimport malloc, free
import multiprocessing


cdef double unravel_index_f_r(long i, long[:] dim) nogil:
    # unravel in Fortran way
    cdef long k, d, tmp
    cdef double r
    d = dim.shape[0]
    for k in range(d):
        tmp = i % dim[k]
        r += <double> (tmp)**2
        i = (i - tmp) / dim[k]
    return sqrt(r)



def hist_to_r(double[:] x, long[:] shape, double dr, double bs, double rc):
    cdef long i, n, j, n_bins
    cdef np.ndarray[np.double_t, ndim=2] ret, count
    cdef int num_threads, thread_num
    n_bins = <long> (rc / bs)
    num_threads = multiprocessing.cpu_count()
    ret = np.zeros((num_threads, n_bins), dtype=np.float64)
    count = np.zeros((num_threads, n_bins), dtype=np.float64)
    n = x.shape[0]
    with nogil, parallel(num_threads=num_threads):
        for i in prange(n, schedule='dynamic'):
            j = <long> (unravel_index_f_r(i, shape)*dr/bs)
            thread_num = openmp.omp_get_thread_num()
            if j < n_bins:
                ret[thread_num, j] += x[i]
                count[thread_num, j] += 1
    return np.sum(ret,axis=0), np.sum(count,axis=0)

In function unravel_index_f_r, the index is unraveled in the Fortran way, therefore, calling hist_to_r requires x.ravel(order='F').

Anisotropy of ideal chain

The Gaussian chain is isotropic when averaged over conformation space and all orientations. Therefore, a Gaussian chain is dealt as sphere with radius of $R_g$, it’s radius of gyration. In the 2nd chapter of Rubinstein’s Polymer Physics, an exercise shows that the $R_g^2$ is asymmetric if one set the coordinate frame on its end-to-end vector, $\mathbf{R}_{ee}$. i.e., the $\mathbf{R}_{ee}$ vector is set as the $x$-axis therefore $\mathbf{R}_{ee}=(R,0,0)^T$. Then the 3 components of $\mathbf{R}_{ee}$ are $\frac{Nb^2}{36}$, $\frac{Nb^2}{36}$ on $y$, $z$ direction and $\frac{Nb^2}{36}+\frac{R^2}{12}$ on $x$ direction. Here I make a very simple proof.

Consider a Gaussian chain is fixed between $(0,0,0)^T$ and $\mathbf{R}_{ee}$. It’s actually a Brownian bridge, and the distribution is a multivariate Gaussian with mean at $\frac{i}{N}\mathbf{R}$ with variance $\frac{i(N-i)}{N}b^2$, the proof is simple:

$$P_{0\mathbf{R}}(\mathbf{r},n)=\frac{G(\mathbf{r},0,n)G(\mathbf{R},\mathbf{r},N-n)}{G(0,\mathbf{R},N)}$$

$G(a,b,n)$ represents distribution of a Gaussian chain ends at $a,b$ with segment length $n$. The meaning is straightforward: it’s the probability of a length $n$ Gaussian chain start from $0$ and end at $\mathbf{r}$ connected with another $N-n$ Gaussian chain which started at $\mathbf{R}$ and stopped at $\mathbf{r}$, and the whole chain is an Gaussian chain with length $N$ and $\mathbf{R}_{ee}=\mathbf{R}$. It is easily to show the distribution of such chain

$$P_{0\mathbf{R}}(\mathbf{r},n)=G\left(\mathbf{r}-\frac{n}{N}\mathbf{R}, 0,\frac{n(N-n)}{N}\right)$$

is equivalent to a Gaussian chain segment ends with $\mathbf{r}$ and $\frac{n}{N}\mathbf{R}$ with equivalent length $\frac{n(N-n)}{N}$. The $R_g^2$ is then

$$\begin{align}R_g^2=&\frac{1}{2N^2}\int_0^N\langle(\mathbf{r}_i-\mathbf{r_j})^2\rangle\mathrm{d}i\mathrm{d}j\\
=&\frac{1}{N^2}\int_0^N\int_j^N \frac{(i-j)^2 R^2}{N^2}+\frac{(i-j)(N-(i-j))}{N}b^2\mathrm{d}i\mathrm{d}j\\ =&\frac{R^2+nb^2}{12}\end{align}$$

In this equivalent method, $\langle (\mathbf{r}_i-\mathbf{r}_j)^2\rangle$ is considered as $(\frac{i}{N}-\frac{j}{N})^2R^2+\frac{|i-j|(N-|i-j|)}{N}$ from the equivalent Gaussian chain. This is because despite the chain is ‘fixed’, it is still a Gaussian chain, which means translation invariance, $\langle (\mathbf{r}_i-\mathbf{r}_j)^2\rangle$ depends only on $|i-j|$, therefore, $P_{0\mathbf{R}}(\mathbf{r},n)$ gives the probability of segment $\mathbf{r}_n-\mathbf{r}_0$, which is any $n$-segment on the Gaussian chain. I tried calculating $P(\mathbf{r}_i-\mathbf{r}_j)$ from the convolution of $P_{0\mathbf{R}}$. This is incorrect because $\mathbf{r}_i$, $\mathbf{r}_j$ are correlated, convolution of $P_{0\mathbf{R}}$ results in dependence of $i$ and $j$ in $\langle (\mathbf{r}_i-\mathbf{r}_j)^2\rangle$, violates the translation invariance of the chain.

Now if $R^2=Nb^2$, we see that $R_g^2=\frac{1}{6}Nb^2$; and we have $\frac{Nb^2}{36}+\frac{R^2_{x,y,z}}{12}=\frac{Nb^2+R^2}{36}$ for each dimension, especially, if $\mathbf{R}_{ee}=(R,0,0)^T$, we have $\frac{Nb^2}{36}$ in $y$ and $z$ direction and $\frac{Nb^2}{36}+\frac{R^2}{12}$ in $x$-direction.

Simple simulation code:

rg2_ree = rg2 = 0
for _ in  range(5000):  
    ch = np.random.normal(size=(1000,3))  
    ch = np.append(np.zeros((1,3)),np.cumsum(ch, axis=0),axis=0)  
    ch = ch - ch.mean(axis=0)  
    ree = ch[-1]-ch[0]  
    ree = ree/np.linalg.norm(ree)  
    rg2_ree += np.sum(ch.dot(ree)**2)  
    rg2 += np.sum(ch**2)
rg2_ree/rg2
# Result is 0.666...

Flory characteristic ratio of hindered rotation chain

This is an exercise on Rubinstein's textbook of Polymer Physics.

Definition of Flory characteristic ratio:
$$C_\infty:=\lim_{n\to\infty}\frac{\langle \mathbf{R}^2 \rangle}{nb^2}$$
where $\langle \mathbf{R}^2 \rangle$ is the mean square end-to-end vector, $b$ is Kuhn length of the polymer and $n$ is corresponding chain length.
Calculation of $\langle \mathbf{R}^2\rangle$:
Let $\mathbf{r}_i$ be the bond vector between $i$ and $(i-1)$th particle, and $\mathbf{r}_1=\mathbf{R}_1-\mathbf{R}_0\equiv0$; for an $n$-bead chain,:
$$\begin{align}\langle \mathbf{R}^2 \rangle&=\left\langle\left(\sum_{i=1}^n \mathbf{r}_i^2\right)\cdot\left(\sum_{j=1}^n \mathbf{r}_j^2\right)\right\rangle\\
&=\sum_i\sum_j\langle \mathbf{r}_i\cdot\mathbf{r}_j\rangle\end{align}$$
Calculation of $\langle\mathbf{r}_i\cdot\mathbf{r}_j\rangle$:
Now consider a local coordinate frame that $\mathbf{r}_i=(b,0,0)^T$, then $\mathbf{r}_{i+1}$ could be considered as rotating $\mathbf{r}_i$ $\theta$ by $z$-axis for bond angle and $\phi_{i}$ by $x$-axis for torsion. The rotation matrix is thus
$$\mathbf{T}_i:=\left(
\begin{array}{ccc}
\cos (\theta ) & -\sin (\theta ) & 0 \\
\sin (\theta ) \cos (\phi_i) & \cos (\theta ) \cos (\phi_i) & -\sin (\phi_i) \\
\sin (\theta ) \sin (\phi_i ) & \cos (\theta ) \sin (\phi_i ) & \cos (\phi_i ) \\
\end{array}\right)$$
or $\pi-\theta$, $\pi - \phi$ according to the definitions of bond angle $\theta$ and torsion angle $\phi$. In every reference coordinate frame $R_i$, the bond vector is $(b,0,0)^T$, and in $R_i$, $\mathbf{r}_{i+1}=\mathbf{T}_i\cdot\mathbf{r}_i=(b \cos (\theta ),b \sin (\theta ) \cos (\phi_i ),b \sin (\theta ) \sin (\phi_i ))^T$. Now consider a simple case of $\mathbf{r}_i\cdot\mathbf{r}_{i+2}$, the $\mathbf{r}_{i+2}$ is calculated in the $(i+1)$th reference frame $R_{i+1}$ where $\mathbf{r}_{i+1}$ is $(b,0,0)^T$ and $\mathbf{r}_{i+2}=\mathbf{T}_{i+1}\cdot\mathbf{r}_{i+1}$, here we must 'rotate' one of the vector into the other frame to do the inner product, i.e. rotate $\mathbf{r}_i=(b,0,0)^T$ from $R_i$ to $R_{i+1}$ or vice versa. Now note that $\mathbf{r}_{i+1}$ in the $R_i$ frame is $(b \cos (\theta ),b \sin (\theta ) \cos (\phi_i ),b \sin (\theta ) \sin (\phi_i ))^T$ and $(b,0,0)^T$ in the $R_{i+1}$ frame, therefore if we want to transform some vector in $R_i$ into $R_{i+1}$, we need a transform matrix to transform the $(b \cos (\theta ),b \sin (\theta ) \cos (\phi_i ),b \sin (\theta ) \sin (\phi_i ))^T$ into $(b,0,0)^T$ which is $\mathbf{T}_i^{-1}$, or $\mathbf{T}_i^T$, because in frame $R_i$, we obtain $\mathbf{r}_{i+1}$ by $\mathbf{T}_i\cdot(b,0,0)^T$ and $\mathbf{T}_i$ is a rotation matrix, which is orthonormal. By transform $\mathbf{r}_i$ into $R_{i+1}$, the inner product $\mathbf{r}_i\cdot\mathbf{r}_{i+2}$ is
$$\langle\mathbf{T}_i^T\mathbf{r}_i,\mathbf{T}_{i+1}\mathbf{r}_{i+1}\rangle=$$ $$\langle\mathbf{r}_i,\mathbf{T}_i\mathbf{T}_{i+1}\mathbf{r}_{i+1}\rangle=(b,0,0)\mathbf{T}_i\mathbf{T}_{i+1}(b,0,0)^T$$
By induction, $\mathbf{r}_i\cdot\mathbf{r}_j=(b,0,0)\mathbf{T}_i\mathbf{T}_{i+1}\mathbf{T}_{i+2}\cdots\mathbf{T}_{j-1}(b,0,0)^T$, since $\theta$ for every bonds are same and $\phi_{1,2,\dots,n}$ are independently distributed, let $\mathbf{T}$ be the average transform matrix of ${\mathbf{T}_i}$:
$$\begin{align}\mathbf{T}&:=\frac{\int_{-\pi}^{\pi}\mathbf{T}_i(\phi)\exp{(-U(\phi)/k_BT)}\mathrm{d}\phi}{\int_{-\pi}^{\pi}\exp{(-U(\phi)/k_BT)}\mathrm{d}\phi}\\
&=\left(
\begin{array}{ccc}
\cos (\theta ) & -\sin (\theta ) & 0 \\
\sin (\theta ) \langle\cos (\phi) \rangle & \cos (\theta )\langle \cos (\phi)\rangle &0 \\
0& 0& \langle\cos (\phi)\rangle \\
\end{array}\right)\end{align}$$
We therefore have $\langle\mathbf{r}_i\cdot\mathbf{r}_j\rangle=(b,0,0)\mathbf{T}^{|j-i|}(b,0,0^T)=b^2(\mathbf{T}^{|j-i|})_{11}$. Here every $\sin(\phi)$ term vanishes in the integral because $U$ is even.
Now the Flory characteristic ratio
$$\begin{align}C_\infty&=\lim_{n\to\infty}\frac{1}{nb^2}\sum_i\sum_j \mathbf{r}_i\cdot\mathbf{r}_j\\
&=\lim_{n\to\infty}\frac{1}{n}(\sum_i\sum_j T^{|j-i|})_{11}\\
&=\lim_{n\to\infty}\left(-\frac{1}{n}\frac{-2 \mathbf{T}^{n+1}+\mathbf{T}^2 n+2 \mathbf{T}-n\mathbf{I}}{(\mathbf{I}-\mathbf{T})^2}\right)_{11}\\
&=\left(\frac{\mathbf{I}+\mathbf{T}}{\mathbf{I}-\mathbf{T}}\right)_{11}\\
&=\frac{1+\cos(\theta)}{1-\cos(\theta)}\frac{1-\langle\cos(\phi)\rangle}{1+\langle\cos(\phi)\rangle}\end{align}$$
Here we have $\frac{\bullet}{n}=0$ and $\mathbf{T}^n=0$ at $n\to\infty$, because $\mathbf{T}$ is constant and correlation between 2 beads goes to $0$ as distance goes to infinity.

RTFM, pls RTFM!!!

I was trying to write a parallel version C-extension with Cython several days ago, to calculate Coulomb energy of $n$ points. I wrote the code as follows:

@cython.cdivision(True)
@cython.boundscheck(False)
@cython.wraparound(False)
def u(double[:,:] x):
    cdef long i
    cdef long j,k
    cdef double ret=0,tmp
    with nogil, parallel():
        for i in prange(x.shape[0]-1, schedule='static'):
            for j in range(i + 1, x.shape[0]):
                tmp = 0
                for k in range(x.shape[1]):
                    tmp = tmp + pow(x[i,k]-x[j,k],2)
                ret = ret + 1 / sqrt(tmp)
    return ret