Advanced examples

This page demonstrates advanced examples of using SPICE.

Controlling for parcels in SPICE-NS

Several arguments in spice.effective_sample_size_estimation() and spice.covariance_estimation() can be used to control the behavior of parcellations used in SPICE-NS, including: x_parc (y_parc), max_clusters, min_clusters, and min_cluster_size.

x_parc (and y_parc) determines whether SPICE or SPICE-NS is used, and also how data x (and y) are pacellated in SPICE-NS. They can be one of the following 3 scenarios:

\(None\): SPICE will be used.
String \('auto'\): SPICE-NS will be used, with parcellation in map is determined by data-driven spatial clustering, conditioned on parameters including: max_clusters, min_clusters, and min_cluster_size
int ndarray \((N,)\) with \(k\) unique integers ranging from \(0\) to \(k-1\), each indicate a parcellation: SPICE-NS will be used, and parcellation is determined as use-specified. A warning will be raised if the number of user-specified parcellation is larger than the number of data-driven determined number conditioned on parameters including: max_clusters, min_clusters, and min_cluster_size.

max_clusters (default 10) and min_cluster_size (default 500) jointly determines the largest number of parcels can be used in data-driven parcellation with spatial clustering. The maximum number is the smaller of max_clusters or \(floor(N / {min\_cluster\_size})\). The min_cluster_size is included such that the average size (i.e., number of observations) of parcels is not too small to get accurate and reliable variogram estimation.

min_clusters (default 1) determines the smallest number of parcels can be used, which should be always set to 1 in practice. This argument is used in our study to investigate the impact of forced parcellation.

Note

Use x_parc and y_parc to manually set the choice of parcellations in SPICE-NS

Warning

May need to change the input of max_clusters and min_cluster_size when apply SPICE-NS with data-driven spatial clustering for small data patches, depending on your data.

Run with distance matrix (python only)

SPICE can be run when the spatial coordinates of observations are missing but the pairwise distances \(D\) is known.

SPICE-NS with data-driven spatial clustering can be run when the spatial coordinates of observations are missing but the pairwise distances \(D\) and the spatial dimension \(dim\) is known, where clusters are determined using K-Medoids clustering implemented with Scikit-learn-extra.

Examples below:

import spice
# SPICE
pef, rX, nef, run_status, n_parc, p_naive, fc_para1, fc_para2 = spice.effective_sample_size_estimation(x, y, D=D)
# SPICE-NS
pef, rX, nef, run_status, n_parc, p_naive, fc_para1, fc_para2 = spice.effective_sample_size_estimation(x, y, D=D, dim=dim, xparc='auto', yparc='auto')

Large-scale pairwise evaluation

The spice.effective_sample_size_estimation() funciton internally estiamtes the covariance matrices of spatial maps, which can increase computational cost when evaluating pairwise associations between a large number of spatial maps because of repeatitive covariance estimation. spice.covariance_estimation() can be used in this case to improve computational efficiency, by decoupling covariance estimation and significance inference. In brief, the covariance matrices estimated can be saved and reloaded for later inference of statistical significance.

Below is an example of running this decoupled pipeline of SPICE, to infer significance of associations between 3 different spatial maps.

Step 1: covariance estimation

MATLAB
python

spatial_maps = {x, y, z};
mapnames = {'x', 'y', 'z'};
cov_path = '.' % save covariance matrices in the current path
for i = 1:length(spatial_maps)
    mapi = spatial_maps{i};
    % computed covariance matrix may contain NaN if the spatial data contains NaN or Inf
    covmat = covariance_estimation(x,coord) % change function parameters for SPICE-NS and other settings
    save(fullfile(cov_path, sprintf('%s_cov.mat', mapnames{i})), 'covmat')

import spice
import numpy as np
import os

spatial_maps = [x, y, z]
mapnames = ['x', 'y', 'z']
cov_path = '.' # save covariance matrix in the current dir
for _, (mapi, namei) in enumerate(zip(spatial_maps, mapnames)):
    covmat = spice.covariance_estimation(mapi, coord) # change function parameters for SPICE-NS and other settings
    np.save(os.path.join(cov_path, f'{namei}_cov.npy'), covmat)

Step 2: infer significance

MATLAB
python

spatial_maps = {x, y, z};
mapnames = {'x', 'y', 'z'};
cov_path = '.'
ps = zeros(length(spatial_maps)) % initiate for saving
rXs = zeros(length(spatial_maps)) % initiate for saving
for i = 1:length(spatial_maps)-1
    fpr j = i+1:length(spatial_maps)
        mapi = spatial_maps{i};
        mapj = spatial_maps{j};
        covi = load(fullfile(cov_path, sprintf('%s_cov.mat', mapnames{i})));
        covj = load(fullfile(cov_path, sprintf('%s_cov.mat', mapnames{j})));
        % because data mapi and mapj, and covariance matrices covi and covj may contain NaN and Inf, remove these data points
        valid = isfinite(mapi) & isfinite(mapj);
        covi = covi(valid, valid);
        covj = covj(valid, valid);
        [rXs(i, j), ~] = corr(mapi(valid), mapj(valid));
        [nef, run_status] = covs2nef(covi,covj)
        ps(i,j) = nef2p(rXs(i,j), nef)

import spice
import numpy as np
import os
from scipy.stats import pearsonr

spatial_maps = [x, y, z]
mapnames = ['x', 'y', 'z']
savepath = '.'
n_maps = len(spatial_maps)
rXs = np.zeros((n_maps, n_maps)) # initiate for saving
ps = np.zeros((n_maps, n_maps)) # initiate for saving
for i in np.arange(n_maps-1):
    for j in np.arange(i+1, n_maps):
        mapi = spatial_maps[i]
        namei = mapnames[i]
        mapj = spatial_maps[j]
        namej = spatial_maps[j]
        covi = np.load(os.path.join(cov_path, f'{namei}_cov.npy'))
        covj = np.load(os.path.join(cov_path, f'{namej}_cov.npy'))
        valid = np.logical_and(np.isfinite(mapi), np.isfinite(mapj))
        rX, p_naive = pearsonr(x[valid], y[valid])
        nef = cov2nef(covi[np.ix_(valid,valid)],covj[np.ix_(valid,valid)])
        pef = nef2p(rX, nef) if nef>2 else np.nan
        rXs[i,j], ps[i,j] = rX, pef

Impact of spatial trends

Coming soon.