Downloading¶
All CNeuroMod data are distributed as DataLad repositories on GitHub under the courtois-neuromod organisation. cneuromod.all is the master meta-repository tracking all datasets and their derivatives.
Repository structure¶
cneuromod.all follows YODA principles: each experimental paradigm is a top-level folder (e.g. hcptrt/, friends/, anat/) containing independent git submodules for each data component:
Submodule |
Contents |
|---|---|
|
Raw BIDS data |
|
fMRIPrep derivatives |
|
MRIQC quality reports |
|
Physiological recordings |
|
Additional content |
The |
Prerequisites¶
DataLad (Linux, macOS, Windows)
A GitHub account with an SSH key configured (instructions)
S3 credentials from the CNeuroMod data manager, required to download restricted data (see Access)
Clone the meta-repository¶
# Clone metadata only — no data files are downloaded
datalad clone git@github.com:courtois-neuromod/cneuromod.all.git
cd cneuromod.all
Important: Do not use datalad install -r or git submodule update --init --recursive. Submodules re-expose their own sub-submodules at differing versions for provenance tracking; recursive installation triggers a massive redundant filesystem traversal.
Install a submodule¶
Submodules are installed individually as needed. This fetches git history but no annexed data files:
datalad get -n hcptrt/bids
Download data files¶
Once a submodule is installed, retrieve actual data files with datalad get:
# Get all files in a submodule
cd hcptrt/bids && datalad get .
# Get a specific subset (e.g. all MNI-space functional outputs)
datalad get hcptrt/fmriprep/sub-*/ses-*/func/*space-MNI152NLin2009cAsym_*
# Use -J n to download in parallel with n threads
datalad get -J 4 hcptrt/bids/sub-01
S3 credentials¶
Restricted data are stored on an Amazon S3 fileserver. Set your credentials in the terminal before downloading:
export AWS_ACCESS_KEY_ID=<s3_access_key>
export AWS_SECRET_ACCESS_KEY=<s3_secret_key>
Credentials are provided by the data manager after approval of an access request (see Access).
Versioning¶
Cloning gives you the latest stable release by default. To reproduce results from a specific release:
git checkout 2020
Updates¶
To pull the latest release into an existing clone:
datalad update -r --merge --reobtain-data
The --reobtain-data flag automatically re-downloads files you had previously retrieved if they were modified upstream.