Frequently Asked Questions

What is MOSAIC?

MOSAIC (Meta-Organized Stimuli And Imaging data for Computational modeling) is a fMRI dataset aggregation framework. By aggregating datasets, the computational neuroscience field can achieve greater scale and diversity of data that is otherwise not possible with a single dataset. MOSAIC is community-driven, meaning anyone can upload their MOSAIC-compliant fMRI datasets for the community to use.

What is the difference between MOSAIC-compatible and MOSAIC-compliant?

A group of dataset files are MOSAIC-compatible if they are preprocessed with the same preprocessing pipeline (e.g., same version of fMRIPrep and GLMsingle). A dataset file is MOSAIC-compliant if it meets certain formatting requirements. All files that are MOSAIC-compatable with each other are also, by definiition, MOSAIC-compliant.

How do I tell which dataset files are MOSAIC-compatible with each other?

Files that share the same preprocessing pipeline and beta pipeline are MOSAIC-compatible. The download page of this portal allows you to filter by metadata (dataset name, subject ID, GitHub URL, preprocessing pipeline, and beta pipeline) so you can quickly find compatible files.

One may upload different preprocessed versions of a dataset that use the same preprocessing and beta pipelines (e.g., but use different parameters, fix a bug etc.). In this case, that new version and old version would both still be MOSAIC-compatible with the other datasets sharing the same preprocessing and beta pipelines. Use your discretion on which version you would like to use and be clear in your manuscript with exact files you use by noting their crc32 hashes.

Does MOSAIC introduce its own novel preprocessing pipeline?

No. MOSAIC does not introduce a novel preprocessing pipeline or recommend any one in particular - the purpose is to simply preprocess all fMRI data into a common format (e.g., beta - stimulus pairs) while, to the best of our ability, controlling for scanner- dataset- and software-specific variability. In this way, MOSAIC can evolve with advancements in preprocessing techniques.

Can any fMRI dataset be uploaded?

Yes. The original MOSAIC paper scoped the initial 8 datasets to be passive vision event-related designs. This choice was made to obtain high quality beta - stimulus pairs. For example, longform movie datasets are not well-suited for GLMs to obtain well-defined beta - stimulus pairs. That being said, we hope MOSAIC can grow beyond the original scope. Feel free to preprocess, document, and upload any fMRI dataset.

Who can upload a dataset to MOSAIC?

Anyone. You do not have to be an author of a dataset to upload a MOSAIC-compatible version of a dataset. Note that the uploaded dataset has to follow strict formatting rules and contain the required publicly visible metadata or else it will be rejected. Namely, all uploaded files must have:

Owner name: Name of the person (you) who preprocessed the dataset. Must be one individual taking responsibility.
Owner email: Email of the "owner" to contact in case of questions.
GitHub URL: URL to a public GitHub repository that contains code (and ideally quality checks) used to preprocess the uploaded data.
Preprocessing pipeline: Which preprocessing pipeline was used to preprocess the data (e.g., fMRIPrepv23.2.0).
Beta pipeline: Which beta estimation pipeline was used (e.g., GLMsinglev1.2) ?
Publication URL: URL to the original publication of the dataset for citation purposes.

Preprocessing and uploading a dataset to MOSAIC is not intended to be easy - it is intended to be transparent and standardized. fMRI dataset preprocessing is technical and time consuming. That being said, this procedure should not be impossible either. Follow the GitHub repositories of other successfully uploaded MOSAIC-compatible datasets as guides, and reach out to those owners or us with questions.

Is data downloaded from this portal safe?

We run rigorous security checks on all uploaded data for malware and ensure to the best of our ability that the files are safe for download. You can see the website code here. This is open source - feel free to suggest improvements.

Is MOSAIC the best way to preprocess fMRI data?

MOSAIC is a great way to preprocess your fMRI data, but might not be the best way. The purpose of MOSAIC is cross-dataset compatibility, not to optimize quality. Thus, you might be able to obtain higher quality data or take advantage of smaller voxel resolution using a custom preprocessing pipeline tailored towards a specific dataset. We hope that datasets get preprocessed in various ways to best support different research objectives, with MOSAIC preprocessing being one such preprocessing derivative.

I collected a fMRI dataset, what is the best way to share it?

We highly recommend that you first and foremost make your raw data BIDS-compatible and upload it to OpenNeuro (or similar dataset sharing service). From there, you and other researchers can easily preprocess your dataset with different pipelines, including pipelines that make it MOSAIC-compatible with other datasets.

Is there a file size limit?

Yes, the single file size limit is 30gb. This limit mainly exists to prevent unintended large uploads. If you are trying to upload a dataset that exceeds this limit, contact us and we can raise the limit.