### Guideline for fMRI articles

Because neuroimaging research is still a young field, no norm exists on how to present data. Originally this page was set up to give information on things that should be produced in an article in order to be able, for the reader, to reproduce the experiment, which is the minimum requested in all disciplines (most of this information being also reported in the paper by Poldrack et al 2008 Neuroimage 40). It also now covers other things such as data presentation, labelling, etc .. ie all the things that make a good paper (format wise, can't warranty the content)

**Experimental Paradigm**

The paradigm corresponds to the way stimuli are presented to the subjects and to the tasks subjects have to do. It is here appropriate to define first the type of fMRI design used (blocked, event-related, mixed) and the parameters of this design (e.g. the probability of the different ISI or the transition matrix). Second, the number of sessions and then the number of blocks, or trials or experimental units per session should be given.

**Imaging parameters**

- Magnet strength

- For anatomical and/or functional data: image dimension, voxel size, number of slices and interslice skip if any, orientation (axial, sagittal, horizontal, other) and volume coverage (from z= to z=)

- For functional data: sequence, flip angle, TE, TR, FOV, order of acquisition of slices (sequential or interleaved) + number of experimental sessions and volumes by experimental sessions

**Preprocessing (computational neuro-anatomy)**

- Slice timing correction: software version; order and type of interpolant used and reference slice.

- Intra-subject registration information (also called coregistration): type of motion correction used (minimally, software version; ideally, image similarity metric and optimization method used), reference slice and interpolation method.

- Inter-subject registration parameters (also called normalization): the intersubject registration method used should be specified (affine/nonlinear) with the number of parameters: 9 or 12 parameters for affine transformations; the deformation parameterization for non-linear transformations (e.g. in AIR, a polynomial order is specified; in SPM, a DCT basis size is specified, 3x2x3) as well as the non-linear regularization setting (e.g. in SPM, e.g. "a little"). Furthermore, the resulting voxels size (resampling) and the interpolation method should be reported. In addition, object image information (image used to determine transformation to atlas) and template or atlas information need to be accounted. Especially, description of the image properties of the object image (T1 or T2, segmented grey image or not) and orientation with reference to functional data should be given. For the template image, the reference atlas name (Talairach, MNI), the modality and resolution. (e.g. "SPM2's MNI, T1 2x2x2"; "SPM2's MNI Gray Matter template 2x2x2") should also be described.

- Spatial smoothing: specify if done at the 1st level and 2nd level if done twice, the kernel type (e.g., Gaussian) and width (e.g., 12mm FWHM, note that for a Gaussian kernel it is not enough to write 12mm, since the kernel is infinite).

+ Report the order in which the various pre-processing steps have been performed (for complex analysis a workflow diagram is useful to visualize the different steps).

**Statistical modeling** for Mass univariate analyses

1. Intra-subject fMRI Modeling (1st level)

- Statistical model (GLM, non parametric) and software version used.

- Filtering (especially, the cut-off frequency of the high-pass filter)

- Autocorrelation modeling (e.g. for SPM2, 'Approximate AR(1) autocorrelation estimated at omnibus F-significant voxels (P<0.001), then pooled over whole brain'; for FSL, 'Regularized autocorrelation function estimated at each voxel').

- Hemodynamic response function (SPM's canonical HRF; SPM's gamma basis; Gamma HRF of Glover) used (assumed or estimated)

- Number of regressors (conditions) modeled and how. It might be useful to define names instead of underlying psychological concepts.

- Additional regressors used (e.g. motion, behavioral covariates)

- Drift modeling (e.g. DCT with cut off of X seconds; cubic polynomial)

- Estimation method: OLS, OLS with variance-correction (G-G correction or equivalent), or whitening.

- Contrast construction. Exactly what terms are subtracted from what.

- Assumption analysis: Analysis of residuals (SPMd - Tom Nichols)

2. Between subjects fMRI Modeling (2nd level)

- Statistical model and software version used

- Whether first level intersubject variances are assumed to be homogeneous (SPM & simple summary stat methods: yes; FSL: no).

- If multiple measurements per subject, method to account for within subject correlation. (e.g. SPM: 'Within-subject variance-covariance matrix estimated at F-significant voxels (P<0.001), then pooled over whole brain')

- Assumption analysis: Analysis of residuals (SPMd - Tom Nichols)

- When complex designs are used, a graphical representation of the matrix and a description of contrasts in term of columns could be provided as supplementary on-line info.

**Activation maps**

- Type of search region considered, and the volume in voxels. If not whole brain, how region was found; method for constructing region should be independent of present statistic image. How many voxels corrected for?

- How were anatomical locations (e.g. Brodmann areas) determined? (e.g. Talairach Daemon, Talairach atlas, manual inspection of individuals' anatomy, etc.). If MNI converted to Talairach, what was the method? E.g. Brett's mni2tal?

- Reporting coordinates is good practice as it helps for meta-analysis, this can however be reported as supplementary material and in the manuscript only report labels. Labelling should be performed using various tools. For single subject, visualize on the subject anatomy and use a reference atlas, avoiding relying entirely on coordinates. For group, use preferably the average MRI brain derived from your group or use the template used for normalization. Labelling is then better done using probabilistic atlases, thus reporting not just the anatomical region but also the likelihood that an activation is in a particular region. Finally, if one wants to associate activated regions with Broadman areas, best is to use again a probabilistic atlas, for now the only one availble for humans being the one available in the Anatomy toolbox (SPM/FSL) - see *Devlin and Poldrack 2007 NeuroImage 37 p1033-1041*)

- Inferences about significant hemispheric asymmetry require formal tests of the Hemisphere x Condition (or Hemisphere x Group) interaction (cf.* Wilke and Lidzba, 2007 NwuroImage 163*). It is inappropriate to infer from main effects (of condition or group) that are significant in only one hemisphere that there is a significant asymmetry.

- Uncorrected inference is not acceptable, unless a single voxel can be a priori identified. According to the method used results should mention either the voxel-wise significance or the cluster-wise significance. For voxel-wise significance, the corrected value for Familywise Error (FWE) or False Discovery Rate (FDR) must be reported. If FWE found by random field theory (e.g. with SPM) list the smoothness in mm FWHM and the RESEL count. If not uniquely specified by use a given software package and version, the method for finding significance must be reported (e.g. "Internal software was used to construct statistic maps and thresholded at FDR<0.05 (*Benjamini & Hochberg 1995*)". For Cluster-wise significance, list cluster-defining threshold (e.g. P=0.01), and what the corrected cluster significance was (e.g. "Statistic images assessed for cluster-wise significance; with a cluster-defining threshold of P=0.01 the 0.05 FWE-corrected critical cluster size was 103."). Again, if significance determined with random field theory, then smoothness and RESEL count must be supplied. For both methods cluster size should be reported either in number of voxels or in mm^3 (facilitate the comparison between studies).

- Post-hoc analyses should be justified and in particular it should be indicated how the reported analyses are not biased by the selection process (see circularity issues - Kriegeskorte et al 2009 Nature Neuroscience 12)

**Figures**

- Indicate which sofware (and version) was used to vidualize the results and possibly which software was used for editing the figures.

- Indicate Neurological or radiological convention.

- If threshold is used for inference and the threshold used for visualization in figures is different, clearly state so and list each.

- Continuous activation maps to be made available - and displayed in the paper or supplementary material. Thresholded statistic maps can be seriously misleading. Both because they exclude sub-threshold but possibly broad patterns, and because they immediate reveal the mask. A reader automatically equates an absence of suprathreshold blob with no activation, yet they would think differently if they found there was no data in that entire region (possible due to susceptibility artifacts). For more on this merits of unthresholded images: Jernigan ert al 2003 Hum Brain Mapp. 19).

- For thresholded and continuous maps always display the color scale and it's meaning, ie not just activation/deactivation but T/F/Z values.

- The recommended color scale for activation maps is the heated body scale going from dark to bright but stop at yellow (do not include white - see Christen et al. 2013 NeuroImage 73 p30-39). For other types of maps use a different color scale than HBS.

- If significant interactions (e.g., Group x Condition) or other complex contrasts are observed, barplots of % signal change or the like would be helpful. If bar plots are used, error bars should be included. If the contrast is within-subjects (repeated-measures) the appropriate within-subjects (repeated-measures) errors should be used (Masson & Loftus, 2009).

- Analyses of zero-order, partial, or part correlations between brain activity and other measures (e.g., paper-and-pencil measures, task performance) mandate the inclusion of scatter plots, preferably with CIs.

**Data curation and reproducible results**

Data curation is a term used to indicate management activities required to maintain research data long-term such that it is available for reuse and preservation. As you are analyzing data and preparing your article, think about (1) how you are going to store the data, but also the metadata describing the experiment and how to 'read' the data and also the analysis scripts and possibly intermediate and final results (2) how to make sure anybody can reproduce your research. Data curation is now a requirement for many funding bodies - and a plethora of informaion can be found on the UK Digital Curation Centre website.

Even if you follow the recommendation on that page, it is likely that a full replication of your work is not possible just by reading your article. A valuable thing to do is to have (and save along with the data) a workflow. This can come as a word document, but it is even better if it comes as an executable script. In Matlab see for instance the pipeline done by Pierre Bellec: The pipeline system for Octave and Matlab (PSOM): a lightweight scripting framework and execution engine for scientific workflows ; and of course you can use Matlab specific formatting to make the code easily readable. In python the notebook is pretty useful, along with the workflows provided in nipype. Here is some information I found useful for imaging, taken from Sandve et al. (2013).

1. Keep track (on paper) of every steps from data collection to results. At each step record what was done with what software (and version), also archive the sofware versions used.

2. Minimize manual editing ; for instance if you need to change the format, this is better if done automatically via some software -- even copy/paste can be done in a script.

3. For analyses that involve random number generator, save the seed or state of the system, so that the exact same result can be obtained.

4. Store raw data along with summaries and plots ; for instance for a ROI analysis store the raw data (the values obtained for each subject and condition) along with the script computing the mean, 95% CI and making the plot.

5. Edit scripts not just providing information about what is computed but also reflecting what hypothesis is tested, or question answered by that specific bit of code.

**Data sharing**

Data sharing is greatly beneficial to the community and to authors! Indeed, this has been shown (Piwowar et al. 2007) has increasing the trust in, interest for, and citation of your paper, as it can get cited just because of using the data. Consider how you want ot share your data and how (see Poline et al 2012 Frontiers in Neuroimformatics). Large or usual dataset can be published on their own in dedicated journals, while coordinates or only result maps or even all the data can be submitted to databases.