Based on the Junker report, we will consider the basic outline of a statistical report.
The introduction should provide the following.
See Nolan and Stoudt chapter 3 (on Sakai) for some great suggestions here.
Here’s an example of a codebook excerpt (from Nolan & Stoudt).
This is very helpful for a programmer, who needs to map numeric codes to their meanings. Someone who is reading your paper does not need to know (or care) that 3=once did, not now or that 99=unknown.
Here is a corresponding publication-quality table (Nolan & Stoudt).
Often, instead of the table above, you will see a “table 1,” providing simple descriptive statistics of variables, that will appear at the beginning of the results section.
For the previous example, your table would also include % missing or unknown for each variable.
Data cleaning is an important part of the processing pipeline and should be clearly documented. Be sure to address aspects such as any observations excluded from analysis (and why), quantification & treatment of missing values, transformations and derivations of variables, aggregation of observations to a higher level of granularity (e.g., averaging exposures over a full day), and merges of multiple data sets.
For example, in the second case study, some observations are collected at 700 Hz, while others are collected at 4 Hz. You’ll need to describe how the scales were aligned for analysis.
In general, you want the methods section of the manuscript to describe an analytic process that is computationally reproducible. So any data processing or modeling steps should be clear. The model should be very clearly described using mathematical notation (pay attention to indices!). You should be able to read this section and then code the analysis, potentially with help from the appendix for things like variable definitions (that is, the paper itself doesn’t need to tell you that 3=once did, not now, only that it was a category you included as a predictor or outcome in a model).
Often these sections are combined. Here you describe the results of your modeling, any validation/sensitivity analysis, and your main findings. Generally multiple helpful tables and figures are included in this section.
Come back to the questions raised in the introduction, perhaps including additional details from the analysis question. Raise new questions, future work, and areas for further investigation here.
This can include material not central to understanding your manuscript, such as