PTOS-open-data - 4 Research I

4.1 No data = not reproducible

Computational reproducibility :=

“a second investigator (including the original researcher in the future) can recreate the final reported results of the project, including key quantitative findings, tables, and figures, given only a set of files and written instructions” (Kitzes et al., 2018, p. xxii)

Same data + same analysis = same results

Why reproducibility?
Allows independent researchers to assess the analytic choices, assumptions, and implementations that led to a set of scientific claims.
Check for validity and generalizability (Clyburne-Sherin et al., 2019; Obels et al., 2020)

4.2 No FAIR data = reproducibility tedious

But computational reproducibility isn’t as easy as it sounds (Artner et al., 2021)

checked 232 primary statistical claims
from 3 journals
after data was provided and accessible (33%, 25%, 26%)

Vagueness Makes Assessing Reproducibility a Nightmare

most successful reproductions are predominantly the result of tedious and time-consuming work

information about the provided raw data was often difficult to understand, and information about the relevant variables, data manipulations, and the used statistical model was often vague or inaccurate

(Artner et al., 2021, p. 12)

4.3 No data = barrier to replication

Evidence e.g. from replication attempts in cancer biology (Errington et al., 2021)
Due to various barriers, 50 of the 193 replication experiments could be conducted at all
Missing data = major barrier to compute parameters to replicate

data were open for 4 of 193 experiments

Questions to be answered at the end?
Please put them here!