Bogna Liziniewicz
The present study investigates the state of open data (defined as research data free to access and use by everyone) in Psychology. The project compares the quality and reusability of public and requested datasets. It also explores researchers’ attitudes towards data sharing. Expanding the existing research, largely focused on publicly available psychological data, it investigates both data sources simultaneously, allowing to evaluate the persistence of past findings and estimate their source-based proportion. Additionally, investigating the most-recent attitudes towards open data permits identifying opinion changes, enabling novel approaches to the discussion of open science. The present study investigates the quality and reusability of psychological research data by assessing their different components, including universal file formats and the presence of supporting information. The attitudes are investigated via a questionnaire adapted from previous studies, extended to open-ended questions, to grasp any novel opinions not investigated previously. The results suggest a significantly better state of public data compared to the requested data. In addition, the attitudes align with previous findings: positive attitudes towards data sharing, despite various fears. The study addresses the importance of not only encouraging open science practices, but also by highlighting the need to ensure better quality of requested data.
Bogna Liziniewicz
Are you interested in learning about the importance of Open Science in Psychology?
If so, start with the introductory video below - and explore the contents of this website to feed your curiosity!
Please click on this link to view the video transcript.
The science behind Open Science
and it's importance to Psychology
Open Science stands for research transparency. This means that it aims to make the most of the research process available to people. For example, when a research paper is published, the authors link a database to it. A database is a storage space, where anonymised datasets and supplementary information can be uploaded for independent researchers to access. This way, one scientist's data can be reused by their independent colleagues to replicate a study or to analyse the same data to answer a new research question.
The idea vs. the harsh reality
Open Science is an amazing phenomenon. It protects scientific integrity and helps fight the replication crisis - the low rate of different researchers' ability to redo someone else's study and obtain similar results. This way, thanks to Open Science, it is possible to verify research and ensure its credibility - if many people can access the same data and reanalyse it, obtaining similar results, it is less likely that the data was made up and that the analyses were manipulated.
However, past research has found that, across many scientific disciplines, the practices of Open Science are far from ideal.
What do we know?
Across many fields of research, Open Science calls for improvement.
The available scientific data
Across different areas of research, the findings are consistent: in many cases, it is not possible for independent scientists to reuse the available data:
- Roche et al. (2015) - More than half of the publicly available datasets in Ecology and Evolution are incomplete, which prevents them from being reused by other scientists.
- Hardwicke et al. (2018) - mandatory data sharing policies of scientific journals (Psychology) result in more datasets being made available publicly.
- Hardwicke et al. (2021) - research data are available publicly in only less than 15% of the total 250 Psychology papers published between 2014 and 2017.
- Towse et al. (2021) - Only 4% of the total 1900 papers published in Psychology journals between 2014-2017 are available publicly. Over two thirds of these datasets are non-reusable and over a half are incomplete.
- Datasets available upon request are often difficult to obtain - research on Medicine (Savage & Vickers, 2009) and Psychology (Wicherts et al., 2006).
- Vines et al. (2014) - time impacts research data availability - the older the dataset, the less likely it is to obtain it.
Attitudes towards Open Science:
- Tenopir et al. (2015) - scientists want to share their research data, although they also list reasons why they sometimes decide not to.
- Houtkoop et al. (2018) - the most commonly listed reasons not to share one's own research data include: lack of time; legal inability; fear of data misuse; preference to share the dataset upon request; lack of specific training.
- Wicherts et al. (2011) - fears concern the possibility of someone else's alternative analysis on one's data resulting in non-significant results.
- Martone et al. (2018) - fear of being taken advantage of - another researcher accessing one's own data and using them for a different project, then publishing the results before they original authors can.
The present study
Aims of the project:
The present study has two aims:
1. Investigating and comparing the quality and reusability of publicly available and requested research data.
This enables to expand the existing research on open data to the investigation of research data available upon request. This way, it is possible to compare their state with the publicly available data.
2. Following up on researchers' attitudes towards data sharing.
This is done to investigate the changes in said attitudes over time. This way, it is possible to identify any new approaches to the topic, which can point towards answers to why the percentage of available data is so low, despite the positive attitides.
“This study has been reviewed and approved by a member of the Psychology department or the Faculty of Science and Technology Research Ethics Committee at Lancaster University."
Pre-registration
Research transparency is a crucial component of Open Science. Because of this, the first step in turning the idea of this study into action was the process of pre-registering it on the Open Science Framework platform. This way, the aims of this piece of research were made clear before any data was collected. In addition, it is now possible to access any materials used in this study through the platform, following this link.
The process:
Part 1 - Investigating and comparing the quality and reusability of publicly available and requested research data.
IDENTIFYING THE JOURNALS AND TIMEFRAME OF INTEREST
British Journal of Psychology & Psychological Science
2016-2020 (Early: Jan 2016 - Jun 2018; Late: Jul 2018 - Dec 2020)
RANDOMLY SAMPLING THE PAPERS
Target: 100 papers (50 per journal). Using random.org to avoid bias.
SEARCHING FOR DATASETS
A manual process for each paper. If no dataset - contacting the author via email (Requested datasets category).
OBTAINING DATA
ASSESSING THE DATA USING SCALES FOR COMPLETENESS AND REUSABILITY CREATED BY ROCHE ET AL. (2015).
The same scales were used for the previous investigation of open data in Psychology by Towse et al. (2021).
Flowchart 1. The stages of obtaining data for Part 1 of the project.
Part 2 - Investigating the attitudes towards data sharing.
1. The process started with creating a survey based on a questionnaire on data sharing attitudes from Houtkoop et al.'s (2018) study.
2. Once the questionnaire was created, a survey invitation list was created. The target participants were the authors whose papers were used in Part 1 of the present study. Their participation was anonymous and voluntary. The target number of participants was N=100.
3. Step 3 consisted of sending out the initial survey invitation via email and a single reminder to complete the study in 2 weeks after the first email.
THE QUESTIONNAIRE
BENEFITS OF DATA SHARING
BARRIERS TO DATA SHARING
FEARS OF DATA SHARING
Flowchart 2. The questionnaire consisted of 3 subscales, each focused on a different aspect of attitudes towards data sharing. In addition, there was one open-ended question which served to grasp any other observations that had not been mentioned in the close-ended questions.
Analyses and findings:
Part 1
The final total number of papers investigated was N=60.
A 2x2 Two Between-Subject Factor ANOVA was used to analyse the data.
The results suggest a significant relationship between Data Completeness and Data Category.
Similarly, there is a significant relationship between Data Reusability and Data Category.
For each component, publicly available data was in a better state than the data obtained via request.
And, for each component, Time Window was not significant.
Part 2
The final total number of participants after the exclusion of incomplete responses was N=17 (Response rate: 20%).
The survey responses from the close-ended questions were analysed using a Factor Analysis.
Thematic Analysis was used to analyse the responses to the open-ended question in the survey.
Although the factor analysis was not significant, the responses are suggestive, as they align with past findings. The qualitative responses also revealed similar trends. However, the low survey response rate and small sample size call for repeating this part of the study with a larger number of participants.
Qualitative analysis:
Among the themes found through qualitative analysis, there were:
- Fear of data sharing despite the willingness resulting from the possibility of facing criticism from other researchers finding errors in the dataset or taking a different approach to a specific set of data, which can produce different results and thus invalidate the original findings.
- Concerns about lack of standards for data sharing.
- Finding errors in someone else's data seen as a benefit, because it serves to pick up what the other researcher could have possibly missed.
The next steps:
The results suggest the need to research datasets available upon request in greater detail. This way, it will be possible to pick up on the reasons behind their quality being poorer in comparison to publicly available datasets.
It might be due to the lack of standards for open data - that is to say, perhaps the existing policies implemented by various scientific journals still call for improvement. Of course, it does not mean that the policies did not change the situation - as we know from past research, mandates on data sharing contributed to greater availability of research data.
It is also worth pointing out that perhaps the policies should concern the aspects of perceptions of possible fears and barriers towards data sharing. Of course, this study's quantitative analyses of the survey responses were not significant, however they do fit in with the trends listed in past research. If we consider the previous, significant themes in the perceptions of open science, it will be possible to amend the policies or include new rules to protect scientists from their data being misused.
Open Science is an amazing phenomenon which helps to improve all fields of research. Hopefully the state of open data continues to improve so that everyone can benefit from it.
Acknowledgements:
I would like to thank my research supervisor, Professor John Towse, for being invaluable support throughout the course of this research project.
References:
Haahr, M. (2022, February 28). RANDOM.ORG: True Random Number Service. Retrieved from https://www.random.org
Hardwicke, T. E., Mathur, M. B., MacDonald, K., Nilsonne, G., Banks, G. C., Kidwell, M. C., Hofelich Mohr, A., Clayton, E., Yoon, E. J., Henry Tessler, M., Lenne, R. L., Altman, S., Long, B., & Frank, M. C. (2018). Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition. Royal Society Open Science, 5(8), 180448. https://doi.org/10.1098/rsos.180448
Hardwicke, T. E., Thibault, R. T., Kosie, J. E., Wallach, J. D., Kidwell, M. C., & Ioannidis, J. P. A. (2021). Estimating the Prevalence of Transparency and Reproducibility-Related Research Practices in Psychology (2014–2017). Perspectives on Psychological Science, 174569162097980. https://doi.org/10.1177/1745691620979806
Houtkoop, B. L., Chambers, C., Macleod, M., Bishop, D. V. M., Nichols, T. E., & Wagenmakers, E.-J. (2018). Data Sharing in Psychology: A Survey on Barriers and Preconditions. Advances in Methods and Practices in Psychological Science, 1(1), 70–85. https://doi.org/10.1177/2515245917751886
Martone, M. E., Garcia-Castro, A., & VandenBos, G. R. (2018). Data sharing in psychology. American Psychologist, 73(2), 111–125. https://doi.org/10.1037/amp0000242
Roche, D. G., Kruuk, L. E. B., Lanfear, R., & Binning, S. A. (2015). Public Data Archiving in Ecology and Evolution: How Well Are We Doing? PLOS Biology, 13(11), e1002295. https://doi.org/10.1371/journal.pbio.1002295
Savage, C. J., & Vickers, A. J. (2009). Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE, 4(9), e7078. https://doi.org/10.1371/journal.pone.0007078
Tenopir, C., Dalton, E. D., Allard, S., Frame, M., Pjesivac, I., Birch, B., Pollock, D., & Dorsett, K. (2015). Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide. PLOS ONE, 10(8), e0134826. https://doi.org/10.1371/journal.pone.0134826
Towse, J. N., Ellis, D. A., & Towse, A. S. (2020). Opening Pandora’s Box: Peeking inside Psychology’s data sharing practices, and seven recommendations for change. Behavior Research Methods, 53(4), 1455–1468. https://doi.org/10.3758/s13428-020-01486-1
Vines, T. H., Albert, A. Y. K., Andrew, R. L., Débarre, F., Bock, D. G., Franklin, M. T., Gilbert, K. J., Moore, J.-S., Renaut, S., & Rennison, D. J. (2014). The Availability of Research Data Declines Rapidly with Article Age. Current Biology, 24(1), 94–97. https://doi.org/10.1016/j.cub.2013.11.014
Wicherts, J. M., Bakker, M., & Molenaar, D. (2011). Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE, 6(11), e26828. https://doi.org/10.1371/journal.pone.0026828
Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. https://doi.org/10.1037/0003-066X.61.7.726
Slide 1 image (max 2mb)
Slide 1 video (YouTube/Vimeo embed code)
Image 1 Caption
Slide 2 image (max 2mb)
Slide 2 video (YouTube/Vimeo embed code)
Image 2 Caption
Slide 3 image (max 2mb)
Slide 3 video (YouTube/Vimeo embed code)
Image 3 Caption
Slide 4 image (max 2mb)
Slide 4 video (YouTube/Vimeo embed code)
Image 4 Caption
Slide 5 image (max 2mb)
Slide 5 video (YouTube/Vimeo embed code)
Image 5 Caption
Slide 6 image (max 2mb)
Slide 6 video (YouTube/Vimeo embed code)
Image 6 Caption
Slide 7 image (max 2mb)
Slide 7 video (YouTube/Vimeo embed code)
Image 7 Caption
Slide 8 image (max 2mb)
Slide 8 video (YouTube/Vimeo embed code)
Image 8 Caption
Slide 9 image (max 2mb)
Slide 9 video (YouTube/Vimeo embed code)
Image 9 Caption
Slide 10 image (max 2mb)
Slide 20 video (YouTube/Vimeo embed code)
Image 10 Caption
Caption font
Text
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)
Image (max size: 2mb)
Or drag a symbol into the upload area
Image description/alt-tag
Image caption
Image link
Rollover Image (max size: 2mb)
Or drag a symbol into the upload area
Border colour
Rotate
Skew (x-axis)
Skew (y-axis)