Papers by Deborah Wiltshire

IASSIST quarterly, Mar 28, 2024
Social science and humanities research infrastructures allow the sharing and safe use of confiden... more Social science and humanities research infrastructures allow the sharing and safe use of confidential, sensitive data for research via physical safe havens. In recent years there has been a shift towards virtual data enclaves or Remote Desktop systems that offer fewer physical controls. These controls need to be replaced with other safeguards, including mandatory 'Safe Researcher' training. This training aims to ensure that researchers are equipped with the knowledge required to use secure data safely. Developing training is resource intensive so canonical training materials are an economical approach to providing standardized, high-quality training. The Social Sciences and Humanities Open Cloud project deliverable 'Training materials of workshop for secure data facility professionals ́ had two objectives. The first was the development of a set of canonical training materials that Trusted Research Environments (TREs) could use as a framework on which to build their own training course. The second objective was to hold a virtual workshop where the training materials could be demonstrated to a credible audience to gather feedback to inform the future development of the materials. We have now developed the canonical materials, building on the wealth of expertise and experience of UK-based TREs. These training materials were then demonstrated at a virtual, two-hour Stakeholder Workshop that we organized in September 2021. Following our demonstration of the materials, we facilitated small group discussions to gather vital feedback. The discussion groups formed a consensus that the materials were both comprehensive and clearly structured and would be a valuable resource to the TRE community.
Combined survey and web tracking data have great potential for social-scientific research. They a... more Combined survey and web tracking data have great potential for social-scientific research. They allow linking information on online behavior with data on reported offline behavior, opinions, and attitudes. At the same time, ethical, legal, and technical challenges make it difficult to disseminate linked web tracking data to the scientific community. This whitepaper aims to address these challenges by providing guidance for researchers and archivists, discussing legal, practical, and ethical aspects, disclosure risks, and establishing a framework for publishing web tracking data. Recommendations for best practices are also provided based on experiences from a research project funded by the German Consortium for the Social, Behavioural, Educational and Economic Sciences.
Zenodo (CERN European Organization for Nuclear Research), Apr 5, 2022
CERN European Organization for Nuclear Research - Zenodo, Apr 30, 2022

CERN European Organization for Nuclear Research - Zenodo, Apr 29, 2022
This white paper provides the necessary basis for understanding the requirements and specificatio... more This white paper provides the necessary basis for understanding the requirements and specifications for remote access to sensitive data (data with potentially harmful effects in the event of their disclosure) in the social sciences and the humanities (SSH). It is result of the work implemented in SSHOC Task 5.4 Remote Access to Sensitive Data. It is intended to provide guidance and recommendations to the EOSC stakeholders for future infrastructure investment for remote access to sensitive data in the SSH. To ensure that this guidance can, in fact, be implemented, the recommendations are based on the knowledge of numerous data professionals who have direct experience planning, implementing, managing, and sustaining diverse forms of remote access and secure facilities. In doing so, our goal has been to maintain the vision of expanding such infrastructure, while remaining grounded in the practicalities of operating such facilities in a sustainable manner. In this domain, it is now recognized that the ideal of "open data" needs to be balanced with privacy and other factors that can require moderating access to sensitive data, as reflected in the EU Commission's (2016) stance of "as open as possible, as closed as necessary." Developments in the past five years have advanced data access, primarily through "safe enclaves", i.e., physical rooms that provide security for data access (see Glossary). This represents a major improvement for data accessibility, but international, comparative, efficient research requires augmenting the research infrastructure by enabling remote access to data from a researcher's desktop. Solutions have operated for several years (e.g., UK Data Archive Secure Lab, ICPSR Virtual Data Enclave), but most of these still face limitations on the scope of data available, geographic limitations, etc. More recently, new infrastructures are being developed, some spanning several countries. These efforts are commendable and represent major improvements. However, limited resources, and complex legal variations (national implementations of GDPR), as well as other factors, have prevented implementation of a broader solution. As countries across Europe look at the emerging multinational infrastructures, it is crucial to address the need for a European answer, at scale, with sustainable funding. The recommendations offered here are guided by our observations that most successful infrastructures embody two features: 1) they are human as well as technical, and 2) they are neither purely centralised nor decentralised, but well-crafted hybrids.

An important aspect of the UK Data Service's work is to promote statistical literacy and enga... more An important aspect of the UK Data Service's work is to promote statistical literacy and engagement with quantitative survey data. We provide support by creating user guides and webinars as well as through help desk support. There is often a gap between the skills learned using bespoke data, and the reality of using survey data for analysis. The queries we receive indicate a need for greater understanding of survey design and data collection, especially with longitudinal surveys which have many constraints including fieldwork procedures, respondent burden, confidentiality, and software limitations. There is some disparity between researchers' expectations and the analysis of data obtained from a repository. Having previously worked on Understanding Society, I now work in the UK Data Service User Support Team which gives me an understanding of the challenges faced by researchers and how to address those challenges. I am collaborating with the Understanding Society team to pro...

Social science and humanities research infrastructures allow the sharing and safe use of confiden... more Social science and humanities research infrastructures allow the sharing and safe use of confidential data for research. In recent years there has been a shift towards virtual data enclaves or Remote Access or Remote Desktop systems that offer fewer physical controls. They need to be replaced with other safeguards, including often mandatory training. This training aims to ensure that researchers are equipped with the knowledge required to use secure/legally controlled data safely. Developing training is resource intensive so canonical training materials are an economical approach to providing standardised, high-quality materials for researchers. As development moves towards remote access connections that allow access across international borders, having some commonalities in the training that services offer will allow secure data access facilities across the world to be confident that researchers have received high quality training regardless of where they trained. The SSHOC Task 5....

On 5 July Cochrane Pregnancy & Childbirth and the UK Data Service held a half-day workshop aimed ... more On 5 July Cochrane Pregnancy & Childbirth and the UK Data Service held a half-day workshop aimed at demystifying some of the conceptions and misconceptions around sharing and archiving data derived from academically run clinical trials. There is no proposal to share any personal data without appropriate security, as outlined below. The purpose of this workshop was to discuss and identify situations where it might be appropriate to share data and how best practice would make this process secure and safe. On behalf of the UK Data Service (https://www.ukdataservice.ac.uk/), Louise has led discussions with a range of interest groups and policymakers on the practicalities of archiving and sharing data from completed clinical trials. The conversations address calls from funders to improve data sharing of research they have supported and requests from the clinical trials community for practical steps that can be taken to safely archive data. Why share clinical trials data? Researchers in both the academic and public sectors are experiencing an increased emphasis on demonstrating research integrity and reproducibility; funders, journals and professional bodies concerned with research conduct expect data usage to be transparent and reproducible. Louise reports:

A number of Research Data Centres (RDCs) run secure environments which provide valuable access to... more A number of Research Data Centres (RDCs) run secure environments which provide valuable access to rich sets of data which, because of their level of detail, carry a significant risk of disclosing the identity of individuals or organisations. Many pioneering areas of research are advanced by using these highly sensitive data. RDC use is on the rise, also due to the growing range of new data available exclusively via this route.nbsp;One of the foundations of RDCs is the concept of safe outputs whereby results of analysis, undertaken within secure environments, are only released once they have passed a thorough Statistical Disclosure Control (SDC).nbsp; However, SDC checks can be time and labour-intensive.nbsp; Therefore, a challenge for many RDCs is to make the process of output checking more efficient while retaining the rigour of disclosure control and excellent service for users.This panel session will feature three RDCs based in the UK, Germany and the United States of America, an...
Integrative Bioinformatics
Training researchers who want to access confidential, legally controlled data in data protection ... more Training researchers who want to access confidential, legally controlled data in data protection and statistical disclosure is a key part of ensuring the safe use of these data and an important task for secure data facility professionals. But developing suitable training materials can be resource intensive. We have developed a set of canonical training materials that cover a wide range of relevant topics from understanding data access, the Five Safes framework to key statistical disclosure control principles. These materials are designed to be used by a wide range of safe data access facilities to form the basis of their researcher training programme. We have designed these materials so that individual facilities can adapt the content to suit the needs of their service and researchers with minimal additional resource requirements.

This study investigates whether there is an association between economic activity in women and un... more This study investigates whether there is an association between economic activity in women and union dissolution in the UK. This study looks at both individual-level and aggregate-level trends by posing a number of research questions. Using a series of Cox Proportional Hazard and Piecewise Constant models to analyse individual-level data from the British Household Panel Survey and Understanding Society surveys, this study has found only weak and inconsistent evidence of an association between women’s economic activity and union dissolution. Examining these data for separate union cohorts, this study has found some initial evidence that the relationship between economic activity and union dissolution may be changing over time. The final stage of the analysis in this study looked at aggregate trends in economic activity and divorce and found some evidence of an association at the aggregate level, although due to data restrictions this was not conclusive. Following a discussion of the ...

Human genomics, Apr 26, 2018
Genomic and biosocial research data about individuals is rapidly proliferating, bringing the pote... more Genomic and biosocial research data about individuals is rapidly proliferating, bringing the potential for novel opportunities for data integration and use. The scale, pace and novelty of these applications raise a number of urgent sociotechnical, ethical and legal questions, including optimal methods of data storage, management and access. Although the open science movement advocates unfettered access to research data, many of the UK's longitudinal cohort studies operate systems of managed data access, in which access is governed by legal and ethical agreements between stewards of research datasets and researchers wishing to make use of them. Amongst other things, these agreements aim to respect the reasonable expectations of the research participants who provided data and samples, as expressed in the consent process. Arguably, responsible data management and governance of data and sample use are foundational to the consent process in longitudinal studies and are an important s...

International Journal of Population Data Science
Robust and standardised licensing and governance frameworks are used to ensure that datasets inte... more Robust and standardised licensing and governance frameworks are used to ensure that datasets intended for research use are made available under the terms and conditions specified by a data owner. The UK Data Service makes use of the Five Safes framework to operate its 3-tier data access policy, and ensuring that data classified as personal data can be made available via appropriate legal gateways. This set of principles has gained traction with national statistics around the world, yet it is remarkably absent in the narrative of data access for health research. The health domain tends to focus on ‘data sharing agreements’, and less on training around trust, security and disclosure. The concept of a Safe Health Researcher is missing, yet is appealing. The UK Data Service has been piloting such a half day course, with colleagues in the health domain. The training helps consolidate the less well defined idea of a ‘bona fide’ researcher, typically required by funders such as the UK’s Me...
Uploads
Papers by Deborah Wiltshire