Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2019, Frontiers in Medicine
https://doi.org/10.3389/fmed.2019.00034…
14 pages
1 file
For over a decade the term “Big data” has been used to describe the rapid increase in volume, variety and velocity of information available, not just in medical research but in almost every aspect of our lives. As scientists, we now have the capacity to rapidly generate, store and analyse data that, only a few years ago, would have taken many years to compile. However, “Big data” no longer means what it once did. The term has expanded and now refers not to just large data volume, but to our increasing ability to analyse and interpret those data. Tautologies such as “data analytics” and “data science” have emerged to describe approaches to the volume of available information as it grows ever larger. New methods dedicated to improving data collection, storage, cleaning, processing and interpretation continue to be developed, although not always by, or for, medical researchers. Exploiting new tools to extract meaning from large volume information has the potential to drive real change in clinical practice, from personalized therapy and intelligent drug design to population screening and electronic health record mining. As ever, where new technology promises “Big Advances,” significant challenges remain. Here we discuss both the opportunities and challenges posed to biomedical research by our increasing ability to tackle large datasets. Important challenges include the need for standardization of data content, format, and clinical definitions, a heightened need for collaborative networks with sharing of both data and expertise and, perhaps most importantly, a need to reconsider how and when analytic methodology is taught to medical researchers. We also set “Big data” analytics in context: recent advances may appear to promise a revolution, sweeping away conventional approaches to medical science. However, their real promise lies in their synergy with, not replacement of, classical hypothesis-driven methods. The generation of novel, data-driven hypotheses based on interpretable models will always require stringent validation and experimental testing. Thus, hypothesis-generating research founded on large datasets adds to, rather than replaces, traditional hypothesis driven science. Each can benefit from the other and it is through using both that we can improve clinical practice.
BMC Medical Research Methodology, 2019
Background: Use of big data is becoming increasingly popular in medical research. Since big data-based projects differ notably from classical research studies, both in terms of scope and quality, a debate is apt as to whether big data require new approaches to scientific reasoning different from those established in statistics and philosophy of science. Main text: The progressing digitalization of our societies generates vast amounts of data that also become available for medical research. Here, the big promise of big data is to facilitate major improvements in the treatment, diagnosis and prevention of diseases. An ongoing examination of the idiosyncrasies of big data is therefore essential to ensure that the field stays congruent with the principles of evidence-based medicine. We discuss the inherent challenges and opportunities of big data in medicine from a methodological point of view, particularly highlighting the relative importance of causality and correlation in commercial and medical research settings. We make a strong case for upholding the distinction between exploratory data analysis facilitating hypothesis generation and confirmatory approaches involving hypothesis validation. An independent verification of research results will be ever more important in the context of big data, where data quality is often hampered by a lack of standardization and structuring. Conclusions: We argue that it would be both unnecessary and dangerous to discard long-established principles of data generation, analysis and interpretation in the age of big data. While many medical research areas may reasonably benefit from big data analyses, they should nevertheless be complemented by carefully designed (prospective) studies.
Review of Public Administration and Management, 2015
Big Data is one of the trending topics in information management, marketing and also in healthcare. Big Data describes a new generation of technologies and architectures that allow us to extract value from massive data volumes and types by enabling high-velocity capture, discovery and analysis of distributed data. Big Data is characterized by the so-called four V's: volume and complexity of data, velocity of collecting, storing, processing and analyzing data, variety in relation with different types of data (structured, unstructured and semistructured) and veracity or 'data assurance' about data quality, integrity and credibility [1,2] The International Medical Informatics Association (IMIA) working group on "Data Mining and Big Data Analytics" defined Big Data as data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it [3]. Nevertheless, it is essential to consider that several aspects of the concept and its application can vary by domain, depending on what kinds of software tools are available in each case and what size and types of datasets are more common in a particular field, each type of Big data requires particular analysis methods [4].
Journal of the American Medical Informatics Association, 2009
Advances in high-throughput and mass-storage technologies have led to an information explosion in both biology and medicine, presenting novel challenges for analysis and modeling. With regards to multivariate analysis techniques such as clustering, classification, and regression, large datasets present unique and often misunderstood challenges. The authors' goal is to provide a discussion of the salient problems encountered in the analysis of large datasets as they relate to modeling and inference to inform a principled and generalizable analysis and highlight the interdisciplinary nature of these challenges. The authors present a detailed study of germane issues including high dimensionality, multiple testing, scientific significance, dependence, information measurement, and information management with a focus on appropriate methodologies available to address these concerns. A firm understanding of the challenges and statistical technology involved ultimately contributes to better science. The authors further suggest that the community consider facilitating discussion through interdisciplinary panels, invited papers and curriculum enhancement to establish guidelines for analysis and reporting.
Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences
Sabina Leonelli, University of Exeter, [email protected] Debates about the emergence, significance and long-term impact of 'big data' have become ubiquitous across most scientific disciplines. Thanks to new technologies for generating and storing information, data production is said to have increased on an unprecedented scale, together with the expectation that data should be made freely accessible to global research networks as a common resource from which new knowledge can be harvested (as often emphasised by editorials in Nature and Science over the last decade). The biological and biomedical sciences are no exception, and are in fact widely seen as fields where the difficulties and potential rewards of handling big datasets are most pronounced. This is partly due to the huge diversity in the types of available data and organisms from which data are taken. The complexity of biological and biomedical phenomena is also seen as particularly challenging, especially given the rise of systemic/integrative approaches wishing to understand how entities and processes at different levels of organisation, ranging from genes and cells to organisms, populations and ecosystems, shape and construct each other. Further, the social and economic stakes in these areas are enormous, not only because of the vast investment of resources devoted to them by both the public and the private sectors, but also due to the tantalizing promises attached to biological and biomedical discovery. Biologists are expected to yield an understanding of life that helps humans to make sense of themselves and their role in their environment, while clinicians are charged with providing improved
The Ocular Surface, 2019
Applied Sciences, 2021
Data science and machine learning are buzzwords of the early 21st century. Now pervasive through human civilization, how do these concepts translate to use by researchers and clinicians in the life-science and medical field? Here, we describe a software toolkit, just large enough in scale, so that it can be maintained and extended by a small team, optimised for problems that arise in small/medium laboratories. In particular, this system may be managed from data ingestion statistics preparation predictions by a single person. At the system’s core is a graph type database, so that it is flexible in terms of irregular, constantly changing data types, as such data types are common during explorative research. At the system’s outermost shell, the concept of ’user stories’ is introduced to help the end-user researchers perform various tasks separated by their expertise: these range from simple data input, data curation, statistics, and finally to predictions via machine learning algorithm...
Clinical Rheumatology, 2020
Health informatics and biomedical computing have introduced the use of computer methods to analyze clinical information and provide tools to assist clinicians during the diagnosis and treatment of diverse clinical conditions. With the amount of information that can be obtained in the healthcare setting, new methods to acquire, organize, and analyze the data are being developed each day, including new applications in the world of big data and machine learning. In this review, first we present the most basic concepts in data science, including the structural hierarchy of information and how it is managed. A section is dedicated to discussing topics relevant to the acquisition of data, importantly the availability and use of online resources such as survey software and cloud computing services. Along with digital datasets, these tools make it possible to create more diverse models and facilitate collaboration. After, we describe concepts and techniques in machine learning used to process and analyze health data, especially those most widely applied in rheumatology. Overall, the objective of this review is to aid in the comprehension of how data science is used in health, with a special emphasis on the relevance to the field of rheumatology. It provides clinicians with basic tools on how to approach and understand new trends in health informatics analysis currently being used in rheumatology practice. If clinicians understand the potential use and limitations of health informatics, this will facilitate interdisciplinary conversations and continued projects relating to data, big data, and machine learning.
Molecular cancer research : MCR, 2016
The Cancer Target Discovery and Development (CTD2) Network was established to accelerate the transformation of "Big Data" into novel pharmacological targets, lead compounds, and biomarkers for rapid translation into improved patient outcomes. It rapidly became clear in this collaborative network that a key central issue was to define what constitutes sufficient computational or experimental evidence to support a biologically or clinically relevant finding. This manuscript represents a first attempt to delineate the challenges of supporting and confirming discoveries arising from the systematic analysis of large-scale data resources in a collaborative work environment and to provide a framework that would begin a community discussion to resolve these challenges. The Network implemented a multi-Tier framework designed to substantiate the biological and biomedical relevance as well as the reproducibility of data and insights resulting from its collaborative activities. The sa...
American Journal of Bioethics, 2023
1 To make the central point of our paper, our preferred term is nonhypothesis-driven approach. But such an approach can go by many other names: data-driven, big data, agnostic, unbiased, untargeted, hypothesis-free, or holistic. When intended to (later on) generate hypotheses and/or identify causal connections, it can also be called: discovery-based, discovery-driven, exploratory, or hypothesis-generating. minimalization, bias, and the perceived conflict between data science and clinical medicine. Furthermore, we argue that these conclusions each actively enable, rather than impede, the ethical and epistemic value of the further development of data-driven statistical/AI-based models as a crucial emerging technology for biomedical research and innovation.
SAGE Open Medicine, 2020
Universally, the volume of data has increased, with the collection rate doubling every 40 months, since the 1980s. “Big data” is a term that was introduced in the 1990s to include data sets too large to be used with common software. Medicine is a major field predicted to increase the use of big data in 2025. Big data in medicine may be used by commercial, academic, government, and public sectors. It includes biologic, biometric, and electronic health data. Examples of biologic data include biobanks; biometric data may have individual wellness data from devices; electronic health data include the medical record; and other data demographics and images. Big data has also contributed to the changes in the research methodology. Changes in the clinical research paradigm has been fueled by large-scale biological data harvesting (biobanks), which is developed, analyzed, and managed by cheaper computing technology (big data), supported by greater flexibility in study design (real-world data)...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
arXiv (Cornell University), 2017
Production and Operations Management
Journal of Medical Artificial Intelligence
European Scientific Journal ESJ, 2020
Frontiers in veterinary science, 2018
npj Digital Medicine, 2020
Journal of the American Medical Informatics Association : JAMIA
Medical research archives, 2024
Frontiers in Veterinary Science
Frontiers in Medicine, 2024
Clinical and Translational Imaging
International Journal of Data Science and Analytics
Journal of medical Internet research, 2016
Journal of the American Medical Informatics Association : JAMIA, 2015