JMIR Bioinformatics and Biotechnology
Methods, web-based platforms, open data and open software tools for big data analytics, machine learning-based predictive models using genomic and imaging data, and information retrieval in biology and medicine.
JMIR Bioinformatics and Biotechnology is the official journal of the MidSouth Computational Biology and Bioinformatics Society
Editor-in-Chief:
Ece D. Uzun, MS, Ph.D., FAMIA, Director of Clinical Bioinformatics, Lifespan Academic Medical Center; Associate Director, Center for Clinical Cancer Informatics and Data Science (CCIDS); and Associate Professor of Pathology and Laboratory Medicine, Alpert Medical School, Brown University, USA
CiteScore 2.2
Recent Articles

Background: Non-small cell lung cancer (NSCLC) is one of the leading causes of cancer-related mortality worldwide. PD-1 immunotherapy has shown promising results in the treatment of NSCLC; however, not all patients respond effectively to this treatment. Identifying predictive biomarkers for PD-1 therapy response is critical to improving patient outcomes and optimizing treatment strategies. Traditional methods of biomarker discovery often fall short in terms of accuracy and comprehensiveness, given their inability to effectively capture dependencies in multi-dimensional data. Recent advancements in deep learning provide a powerful approach to analyze complex genomic data and identify novel biomarkers that may predict therapeutic responses.

Plant-derived exosome-like nanovesicles (P-ELNs) effectively deliver bioactive compounds due to their high biocompatibility and low immunogenicity. While LC-MS profiles compounds in complex samples, its analysis of large datasets remains limited by traditional methods. Recent advances in large language models (LLMs) and domain-specific systems now enhance Chinese biomedical data processing and cross-modal pharmaceutical research.

Manual abstraction of unstructured clinical data is often necessary for granular clinical outcomes research but is time consuming and can be of variable quality. Large language models (LLMs) show promise in medical data extraction yet integrating them into research workflows remains challenging and poorly described.



Bladder cancer is a disease with complex perturbations in gene networks and heterogeneous in terms of histology, mutations, and prognosis. Advances in high-throughput sequencing technologies, genome-wide association studies, and bioinformatics methods have revealed greater insights into the pathogenesis of complex diseases. Network biology-based approaches have been used to identify the complex protein-protein interactions (PPIs) which can lead to potential drug targets. There is a need to better understand PPIs specific to urothelial carcinoma.

Sensitivity—expressed as percent positive agreement (PPA) with a reference assay—is a primary metric for evaluating lateral-flow antigen tests (ATs), typically benchmarked against a quantitative reverse transcription polymerase chain reaction (qRT-PCR). In SARS-CoV-2 diagnostics, ATs detect nucleocapsid protein, whereas qRT-PCR detects viral RNA copy numbers. Because observed PPA depends on the underlying viral-load distribution (proxied by the number cycle thresholds or Cts, which is inversely related to load), study-specific sampling can bias sensitivity estimates. Cohort differences—such as enrichment for high- or low-Ct specimens—therefore complicate cross-test comparisons, and real-world datasets often deviate from regulatory guidance to sample across the full concentration range. Although logistic models relating test positivity to Ct are well described, they are seldom used to re-weight results to a standardized reference viral-load distribution. As a result, reported sensitivities remain difficult to compare across studies, limiting both accuracy and generalizability

Integrating clinical, genomic, and social determinants of health (SDoH) data is essential for advancing precision medicine and addressing cancer health disparities. However, existing bioinformatics tools often lack the flexibility to perform equity-driven analyses or require significant programming expertise.

Ninety percent of the 65,000 human diseases are infrequent, collectively affecting ~ 400 million peo-ple, substantially limiting cohort accrual. This low prevalence constrains the development of robust transcriptome-based machine learning (ML) classifiers. Standard data-driven classifiers typically require cohorts of over 100 subjects per group to achieve clinical accuracy while managing high-dimensional input (~25,000 transcripts). These requirements are infeasible for micro-cohorts of ~20 individuals, where overfitting becomes pervasive.

The systemic treatment of cancer typically requires the use of multiple anticancer agents in combination and/or sequentially. Clinical narrative texts often contain extensive descriptions of the temporal sequencing of systemic anticancer therapy (SACT), setting up an important task that may be amenable to automated extraction of SACT timelines.
Preprints Open for Peer Review
There are no preprints available for open peer-review at this time. Please check back later.







