What is CirRFKB
CirRFKB (Circadian-related Risk Factors Knowledgebase for Cancer) is a comprehensive, manually curated knowledgebase that systematically catalogs and organizes circadian rhythm related risk factors associated with cancer initiation, progression, and outcomes. It was designed to bridge the gap between chronobiology and oncology by integrating multidisciplinary knowledge from genetics, behavior, environment, and physiology.
CirRFKB draws data from 471 original research articles published up to Dec 31, 2023, retrieved from the PubMed database. Each entry in the knowledgebase includes detailed information about the type of circadian risk factor, associated cancer type, study context, and impact classification (e.g., risk, protective, non-influencing, or unclear).
In total, CirRFKB contains:
- 1,449 single factors and 340 combined factors, forming 4,052 standardized records
- Classification into genetic (681), environmental (106), physiological (244), and behavioral (418) categories
- Inclusion of 46 cancer types, with breast, prostate, and lung cancers being the most studied
CirRFKB serves as a centralized knowledge platform for researchers, clinicians, and public health experts to explore how circadian disruption influences cancer risk. It also supports hypothesis generation, risk prediction, and the development of personalized prevention strategies.
Why CirRFKB
Despite growing interest in the circadian clock’s role in cancer, current knowledge remains fragmented and inconsistent. Studies vary widely in terms of cancer types, study design, exposure assessments, and population characteristics. Furthermore, circadian-related findings are often buried in complex literature, making it difficult to extract and apply meaningful insights.
CirRFKB was developed to solve these challenges by offering:
- Systematic data curation from high-quality, peer-reviewed human studies
- Standardized classification of circadian risk factors across cancer types
- User-friendly web access for browsing, searching, and downloading structured data
- Support for precision medicine by linking risk factors to clinical relevance
By aggregating evidence from diverse disciplines, CirRFKB allows researchers to explore not only individual risk factors but also complex interactions between genes, behaviors, and environments. It highlights both well-established mechanisms and research gaps, fostering new investigations in cancer prevention and chronotherapy.
Data Curation
CirRFKB was built through a rigorous and transparent data curation pipeline to ensure high-quality, reliable, and clinically relevant information. The following five steps were applied to construct the database:
- Literature Retrieval: A comprehensive PubMed search was conducted using circadian- and cancer-related keywords. Articles published up to 2023 were included.
- Screening & Selection: Of 5,941 initial articles, only original human research with full-text access and clear relevance to circadian risk and cancer were retained after a strict exclusion process.
- Manual Extraction: Trained curators manually extracted risk factor details from 471 studies, including cancer type, classification, population features, and reported outcomes.
- Standardization: Terminologies were normalized using authoritative
resources including:
- NCBI Gene
- UniProt
- miRBase
- ICD-10 (for cancer types)
- Data Modeling: Each record contains 50+ structured fields covering literature metadata, biological context, sample data, and statistical conclusions for easy querying and filtering.
This comprehensive curation process ensures that all entries in CirRFKB are traceable, reproducible, and suitable for downstream applications including computational modeling, cancer risk scoring, and clinical decision-making in precision oncology.
Knowledgebase Contents
CirRFKB provides a rich and structured resource composed of manually curated data on circadian-related cancer risk factors. The current release includes:
- 471 curated articles
- 4,052 total records
- 1,449 single factors and 340 combined factors
- Genetic factors: 681
- Behavioral factors: 418
- Physiological factors: 244
- Environmental factors: 106
- Classified by effect:
- Risk factors: 323
- Protective factors: 254
- Non-influencing factors: 291
- Unclear factors: 921
- Covers 46 cancer types, with breast, prostate, and lung cancers being the most studied
Platform Features
CirRFKB provides a responsive and interactive web interface tailored for efficient access, visualization, and contribution. Users can:
- Search by factor name, gene, or cancer type
- Filter results by factor category and biological effect
- Visualize trends using built-in statistics and plots
- Download the complete dataset for offline analysis
- Submit new evidence or corrections via the submission portal
Who Should Use CirRFKB?
CirRFKB is built to support diverse users across scientific and clinical domains. The database is especially useful for:
- Biomedical researchers investigating circadian biology and cancer
- Clinicians exploring chronotherapeutic risk factors in patient care
- Data scientists developing predictive or risk scoring models
- Students and educators in life science, public health, and precision medicine
Development Team
CirRFKB is the result of an international collaboration between research teams at:
- West China Hospital, Sichuan University (China)
- University of A Coruña (Spain)
Corresponding Authors:
- Prof. Bairong Shen, Sichuan University
- Prof. Juan Ramón Rabuñal Dopico, University of A Coruña
The platform was built by a multidisciplinary team with expertise in critical care medicine, chronobiology, oncology, systems genetics, and biomedical informatics. Their combined efforts ensure both scientific rigor and practical utility of the knowledgebase.
How to Cite
If you use CirRFKB in your research, please cite the following work:
Wang J, Zong H, Zhang Y, et al. CirRFKB: A Knowledgebase of Circadian-Related Risk Factors for Cancer Pathogenesis and Personalized Medicine. 2025. Computational and Structural Biotechnology Journal
Contact us
If you have any questions, suggestions or feedback, please contact us:
E-mail: Bairong Shen (bairong.shen@scu.edu.cn); Jiao Wang (wjiao815@163.com)
Address: D5-A10, No.2222 Xinchuan Road, Chengdu 610041, Sichuan, China
Affiliation: Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University