-
PDF
- Split View
-
Views
-
Cite
Cite
Neha H Suresh, Chaitra A Kilpady, Akshata A Kamath, Ashikh Ahmed, Tinku Thomas, Anura V Kurpad, Ambily Sivadas, CobVar—a comprehensive resource of vitamin B12-associated genomic variants, Database, Volume 2025, 2025, baaf049, https://doi.org/10.1093/database/baaf049
- Share Icon Share
Abstract
The importance of vitamin B12 (cobalamin) in numerous biological processes, including DNA synthesis and cellular energy production, underscores the need for therapeutic and public health strategies to address B12 insufficiency/deficiency in the population. Genetic variations in pathways influencing cobalamin absorption, transport, and metabolism can affect various direct and indirect measures of vitamin B12 status. To facilitate a structured approach to studying these genetic factors, we aimed to systematically curate and create a user-friendly web database that offers comprehensive data on genetic variants influencing B12 biomarkers. A PubMed search was performed for 5 B12 traits [total serum/plasma B12, holotranscobalamin (active B12), total transcobalamin, holohaptocorrin, and methylmalonic acid] resulting in 493 research publications, of which 47 relevant publications were reviewed further. The database backend was built using MongoDB and the web interface was coded in PHP, JavaScript, HTML, and CSS on an Apache HTTP server. We have manually curated and compiled the Cobalamin Associated Genetic Variant (CobVar) database, comprising a total of 324 genetic variant associations for 5 different vitamin B12 traits involving 222 unique genetic variants and 84 genes identified across several genome-wide association studies and candidate gene studies. About one-third of the total genetic variant associations have been reported in >1 independent studies and 15 variants in >1 ethnic group. FUT2 gene showed the maximum number of associations for total serum/plasma B12 (N = 39), followed by MTHFR (N = 24) and TCN2 (N = 23). The database is accessible online at https://datatools.sjri.res.in/VBG/. CobVar is a vital resource for researchers and nutritionists, offering quick access to the latest developments in B12-related genetic variant research and serves as a valuable tool for advancing personalized treatment.
Database URL: https://datatools.sjri.res.in/VBG/
Introduction
Cobalamin, commonly known as vitamin B12, is an indispensable water-soluble micronutrient, primarily found in animal food sources. It is essential for various physiological processes, including normal functioning of the nervous system and the development and maturation of red blood cells. It serves as a cofactor for two critical enzymes: cytosolic methionine synthase, necessary for nucleic acid synthesis through the regeneration of tetrahydrofolate, and mitochondrial methylmalonyl-CoA mutase, which supports the citric acid cycle and heme synthesis [1,2]. It also plays key roles in DNA synthesis, the methylation cycle, and energy production, and is crucial for maintaining cell membranes and the myelin sheath, which protects nerves and ensures proper transmission. Vitamin B12 deficiency can manifest in both clinical and sub-clinical forms, with significant health implications. Clinical B12 deficiency (usually defined as a total serum B12 of <150 pmol/L) may manifest as neuropathy, cognitive impairment, and anaemia [3], while sub-clinical deficiency (usually defined as a total serum B12 of <200 pmol/L), though asymptomatic, is equally important to detect as it can still cause metabolic disruptions over time [4].
Assessing vitamin B12 status involves several biomarkers that provide insight into both circulating levels and the functional impact of deficiency. Total circulating B12 is a commonly used measure that reflects the overall concentration of vitamin B12 in the blood, though it may not always accurately reflect tissue availability or early deficiency [5]. Holotranscobalamin (holoTC), the biologically active form of B12 bound to transcobalamin, offers a more precise indicator of B12 available for cellular uptake, making it a sensitive marker for detecting early deficiency. Holohaptocorrin, another form of B12 bound to haptocorrin, represents a storage form of B12 that is not readily available to cells, and its clinical utility is less direct compared to holoTC. Functional biomarkers like methylmalonic acid (MMA) and total homocysteine (tHcy) accumulate when B12 levels are insufficient. Elevated MMA is a specific marker for B12 deficiency, as B12 is required for its metabolism. tHcy, while indicative of B12 deficiency, can also be elevated due to low levels of folate, riboflavin, or vitamin B6. These functional biomarkers are especially useful for detecting subclinical B12 deficiency and early changes in B12 status. Together, these biomarkers provide a comprehensive evaluation of an individual’s vitamin B12 status.
Vitamin B12 biomarkers exhibit significant inter-individual variability, influenced by several factors, including age, diet, gastrointestinal health, pregnancy, other comorbidities and medications, and genetic factors [6]. Studies show that 59% of the variability in plasma vitamin B12 levels can be attributable to genetic factors [7]. Over the years, numerous candidate gene studies and genome-wide association studies (GWAS) have identified multiple genetic markers linked to vitamin B12 status. Candidate gene studies, which focus on specific genes involved in B12 pathways, have uncovered polymorphisms in genes such as FUT2, TCN2, and MTHFR, all of which play crucial roles in B12 absorption, distribution, metabolism/utilization, and excretion (ADME) [8–10]. TCN1, CUBN, and TCN2 are involved in the initial absorption and transport phases, ensuring that B12 reaches tissues where it is needed (Fig. 1). Metabolism of cobalamin associated C gene (MMACHC), MTR, MTHFR, and MMUT facilitate B12’s metabolic functions in DNA synthesis and mitochondrial energy production, while CD320 aids in cellular uptake. More recently, large-scale GWAS have expanded this understanding by scanning the entire genome, leading to the identification of several novel genetic loci associated with various B12 biomarkers [11].

A summary schematic of the key genes involved in the absorption, transport, and metabolic utilization of B12 across different body compartments. TCN1: transcobalamin I (haptocorrin); TCN2: transcobalamin II; GIF: gastric intrinsic factor (IF); FUT2: fucosyltransferase 2; FUT6: fucosyltransferase 6; CUBN: cubilin; PON1: paraoxonase 1; CD320 (TCblR): transcobalamin receptor; MMACHC: methylmalonic aciduria and homocystinuria type C; LMBRD1: LMBR1 domain containing 1; ABCD4: ATP binding cassette subfamily D member 4; MTHFR: methylenetetrahydrofolate reductase; MTR: 5-methyltetrahydrofolate-homocysteine methyltransferase; MTRR: methionine synthase reductase; DNMT2: DNA methyltransferase 2; MS4A3: membrane spanning 4-domains A3; MUT: methylmalonyl-CoA mutase; MMAB: methylmalonic aciduria type B (adenosyltransferase); MMMAA: methylmalonic aciduria type A (GTPase); CLYBL: citrate lyase beta-like; PRELID2: PRELI domain containing protein 2.
Despite the progress made through these diverse and isolated studies, there remains a significant gap in consolidating this information into a unified resource. Currently, there is no comprehensive, curated database that compiles all known genetic variants associated with vitamin B12 status and biomarkers, along with evidence of replication across multiple studies and ethnic groups. This lack of integration poses a challenge for efficiently accessing and interpreting genetic data, limiting its potential application in precision nutrition and clinical care. To address this, we aimed to systematically curate all B12-associated genetic variants reported in the literature and develop an online centralized database, making this valuable information easily accessible for researchers, nutritionists, and clinicians alike.
Materials and methods
Data curation from published peer reviewed literature
We searched in PubMed (http://www.ncbi.nlm.nih.gov/pubmed) and identified 493 original research publications (as of 25 August 2024) for curation of B12-associated genetic variants. The search string utilized a combination of MeSH Terms and plain text of relevant keywords—(humans[MeSH Terms]) AND ((‘vitamin b12/blood’[MeSH Terms] OR Vitamin B12/metabolism[MeSH Terms] OR Vitamin B12 Deficiency/genetics[MeSH Terms] OR Vitamin B12 Deficiency/blood[MeSH Terms]) AND ((association study, genome wide[MeSH Terms] OR Genetic Association Studies[MeSH Terms]) OR ((polymorphism, single nucleotide[MeSH Terms] OR Polymorphism, Genetic[MeSH Terms]) AND (genotype[MeSH Terms])))) OR ((‘Vitamin B12’[Title] OR B12[Title] OR Cobalamin[Title] OR homocysteine[Title]) AND (vitamin B12 [Title/Abstract] OR cobalamin[Title/Abstract]) AND (Allele[Title/Abstract] OR Polymorphism[Title/Abstract] OR Polymorphisms[Title/Abstract])) NOT (review[Publication Type]).
After manual inspection of the 493 publications, 446 articles were excluded as they did not report primary genetic associations with B12-related traits or focused only on functional or mechanistic aspects without reporting variant-level association results. Forty-seven relevant publications were systematically reviewed to collate the reported genetic variant associations with five B12-related traits (total serum/plasma B12, holoTC, holoHC, MMA, and totalTC). Various parameters from the publications including the method of study, study design and geographical location, ethnicity, etc. were documented in a pre-formatted tabular template.
Database and interface design
The backend architecture of the database (Cobalamin Associated Genetic Variant—CobVar) is built using open source web technologies like MongoDB, PHP, and JavaScript and the server was hosted using Apache HTTP server, creating a robust system for managing and querying large datasets of genetic variants. MongoDB serves as the primary database, allowing for flexible data storage across various fields related to genetic associations, such as ‘dbSNP ID’, ‘Gene’, and ‘Phenotype/Trait’. PHP facilitates server-side processing, handling requests, executing queries, and dynamically generating responses based on user input. This integration enables efficient data retrieval and ensures that users can access relevant information swiftly.
On the client side, JavaScript enhances user interaction and experience by enabling dynamic features such as real-time updates and interactive charts. The use of AJAX further optimizes performance, providing an efficient way to load data asynchronously while maintaining a responsive interface. The key annotations related to the genes and the genetic variants has been presented with external links to databases such as GeneCards and dbSNP.
To keep up with the rapid advancements in this field, the CobVar database will be updated annually. Each version will be time-stamped and versioned, even in cases where no new variant annotations are identified, to ensure transparency. Updates will include newly published data from both peer-reviewed articles and preprints indexed in PubMed and bioRxiv/medRxiv.
Results
Data curation
The database catalogues a total of 324 genetic variant associations for 5 distinct vitamin B12 related traits involving 222 unique variants and 84 genes. The data were systematically curated from a total of 47 peer-reviewed publications including, 34 candidate gene and 13 GWAS (Fig. 2).

A schematic workflow of data collection and database integration for CobVar database.
The studies were conducted primarily across five ethnic groups, with individuals of European ancestry, represented by population groups such as Irish, Icelandic, Danish, Greek, Norwegian, and Canadian, comprising the largest proportion (43%) followed by admixed American ancestry (primarily Brazilian population) and those of South Asian ancestry (which included Indian and Pakistani populations). In our study, around 33.6% of the total genetic variant associations have been reported in >1 independent study. This includes two total B12 associations, methylenetetrahydrofolate reductase (MTHFR) gene variant rs1801133 and transcobalamin II (TCN2) gene variant rs1801198, which were replicated in >10 independent studies. An additional seven total B12 associations were replicated in ≥5 studies [rs526934: transcobalamin I (TCN1); rs602662: fucosyltransferase 2 (FUT2); rs601338: FUT2; rs1805087: 5-methyltetrahydrofolate-homocysteine methyltransferase (MTR); rs1801131: MTHFR; rs492602: MTHFR; rs1141321: methylmalonyl-CoA mutase (MMUT)]. Two variants, rs1801133 (MTHFR) and rs1805087 (MTR), were replicated in >3 ethnic groups, 7 variants in >2 ethnic groups, and 15 variants in >1 ethnic group.
The largest number of associations curated was for circulating levels of total serum/plasma B12, with a total of 155 variants (48%; Fig. 3). HoloTC showed 68 associations followed by holoHC with 67 associations. Maximum variant overlap was observed for total B12 and holoHC.

An overlap analysis of genetic variant associations across B12 traits.
Several genes involved in the absorption, distribution, metabolism/utilization, and excretion pathways of vitamin B12 have been associated with different B12 traits. FUT2 gene showed the maximum number of associations for total serum/plasma B12, with 39 variants, followed by MTHFR (24 variants) and TCN2 (23 variants) (Fig. 4a). Similarly, for holoTC, cubulin gene (CUBN) shows the maximum number of associations (14 variants) followed by TCN2 (12 variants) and CD320 molecule (4 variants) (Fig. 4b). HoloHC showed maximum associations with FUT2 gene (13 variants), followed by Izumo Sperm-Oocyte Fusion 1 gene (IZUMO1) (18 variants) and CUBN (5 variants) (Fig. 4c). MMA phenotype showed maximum associations with HIBCH (two variants) while totalTC showed one variant association each in CUBN, FUT2, MMUT, MTR, TCN1, and TCN2 genes (Fig. 4d and e).

Distribution of genes reported to be associated with different B12 traits: (a) total plasma/serum B12, (b) holoTC (active B12), (c) holohaptocorrin (holoHC), (d) MMA, and (e) total transcobalamin (totalTC) levels. The number in parentheses next to each gene name represents the count of reported variants within that gene.
Database and web interface features
The database features a user-friendly interface that enables users to textually search based on various query terms, including dbSNP ID, chromosome-position, gene name, and specific B12 traits (Fig. 5). The search results are displayed in a concise table of matching genetic variant associations, with complete information accessible by clicking on individual rows (Fig. 6a). The variant associations are organized into two main sections: the Variant section (Fig. 6b) and the Association section (Fig. 6c). The Variant section provides key annotations related to the genetic variant, with external links to databases such as dbSNP and GeneCards for further details. The Association section presents a series of study results (referenced by PubMed IDs), displaying relevant association statistics and study details in a sequential format. The study details include ethnicity, geographical location, specific B12 trait, and association statistics including effect size and P-values. The Home page includes an interactive pie chart summarizing the gene distribution across various trait associations, along with a brief overview of key data statistics. Additionally, a comprehensive user manual is available to guide users in navigating and effectively utilizing the database. The database can be freely accessed at https://datatools.sjri.res.in/VBG/.

Screenshots of the CobVar database application featuring the Home page, and the different search query formats that can be used by the user.

Screenshots of (a) the expanded database query results, (b) the detailed view of the Variant section, and (c) the Association section.
Discussion
The CobVar database offers a centralized platform that consolidates curated genetic data linked to B12 homeostasis, enabling a deeper understanding of the genetic factors influencing B12 absorption, transport, clearance, and utilization. The database helps users examine variant associations that have been replicated across multiple studies and diverse populations, enabling assessment of reproducibility and strength of evidence, while ensuring transparency for research and potential clinical use. While the curation process was population-agnostic, the current evidence remains skewed towards European-ancestry cohorts; therefore, users are advised to interpret associations observed in underrepresented populations with caution, particularly when replication across studies or ethnic groups is limited. The database can also be utilized to generate polygenic risk scores for populations by aggregating the impact of multiple genetic variants associated with vitamin B12 biomarkers, enabling more accurate assessments of genetic predispositions to B12 deficiencies/insufficiency and other related health outcomes, including anaemia and neurological disorder.
The database will also enable nutritionists to develop personalized nutrition strategies by incorporating genetic data into dietary recommendations. As genetic testing becomes more accessible, understanding individual differences in B12 status can help tailor supplementation and dietary plans based on a person’s genetic profile. This approach can optimize B12 intake, ensuring that patients and clients receive the most effective nutritional advice, particularly those at risk for deficiency due to genetic factors. This database thus serves as a valuable tool for advancing both precision nutrition and genomic research.
Conclusion
With emerging evidence on novel and population-specific genetic variants associated with various direct and indirect measurements of vitamin B12 status, there is increased scope for exploring the physiological and molecular underpinnings of the ADME pathways associated with vitamin B12 in humans. These findings carry important consequences for understanding the genetic contributors to B12 deficiency and tailoring treatment strategies. This resource is poised to become a vital hub for researchers and nutritionists, offering free and quick access to the latest developments in B12-related genetic variant research.
Acknowledgements
We thank Vaishnavi Chevvu for her assistance in literature review and data curation for the database.
Author contributions
A.S. and N.H.S. carried out this project and wrote the manuscript. A.S. was responsible for the conception and design of the project. Data curation were primarily carried out by C.A.K., A.A.K., and N.H.S. N.H.S. took primary responsibility for database and web interface development. A.A. and T.T. reviewed the database and implemented the public server setup. C.A.K., A.A.K., and N.H.S. reviewed and tested the content and functionality of the online web database. N.H.S. also performed the data analysis for the manuscript. A.V.K. and T.T. critically reviewed the manuscript, provided feedback, and approved the manuscript.
Conflict of interest
None declared.
Funding
This research was supported by the Department of Biotechnology/Wellcome Trust India Alliance Clinical/Public Health Research Centre Grant # IA/CRC/19/1/610006 to A.V.K.
Data availability
CobVar database is publicly available at https://datatools.sjri.res.in/VBG/.