Klebsiella pneumoniae O antigen genetics, structural diversity and nomenclature
We recently collaborated with Prof Chris Whitfield and colleagues at the University of Guelph to summarise the current state of knowledge on K. pneumoniae O polysaccharide (antigen) structures and genetics, and propose a new harmonised nomenclature that has been implemented in the latest version of Kaptive.
Prof Whitfield and team have been at the forefront of this research domain for many years, and have a very deep knowledge about these polysaccharide structures, the biochemical processes that result in their production and the associated genes. It was a pleasure to learn from Chris and his team, and contribute our expertise in K. pneumoniae population genomics to determine the O antigen variants that are represented among K. pneumoniae species complex genomes in the public domain.
This work is published in Whitfield et al. 2025. O-antigen polysaccharides in Klebsiella pneumoniae: structures and molecular basis for antigenic diversity. Microbiology and Molecular Biology Reviews 21:e0009023. The full text of the accepted version is available at the bottom of this page.
Why did we need a new nomenclature system?
The old system was built in a piece-meal fashion and did not allow accurate discrimination for the full range of structures that are now known. There has also been some confusion in the literature, particularly for O1 and O2 which have been referenced by many different names. This resulted in part because our understanding of the underlying biochemical processes and structural variation has evolved through time (e.g. until 2023 we did not know that there are two glycoforms of O1), and partly because the way in which we have detected and reported the associated loci and predicted types in Kaptive has also changed (e.g. early versions did not attempt to make an explicit ‘type’ prediction). We hope that the new harmonised nomenclature will help to alleviate this confusion, will provide a stable system that allows precise discrimination of each structure and can be readily adapted as we continue to learn more about these polysaccharides.
You can easily distinguish the new from the old nomenclatures because the new nomenclature uniquely uses Greek letters to mark different serotypes within a serogroup. The new nomenclature also includes both short hand and extended antigen names, with the extended names reported in Kaptive.
Understanding the new nomenclature
Then review article provides detailed information about the nomenclature scheme and you’ll find an explainer for the O1 and O2 serogroup names below.
When you are looking at a Kaptive output, it helps to keep in mind the distinction between the O locus (i.e. the set of genes that encode the synthesis and assembly machinery to make the saccharide repeat unit) and the O type (the structural phenotype), which is determined by the products of the genes in the O locus plus the products of genes in other parts of the genome that act to extend the repeat units and/or add side chains and/or other modifications (acetylation, pyruvylation). The relevant genes are included and reported in Kaptive as ‘Extra genes.’ The O type reported by Kaptive reflects the combination of the O locus and any ‘Extra genes’ that are detected in a genome.
O1 and O2 serogroup nomenclatures
All O2 and O1 polysaccharides are built on the same backbone. This is an N-acetylglucosamine (GlcNac) attached to a galactofuranose (Galf) + galactopyranose (Galp) repeating unit, together called O2⍺.

The O2β polysaccharide consists of an O2⍺ backbone with a galactopyranose side-chain on the repeat unit.

The O2𝛾 polysaccharide consists of an O2⍺ backbone with an acetylated galactofuranose within the repeat unit. Only ~40% of the galactofuranose residues are acetylated, resulting in a mixture of polysaccharides presented on the cell (O2⍺ and O2𝛾). Strains presenting this mixture of polysaccharides are therefore reported as O2⍺𝛾 in the extended nomenclature.

Any of the three O2 polysaccharides can be capped by additional polysaccharide repeat units, converting them to a different serogroup.
The O1⍺ cap is a repeating galactopyranose disaccharide unit.

The O1⍺ antigen can be converted to O1β through the addition of pyruvate. Just like the acetylation that converts O2⍺ to O2𝛾, the O1β pyruvylation is incomplete because it occurs on only ~50% of the individual O1 chains. As a result, strains with capacity to make O1β actually produce a mixture of O1⍺ and O1β polysaccharides and are therefore reported as O1⍺β.

In the extended nomenclature, the O1 label is followed by the relevant O2 repeat unit name e.g. O1⍺β,2⍺ = O1⍺β cap on an O2⍺ polysaccharide, O1⍺β,2β = O1⍺β cap on an O2β polysaccharide. This allows clear identification of each of the distinct polysaccharide structures.

The polysaccharides formally known as O2ac have been assigned to a new serogroup in the new nomenclature, for consistency with the definition of O1 as a separate serogroup: O11 polysaccharides consist of an O2 polysaccharide (any of O2⍺, O2β or O2⍺𝛾) with a galactopyranose + N-acetylglucosamine repeating cap. There are two variants of O11 with and without the addition of pyruvate, known as O11⍺ and O11β. respectively. Only a subset of the O11⍺ chains are pyruvylated, such that strains produce a mixture of O11⍺ and O11β, and are reported as O11⍺β. In the extended nomenclature this is followed by the relevant O2 repeat unit name i.e. O11⍺β,2⍺ = O11⍺β cap on an O2⍺ polysaccharide, O11⍺β,2β = O11⍺β cap on an O2β polysaccharide.

How do the new names match to those in the literature?
This table summarises the commonly used names of the most prevalent O1 and O2 polysaccharides.
| New name (extended nomenclature) | Old names |
| O1⍺β,2⍺ | O1ab, O1, O1v1 |
| O1⍺β,2β | O1ab, O1, O1v2 |
| O2⍺ | O2a, O2v1 |
| O2β | O2afg, O2v2 |
For a full breakdown of all of the nomenclature changes and matches to earlier names, see the review article and the Kaptive documentation pages.
How does this correspond to genetic loci?
The genes encoding the machinery for export and synthesis of O2⍺ are located at the O locus, OL2⍺. However, there are three different variants of OL2⍺, which each differ by ~20% nucleotide divergence. All three locus variants are included in the Kaptive database and labelled as OL2⍺.1, OL2⍺.2 and OL2⍺.3, respectively. OL2⍺.1 was called O1/O2v1 in earlier versions of the database.
The genes encoding the machinery for addition of the galactopyranose sidechain that converts O2⍺ to O2β are gmlA, gmlB and gmlC, together called gml2β. These genes are no longer considered as part of the O locus, and are included in the Kaptive database as ‘Extra genes,’ although when present they are generally located adjacent to OL2⍺ (in particular they are associated with OL2⍺.2, and in previous versions of the Kaptive database OL2⍺.2 was extended to include gml2β and was called O1/O2v2).
A single gene known as orf8 is required to convert O2⍺ to O2𝛾 via galactofuranose acetylation. Like gml2β, orf8 is no longer considered part of the O locus, although when present it is generally located adjacent to OL2⍺ (in particular associated with OL2⍺.3, and in previous versions of the Kaptive database OL2⍺.3 was extended to include orf8 and was called O1/O2v3).
Just one gene is required to add the O1⍺ cap to an O2 antigen- wbbY, which is located well away from the O locus in K. pneumoniae genomes. wbbY is usually co-located with wbbZ, which encodes the pyruvyl transferase that converts O1⍺ to O1β. The vast majority of K. pneumoniae expressing O1 polysaccharides carry intact versions of both genes and are reported as O1⍺β (they produce a mixture of O1⍺ and O1β polysaccharides). However, a very small number of strains carry wbbY without an intact copy of wbbZ and therefore produce only the O1⍺ polysaccharide.
The wbmV and wbmW genes are required to add the O11⍺ cap to an O2 polysaccharide. The gene required for the addition of pyruvate and conversion to O11β is wbmX, generally co-located with wbmV and wbmW outside of the O locus. Strains carrying all three genes produce a mixture of O11⍺ and O11β polysaccharides and are reported as O11⍺β. Strains carrying wbmV and wbmW without an intact copy of wbmX are reported as O11⍺.
Accepted full text of the review article
Post by Kelly Wyres. Image credit, Tom Stanton.
