A deep learning approach to the genetic prediction of blood group antigens

20 Jun 2023

This session was held on June 20 2023, during the 33rd Regional ISBT Congress that was held in Gothenburg, Sweden, from June 17-21.

The Diving deep into blood groups session included the following presentations:

1. Vanja Karamatic Crew: The new Er blood group system: Piezo and its role in RBC physiology
2. Gloria Wu: Elucidating the Blood Group Regulome
3. Camous Moslemi: A deep learning approach to the genetic prediction of blood group antigens
4. Gian-Andri Thun: Intriguing outcomes from Nanopore sequencing of two cryptic A3 samples: a case of blood group mosaicism and a novel regulatory variant in the ABO system

MODERATORS: Catherine Hyland and Frederik Banch Clausen

After the presentations, there were questions and answers of about 5 minutes, which are also included in the recordings.

Abstract

A deep learning approach to the genetic prediction of blood group antigens

C Moslemi^1,2, O Pedersen^1,3, K Hyvarinen⁴, J Ritari⁵, J Partanen⁴, M Olsson^6,7, S Ostrowski^8,9

¹Department of Clinical Immunology, Zealand University Hospital, Køge, ²Department of Clinical Immunology, Aarhus University Hospital, Aarhus, ³Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark, ⁴Finnish Red Cross Blood Service, Helsinki, ⁵Finnish Red Cross Blood Service, Helsinki, Finland, ⁶Department of Laboratory Medicine, Lund University, ⁷Department of Clinical Immunology and Transfusion Medicine, Office for Medical Services, Lund, Sweden, ⁸Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, ⁹Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark

Background: Deep Learning (DL) techniques have made great advances in recent years by utilising very large training sets. Next generation sequencing techniques and digitalization of biological data have revolutionized the field of biology with access to large datasets. The next logical step is the marriage of the latest DL techniques and large biological datasets.

Since the discovery of ABO over a century ago, 43 blood groups were discovered, covering 349 antigens. Due to their clinical significance in blood transfusion, testing for certain blood types, such as ABO and RhD, is ubiquitous. However, this leaves the remaining blood groups as either not consistently tested everywhere, or not tested at all unless special needs arise.

The cost and time associated with performing serological or custom-made PCR testing for every phenotype in each blood group hinders widespread testing. However, a comprehensive dataset of blood phenotypes is invaluable in circumstances that require blood from donors with rare blood types or rare combinations. This can be particularly challenging if the blood group of interest is not routinely tested.

Comprehensive phenotypes are also useful for general improvement of matching donors to recipients to avoid immunization and delayed transfusion reactions, especially for patients with chronic transfusion needs.

Aims: In this study, we aim to apply such techniques to develop blood type prediction models based on cheap-to-analyse and easily scalable screening array genotyping platforms.

Methods: Combining existing serological or genotyped blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used Deep Learning techniques to train and validate blood type prediction models for 32 antigens in 13 blood group systems, HPA-1a/b and secretor status. To account for missing genotypes a Denoising Autoencoder initial step was utilised, followed by a Convolutional Neural Network blood type classifier.

Results: All (A, B, A1, Coa, Doa/Dob, Fya, HPA1a, HPA1b, K/k, Jka/Jkb, Lua/Lub, M/N, S/s, C/c, D/D-weak, E/e, Se, Vel, Yta, Ytb) but seven prediction models demonstrated an overall prediction accuracy above 99%, with two coming close with accuracies of 98.8% (P1) and 98.7% (Fyb). The Lewis blood group coming next with an accuracy of 97.8% (Lea) and 97.9% (Leb), and the bottom scoring three antigens (Kpa, Cw, Cob) achieving 92.9%, 92.8% and 92.6% respectively. Models for antigens with low or high frequencies like for example, Cw, small training cohorts like for example, Cob, or very complicated genetic underpinning like for example, RhD proved to be more challenging for high accuracy (>99%) DL modelling. However, in the Danish cohort only 3 of 36 models (Cob, Cw, Kpa) failed to achieve a balanced overall prediction accuracy above 97%. The high predictive performance in the Danish training cohort was replicated in the Finnish cohort.

Summary/Conclusions: High accuracy in a variety of blood groups proves viability of applying Deep Learning to genetic blood type prediction using routine phenotypic and array chip genotypic training sets, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation. Replication of high predictive performance in an external genetic cohort other than the Danish training cohort proved the viability of DL models in external genetic datasets.