Sei sulla pagina 1di 16

bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064.

The copyright holder for this preprint


(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Primate MHC class I from Genomes


D.N. Olivieri 1 , F. Gambón-Deza 2
1 Department of Computer Science, University of Vigo, Ourense 32004, Spain
2 Department of Immunology, Hospital of Meixoiero, Vigo, SPAIN
olivieri@uvigo.es; fgambon@gmail.com

Abstract
The major histocompatibility complex (MHC) molecule plays a central role in the adaptive im-
munity of jawed vertebrates. Allelic variations have been studied extensively in some primate
species, however a comprehensive description of the number of genes remains incomplete. Here,
a bioinformatics program was developed to identify three MHC Class I exons (EX2, EX3 and
EX4) from Whole Genome Sequencing (WGS) datasets. With this algorithm, MHC Class I exons
sequences were extracted from 30 WGS datasets of primates, representatives of Apes, Old World
and New World monkeys and prosimians. There is a high variability in the number of genes be-
tween species. From human WGS, six viable genes (HLA-A, -B, -C, -E, -F, and -G) and four
pseudogene sequences (HLA-H, -J, -L, -V) are obtained. These genes serve to identify the phy-
logenetic clades of MHC-I in primates. The results indicate that human clades of HLA-A -B and
-C were generated shortly after the separation of Old World monkeys. The clades pertaining to
HLA-E, -H and -F are found in all primate families, except in Prosimians. In the clades defined by
HLA-G, -L and -J, there are sequences from Old world monkeys. Specific clades are found in the
four primate families. The evolution of these genes is consistent with birth and death processes
having a high turnover rates.
Keywords: Major Histocompatibility Complex, MHC Evolution, Immunologic Repertoire,
Mammalian Evolution, Gene Discovery

1. Introduction originally studied as the cause of transplant


rejection, their primary function is in the de-
The Major Histocompatibility Complex fense of the organism against pathogens. The
(MHC) molecules are amongst the most in- molecule of MHC class I presents endogenous
tensely studied of the adaptive immune sys- antigen to the CD8+ T lymphocytes and the
tem, especially in primates and the mouse MHC class II presents exogenous antigens to
since they are responsible for transplant re- the CD4+ T lymphocytes. Other molecules
jection and antigen presentation. These related to these pathways are present in the
molecules were first discovered in humans on same chromosome MHC regions, sometimes
the surface of leukocyte membranes and were referred to collectively as the MHC class III
given the name HLA (Human Leukocyte Anti- genes (i.e., complement proteins, cytokines,
gen). Although the molecules of MHC were
Preprint submitted to Elsevier February 15, 2018
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

etc.), although this is a nomenclature that is All humans possess the haplotype with six
simply used to make a distinction from the MHC-I expressed genes. Orthologs to these
MHCI and MHCII genes. haplotypes have been sought in Apes, Old
MHC class I (MHC-I) are encoded by genes World, and New World monkeys. Previous
with at least 6 exons. The exons of particular studies have described MHC-I genes ortholo-
interest are exons-2 and -3 (EX2 and EX3, re- gous to those in humans, and the human pseu-
spectively) encoding the protein domains α1 dogene, HLA-H, is actually a viable MHC-
and α2, that are responsible for presenting I gene in chimpanzees and gorillas (Wilming
peptides to TCR. These domains also repre- et al., 2013). At present, there are detailed de-
sent regions of the highest allelic variation. scriptions of the MHC loci in Apes, but few
These genes are essential for the innate and such descriptions in evolutionarily more dis-
adaptive immune responses and are subject to tant primates. For these more distant primate
environmental and evolutionary pressures to- species, several RNA studies have attempted
gether with likely coevolution effects through to characterize allelic variability (de Groot
their interaction with TCR and KIR (de Groot et al., 2012). Nonetheless, these sequences
et al., 2015; Garcia et al., 2009). have not been studied within the context of the
In humans, the MHC-I region consists of germline MHC genes from all these species.
six genes. The HLA-A, -B and -C genes are
highly polymorphic and are expressed by all
cells. These molecules are considered the clas-
sical MHC-I molecules and their role is to
present peptides to cytotoxic T lymphocytes. In recent years, genome sequences of pri-
Unlike the classical MHC genes, the non- mates have become publicly available in the
classical genes, HLA-E, -F and -G, exhibit form of assembled WGS datasets. These
limited polymorphism. In particular, HLA-E assemblies consist of relatively large con-
is a CD94/NKG2A ligand, HLA-G cells are tigs containing most of the genomic se-
found only in the trophoblast (Castro et al., quences. In this paper, we describe the anal-
1996; Djurisic & Hviid, 2014; Lynge-Nilsson ysis of MHC-I exon sequences EX2, EX3
et al., 2014) and HLA-F is involved in NK cell and EX4 from primates that were identified
signaling (Lee et al., 2010). Both HLA-F and from WGS data using a new bioinformat-
HLA-G are expressed by a restricted set of cell ics tool, called MHCfinder, freely availble
populations. at http://vgenerepertoire.org/. The primate
Apart from these MHC-I genes in humans, datasets we utilized in this study are indi-
complete genes have been identified that are cated in the phylogenetic ordering of 1, ob-
not expressed because of different causes: tained from molecular studies Perelman et al.
some are pseudogenes (e.g., the known pseu- (2011); Rogers & Gibbs (2014) and diver-
dogenes HLA-H, -K, -J and -L) (Moscoso gence times further confirmed with sets of
et al., 2006; Heinrichs & Orr, 1990) as well as published works summarized by TimeTree
individual MHC-like exons that are not found (Hedges et al., 2006). From the sequences
in tandem with other valid structural exons found by MHCfinder, this wealth of de-
(e.g., those possessing stop-codons) needed tailed genomic information may help to clar-
to form functional MHC-I molecule (Horton ify evolutionary processes that have shaped the
et al., 2004). MHC-I genes in primates.
2
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Eulemur flavifrons (LGHW01)


Lemuriformes Lemuroidea Eulemur macaco (LGHX01)
Propithecus coquereli (JZKE01)
Microcebus murinus (ABDC02)
Lorisiformes
Lorisoidea Otolemur garnettii (AAQR03)

Tarsiformes Tarsiidae Tarsius syrichta (ABRT02 )


s )
m ian rhini
i
os sir
Cebus capucinus (LVWQ01)
Ceboidea
Pr trep
Aotus nancymaae (JYKP01)
Callithrix jacchus (ACFV01, BBXK01)
(S Saimiri boliviensis (AGCE01)

A Nasalis larvatus (JMHX01)


(H nthr New World primates Colobus angolensis (JYKR01)
es

ap op Rhinopithecus roxellana (JABR01)


in

lor oid Mandrillus leucophaeus (JYKQ01)


rrh

hin s Chlorocebus sabaeus (AQIB01)


es) Cercopithecoidea
aty

Macaca fascicularis (AQIA01)


Pl

mo Macaca mulatta (JSUE03, AANU01)


nk 43.1 Mya
Macaca nemestrina (JZLF01)
ey Ca Cercocebus atys (JZLG01)
s an tar
da rhi Papio anubis (AHZZ01)
pe ne
s s 27.3-35 Mya Homo sapiens (ABBA01)
Pan troglodytes (AADA01, AACZ04)
Hominoidea Pan paniscus (AJFE02)
Old World primates Gorilla gorilla (CABD03, CYUI03)
Pongo abelii (ABGA01)
60 40 20 0 Mya Nomascus leucogenys (ADFV01)

Eocene Oligocene Milocene


Holocene

Figure 1: The phylogenetic tree of the primates indicating the species studied in this work. The tree is based upon
divergence times obtained from (Hedges et al., 2006), and previous molecular phylogenetic studies Perelman et al.
(2011); Rogers & Gibbs (2014).

2. Methods genes varies considerably across other mam-


mal species. Nonetheless, the MHC-I exon ar-
2.1. Datasets chitecture is nearly universal across all mam-
The WGS assembly datasets of 30 primate mal species (Birch et al., 2006). A diagram of
species were obtained from the NCBI in the the Homo sapiens MHC-I (HLA) exon struc-
form of FASTA files consisting of assembled ture and corresponding protein domains was
contigs, or for more mature projects, scaffolds described (Lefranc et al., 2005) (see also se-
and/or fully constructed chromosomes. The quence repositories of the by the IMGT) and
average genome coverage in these datasets is the a large collection of MHC-I alleles at the
> 15−20× with contig assembly N50 > 15kbs. IPD-MHC database (Robinson et al., 2011).
A detailed summary of the accession num- The relevant genomic signals (i.e., the exons,
bers and relevant assembly parameters can be introns, 50 -URT and 30 -URT) can be identi-
found in Table 1. fied to a high degree of accuracy without re-
sorting to general gene finding software tools.
2.2. Software Therefore, viable exons (i.e., those exons that
While there are six viable MHC-I genes could form a functionally expressed MHC-I
in Homo sapiens, the number of MHC-I molecule) can be accurately determined using
3
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

homology criteria with a supervised machine cise start/stop positions of the exon reading-
learning classifier. frame (defined by AG and GT motifs, respec-
The MHCfinder program extracts exons tively). Once the exons are identified, they are
EX2 and EX3 that encode the α1 and α2 translated into amino acid sequences by check-
domains, respectively, together with exon ing all valid reading frames. Those sequences
EX4 that encodes the constant domain. Al- containing stop-codons in the reading frame
though other exons constitute the full MHC- are discarded, while valid exons are saved and
I gene, these three exons are of particular in- converted into numerical feature vectors (i.e.,
terest for characterizing and comparing MHC- a unique array of numbers, that uniquely char-
I genes within and amongst species. Thus, acterizes the string of amino acids). A sim-
MHCfinder ignores the peptide leader (L) ple transformation of AA to feature vector was
(given by EX1), as well as all peptides cor- used (based upon the frequency of each AA
responding to the transmembrane and cyto- and pairs of AA), because it was found to
plasmic (T m , indicated by EX5 through EX8). discriminate sequences better than other more
While the exon/intron structure of MHC Class sophisticated transformation procedures (e.g.,
I is thought to be universal across jawed ver- those based upon positional physicochemical
tebrates, the specific intron spacing between properties of each AA within the sequence).
the exon sequences EX2, EX3, and EX4 varies From the feature vector representation, a
considerably. As such, the algorithm only machine learning procedure with a Random
imposes a simple structural requirement that Forest (Breiman, 2001) was used to classify
EX2, EX3 and EX4 are found in a tandem the sequence into one of the exon types: EX2,
arrangement along the DNA sequence, but EX3, and EX4. Supervised training is per-
places no hard restrictions on the intron sep- formed by defining these classes with sets of
aration. annotated exons from H. sapiens and defin-
Figure 2 summarizes the principal steps of ing a null set from a random background se-
the MHCfinder algorithm. This program was quences. Binary classification is carried out
implemented as a multi-threaded application for each exon type with a background/signal
in the python programming language, with the ratio of 3:1, determined empirically. The
biopython library (Cock et al., 2009) for low- prediction precision is improved by multiple
level sequence analysis, and the scikits library training/prediction iterations; positively iden-
(Pedregosa et al., 2011) for machine learning tified sequences are included in the training
tasks. First, a Tblastn query from a consensus set for subsequent training/predictions. This
protein sequences from known MHC-I exons process is referred to as iterative supervised
(EX2, EX3, and EX4 from humans) is made learning, and is a common machine learning
against all available primate WGS datasets. technique whereby new information is con-
The search result is a listing of candidate WGS tinually accrued to the knowledge base for
contigs likely to contain valid exons, together improving prediction accuracy (e.g., modern
with the position of the matching nucleotide speech recognition has benefited from such
sequence and similarity scores; this listing is techniques).
referred to as a hit table. The algorithm pro- In our gene finding algorithm, MHCfinder,
cesses each line of the hit table, analyzing a a probable functionally expressed MHC-I gene
nucleotide region larger than the nucleotide that must contain a tandem arrangement of
positions in the hit, so as to determine the pre- the three viable exons (i.e. those that do not
4
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

2. tblastn WGS Datasets


1. HLA class I query
query sequence restricted search for
Primates
g e
nti me ion alu or
e
co fra reg ev sc
1 AANU01104903.1 54.2 65.25 0/-3...979 668 4.69e-29 115
2 AANU01018014.1 52.5 64.41 0/-1...577 269 5.87e-29 112
tblastn Hit table 3 AANU01104950.1 52.5 65.25 0/2 3635 3943 6.24e-29 115

.....

N ABSL01193118.1 60.4 72.92 0/2 2 145 2.08e-05 56.6


3. Loop over tblastn hit table results
tblastn hit table results

Exon defined between AG and GT


3.A Find exon in larger contig region R AAAGCTTCCTCA...TTTCAATCGTAGA
specified by hit coordinates
exon

3.B a)Translate NN sequence to AA; NNseq= TCCTCA ... TTTCAAT

b) Translate AA sequence to feature AAseq= SHSLRYFHT ...LPEPLTLRW


vector

FeatVec= 0,1,5,9,2,8,15,0,0,1...

3.C Use Random Forest to classify


valid exons EX2, EX3, and EX4
4
2

yes valid MHC-I gene


EX
EX

EX

(if valid, add to list) (tandem EX arrangement)

EX2 EX4 pseudogene


(e.g., here missing EX3)
no
EX2 EX3
end

EX3 EX2
4. From positive exons, determine structure "indeterminate" examples
(valid gene, pseudogene, or undertermined) (at end or start of contig)
EX4 EX3
end

Figure 2: The steps in the MHC-I prediction algorithm. The selection of valid MHC-I exons is based upon a Tblastn
pre-selection, an exon reading-frame identification procedure, and classification with a random forest method.

contain stop codons in the reading frame), humans, we found 31 exons, 18 of which are
EX2-EX3-EX4 along the germline sequence. constituents of six functional MHC-I genes;
Nonetheless, viable exons, which are homolo- the other exons while viable, must be pseudo-
gous to MHC-I constituent exons, exist in the genes. Similar results are seen in all the other
genome that do not form tandem arrangements species studied.
(i.e., they are isolated or an exon is missing),
and thus, do not express MHC-I molecules. In
5
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Tree construction. To study the phylogenetic able MHC-I genes. Table 1 lists the number of
relationships from the MHC-I exons, we con- exons and candidate viable genes per species
structed a large phylogenetic tree by align- found from different WGS datasets. Differ-
ing sequences with ClustalO and then used ences in the number of exons between WGS
phyML with the WAG matrix (part of the Fast- datasets from the same species is indicative of
tree software (Price et al., 2010)). In all cases, the variability between individuals of the same
500 bootstrapped samples were made. While species, as well as maturity/completeness and
trees were studied using the three exon se- sequencing methods used in constructing the
quences (EX2-EX3-EX4), the final analysis assemblies.
was made only with EX2, since this exon pro- At present, there are 42 WGS of Homo
vides the most discriminatory information. sapiens in the NCBI repository. From these
datasets, MHCfinder was used to find all
3. Results MHC-I exon sequences and the phylogenetic
tree of Figure 4 was constructed using the de-
Exon sequences EX2, EX3 and EX4 of duced amino acid sequences of EX2. The pro-
MHC-I were obtained from 30 WGS primate gram tags the EX2 sequence as belonging to
datasets using our software tool, MHCfinder, one of the following categories: (1) a proba-
described in the Methods section. These se- ble expressed MHC-1 gene (since it is part of
quences are homologous to the human MHC- a tandem arrangement EX2-EX3-EX4), (2) a
I and were found by an iterative supervised pseudogene (because it lacks either EX3, EX4
learning procedure (Methods). As described, or both), or (3) an indeterminate gene (because
these sequences are flanked by splicing sig- it is found at the extreme edge the contig, but
nals AG/GT and have an ORF starting two nu- could form a viable gene; see graphic of Fig-
cleotides after the AG and terminating one nu- ure 2). In the resulting tree, six viable genes
cleotide before the last GT. Exons are consid- (HLA-A, -B, -C, -E, -F and -G) and four pseu-
ered correct if these conditions are met, while dogenes (HLA-H, -J, -L and -V) can be dis-
exons possessing stop codons within the read- cerned. Also, the tree demonstrates that con-
ing frame are discarded. Those exons found siderable variability exists amongst the clas-
in a tandem arrangement (i.e. with EX2-EX3- sic genes (A, B, and C), forming several lin-
EX4 and with nominal intron spacing) are con- eages. However, with respect to the nonclas-
sidered candidate MHC-I genes (referred to as sical genes (E, F, and G), the sequences are
probably viable genes throughout the rest of invariable. Also, the known pseudogene se-
the paper, since they have the necessary con- quences, L, J and V are conserved, while the H
ditions to be expressed). Valid exons that do pseudogene form separate lineages; this may
not participate in a tandem arrangement are be related to its proximity to the HLA-A locus
considered probable pseudogenes (referred as (Grimsley et al., 1998).
pseudogenes throughout) or indeterminate if Hominidae diversified approximately 20
they are found at the extreme ends of the con- (Million years ago) Mya. Eight WGS datasets
tigs (referred to throughout the text as indeter- were used to study the MHC-I exons from the
minate). Figure 3 shows graphical maps of the Ape family; the number of exons identified
MHC-I exons found in contigs of G. gorilla. from the WGS of Ape species is provided in
The exons found in tandem configurations, Table 1. The number of probable viable genes
EX2-EX3-EX4, are indicated as probably vi- per species is between two and eight. The
6
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Table 1: MHC classI Exons and genes of Primates. All values for the contig N50 were > 15kbp, except N. larvatus
(for which no exons were found). For some assemblies, a reliable number for the coverage could not be ascertained,
and are indicated by N/A (not available).
Species WGS N50(kbp) Cov. EX2 EX3 EX4 Genes
Lemuriformes
E. flavifrons LGHW01 27.3 52x 1 1 1 1
E. macaco LGHX01 20.0 21x 1 0 1 0
P. coquereli JZKE01 28.1 104.7x 6 6 7 6
M. murinus ABDC02 182.9 221.6x 9 10 9 6
Lorisiformes
O. garnettii AAQR03 27.1 137x 13 11 13 8
Tarsiformes
T. syrichta ABRT02 38.2 48x 7 8 10 5
Platyrrhini
C. capucinus LVWQ01 41.2 81x 23 20 24 15
A. nancymaae JYKP01 28.5 113.4x 28 32 28 12
C. jacchus ACFV01 29.3 6.6x 22 20 20 13
C. jacchus BBXK01 61 N/A 5 6 8 3
S. boliviensis AGCE01 38.8 80x 15 17 22 8
Cercopithecoidea
N. larvatus JMHX01 13.3 290 - - - -
C. angolensis JYKR01 38.4 86.8x 8 8 10 4
R. roxellana JABR01 77.2 53.7x 20 14 22 8
M. leucophaeus JYKQ01 31.3 117.2x 6 7 16 3
C. sabaeus AQIB01 90.4 95x 9 10 10 6
M. fascicularis AQIA01 86 68x 17 19 21 12
M. mulatta JSUE03 107 47.4x 31 28 32 17
M. mulatta AANU01 25.7 N/A 26 34 32 17
M. nemestrina JZLF01 107 113.1x 24 20 28 12
C. atys JZLG01 113 192.0x 24 19 22 12
P. anubis AHZZ01 40.3 92x 30 21 20 16
Apes
H. sapiens ABBA01 100 13 8 10 6
P. troglodytes AADA01 108.4 N/A 9 8 6 3
P. troglodytes AACZ04 384.8 70x 12 10 10 8
P. paniscus AJFE02 67 26x 10 8 8 6
G. gorilla CABD03 53 N/A 10 9 8 5
G. gorilla CYUI03 ¿1000 NA/ 7 11 8 5
P. abelii ABGA01 15.6 6x 13 10 9 3
N. leucogenys ADFV01 35 5.6x 6 6 5 2

7
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

EX2 EX4
HLA-V

2
EX
HLA-J

EX2
HLA-H

4
EX

4
2

EX
EX

3
EX
HLA-H

2
EX
HLA-F

3
EX
EX 2

4
EX3
EX

HLA-G

4
EX 2
3
EX
EX
HLA-E
3
4

2
EX
EX

EX

HLA-C
3
EX
4
3

EX
EX

2
3
EX

EX
EX

HLA-B

3
2

EX
EX

HLA-H

Figure 3: Graphical representation of the MHC-I exons obtained in G. gorilla (CABD03) with the MHCfinder pro-
gram. Each contig is represented by a line and exons by boxes. Tandem exon arrangements (EX2-EX3-EX4) are
colored blue, indicating the high possibility of being a viable and functionally expressed MHC-I gene (explained in
text). Exons not part of tandem arrangements, thus indicative of pseudogenes, are marked in red. Exons that are found
at the extreme ends of the contigs may be considered indeterminate if they could form tandem arrangements with
exons in other contigs (colored in light blue).

presence of pseudogenes is demonstrated by study (Daza-Vamenta et al., 2004)) (Table 1).


the existence of more exons than needed for In general, the Cercopithecus species possess
the viable genes. more MHC-I genes than found in Hominidae
species.
The Cercopithecus family consists of mon-
keys closest to hominids and are also referred The Platyrrhines are monkeys of South and
to as Old World monkeys. The divergence Central America. These species diverged
time between Cercopithecidae and Hominidae from hominids approximately 43 Mya. Five
is approximately 29Mya. The number of WGS datasets of Platyrrhines were analyzed
MHC-I genes in these animals varies consid- to identify MHC-I genes and pseudogenes us-
erably; between four in C. angolensis and 18 ing MHCfinder. The number of viable MHC-I
in M. Mulatta (22 were found in a detailed genes found is between nine (in S. bolivien-
8
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Homo_sapiens-LRUM01-EX2-198|Pseu/1-89

s-LO QL0 1-EX -246| Pseu/ 1-89


Homo_ sapien s-LOQ

-EX2-22 0|Pseu/ 1-89

-89

Ho mo ap ien s-A BB A0 1-E X2 -27 Pse u/1 -89


Homo_sapien s-JSAF02-EX2 -206|Pseu/1-8 9
Homo_sapiens-LRUL0 1-EX2-188|Pseu/1-89
Homo_sap iens-LIQK 02-EX2-18 |Pseu/ 1-89
Hom o_sa piens
Hom o_s apie

ap ien s-AAA DD 01- EX 2-2 71| seu /1-8 9

-89
Ho mo _sa ien s-L BL P0 1-E

-8 9
C o n se n supi en s- A D D F0 2- E 2- 30 6| Ps u/ 1- 89
se u/1 -89
Ho m o_

Hom o_s api


Ho mo _s pi en s- LD OC 01 -E X2 88 |P se u/
Ho mo _s ien s-L BH Z0 -E X2 -1 12 Ps eu /1 -889

3 /1 -8 30 8| P se t/ 1- 89
7| In de /1 -8 9
2-24 8|Ps eu/1

1- 89
H om o_ sa pi en s-

2|P se u/1
H om

01 -E X2 30 4| In de t/1
H om _ sa p ie n L K H W 0 1 -E X 22 -3 8 |P s e u /1 -8 9
H o m _ sa p ie n ie n s -L J C 0 1 -E XX 2 -2 8 |P s e uu /1 -88 9
H o m m o _ s a n s -L M B D 0 1 -E

9
u/
eu

H oo m o _ s a p ie n s -L B H Z 0 2 3 -1 1 4 H C /1 s e u /1 -8
H o m o _ s a p ie n s -L O Q O s 1 0 - e2 - 3 0 04 |P s ee u /1 - 8 99

C o o_ sa en s- A Q P0 1- EX -3 05 |P se
H o m o _ s a p ie n s s e n s 0 2 - E XX 2 - 2 99 3 |P s e uu / 1 - 8- 8 9

o_

sa
H o s a p ie s -L M E 0 1 -E-E X 2 -9- H L As e u /1 - 8 99

EX2 -25 1|P


o_ sa
H o m o _ s a C o n D D F 0 1 - E X 2 - 22 9 2 |P|P s e e t / 11 - 8 99

-27 4|P
ap
ap
o
o

pie ns- LD
sa
Ho
H o o _ s p ie n s - A E Q N 0 1 - E E X 2 - 2 7 8 | P s e u / 11 - 8 8 9

H o m o _ s a p ie n s B L 0 1 -1 1 6 |M H C /1 -8 9
o_
Ho

9
H o m o _ a p ie n s - L - A B S A 0 1 2 - E X 2 - - 2 5 3 | P P s ee u / / 1 - 8- 8 9 9

pi en s-

ns-L OEK 01-E X2- 152 |Ps eu/ 1-889

-8
ens -LR IL0 -EX 2-1 40 |Ps euu/1 -89
m

H o m o_ sa s1 ns LR D 02 11 |MH C / t/1-8 89
mo

H o m o _ s a p ie n s - A K 0 1 - 8 |M C /1 -8 9
X 2- 30
pi en L K H Y 0 1 -E X 2 -5

H o m o _ s a p n s -A D D F 2 /1 -8 9 -8 9
Ho

OQN0 1-EX2

0 9 |M 1 |P
m

9
H o m o _ s a p ie n - L IQ N V - 1 2 |M H /1 -8
H o m o _ _ s a p ie n s - A M X P 0 1 - - E X X 2 - - 1 9 3 | P P s e e u u / 1 1 - 8 8 9

-LBL Q01- EX2- 60|P seu /1-8 9


mo sa

EX 2-

-
9
H o mo sap ns s-L Q 02 -E X2 -18 79 |P se u/ /1

H0oo m o _ s a pp ie nn s - AA A D C 0 1 E X 2 |M H C /1 - 8 9

n - A U 0 1 - 1 0 6 - 0 |I |I n d e 9
_ s a p ie n - A A K P 0 1 - E XX 2 - 2 2 7 5 | P s e u / / 1 - 8- 8 9

t/ 1
Ho m o__s ap ie ns -A BS L0 1-E X2

H o mn s e n ss u s 1 0 -M H C -I -p02 -E X 2-

8
9
m o s a p ie - A Y H 0 1 E X 2 - 2 0 9 | P s e u / 1 - 8 9

H o m o s a p ie n s -A C -I -p rc o 2 /1

H o m o _ s a ie n -L D O A 0 1 -1 X 2 -2 8
_s

p
H o m o _ a p ie ie n s - L OA D B F 0 2 1 - E X 2 - 1 7 4 8 | P s e e u

ie n

s- LK

/1 -
HC
p

LM BA 01 -E X 2- 78 |P se u/
s-
s-
Ho mo_ _sa pien s-A -JS M0 02- EX X2- -16 62| |P

pi ien ns- -A AZ O 01 1|M -89 97 se et/1 89


H o mm o o _ s s a p i e e n s - E X R U L 0 1 2 - E X 2 - H C / 1 - 8 8 9
NV 01

M01-E X2-177 seu/1 -89


LM

Homo_s apiens- AADB02


a
p ie s - L A D D 0 1 - E 2 - 9 0 | P s e u / 1 - 8 9

ns-L PXP 01-


H o o m _ s a p i e i e n s - L R I Q K 0 1 3 - E E X 2 2 - 1- 1 5

|M

se 1- -89
a ie n L O P 1 - X 2 2 5 3 | P s e u / 1 - 9
_s

EK P0 1-

H o m o _ s a ie s - A D 0 2 - 1 4 6 H C

H o m _ su ie s- AA B 1- 07 M H de

sa p ie ns -L OQ N 2- /1 -1 7|P nd /1-
H om o _ a pie n -A 2- M0 -E X2 18 /1 9
H o m o_ sa pi 0 -L U B0 -E 1|M C 1 -
H mo sa ap ns -L QM 0 1- EX 2

u/ 89
9
BB
ns-
p

H X 011 -E X 2 -6 8 |P se uu /1 -8 99
Ho mo o_s pie ens -LO DO Q0 01- -E

89
03

0 , H o n m o _ s a p ie n s - L R S L 0 1 E X 2 2 1 8
n
H o m o _ s a p ie n s B B 0 2 -E
1-E

om o _ sa pi n s- A H 1 X -2 0| -8
u
e
s- AA DC
2- EX 2-
Ho om _sa ap ns s-L LBL EK L01

-L

o _ _ s s a p p i e e n s s - L L O QD B L A - - E X 2 - 1 8 8 2 | P s e
26
-c
s1 0 -M M Y H
B
s

1-
A ZQ 1-E 2- 91 nd u/ -8 9
ie n s B B 0 - E 2 2 2 9 | s u / 1 - 8 9

CY 01
s

-1
n s O Q L0 -E X2 25 0|P se u/ 1- 9
o

|P u/
II

X2-1

-
Homo _sapi ens-L
0

H C -I
H o _s ie n s- O I

X2
Ho om _sa ap pie ns- s-L

H om o_ sa pi en s- LO

H o m o _ s a p ie n s -L B C 0
-

|
-E

I
pie ns-
_
H o _s a ie en
m o

165|P
-13

H om o_ ap ns -LD P01
p
H om o_ _sa sap

Hom o_sa pien

H
u

12 0|P se

H m _ s p ie s - L L Q

9
-
om o

Hom o_s apie

sa pi en

-M
2-

1|Pseu/1- 89
-9 8|

B D0 1- -15 2- HC /1-8 9

8
H om o _

0H o m o e n a p ie n s - A D L 0 - 1

-
2|P se

01 -3 |M M 8 -8 9

M C /1 -8 /1
LM B E0 02 EX M C 1-
X 2 -1 |P s - L /1 - 8 9
H om mo

Hom o_s u s 1 0
-4 8 |P s e u /1 -8 9
n

H om sa pi

-4 4| H HC |Ps 9
Ho mo _sa
|P se u/

B D

u
8

s - - L MC Y Y H 1 - - 5 | M H C /
2
H Ho

4| M C /1 e
P L

Ho mo _s

ie
Ho mo _s
|P

H / -8 9
x 2 |P u /1 - 8

C 1 24 | 15 /1
p ie n L R

en s L M Q0 01 -4| H
X 2
H i

p
8

se u /1 9

H om o_
o

|P

u/

C 1- 9
a

B
s p i

H m

se

/1-

Co nsen

-8 9
H om o_

1-

/1 8
1- 89
A

1-
U

9
o_
1- 899
/1
H om o sa

89
e
- 3 s u

0
9
-8
om o _s p

-8

9
9 9

/1 9
s /1
H o _s ap ien

H
-8
-8 -8

|I e 1
o
Ho mo _sa api ien s-L /1 /1 89

0 1H o s _ a p ie
C

H
C C - 9
E 2 1
H H /1 -8
-

H m _ pi en s- D
Ho om o_s sap ens s-L LB NV M |M C /1 89 9

o s
m o_ ap ien -L BH LP 01 4| 64 |MH HC /1- 1-8

s
5 -
o s
Ho Ho _sa api ien s-L BA Z0 1- -EX
M 0 1- 01 -74 |M HC C/ -89
II0 W 01 -84 |M MH /1 89
1- 9
| s
X

Ho mo mo_ pie ens s-L MB 01 2-E EX 2- LJ H Y 1 94 4| HC u/ -8 9


_ n - K B -
Ho mo_ sap sap s-L LK HX 01 EX X2- -13 43
2 1 s- -LK KH X0 01- -10 |M 9 se u/1 1-8

C
9 en s L H B 1 12 - 8 0|P se u/
P s
5

8
Ho m sa ie ien KH HY 01 -E 2- 12 5 |P 9

a
o_ pi ns s | pi ien ns- -LK MB A0 2-2 H / 1 -15 5|P Pse /1-8 - 8 9
Ho o _ sa en -L -L W0 1- EX X2- 101 3|P Ps seu
m 0 - a
s ap ie ns -L MB F0 A - X2 -18 9| eu t / 1 9

8
0 ,0
m o s a p ie s - M J I I 1 E X 2 9 1 | P s e u / 1 o_ _s sap pie ens s-L SA - H L 1-E X2 2-21 7|Ps n d e /1-8
_ s p ie n s L M B C 0 1 - E X 2 - 8 1 | P s e e u / 1 - 8

05
I
o mm o o _ _ s a a p i i e n s - J x 2 I L 0 2 - E - E X - 2 6 8 3 | s e u / 1 - 8 9
-

ap ns -L BD 01 -E 2 -7 |P se u /1 -8 9

0,
ie n - L C Y 0 - E X 2 - 6 1 | P s u / 1 - 8 9 H o m o s p en - e R K0 2 X2 - 2 4|P u
Ho O 1 1 e
5 | s u /1 -8 9 H o m o_ sa pi 1 0 -L IQ B0 -E X 2 28 e -89
m s - E - X -
H o o _ C o L R UQ O 0 0 1 - E X 2 2 - 4 1 | PP s e e u / 1 - - 8 9 9 H o m o _ s a s n s - L D 0 2 - E 2 - 5 |P s e u / 1 1 - 8 9
H o m _ s u ie ns AA K 01 EX 28 /
H o m o _ s a p ie n s e L 0 1 1 - E E X 2 - 3 1 1 | P s e u / 1/ 1 - 8 8 9 H o mo e n ap ie s- IQ C 1- - |P s d e t - 8 9
H o n s _ s a p i e n s - L A D D 0 - E X 2 - 2 8 6 9 |I n e u /1

07
H o mo_ sap n s - n s u s -EX X2- -21 |Ps seu u/1- -89 9

0,
H o o _ s a p n - A D 0 1 X 2 - 2 9 |P s

3
m o s a ie n A A 1 0 2 - 1 3 | P e u / 1 8 9

0 ,1
C o m o _ s p ie n s - A A B A 1 - E X 2 9 4

13
4
Ho p s - 07 H o m o s a p ie n s A B L 0 2 - E 2 - 1 /1 - 8 9 8 9
m _ s a ie n s - A D B - e 1 9 2 | P s e u / 1 - 8 9

0,18

0 ,1
0,

0,
6
H o o _ s p ie n s - A B S L 0 2 - x 2 - H | P s e u / / 1 - 8 9 H o m o _ s a p ie s - B S F 0 E X H C - -8 9

0 ,1
H o m o _ s a p ie n s - A AB B A 0 0 1 - 22 3 6 | L A - e u / 11 - 8 8 9 H o m o _ s a ie n s - A D D 0 1 - |M C /1 e t/ 1 9
m M 9 H o m o _ s a p ie n s - A U M 1 - 7 4 |M H 5 |I n d e t/ 1 -8
H o o _ s a a p ie n s - L R D D 0 1 - 2 3 3 3 |M H CJ / 1 - - 8 9 03 H o m o _ a p ie n - L R Q O 0 - 1 8 -2 9 |I n d 1 -8 9

HLA-L
mo p s UM 1- 0 |M H C / 1 - 8 9 0, H om o_s ap ns LO N01 X2 6 t/
H o m _ s a p ieie n s -A- L IQ K 0 1 - 22 2 8 |M s p ie n s - O Q 0 1 -E X 2 -2 9 |I n d e 1 -8 9

-B
0,

HC /1- 89 ,0
3 H m _
H oo m oo _ s a p ie n s - L Q P 0 1 -E 2 -2 9 7 n d e t/
15

n s -A A D C 0 2 - 2 1 7 |M H C /1 - 8 9 0
Hom o_s 9

HLA
a 8 H o m _ s a p ie s -L O Q M -E X 8 |I -8
H o m o _ s a p p ie n s -JM Y H 00 1 -2 11 6 |M HH C /1/1 - 8 99 H o m o _ s a p ie n -L O L 0 1 X 2 -2 1 |P s e u -8 9
9 /1

2
14 9

2
ie 0,

0 ,2
5 H o m o _ s a ie n s -L O Q 0 1 -E 2 -3 0 s e u /1
H o m o _ s a p ie n s -L O S A F 0 2 -2 1 4 |M H CC /1 - 8- 8 9 0, 0 ,0

0,

19
15 9
5 H o m o _ s a p ie n s O E K 1 -E X -1 6 |P

0, 25
u /1 -8

0,
o Q 2 |M /1 -8 9 0 ,0
H o m _ s a p ie n s -A D N 0 1 -2-2 1 0 |M H C /1 H o m o s a p n s -L X P 0 -E X 2 |P se /1 -8 9

5
D 9 6

0 ,2
o _ n s
sa p ie -A E F 0 0 2 H -8 H m o _ s a p ie n s -L P Y E 0 1 2 -2 u
Hom K P 2 -2 0 |M H CC /1 -8 9
o n
H om _ sa p ie n s- L R U L 0 1 -1 9 61 |M H C /1 -8 99
0,
-H 3 H o m o _ s a p ie s -L C D 0 1 -E X 2 -3 6 |P se 1- 89

HLA
0 ,0

8
18
H o m o _ a p ie n s- L M B 0 1 -E X |P se u/

0 ,2
o_ s- L 4 1- 89
Ho m o_ sa pi en s-LB LQ 01 -1
01
-1 7 5
|M
H om o_sa pi en s- D O C 0 3 -1 9 1 |M H C /1 -8-8 9
H
/1
HLA 0 ,0 H o m o _ s a p ie n L M B C X 2- 46
H o m o _ s p ie n s- II 01 -E X 2- 56
|P se u/
u/ 1- 89

0,3
|M C /1 -8 99
- H o m o _ sa pi en s- LJ H W 01 -E 2- 66 |P se

0, 3
-8 9
Ho m o_
sa pi en
sa
LR IL
s- LD NV
Ho mo _s pi en s- LB LP 01 -1 44 |M
01 69 |M H C /1 -8
-1 56 |M H C /1 -8 9
H C /1 9
J 0, 04
H o o_ sa en s- LK Y0 1- EX
H om o_ sa pi en s- LK H 01 -E X2 -7 6|
Ps eu /1 9
eu /1 -8
H om o_ sa pi s- LK HX -E X2 -8 6| Ps
0,

Ho mo _s ap ien s-L BH Z001 -1 36 |M HC HC /1 -8 9 H om o_ sa pi en s- LM BB 01 X2 -96 |P se u/1


-89
31

-8 9
Ho mo _sa ap ien s-L MB A0 2- 12 4|M HC /1 -8 9 0, 1 Ho m _s ap ien MB A0 1-E 18 |Ps eu /1-
89
pie ns- LM 1-1 02 |M /1- 89 Ho mo s-L
ap ien LB HZ 02 -EX 2-1
Ho mo _s u/1 -89
Hom o_s api BB 01- EX 2-9 2|P HC /1- 89 0,0 3 pie ns- 2-1 30| Pse
ens
Hom o_sa pien -LK HX 01- 82| MH C/1
seu /1- 89 Ho mo _sa pie ns- LBL P01 -EXX2- 138 |Pse u/1- 89
Ho mo _sa 01-E
s-LK HY0 1-72 -89 ns-L DNV u/1-8 9
Homo _sap iens-
LKHW 01-62 |MHC
Homo_ sapien s-LJII0 1-52|M
|MH C/1- 89
/1-89 0,13
HLA-F 0 ,3
1 Hom o_s apie
Homo _sapie
s-LB LQ01 -EX2 -163 |Pse
Hom o_sa pienns-LD OC03 -EX2- 171|P seu/1- 89
HC/1-8 9 2-204|Ps eu/1-89
Homo_sapi ens-LMBC0 1-42|MHC/1 -89 Homo_sa piens-JS AF02-EX
Homo_sapiens-LMBD01-32|MHC /1-89 0,34 Homo_sapiens-A MYH02-EX2-287 |Pseu/1-89
Consensus10-MHC-I-p1/1-89
Homo_sapiens-LCYE01-22|MHC/ 1-89 0,22 Homo_sapiens-A ADB02-279|MHC /1-89
-89
Homo_sapi ens-LOQO0 1-14|MHC/1 A-F/1-9 1
0,25 Homo_sa piens-AA DC01-EX
2-311|Ps eu/1-89
Consen sus10- ex2-HL |Pseu /1-89
0, 3 Conse nsus1 0-MHC
-I-cer co3/1 -89
Con sens
LOQN 01-EX 2-2893|M HC/ 1-89
0,2 3
Homo _sap iens- 0,1 3 6 Hom o_s us10 -ex2 -HLA -V/1 -89
s-LD OC0 3-17 C/1 -89 0,1 8
A 3
Ho mo _saapie ns-L OQ O01 -EX 2-12
0, 37

HLA-
Hom o_sa pien -LB LQ0 1-1 66| MH 0, 19 0, 2 0,
api ens 3|P seu /1- 89 0,0 Ho
1 mo pie ns- LO
QN |Pse u/1- 89
Hom o_s -15 _sa 01-
IL0 1-E X2 HC /1- 89 0, 2 Ho mo pie ns- LIQ EX 2-2 63|
Pse u/1 -89
pie ns- LR 1-1 41 |M /1- 89 0 ,2 Ho mo _s ap ien s-A ADK0 2-E X2 -26 9|P
Ho mo _sa ap ien s-L DN V0 1- 13 3|M HC /1 -8 9 6
Ho m _s ap ien B0 2-E X2 seu /1- 89
0,34

_s P0 HC 2
H om o_ sa pi en s- AB SL 01
BL 9 6
Ho mo |M -31 0|I nd
HLA-G

0 ,2
eu /1 -8
HLA-
3

ap ien s-L 02 -1 21 5 0 ,2
0,

Ho mo _s pi en s- LB HZ-E X2 -1 0| Ps /1 -8 9 0 ,1 H om o_ sa pi en s- AB BA 01 -E X2 -3 14 |In et/ 1-8 9


LA -A -8 9
sa 01
Ho m o_ s- LO Q O 10 -e x2 -H |M H C /11- 89
sa pi en on se ns usK P 01 -1 592| P se u/t/ 1 -8 9
0 ,2
1

0 ,2
1 V H om o_ sa pi en s- LC YE
o_
0, 01H o m o _ sa pi en
H o m sa p
s- LM
-E X2 -3
01 -E X2 15
s- LM B D 01 -E X -2 3| Ps eu t/1 -8 9
de t/1 -8
|In de 9
-C

Ho m o_ C s- A E -E X 2- 180 3 |I n d ee t/ 1 -8 99 H o o _ sa ie n s- B C 01 2- 33 /1
pi en 2 H o m o _ s p ie n L JI I0 1 -E X 2- |P se u/ -8 2
o_ sa LI Q K 022 -E X 2 -3 0 2 |I n dH C /1 -8 0,
HLA

-8 9 H o mm o _ s a p ie n ss- L K H W -E X 2 -5 43 |P se u/ 1- 82
0,28

H om
HL

s-
pi en A A D B 0 2 -E X 2 -3-2 5 5 |M H C /1 -8 9 H o _ a p ie -L K 0 1 -E 3 |P se 1-
o_ sa s- 0 1 |M /1 9 H o m o _ s a p ie n s -L H Y 0 1 X 2 -6 u /1 -8 82
A-

H om sa p ie n -A A D B B S L 0 1 -2 5 4 |M H CC /1 -8 9
o_ ns s -A A0 25 H -8 9 H oo m o _ s a p ie n s -L MK H X 0 1 -E X 2 -7 3 |P se u 2
E

H o m _ s a p ie s a p ie n s -A B B K 0 2 -2 0 0 |M H C /1 8 0 ,0 H o m o _ s a p ie n s -L B B 0 -E X 2 3 |P /1 -8
21
2

/1 - - 8 9 -8 3 s e u /1 2
0,

o mo sa
19

_ n -2 |M C H ns MB 1 -E
0,

m o IQ 1 6 1
H oo m o __ s a pp ie n s -L B H A 0 1 -E X 2 -9 |P s e u -8 2
0,

Ho ie
H o m o _ s a p ie n s -L A D D 0 1 -1 8 7 |M HH C /1 - 8 99
1
0,

19
0 ,2
2

14
0, ,17

p 0 5 /1 H m sa ie -L Z X 3 |P /1
0,

H o m o _ s a ie n s -A A D C 0 2 - 2 2 0 8 |M H CC /1 - 8- 8 9 0, B 0
H o m o _ s p ie n s - L L P 2 -E 2 -1 0 s e u /1 -8 2
-8
17

14
0

m p -A H - |M 0, H o m o _ s a p ie n s - D N 0 1 -E X 2 -1 3 |P s
0 ,2,2

H o o _ s a ie n s A M Y F 0 2 - 1 0 0 |M H C / 11 - 8 99 2
H oo m oo _ s aa p ie n s - LL R IL V 0 1 - X 2 -1 2 5 |P s e u /1 -8
0

m a p s - S A 1 0 H / 8
H o o _ s p ie n n s - J B A 0 0 1 - 98 0 |M H CC / 1 - - 8 9 H m _ p n B 01 EX 37 e 2
m
H o o _ s a s a p ie s - L M M B BX 0 1 - 7 0 |M
- H /1 89 H o m o _ s a p ie n s - L L Q - E X 2 - 1 |P s u /1 -8
|M HC /1- 89 H o m o _ s a ie n s - L D O 0 1 - 2 - 1 5 4 5 |P e u /1 -8 2
7

m n L 1 H o m o s a p ie s - R U C 0 E X
H o o m o _ a p ie ie n s - - L K H Y 0 1 - 6 0 0 | M H CC / 1 - - 8 99
0 ,1

7 seu 2
H o m o _ s a p ie n s J S A L 0 3 - E 2 - 1 |P s
0 ,1

H o_s ap ns LKH W0 1-5 0|M H C/1 -8 9 /1


H o m o _ s p ie n s - A B F 1 - E X 2 7 0 |P e u /1 - 8 2
4

m _s ie - H 0 4 |M /1 8 H o o m o __ s a a p i e n s - A B B A 02-E X 2 - - 1 7 6 s e u - 8 2
H o o m o _ s a p ie n s - L K L J I I 0 1 - - 3 0 | M HH C / 1 - - 8 9 9 C o s p n - S 0 X 1 |P s /1 -
H mo sap ns ns- BC 01 -20 |M HC1/1 1-8 89 H o m _ a ie s- AA L0 1-E 2- 93 e 8
H o m o _ s a p iea p ie - L M B D E 0 1 1 9 5 2 | M r c o C / / 1 - - 8 9 9 H o n s o _ s a p i e n s A M D B 1 - X 2 1 |P s u /1 2
- H o m e s a p i n - L Y 0 E X 2 2 - 2 3 1 |P s e u / 1 - 8 2
H o m o _ o _ s ie n s s - L M C Y 0 1 - 2 2 - c e | M HH C / 1 1 - 8 8 9 H o m o_ ns p en s- RU H 2- 1 e
o o m a p en s-L UM 02 -I 03 |M HC / 1- 89 H o mm o o _ s a u s 1i e n s - A A D D M 0 0 2 - E X 2 - 2 3 4 | P s u / 1 - - 8 2
H H _s pi n R B C 2 7 M C / - 89 om o _ sa pi 0 s- D F 1 EX -2 |P eu 82
H - 4 | H C /1 -
e
mo _sa api s-L AD M 01 -1 29 |M H C /1- o_ _s sap pie ens - e x AAD DF0 02- EX 2-2 37|P seu /1-8
H o mm o o _ s a p i ei e n s s - L B L N V 0 1 - 1 5 6 1 | | M H C / 1 8 9
_ s a p ie e n s - L M A 0 0 2 2 5 8 | A C / 1 - 8 8 9

om o _ sa pi n - B P0 1 4 M C/
H o m o_ sa p en -L D 0 1-1 7 MH /1-
n -L B 1 -21 9| MH - C /1 -8 9

H o m o o _ s i e n s - A 0 — D D 0 1 - 1 2 7 7 | MM H C sa ap ien ns -L 2 - H D 2- EX 2-2 65| se /1-8 2


mo _s pi ns MB AF 2- 14 H L MH eu /1 1-

H o m o_ sa pi ns -L IL 0 -16 8|
H o m o _ o _ s ie n s s - L KK H XB 0 1 - 1 0 5 3 | MM H C / / 1 - - 8 9 9

p i i e n s - - L OR U L 0 1 - E X 2 - 3 7 7 | P s e u / 1 2
o_ _s sap pie ens s-L DO Z0 1-1 142 |MH HC/ 1-89 9

Ho om sap pien us1 AA NV P01 1-1 11 5| |MH


H o
Ho omo _sa pie s-L -JS H0 01- x 2 - 24| |Ps HC C/

H o mm o _ s a p i i e n s - L L O E Q 0 10 1 - 1 9 | M H C / 1 / 1 - 8 9
sa ap -L H 0 -9 | H C 1 89

en s L Q M A - E 2- 18 Ps u -82
H mo _sa ien ns MY IL - e -2 13 |M H

sa ap i e n s - LMM B C 0 2 - 1 3 4 | M C / 1 1 - 8

H o_ sa ns s- LD BL Q0 2- 11 0
H
m o H o m o _ s a p ie n ie n s K H WY 0 1 1 - 8 5 | MM H C / / 1 - 8- 8 9
H o m o s a p p i e s - A - L Rs 1 0 0 2 2 - 3 1 - 3 8 | M

s - - L MC Y O 0 1 E / 1 X 2 - 3 1 | I n e u / 1 - 8
H o o m o _ _ s a a p i e n s - L B Q M 0 1 - 1 0 7 |M|M H CC /1 - 8 9
H C 1 9

pi ien ns -LK B A0 3- 22 |MH HC -89 9

m o_ se ien s- -L L Z0 3- 11
H o a p ie s a p n s L M L J I I 00 1 - 6 7 5 | M| M H C / 1/ 1 - 8- 8 9

H o m o _ s p ie n s - L O U L 2 - 2 2 1

LM B E0 01 -2 - 8 32 9|I de /1- 2
en s -L H B 1 11 |M C /1

o
H m on ap ien ns -LB H C0 1-
H o m o _ s a p ie n s L R F 0 2 - 2 8 |M C /1 - 9
- A ie n s L M C 0 1 - 5 5 | M H C / 1 - 8 9 9
Ho _ sa en n u DB X 0 10

|
H o m o s a p ie n s - S A B 0 - 2 3 |M H /1 - 8

B D0 1-1 -11 MH 9 0|I nde t/1- 82


s- -LK KH X0 01- -99 3| HC /1- -89

Ho C _s ap pie ns -LB O A0
H o m o _ s a ie s - J A D 0 1 3 9
m o H o m s a p ie D D F - L C B D 0 1 - 4 5 5 | M H C / 1 - - 8 9
m o_ pi pie n s A -E UM 1-

LJ H Y 1 89 |M MH /1 89
_ a e s R K
o s p n -
H o m o _ s a p ie n s - A D D 1 - 2 0 |M H C /1 -8 9
|M H C / 1 - 8 9

o s D C 1 9 |M C nd t/ 8
Ho om _sa sa n s e s-A 01 LR L0

m o_ sa pie ns -L BB
H m _ p n AA P0 24
C - 89

01 -2 |M et 1-8 2
II0 W 01 -79 |M H C -8

/
H oo m oo _ s aa p ie n s - A E K L 0 1 - 4 1 |M H C /1 -8 9
o _ s s -A E n s - L Q N 0 2 - 3 02 5 |M H C / 1 - 88 9

-3 9| H HC 1-8
H o o_ o en DC s- S

Ho om o_ sa pie ns s-A
H o m o _ s a p ie n s - B S 0 1 -2 4 2 |M H C /1
K P O Q 1 - 2 9 |P H C / 1 - 9

/1
-8 2
1- 01 -69 |M HC C/1 /1-8 9
0 1 -E O 0 7 0 s e /1 8 9

H o m o _ s p ie n s - A B B A 0 1 -2 3 |M C /1 -8 9
C
m m C pi A n B

H o m p ie n s D D F E K P X 2 -3 1- 6 |M H C /1/1 - 8 9

H o m _ s a p ie s -A Q P 2 -2 4 |M H -8 9

9| M C /1 9
9

H om o_ sa pie en
a -A pie s-A

49 -5 |M H /1 -8 9
H o m o _ s a p ie n s -L O D F 0 -2 4 4
1 |M u - 8

M HC /1 -8 2
H o m _ s a p ie A Z Q 0 2 -E X 21 -2 2 7 |I n d eC /1 - 8- 8 9

H o m o _ s a ie n -A D H 0 2 5 |M H C C /1 -8 9
/

H mo sap ns MY

H m o_ a pi
-2 8 |M H t/ 1 -8 9

|M 9|M H C/1 -8 9

H / -8 9
C /1 9

H oo m o _ s a p ie n s -A IQ K 0 2 1 -2 4 7 |M H C /1 -8
o_ ien sa ien

o s
H o m o _ sa p ie n s -L IQ M 0 1 -2 8 0 |P n d e t/ 1 -8 9

H
0 2 -2 6 |M s e u /1 -8 9

L H

C 1- 9
H o m o _ ss a p ie nn s -L O Q L 0 1 -2 2 |M H C /1 -8 9

H om o_ sa
- 5

H H C/1 -8 9
6 8 |M H C /1 -8 9
H

H o m o _ a p ie s -L O Q 0 1 -2 5 |M H C /1
s- LO Y H 0 2 -2 6 6 |M H C /1 -8 9

/1 8
om ap o_ sap

H o m o _ s a p ie n L P X P

C C - 9
C /1 -8 9

H o m o _ s p ie n s- LA Z Q 01 /1 -8 9

H m o_
9
-8

H o m o _ sa pi en s-

-8 9
01 -2 |M H C -8 9

/1 /1 89
H o o_ sa 10 -e x2 O 01 -8 |M
- L O0 2 - E E 0 1 - 3 5 |M

/1 -8 9

m
9

H om se ns us en s- LO Q 01 -1 7| M HC /1 -8 9

o
C on o_ sa pi s- LC YE 1- 27 |M HC
9
ap ien s-A DB 02 -2 35 |M HC /1 -8 9
s om _

H om o_ sa pi en -L MB D0

-8 -8

H om 9
32 |M HC /1 -8 9

Ho m _s ap ie ns MB C0 1- 37 |M /1- 89
Ho mo _s pi en s- AD KP 01 -2 60 |M H C /1 -8

/1- 89

Ho mo
s

o _ p ie -L
2 H

C/1 -89
9
o_ H mo

HC /1- 89

Ho mo _s
-8

HC/1 -89

Ho mo _s
C/1-89
Homo_sapiens-LDNV0 1-139|MHC/1-89
01-164 |MHC/ 1-89

Ho mo _sa
s

Hom o_s apie

9 9
Hom o_sa pien
Homo _sapi ens-LLMBA01 -97|MHC /1-89
A A D DN 01 -2 62 |M H C /1

Homo_s apiens-
Homo_sapien s-LBHZ02-11 9|MHC/1-89
Homo_sapiens-LBLP01-131|MHC/1-89
Hom o_sa piens 01-E X2-1 90|P seu /1-8

H
o

/1
H

-
1
s

L
DF 02 -2 |M H C
8 |I

HC

-
B

2-2 05| MH
-

Homo_sap iens-LRIL 01-151|MH


Y

|M
7

ap
om

64

Hom o_s api ns- AB BA 01 -22 9|M

ap ien s-LLK HW 01- 57| MH C/1-89


H o m o _ s a p ien s -L R U -E X 2 -2
H

-LDO C03- 172|M


61
Ho Ho

56

s
-

pie ns-
o_ sa ie n s- A MP X P 0 1 -2
0
H

ien s-L
-

7
Ho om ap
o

8
BS L0 1-2

ns- LKH Y01


ie

s-LK HX0 1-77


1
p

o _ s ie n s ie n s -A
0
_sa o_s ns

ens -JS AF0


ie

N0

HC -8
-H LA
JII
Q

MBB0 1-87| MHC/


Homo_ sapien s-LBLQ
p
ns

s- L

-2 4
_
H

Ho m o_ sa pi en s- AE
a

01
o

AA

H
m

-8
-A
ap

-L

-2 78
-47
H om o_ sa pi en s-

-G
n

ns-L RUL
_

H
C
ap ien s-

9
pi en
ie
mo

9 |M
_s

-67 |MH C/1

|M HC
p

8
pie
Hom sap

9
Hom om

H
o _ sa

H C /1 9
|MH C/1- 89

/1
o

-8
a

sa

9
Ho

HC /1- 89
Ho mo _sa
H

Ho mo _s
_

o_

Hom o_s apie

/1 -8
o

-8 9
H om
Ho

-8 9
H om

9
1-89

-89

Figure 4: The phylogenetic tree of 333 EX2 sequences of MHC-I found with our |MHCfinder— algorithm from the
42 WGS datasets of Homo sapiens. The probable viable genes are colored in blue, pseudogenes are colored red, and
those genes that are indeterminate are colored light blue. The sequences in brown correspond to consensus sequences
that are used to identify clades. The nodes of the principal clades have been labeled. The tree was constructed after
aligning sequences with ClustalO and using the phyML (part of the Fasttree software (Price et al., 2010)) with the
gamma parameter, WGS matrix, and 500 bootstrapped samples.

sis) and 15 (in C. capucinus). Lemuriformes cause these sequenced assemblies are still in a
and Tarsiiformes represent the oldest extant nascent stage.
primates that diverged from other primates
approximately 76 Mya. Compared to other Using MHCfinder, 20 WGS datasets from
taxa of this study, few exons were found from the Simians family (not including humans)
the datasets of Eulemur species, perhaps be- were studied, from which 210 viable genes
were found. Of the 3 MHC-I exons stud-
9
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

t/1- 89

|M HC /1Ind et/ 1-8 9


R hi no

|M H d e t/ 1 -9 1
ecu s_r oxeLVW Q01 -EX 2-1 1|M et/1 -91
Callithrix_jacc hus-ACFV01- EX2-16|Indet/
N om as

Cebus_capuc inus-LVWQ01 -EX2-13|MHC /1-89


Callithrix_jacchus-ACFV01-EX2-21|MHC /1-88
Callithrix_jacchus-ACFV01-EX2-12|Pseu /1-91

aae- JYKP 01-E 2-23|I ndet/1 -89


Man

Ceb us_ cap ensi s-AG CE0 1-EX 1|Ind et/1- 91


2|MHC/1 -89

AB R0 1-E HC /1-9 1
Aotus_ nancy maae-
Callithrix _jacchus -ACFV01

1
X2 -7|I nde
R h in M a c a c a _ fa s c u s _ a tya -J S U U 0 1 -E

C /1 -9
Pa n_ tro

Z L G 1 -E X 0 1 -E 2 -1 2 |I -9 1
ro gl od -A JF E0 3- EX 2-2- 0| In de t/1 -8 -8 9
Pa n_ tro
d ri llu P a p io _ a s _ a ty s a -J A B R 0 1 -E XX 2 -7 |M

pi th ec ce bu s_ a e u s- JYZ Z 0 1 -E X 2 -1 3 |P s e uC /1 -8 99

Gor illa_ gorCo nse nsu s10 -ex2-5 |In de t/1 -89

Papi o_an ubis- AHZZ


C

Gor illa_ gori lla-C

N a c a c a _ a n la t a b a e ll a - A X 2 - 0 |M |M /1 -9 /1 -9 1
/1 -8 9
2-14 |Ind
o p it

cu s_ le _r ox el la na-J Z LG 010 1 -E X 2-1 0 |P se

9
Go tro glo dy og en ys -A -0| MH C/1 -89

n
e t/ 1
er

-9 1
M a c C e rc o _ m u latt a -A -A H Z Z 0 1 - E X 2 -2 - 3 |I|M H CC / 1 - - 8 99

C e rc s _ ro u la ri -A Q IA 0 1 -E 2 -1 92 |M H

Po ng o_ ab el ii- ABD FV 01 -E X2

G orril la _g or te s- AA DA 01DF V0 1-E X2 -0|

-8 9
Pa n_ paab el ii- AB GAA0 1- EX 2--1 |P se u/ 1-
Po ng o_og en ys -A A B R 01 -EX 2- 22 |P

M ao lo ba c ac_e b u c u s u c o gJ S U E K 0 1 0 1 - E2 -1 3 |MX 2 -2 C /1 -9 1
s_ le
co

M a c a c a _ m u u c o a -J -E X 1 -E X H C -9 1
N in o p h e cu le n si s- LA -F /1 2- 0| M H C 9
h e c _ fa s c ic u la ri ss -J Z L G0 3 -E XX 2 -2-4

N oa c a cc a _ mfa s cla tt a -Ap h a e Z L F 0 2 -1 8 |M2 -0 |M /1 -9 1

T h lo io ce he n 0 s A A 02 -A EX 2 -3 | I n 1-8 / 1 - 8 9 - 86
us

R h m a a _ u la ic u la A N u s -J 1 -E X H C H C /1
C /1 -8

C
P a c a c ae b u s _a tu s -Jx e ll a n a -J A B R 0X 2 -4 |I n d
glo dy tes -A AC Z0 4-EEX

C o lo ns us yt es -A 2- EX 2- 0| M HC /1
Ma
Ma

ac

0 1 -E 2 -0 |P X 2 -8

P e i n b u n s l o d y - A g U E 0 0 1 - E |M 5 |M H3 |M H /1 - 8 9
P a _ s c _ m u la s t s is U E A Q B 1 - E C /1 - 9 1 1
glo
o c e b x e ll a s -A Q 0 1 X 2 |I n d C /1

P o c a a _ n e g o t a - e u n a D F - 0 |M H C H C /1 1
|M H
Aotus_n ancymaa e-JYKP0 1-EX2-2

X2-2

M a c a c a _ s _ le e s tr in Z 0 1 H L 0 2 -0 |M e u /1

M ah lo r o p it u s _u la tt aa - A Eis - A Q 0 1 -E K Q 0 1 -1 8 |M -9 1
uc

-9
a

u co pn u b is -A Z L G 0 10 1 -E X 22 -8 |M HH C /1 -8 9
u

C /1
1 -E X
c a c a p io _ a n u b a - J SA N U 0 3 - E- E X 2 - 2 1 |M
c a c _ m u lan u b is - A H E 0 3 1 - E X 2 - 2 00 |M HH C /1/ 1 - 8- 8 9

-E X2 -4

H
dy
M a c a c a _ m u la t t aa - J S N U 0B 0 1 E X 2 2 - 1 69 | Mn d e C / 1/ 1 - 889

-JYKP 01-EX
M a a c a c a _ m u la tta - A A- A Q IZ 0 1 - - E XX 2 - 1 2 | | M Hd e t-2/11 - 8899

nis cu s-A

/1
0| M H

C h lo e o g l o us co JS N U LF 0 1 -11 2 - -1 e u
P

illa- CAB D03

s
u

M a ac s_ u _s rox nys 3-E X2 2-1 H C


ceb
Ch

P a c a _ m u la tt a - J S A N U 0 3 - EE X 2 2 - 04 | M HH CC / 1 - - 8 99
M

C o o nn _ t t r o gn is c _ le t t a a - A Aa - J Z Y K R- E X 20 1 - E 1 - E X - 4 |P 1
a

tes
a

h
M a c a c _ m u a b a is - A J Z LU E 0 1 - E - E X2 - 1 - c e | I n dH C / 1 1 - 8 8 9

c a _ m la tt - A U E 1 - - E X - 2 | M H t / 1 - 8 9

C a s a li s it h e c u s_ ro xe JY K R -9 1

/1

ac a us ith bu bi at ro si H L A 0 - E 9| 1- MH H C /1- 8 9
C a n _ p a u s u la t t r in - J 0 3 I B R 0 X 2 - 9
s
at ys
M a o c e b p io _ u s _ la t t - A A - J Z R 0 1- M H- E XX 2 - 2 1 | | I n n d e C / / 1 - - 8 98 9
M a a c a s _ s n u b t y s - - J S U 0 F 0 1- E X C - I 2 - 4 | M H Ce t / / 1 - - 8 9 9

M e rc o c _ la rv s _ ro lla n a 0 1 -E

C
lo r P a o c e _ m ul a t t a t r in aJ A B s 1 0 E 0 31 - E X 2 - 1 7 | I | M HH C u / / 1 - - 8 9

p io u la a - A U E 0 1 X - 1 | M C / 1 8 9

JYKP0 1-EX2 -13|MH


-AA DA 01 X2 -5| Psse

Pa n_ sc us _le uc A0 1-E X2

4- E X
ic

M a in r _a b c go - e - A C -E D 2 - 1 |P d e 9
-E
a-J

M a cr s i o po c e n u u s _ u s _ l e n x 2 - A D Z 0 4 X 2 - F V 0 - 1 2 | 7 | M s e u t / 1 -
EX
c

9
ac ca _s ec s_ s-A ys xe s-J A - 1 - X 2 Pse EX C/ / 1 - 89
Rh

ac

-8
2
_ fa s a ty M H X -J A B

a_ _n yr us sa H -JZ lla YK L / 1 E X - 6 u/1 2-2 1-8 8 9


_

YUI 03-E X2-4


C e a c a a _ m e m l l a n s e n a - J SA N U 0 1 1 - E E X 2 X 2 - 1 1 8 | P M H d e t

2
-J
a

n
in

/1
llan

3-
-J

JF E0 2--E X2 -7 |In de

R h p o it _a s 1 t e s- E ys 3- E X X2 -1 C/
M a ca s_

tt

X
M a c _n x e n tt - A Z Z 0 1- -E 2 -1 8 | In

m em ich _r ba ZZ LG na- R0 - 8 2 - 3 | P s -89 |Ps 9


op

H
a

b u s_ 10 -e A C Z0
G

eu
rc ca u es a - s u U 0 - E X2 -1 -5 |M se C /1
Ma ecu

-C AB D0
_
u a

C a rc op s u d y t e JF en 0 1 - -E X 2 H
01-E X2-6 |MHC

IA X
ith

P an _tpa ni sc us -C YU I0
c a o

KQ
m tt

X
0
01

Y
u

tt

9
Aotus _nanc ymaae
is

ucin us-
-EX2-6|M HC/1-88

t/1 89
R h o p it a n g o x2 -H
eli i-A BG

P a n m a c a m m e le n J S s - - J A V 0
-EX 2-6 |Ps

-8
A N Z 0 1 E X 1 3 n d e /1 8 9

-EX

9
la

0 -E

-E -E 2-2 1- 2- /1- /1- se 89


b
c

IA

-8

X X |M EX 7|P 89 89 u/1
M aa n d ric a _ n e b is -Au la ri s -A

de /1-
-E
E
M c a c i o _ m e _ a l a r s - A J S U N UF 0 1

01 01 X 0 X eu eu |P 1-
Aotu s_na ncym
M P a _ne bu cic lar tta - A ZL
M Ce cac _fa ca_ _ m est

U
-E

ul e ta ox eu 01 01 JA 1- 9 | I n e u

9
Saim iri_b olivi
ac a a st ty is Q E

-J
2-H LA -J/

U F E BR -E s s -3 t/
Z
e
ac rc a_ s m u ri

a p

X2
r

X 2- 2| se u//1

Pa n_ill a_ go rililla

-8
2- 5|P t/1 -8 9
a a

la

H C -89 H -89
N L 2- A 1 |P |P 2 de
HZ
u

s
ac _m nu rin s-J -A IA 0 01 -EX

-E

/1
H
ac o fa ci u lat na

s Z 01 -E 2 - -7 |In o t/ - 9
M ca ac c a c nem

|Pse u/1- 899


M

2|

A -JZ 0 -J B0 14 12 EX |In

C
|M MH /1 9|M /1
a_ u b i s a- Z Q 0 3- - E 2
a ca a a

ab
-3 |P u /1 -8
0
a ce s cu la t a -J

-E X 2 - 3 |M H t/ 1 - 8 9

r
Rh ino pith
a M a _

n u ic
Ps eu /1 89
-

23 4| C -1 eu
o la ta H LF 0 1 X 2 - 9|

-A a RT na QI 2- 2- 1- -7
-1
m l a t - A JZ LG IA0 1-E EX X -1

P se u/ 1- 89

e
eu/ 1-8 9

c
Po ng o_
C/1-91
eu /1--8

ta rin B la -A X X R0 2

2- 2- H 2 s
|P s e /1 -8
u

-1 |M e t/ -8 99

-
M c a ll u m
2 |M H C /1 - 8 9

-8 9
at st -A el s -E -E B EX
se u
M

/1-89
G

tt
u/1 9
N L

No ma
X

1-88

/1 -8
M ca

1-8
ac

C on se

9
C /1
M 89

M c u m s _
s u i - A

|P s C /1 - 8 9
a

u /1 -8 9

C c o h e le
1- 89

M aa p io _ a
H C 1 -8 9
2 9|M HC /1 -8
ac 9 1- 89

89
-8

R h in
-
M -8 C/ 1-

m
ac Pap aca

-8 99
e u /1 - 8

u -
/1 H et /

/1

M a
X
9 89

-8
ac io _m

c
9

C in s c
C
P a_ _a u H 2 | M| I n d - 8 / 1 -
2 I

-8 9
H
C 1 t
er Pa api m nu lat M 1
2
3| 2- 14 C/ de

C
u 2- X 2- H n 9 9
M co p o_ l b ta
6
M X -E EX |M 3|I / 1 - 8 1 - 8
Rh an M M ac ce io_ an atta is -A -A

/ 1
in dr a c aca ac bu an ub -A H AN -E 01 - 25 -1 C t / 89
-

-8 9
5 M d t 1 8

03 A 01 2- 2 H d e -
c

o ill a c a s u i A Z U
M pith us_ c a _ a_ _m _at bis s-A NU Z 0 01 E I A X X M n /1 8 9 9 89
9
1

s r
U Q I E -E | |I C - -8 1-
e

ac ec le m m ul ys -A H
0 1 - S -A Q 3- 01 2 8 - 4 H / 1 / 1 C/
M ac us uc u l ula att -J H ZZ 1 - E EX -J ris s-A E0 LF 2 - X 2 8|M H C H C H
-

9
/ 1

M a a c a a _ f a _ r o o p h a t t a t t a a - J Z L GZ Z 0 0 1 - E XX 2 - 2 - -8
2

t t a u l al a r i S U - J Z - E X1 - E X 2 - 6 | M2 | M 1 0 | M t/1
-

ca ca_ sc xel ae - A -AE SU 01 1- -E 2- 1 9 9|I a


8

ul cic icu a-J ina Z 0 1 U 0 3-E 2 - 1 2 - X 2 - de 9

R
ca n icu lan us AN H E -E E X2 20 |I nd In 1-8 9
9

s X
_n em la a- -J U K0 03 X X2 -7 |M nd e _ mf a a s c l a t t s t r H Z A N E 0 - E X - E 1 - E 1-89 - 1 4 | H C / / 1 - 8
1

Ma e r Y - a _ e A U 1 9 9
Ch M ca m s s A K 1 1- E 2- -0 |M H e t/1
e i J 0 f u 1
a c c a _ m m s - A t a - J S Z 0 U 0 I A o-3 X 0 | M H 1 - 8 / 1 - 80 / 2 C
lo r a c a c a C e s t t r i n - A Q B R Q 0 - E XE X X 2 - 1 4 | | P s H CC / t / 1 - - 8 ac a ca _ ne b i a t a- Z N Q r c 1 - E -2 | M C/ C
Rh
in o
o c c a _ m o n r in a - J I A 0 1 1 - 2 2 - 9 | M e u / 1 - 8 9 9
e _ u s a Z 0 - E - 1 M H 1 8 M a ca c a a c ac a _ n u u l l a t t - A H - A Ai s - AI - c e F 0 E X 2 2 - 2 1 | M H | M H - 8 9
p it M a c b u s _ m u la t t e n s - J Z L F 1 - EE X 2X 2 -1 8 | | M HH CC / 1/ 1 - 8- 8 9 9 M a m r L 3
h e a c s a la t a - J u s L F 0 1 X - 1 0 | M C / 1 - 8 9 M a c c ai o _ a _ m u b i sl a t t au l a C - - J Z 0 1 - - E X 2 - 1 7 2 - 1 d e t / 1 8 9 - 8 9 /1 - 8 9
M a p a c a_ n u u ic H na LG 0 1 X E X In / 1 - /1
Rh cu a_ b t a S 1 01 -E 2- 0 M HC / - 9 M a c ac a m sc - M tri Z U -E 1- -8| C HC HC
in o C e P a ps _ r o m u laa e u s- A A NU E 00 - M - E XX 2 -1 5 | M| P s H C / 1 1 - 8 8 9 P a c o _ a _ f a 1 0 s s - J A N 0 3 A 0 2 M H |M 7 |M
M a pi c _ u s me ty A E QI -EX 3| -0 2-1
eu /1 -8 9
MH

p it r c io x e t t - A U 3 - H C 2 2 | H M a ca ca s e _a ta- SU -A 1 - 1 X2 X - 8 9
h e o c e _ a ll a a - Q 0 1 E X - - 1 | M C / 1 - 8 9 P a c a e n _ n s a t a - J r is G 0 X 2 - E 1 - E /1 8 9
c n n J I I- H 9
C h M a c a u s _ r b u s _ u b is a - J AS U EB 0 1 - E X 2 - 1 cerI n d e C / 1/ 1 - 8- 8 9
C-

M a n s c a b u u l t t la Z L - E 0 1 R 0 H C /1 -
lo r - 0 c M o c a c e _ m u la ic u s - J 0 1 K R A B |M H C - 8 9 9
o c e c a _ n o x e ll a ty s - A H B R 00 3 - EE X 22 - 1 | | M Ho-1 t / 1 - - 8 99
I-

C a r c o c a m s c t y Z Z - J Y - J - 1 7 |M t/ 1 8
b e an -J Z - In /1 8 M e c a a _ f a _ a A H is a n a X 2 2 - 6 d e C /1 -

HLA-F
1
M a P a p u s _ s m e s tr a - J AZ L G Z 0 1 - - E XX 2 - 73 | M d e tC / 1 - -89 9 C a a c a _ u s is - n s e ll - E - E X 7 |I n 9
ce

H
cac io a in 0 E 2 |I H / 8 M a c a c c e b u b o le r o x Z 0 1 0 1 2 - 2 2 5 |M e u /1 -8
a _ n _ a n b a e u a - J ZB R 0 11 - E X X 2 - - 6 |I nn d eC / 11 - 8 99

HLA-J
M a c c o a n a n g s _ H Z L G E X X 2 - |P s 9
rc

C e rc e m u b
is s - A L F - E X 2 - 1 5 |I n d e t / 1 - 8 9 M e r io _ s _ c u - A - J Z 0 1 - 1 - E -1 4 e t/ 1 -89
P a p o c e b ue s tr in - A H ZQ IB 00 1 - E 2 - 1 48 |M Hd e t /t / 1 - 8- 8 9 C a p b u it h e b is ty s Z Z U 0 X 2 |I n d
o-

C io s a Z 1 X |M C /11 - 8 9 P o lo o p a n u s _ a A H A N 0 3 -E 2 -5 H C /1 -89
M a c e rc o c e _ a n u _ a ty s-J Z L F 0 1 - E- E X 22 - 9 |M -
C h in io _ b u b is a - A U E -E X |M /1 -8
2

HC - 9 R a p o c e n u la tt -J S D 0 3 X 2 -5 H C
a b -J 0 X - 8
M a c c a _ n e b u s _ a is -A H Z L G 1 -E X 2 - 2 26 |M HH C /1/1 - 8 99 3 P e r c _ a m u la tt a A B 3 -E 2 -6 |M -8 9
o- C a p io c a _ m u a -C U I0 -E X
aca m e s ty s -J Z Z 0 0 1 -E 2 -1 |M H C /1 - 8 9 9 C /1 -8 9

L
M a c _ fa s c ic tr in a -J Z L G 1 -E X X 2 -76 |M H C /1 - 8 9 P a c a c a _ o ri ll a -C Y E 0 2 -H /1 -8 |M H

A-
rc
1
a 0 |P s C /1 - 8 9 MH M a c a a _ g o ri ll -A J F -H L A X 2 -4 |I n d e t/ 9
M a n M a c a c c a _ m u u la ri s -AZ L F 0 11 -E X 22 -2 0 |M e M o ri ll a _ g c u s 3 -E 2 -3 /1 -8
e u /1 -8 9 C- x2 9

HL
G o ri ll a n is s 1 0 -e A B D 00 3 -E X -3 |P s e u H C /1 -8 9
d ri llu a_m la tt a -E -1
u la tt -J S UQ IA 0 1 X 2 -2 0 |I n dH C /1 -8 -c
0, 55

-8
s_ le
3 |I n e t/ 1 99 I- G a n _ p n s u ri ll a -C A B D -E X 2 2 -3 |M u /1 -8
C
R hi no ol ob us _au co p h a ea -A A N UE 0 3 -E X X 2 -0 |M
-E
d e t/ -8 9 ce C-I P o n s e _ g o ll a -C F E 0 2 0 4 -E X -7 |P se H C /1 -8 9 9
0,9 7
pi th ec u s- JY 0 1 -E 2 -1 C o ri ll a _ g o ri s -A J A C Z 1 -E X 2 |M
ng ol
M ac ac us _r ox elen si s- JY K Q 0 1 -EX 2 -1 4 |M
5 |M H C /11-8 -8 9 rc MH G o ri ll a n is c u te s -A D A 0 X 2- 12 |I nd et /1
G n _ p a g lo d y e s- A A JZ LF 01 -E-E X 2- 10 t/1 -8 9
-8
H
o-
0, 85

a_ la KR X 2 -4 H CC /1 -8 99
C er co ne m es trina -J A B R 01 -E X 2- |M H /1 -8 9 1 P a n _ tr o g lo d yt tr in a- Q R 03 In de /1 -8 9
P a n _ tr o ne m es tt ii- A A 2- E X 2- 6|X2 -2 |M H C -8 9
0, 82
Co lo bu ce bu s_ at na -J ZL F001 -E X 2- 5| M H C C /1 -8
0,

/1 -8 99 P a ac a_ _g ar ne A B R T0 C 02 -E /1
0, 94

s_ ys E X 2- 4| M H C /1 eu
Ce rc oc an go le ns is--J ZL G 011- Ps
87

M acle m ur ric ht a- us -A BD EX 2- 7| t/1 -8 9


0,020,64

JY KR -E X2 -45| M H C /1 -8 9 O tors iu s_ sys_ m ur in -A BD C0 2- EX 2- 8| In de /1- 89


0,76

eb
Ma ca ca us _a tys -J ZL 01 -E X2 |In de -8 9 Ta ro ce bu m ur in us -A BD C0 2- HC
_m
Pa n_ tro ula tta -JS UE G0 1- EX -6 |M HCt/1 -8 9 0,
92 M ic ro ce bu s_ ur in us BD C0 2- EX 2- 3|M/1- 89
76 -H
2- 11 |In /1 -8 9 75
0,7 7

glo dy tes 03 -E de 0, 0, M ic oc eb us _m rin us -A -23 |M HC


HLA
X2 t/1 -8 9
Pa n_t rog -A
lod yte s-A AD A0 1-E X2-22 |In de t/1 Mi cr ce bu s_ mu s-J ZL G0 1-E X22-1 1|M HC /1- 89
0

0,
Pan _pa nis AC Z0 4-E -0| Ps -89 69 Mi cro aty
bu s_ tta -AA NU 01- EX
X2 MHeu /1- 89 76 |MH C/1 -89
Gor illa_ goricus -AJ FE0 2-E X2--7| C/1 -89 0, Ce rco ce ula E03 -EX 2-6 1-89
Ma cac a_m ula tta- JSU-AQ IA0 1-E X2- 3|In det/
0,

lla-C YUI 03-E 8|M HC /1-8


-p3 Ma cac a_m
76

Gori lla_g orilla 9


HLA-MHCI
X2- 3|M HC/ ,4 6 cicu laris p3/1-89
Pong o_abe
-CAB D03 -EX2
-7|M HC/11-89 0 0 ,7 Mac aca _fas -MH C-I- 21|Ind et/1-8 9
Nomas cus_leu cogenylii-AB GA01 -EX2- 9|MH -89 Con sens us10 ina-JZ LF01 -EX2- X2-11|P seu/1-8 9
s-ADFV 01-EX2 -3|MHC C/1-8 9 0, 71 Maca ca_ne mestr BR01-E
/1-89 0,9 5 Rhinop ithecus _roxell ana-JA 9
Consensu s10-ex2-H LA-E/1-89
Tarsius_syrichta-AB RT02-EX2-3|MHC/1- 89 HLA-E 0,76 0,7 2
0,8 8
Cercocebu s_atys-JZL G01-EX2-1 9|Indet/1-8
Papio_anubis-AHZZ 01-EX2-1|MHC/1-89
Callithrix_jacchus-BBXK01-EX2-3|MHC/1-89 0,65 Saimiri_boliviensis-AGCE01-EX2-5|MHC/1-89
Callithrix_jacchus-A CFV01-EX2-8|MHC/ 1-89 Cebus_capucinus-LV WQ01-EX2-14|MHC/ 1-89
Aotus_nan cymaae-JY KP01-EX2 -16|Indet/1
-89 0,76
0,8 6
MHC-I-platy-1 Cebus_cap ucinus-LVW Q01-EX2-1
2|Pseu/1-8 8
Q01-EX 2-6|MHet/1-8 C/1-89 0, 84 Aotus_ nancym aae-JY KP01-E
Cebus_ capucin us-LVW 9 Saim iri_bo X2-19|M HC/1-8
GCE0 1-EX2 -7|Ind 0, 8 0 ,9 Saim iri_b livien sis-A GCE0 1-EX2 -10|In det/1 9
Saim iri_bo livien sis-A 1-EX 2-8|I ndet
/1-89
t/1- 89 0 ,7
7 Aotu s_n olivi ensi s-AG CE0 1-EX 2-11 -89
ensi s-AG CE0 19|I ndeC/1 7
Aot us_ anc yma ae-J YKP 01-E
Saim iri_b olivi hus -AC FV0 1-E X2-2-5 |Inde
76

0 ,7
0, 86

|MH -89
2 ,6 7 0 Sa imi ri_nan cym aae -JY X2- 9|Ps eu/1 t/1-8 9
0,

Cal lithr ix_j acc


0 ,7

s-A CF V01 -EX -1| MH C/1 -89 0 ,6 0 0 ,7 7 Ce bu s_ bol ivie nsi KP 01- EX 2-8 -89
jac chu X2 t/1 -89
0,

s-A GC E0 |Ind
Cal lith rix_ ivie nsi s-A GC E0 1-E -10 |In de Ce bu s_ ca pu cin us -LV 1-E X2 -13 et/1 -89
2

6
MHC-
88

HC /1--8 89 0, 99 0 ,8
01 -E X2
0, 79 0,7

Ce bu s_ ca pu WQ |MH
Sa imi ri_ bolcy ma ae -JY KPQ0 1- EX 2- 7|M In de t/1
9
0 ,8
5
I-pl ca pu cin us -L VW 01 -E X2 -17 |M HCC/1 -89
0,78

9 V Ce s_
X2 -1 7| HC /1 -8
HLA- Sa buiri ca pu cin us -L VW Q0 1- EX 2-
0,68

an
Ao tus _n ca pu cin us -L VW KP 01 -E 2- 2| Mse u/ 1- 899 aty- /1- 89
S aiim _b ol ivci nu s- LV Q0 1- EX 2- 20 |M HC /1- 89
Ce bu s_ nc ym aa e- JY BX K0 1- EX X2 -93||PM H C /11--889
4
0 ,7 1 2 m iri _b ie ns is W Q 01 -E 5| MH C/
0 ,9

na us -B -E C al lit ol -A 1- 89
9

Ao tu s_ ch 01 iv
A ot hr ix _j ac ie ns is G C E0 1- X2 -4 |M
rix _j ac ch us -A C FV -E X 2- P se u/
0,9 2

89
7

A o tuus _n an ch us -A G C E 01 EX 2- 3| HC /1 -8 9
0
82

lli th FV 01 2- 2|
X 1| P se u /1 -8 9 u/ 1- 9
3

Ca ac us -A C 0 ,9
0,

hr ix _j 01 -E C a s_ n cy m -A C -E MHC
0 ,8

C al lit ix _j ac ch es -A A D AZ 04 -E XX2- se u /1 -9 5 C a llll it h ri xa_n cy m aaa e- JY KF V 01 -E X X 2- 4| M H C /1 -8 9


2 -1 |P
0,83

9
hr
C al lit _t ro gl od ytyt es -A A C |P s e-V /1 -8 C it h ri ja c a e -J P 01 2- 15
|M H C /1 -8 9
E 0 2 -E
0,85 0,14

X 2 -1 -9 59 C a ll it h x _ ja c h u s Y K P -E X 2-
0,9 6
0,7 6

A
P an ro gl od is cu s- A JF D 0 3 -E x 2 -H L s e u /1 /1 -8 9 MHC C a ll it h ri x _ ja c c h u -A C F 0 1 -E X 2| M H C /1 /1 -8 9
|P
B
P an _ta n _ p a n ri ll a -C A s u s 1 0 -E X 2 -1
-e e u
|P s t/ 1 -8 9 C a ll it h ri x _ ja c c h u s -B B X V 0 1 -E 2 -4 |M -8 9
P o en 3 2 -1 n d e /1 -8 9
ll a _ g C o n s -C Y U I00 1 -E XX 2 -4 |I|P s e uu /1 -8 9
-I- S a ll it h ri x _ ja c c h u ss -A C F K 0 1 -E X 2 -1 0 |MH C /1 -8 9
C aa im ir i_ri x _ ja c c h -J R V 0 1 X 2 -1
pla
HLA-

G o ri a A -8 ll u U -E |M H C /1
ll e C it b cc s -A L
g o ri li i- A B GF V 0 1 -E X 2 -1 |P s C /1 -8 9 A a ll it h ri x o li v h u s C F 0 1 -E 2 -2 0 H C /1
X -8 9
ll a _ e C 1 -E 2 -4 |M H C /1 8 5
G o ri g o _ a bh u s -A C F V 00 1 -E XX 2 -9 |M H e t/ 1 - - 8 99
ty- A o tu h ri x _ ja c ie n s -A C V 0 1 -E X 2 -0 |I |M H C -8 9
3 A oo tu ss__ n a _ ja c cc h u sis -A GF V 0 1 -E X 2 -7 n d e /1 -8
P o n _ ja c c h u s -A B X K 0 1 -E X 2 -30 |I n ds e u /11 - 8 9 |M H t/ 1 -8 9 9
G

C tu n n c h -A C
ri x c c s -B B R -E 1
0 1 X 2 - 2 |Pn d /1 - 8 9
- e t/ 8 C e b s _ n a n c y m u s -B C F V E 0 1 X 2 -1 C
it h _ ja hu A C e b u s _ a n y m a a e B X 0 1 -E-E X 2 -28 |M H /1 -8 9
C a ll ll it h ri x _ ja c c ll a n a -J-J Y K R0 1 - E - E X 22 - 9 |I|M H Cu /1 - - 8 99 A oa ll itu s _ cc a p uc y m aa a e - - J Y K K 0 1 -E X2 C
C a ll it h ri x ro x e n s is B G A IB 0 1 - E X 2 - 8 |P s eH C / 1/ 1 - 88 9 C tu h a c a JY P -1 |M H /1 -8
C a e c u s _ n g o lee li i- A s - A Q Q 0 11 - E XX 2 - 1- 2 |M H C / 1 - - 8 99 C e b s _ r ix _ p u in u e - J K P 0 1 - X 2 -0 1 |P C /1 -8 9 9
S e b u s n a ja c in s - L Y K 0 1 E X 2 |P s e u
HC /1 -8 S a im u s _ c n c c c h u s - V W P 0 - E X - 2 s e u /1 -8
p1

h a W 0
p it b u s _ o _ a b b a e u - L V W Q 0 1 - E- E X 22 - 8 |M | M seu /1 -89 A a im ir _ c a p y m u s L V Q 1 - E 2 - 4 |I n /1 -8 9 9
in o s HL C o t ir i_ a p u c a a - A W 0 1 X 2 1 |M d e
R h C o lo P o n g s _ s a c in u s - L VV W QG 0 11 - E XX 2 - 06 | P s e uu / 11 - 8 99
I-

C ea l l iu s _ i_ bb o li u c inin u s e - J YC F VQ 0 1 -- E X 2 - 1 4 |I H C /1 t/ 1 - 8 9
C-I-p2

u
e b a p u in u s - L Z L F 0 - E 2 - 3 | P s e C / 1 - 8 9 A- C t n o v - K 0 -
C-

E n -
r o c u s _ c a p u cc in u t y s - J J Z L T 0 2 1 - E XX 2 -2 6 | PM H e t /t / 1 - 8- 8 99 P e b u h r a li v ie n u s L V P 1 - X 2 0 |P d e 8 9
lo
C h C e b u s _ c a p u s _ a r in a - A B R L F 0 1 - EX 2 - - 1 1 | | I n dd e t / 11 - 8 8 9 AB P a b u s _ i x _ n c ie s - L V W 0 1 E X - 1 s e t/ 1
P o n _ s _ c j a y m n s is - W Q 0 - E 2 - 6 |I u /1 - 8 9
MH

b c u t - Z 0 E 2 5 In e / - 9
C e b u s _ o c e b m e s h t a a - J Z L G 0 3 - E X X 2 - 1 0 | | I n d H CC / 11 - 8 8 9 C C a n g t r c a p c c a is A G Q 1 - X 2 2 |M H n d e t - 8 9
P o n o og ap uc h ae -A C 01 EX -5
A-

/
C e e r c _ n e s y r ics t r iny s - J U E 0 1 - 1 - E X 2 - 2 - 2 6 | MM H C / / 1 - - 8 9 9 P a n _ t _ a l o u i n u s - J G C E 0 - E 2 - |M C / 1 -
G a n s e ro b d cin us -A YK E 1 X 18 HC 1- 89
HLA-MH

Pa o r n _ _ t r o n s g l o e l i i y t e u s - L V C F P 0 0 1 - - E X 2 - 2 2 | M H /1-8 8 9
HL

C ca s_ e t JS Z 0 E X 1 | H C /1 -8 9
c a iu e m s _ a t a - H Z B R 1 - - E 2 - 2 8 | M H C / 1 - 8 7 P - s - 1 2
M a T a r s a _ n e b u u la t i s - A - J A L F 0R 0 1 - E X X 2 - 2 - 6- 0 | M| M Hd e t u / 1 1 - 8- 8 9 9 O o n n _ i l l a t r o g g l ou s 1d y t e A B - A AL V WW QV 0 1 - E XE X 2 - 6 | M| I n d C / 1 - 9
c / t o g pa _ g l o d 0 s G C Q 0 - E 2 - 0 e 8
ca coc _m u b a n a -J Z Y K 0 1 3-E E X X2 -5 | I n se e t /1 -8 le o_ n o d yte - e -A A0 Z 0 1-E X2 -26 |In HC/ t/1-8 9
P on s t r g isc i l l a - a -L YK P0 1- - 8 - E | I n e u
O t o le m u r _ g a r r n e t t t i i - - A A n s u 0 3 - - E XX 2 - 2 - 9 5 | MI n d d e t t / 1 1 - 8 8 9 9

M a C e r a c a _ a nx e l l r i n a i s - J Q I A E 0 0 1 - - E X 22 - 8 2 | P I n d H C u / 1
O t o le m u r _ _ g a r n e t t i i n s e U EF 0 1 3 - E - E X 2 - 1 1 2 | | I n n d e e t / t / 1 - - 8

m a is r i l y t s x 2 A C 1 - 0 4 1 - X - 0 | I n d e 1 -
P o n n _ _ t r oa n g o r r i l l m a u s e - J Y K P 0 X 2 V 0 1 2 - 1 4 | P s
on g e og lo u a C e- V P 1 E | M X

9
C a n _p _ g o cy cin aa -J YK - E F X 1 7
O t o le m u r _ g a r n e t t i i - AA A QQ R s 1 E X 2 - 2 3 0 | | I n H Ce t / 1/ 1 - - 8 9 9

u r b e cu l a e s - A - H Z E X - E E X 2 - | M d t / 1 8 9
O t le m u r g a r n e C o - J S Z L E 0 0 1 E X 2 - - 1 6 6 | I I n d d e e t / 1

c i o o t s A U U 1 -E - | e
g o o _ n s l o d y s - A- C AY U J Y W Q 0 1 - E XX 2 - H C 2 - 5 e t / 1 - 9 0
P a n lla _ an u m ae -J 0 3 AD - E 2-

M a P a p s _ r e s l e n r i s - - J S A N L F 0 0 1 - E X X 2 2 - 2- 4 | M| P s _ g l i s- - C - A A C L 0 4 2 X 2 2 - 3 | M H C e t / - 8
P a ri lla n ap y a ae D s- 01 EX
le u r g a e ii - A R 3 - - 2 2 | I n e / 1 - 8 8 9

_a ab u s d y t e JF B DI 0 3 KP 01 -EX 2- 20| / 1 - | M H - 9 0

a r - A AJ Y U A D Z - A - E - 4 | - 9 | 2 | M H C / 1 1 - 8 9
i A
P o r i s _ c n c y m a A B n y V 1 - 8 9 |M H / 1 - 8 9 - 8 9
M ic t o le mm u r _ g a r r n e t t t ii - AA A QQ R 00 3 - - E XM H C4 | I I n d d e t t / 1 - - 8 9 9

c u n e mn g o u l a t t a - A - J Z L G 0 1 1 - E E X X 2 2 - 9
at a- S I 1 E 2 - -4 |In d

G o tu s_ na nc ym - C ge C F V0 / 1 - 27 C C / 1 9
ul in -J Q 0 1- EX 2 2 6 In

be eli 1 0 t e s s - A E0 0 3 - E 01- -EX 2- 12| Ind 8 9 C / 1

e n e B G FE I 0 A 0 4 / 1 X 2 I n d M H H / 1 - 8 9 9
G o bu _ na nc la co -A F - B 2-
M ic r o c e u r _ g a r n e t t t ii - A A Q R 0 3 - E X 22 - 5 | - I -n d ee t / 11 - 88 9

_ c a a Z B -
m str ta s-A WQ 0 1- EX X -2 5|

ith a _a ci ul t t na J I 0 1 -E X
A
M p 1 t/1 -8 9

lii i-A - e x - A A D 2- - E X 2 EX 2- 11| Ind et/1

op ac s s m la ri s- Q ZZ 0 1 E t t i A 02 3 - 0 1 - E X - 8 - 8 e t C C / 1 - 8 8
C eo t ut u s __ n ag o r _ le uc h u s - A H L A0 3 - E 2 - 1 2 - 1 0 |M H C
a_ e lat ri V CE 0 3- -E 2 -2

H / -8 9

-A B 2 A C A EX X - 6 2- 10 Ind et -8

i n ac b u _ f a a_ m u st ty s - A H K 0 1-
1 |

A o u s a _ u s c h u 2 - E E X X 2 - 8 |M
M ic r o c e b u s _ m u r e t t ii - - A A Q R 00 3 - E- E X 22 - 62 | M H CC / 11 - 8 99

E
i - A 0 1 -E X - E 2 9 | I n / 1 - / 1 - - 8
ac em u la -L G U 0 01 EX 2

A o t r i l l s c _ ja c c - E X S U 1 - 4 - E X 2

B GA - H Z 0 1 2- 2 - 9 | M 18 |Ps et /1-8 9

R h M o l o c a c a c a _ e m e s _ a e u s - A A E HQ I B G 0
ac n _m ic us -A N E A - X

A o a r ix _ ja 1 0 a - J A 0 Z 0 - E 8 9
HC /1 -8 9
/

A - E X2 2 - 2 X 2 - - 4 | de 89 89 9
G 0 L 0 - 7 | H |M e /1 9
M a_ a c in is AA SU I 01 -E

G o m it h r ix u s la t t B G A C 0 1 /1 - n d e t/ /1 - 8
6
|M HC /1 -8
d

N a ll it h n s u i- A - A D A - G - 6 |I H C 9
_ m r in s - A D 0 3 E X - 0 | M C / 1 - 8 9

C ca a ac n u a bi - A L t/1
A 1 A - 4- E X |M M H C / H u/ -8

Q X -4 | 1 M
s
C a ll s e a _ m e li t e s - A A L A X 2 - 2 |M /1 -8 -8 9 -8 9
ac ac fas uc ns a- -J AQ Z 01

M a u s _ m u r in u s - A B D CC 0 2 -- E X 22 - 9 | | P s eH C / 1/ 1 - 88 9

a M a c a_ eb a b u t t a s- JZ
01 -E C E 2 H C 1 C 1- 9

R 2 - |M M | M H
C n c
-8 M u - 9

-8
-

C oa c ag o _ ag lo dd y te- e x 2 A 0 1 3 - E X -2 |M H |M H n d e t/ 1 -8 9
M a a c a c a c a _ m ri n u s - A B D C 0 22 - E XX 2 - 6 |I n dH C / / 1 - 88 9

- 03 1 H H C
ac ac _ p ie tt ta s- Z Z

M n
M cac coc s _ s _a u l a aeu tys 9
M n r o lo 0
1 9

-E X / 1 X 2 - 6 C/ / 1 - 8 9 /1- 89

- E 1 | I HC C / 1 C / / 1 - 8
M M aca _ca liv ula lat ari -AH HZ

P oa n __t tr o gs u s 1li i- A BC A B D0 2 -E 4 -E X X 2 -1 |I n d e 1 -8 9
a _ fa m u la la tt a -J B D CC 0 2 -- E X 22 - 0 |M H Ce t / 1 - - 8 9
ta J U A - X

P a n s e n a b e la - J F E C Z 0 0 1 -E X 2 -2 d e t/ 9
s c ic tt a -A S U 0 2 -EE X 2 - 5 |I n H C /1 - 8 9

X 2- - 8 -1 1 | I n 1-8 - 8

r u io
P o n o _ o r il s -A -A A Z L F 0 1 -E 4 |I n t/ 1 -8
A N E 0 3 X - 4 |M d e /1 - 8 9

a a
M Ce e b ap a _ m sab s_ X n d /1 - 8 1 - 9
u ul is A

C o n g la _ g is c u te s a -J Q IA X 2 -2 n d e
1 89

2- 8| 9 | d 9 9
M ic r o c e b u s g a r n n e t t iiii - A AA Q RR 0 33 - E XX 2 - - 3 | M

P o r il p a n lo d y e s tr in s -A 0 1 -E -2 3 |I d e t/ 1 -8
e s tr n s e IA 0 X 2 -2 |I H C /1 --88 9
-

G a n _ tr o g e m u la ri N U -E X 2 -7 |I n
9

2 - e -8 9 8 9
3| In
M b ri_ a_ _m cic b is

P _ _ n c ic -A A 0 3
2 -6 H C 1 -8 9

o c P a c s_ bu
2

P aanc a c aa _ fa su la tt a -J S U EU 0 1 -E 2 -5 |I n d /1 -8 9
/1 9

4| t/1 9
Ce imi cac aca as anu ub

In de
il
M a c a c a _ m la tt a -A A N 0 3 -E X |P se u 1 -8 9
X 2 -1H C - I H C /1 -8 9
-E X 5 |M d e t/ -8
M ic r o c e u s _ mm u r inin u s - A A Q R 0 3 - E X 2 - - 1 | M

or
M a c a c a _ m u la tt a S U E X 2 -8 |I n d e t/
5 |M -p2/1 -8 9

u e
M a c a c a _ m u tt a -J Z 0 1 -E 2 -2 3

a c eb oc P -8
-89

de t/

hl
M a c a c _ m u la -A H Z 0 1 -E X X 2- 10 |P u/ 1- 89
5 |M e t/ 1 -8 9
n

C o ri s -A QU 0 1 -E-E X 22 -1 |M H C t/

M a c a c a n u b is -A H Z Z U 01 -E 1| P se
H C /1 9
/1

se 9
0

M
-8 9
Sa Ma ac a_ io_ _a

L G 0 -E X 2 -1 8 |I n dH C /1 -8

P a pp io _ a nm
el la na H ZZ 01 X 2- 15 H C /1 -8 9

b y s -H -E

t/1 1-8
P a ac a_ ul at ta -Jph
-E X 2- |M H C -8 9
7

C M c rc
M ac a_ m eu co
M H C /1 -8 9
E

M ac ac

u/
/1 -8 9
|M

M an
/1 -8 99

o
M ac p io

Pa pi
Ca llitith
/1 -8 99

or Ce
Pr op iri_ bo liv ien sis-LV WQ 01 -EX et/1 -89

-8 9
Sa im ca pu cin us 01- EX 2-5 |Ind
1-8 9

1-
Ce bu s__ab
-8

89

Po ngo _ne mes trin a-J ZLF -EX 2-15 |Ind et/1 -89
X2- 0|M C/1 -94
ac P ap

Mac aca ncym aae -JYK P01


eu/1 -899
0

Aotu s_na
2

/1

Aotu s_na ncym


HC /1 -8

t/1-8 9

Aotus_ nancy maae-


1-89

Propithe cus_coq uereli-JZ KE01-EX

hl
KE01-EX 2-3|MHC /1-89

Propithecus_c oquereli-JZKE 01-EX2-0|MH C/1-89


Otolemur_garnettii-AAQR03-EX2-11|MH C/1-89
C/1-89
Propithecus_coquereli-JZKE01-EX2-5|M HC/1-89

9
io

s C

89
/1-8
X

M et
de t/1
P

C
f

JYK P0 1-E X2 -0| Ps eu /1-

C
a

_a
dr ill usub is -A HZ
|M
ac us b m

H /1
o_ an ja cc hu s- AC
|M H
u

RT 02 -E -1 5| M HC

C/

C -8
hr ix_us _c oq ue re li-J
2- 4|M HC
-
us BD 0 E

BRT02 -EX2-1 |Pseu/


Aotu s_na ncym s-LV WQ 01-E X2-2 1|PsHC

/1 9
3

us _r ox nu bi s- AA B R 01 -E-E X 2 -11|M
-1
o

01-E X2-3 |Inde


Ao tus _na ma cac o-L GH X0 1-E X2 -0| MH

ec
|MH
X2 -09||In

-8
u
Propithecus_c oquereli-JZKE 01-EX2-2|MH
X2 -5 |M

G
pi th ec u s_ sa ty s- JZ Z Z 0 1 X 2 -1
M

b
C e rca p io _ a n ll a n a -J-J Z L F 0u s 1 0 M

9
ul at ta S U E 03-J YK Q 01 se u/ 1- 89 -8 9
P ap ioox el la na -Js- A Q IB 01 -E X 2 -2

_l
R
Q

is
elii -AB GA
-25
t

1
1 -E
O t t o l e mm u r r _ g a

0
R hi no h lo ro ce b ce b u s_ ab is -A H B R 0 1 -E

Ta rs ius sy ric ht a- ABLG 01 -E X2

X |M
T0 2- EX
A

X2
Ta rs iu us _a ty s- -J AB R 01 -E

-A
n

aae- JYKP 01-E X2-7

89
_

1-E
-JY KP0 1-E
s
O tole mu

X
AN
-A

n
u

2
Z0
ae us 2- 12 |P

2
-A GC E0
O ole

Eu lem ur_ vif ron s-L GH W0

H H

d /1
JYKP0 1-EX2 -6|MH C/1-89
A

1- EX
Tarsiu s_syri chta-AJYKP

FV 01 -E 1- EX 2- 4|M 1-8 9
ta -A BR
Ot

C/
_

a
u

u la

-2
-E
u

JZ
_

in

ZK E0 X2 -9| MH C/

1-8
u
o

Propithe cus_coq uereli-JZ

X2

C C /1
baeu

X 2-

1-8 9

-8
/1 -
01- EX2 -20
aae-

9
aae

1-E
ro c b u s

ncy ma ae-
u

9
7 |I
o

xe

_s yr ich
hec a_nem
b

8
X2

9
_
ca
eb

_a

cym
cinu
ro
O

2-1 9|M HC

-1 3| M HCHC /1- 89

e t/ 1
-E
us_
cac

us _r

se u/
eb
s_

X2

t/
fla

Aot us_ nan


Ceb us_c apu

|MHC /1-89
c

1
-8 9
2-1|MHC /1-89
M

R h in M a c a

Ce rc oc
P

-1 |In de
Eu lem ur_

|MH C/1 -89


pi th ec

1- 89

9
o p it

/1
/1- 89
C

t/1 -8 9
R hi no

Figure 5: The phylogenetic tree of 417 EX2 sequences of MHC-I found in primate (Prosimians, Platyrrhini, Cercop-
ithecus and Hominidae) WGS datasets with the software MHCfinder. Primate orders are distinguished by color codes,
indicated in the legend. Clades that are suggestive of representing a gene or group of genes have been collapsed. Each
collapsed clade are labeled with the human isoform orthologs, however, those clades not containing human genes are
assigned new names (e.g., MHC I-p1, etc.). The nodes of the principal clades have been labeled. The tree was con-
structed after aligning sequences with ClustalO and using the phyML (part of the Fasttree software (Price et al., 2010))
with the gamma parameter, WGS matrix, and 500 bootstrapped samples. The consensus sequences for identifying
clades are marked in black.

10
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

ied (i.e., EX2, EX3 and EX4), EX2 contain veloped other genes that overtook the basic
the most information to discriminate genes for functions of these classical HLA-class I found
classification. A phylogenetic tree (Figure 5) in Hominids.
was constructed from 410 EX2 exons found From the tree of Figure 5, the HLA-E and -F
from the primate WGS datasets, consisting of clades contain sequences from all the primate
both viable (those having exons in tandem) orders except Prosimians. HLA-G, -H, -J and
and nonviable genes. In the tree, clades were -L clades have sequences from Apes and Cer-
collapsed to improve the visualization of re- copithecus. With the analysis of MHCfinder,
sults. Human sequences from the HLA iso- most of EX2 sequences of Cercopithecus of
types HLA-A, -B, -C, -E, -F, -G, -H, -J, -V the HLA-G clade are identified as pseudo-
and -L genes were aligned with the sequences genes, consistent with experimental results
from non-human primates in order to associate (Castro et al., 1996). All the sequences of the
and identify clades homologous to those in hu- HLA-V clade are pseudogenes and are only
mans. Apart from these homologous human found in Hominids.
clades, other clades were also identified. Three Three clades (indicated by the name cerco)
clades containing only sequences from Cer- are found that contain sequences only from
copithecus species are designated with MHC- Cercopithecus species. These sequences were
I-cerco-1, MHC-I-cerco-2 and MHC-I-cerco- generated after the separation of Old World
3. Another three clades with sequences ex- monkeys from the Hominidae family. In a sim-
clusive to Platyrrhini also are identified, indi- ilar way, three clades contain sequences only
cated by MHC-I-platy-1, MHC-I-platy-2 and from Platyrrhine species (indicated by name
MHC-I-platy-3. Also, there are three other platy.
clades that have sequences from at least two Another three clades exist that are com-
primate orders and have been labeled MHC-I- posed of sequences from at least two pri-
p1, MHC-I-p2 and MHC-I-p3. The MHC-I- mate families, but lacking sequences from
p1 clade contains of the largest number of se- Hominids. Of these, the MHC-I-p1 clade
quences, with representatives from all primate (where p is used to indicate primate) is the
species except Hominids, perhaps suggesting largest. In particular, the MHC-I-p1 and MHC-
a concrete function that was made superfluous I-p2 clades contain sequences from Prosimi-
in the differentiation of Hominid species. ans, Platyrrhines and Cercopithecus, while the
Most of the sequences from the HLA-ABC third clade of this group lacks sequences from
clade (i.e. consisting of HLA-A, -B and -C se- Cercopithecus.
quences) are from Hominid species. Also, this
clade contains a single sequence from Cerco- 3.1. Allele assignment to clades
pithecus and seven sequences from Platyrrhi- Several studies concerning the gene alleles
nus. HLA A, -B and -C have a common ori- of MHC-I have been described in non-human
gin and must have been generated after the primates and are available at the IPD-MHC
separation of the Hominids from the Cerco- database (www.ebi.ac.uk/ipd/mhc) (Robinson
pithecus, previously described by other au- et al., 2013; de Groot et al., 2012). These alle-
thors (Piontkivska & Nei, 2003). The se- les were aligned with the germline sequences
quences from these clades constitute the clas- to classify alleles as classical MHC-I (with al-
sical HLA-Class I genes. Therefore, it is likely lelic variation) or nonclassical MHC-I (with-
that Cercopithecus and Prosimian species de- out allelic variation).
11
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

M. Mulatta has been extensively studied to sequences, deduces the amino acid sequence,
reveal such allelic variation of MHC-I. We ob- and transforms it to a 400 element feature
tained allele sequences of M. Mulatta from vector. Each element of the feature vector
the IPD-MHC database and aligned them to represents the frequency of occurrence of each
the germline sequences found by MHCfinder. AA and pairs of AA (i.e., one of the 20 × 20
Figure 6 shows the resulting phylogenetic tree. possibilities) within the protein sequence. A
To reduce sequence redundancy, only one rep- supervised learning procedure is used to train
resentative allele from each allele lineage was a machine learning classifier to recognize
included in the tree construction. The Mamu- exons homologous to those known in humans.
A alleles belongs to the genes that define the We found that feature vectors based upon the
MHC-I-p2 and MHC-I-cerco-3 clades. In pre- simple frequency of AA occurrence transform
vious publications, difficulties have been de- has a higher classification accuracy (attaining
scribed in assigning specific orthology to the predictive precision of 98%) than other more
Mamu-B alleles (Liu et al., 2013). The tree sophisticated transforms based upon posi-
of Figure 6 resolves this difficulty by show- tional physicochemical properties. If the exon
ing that Mamu-B alleles correspond to genes architecture is known, small modifications
belonging to the MHC-I-p1, MHC-I-cerco-1, could render the algorithm a more general
-cerco2 and HLA-H clades. gene finding method, thereby facilitating rapid
C. jacchus is considered the reference or- identification of gene/specific exons from any
ganism for studies of allelic variation of MHC- species whose genome has been sequenced.
I genes in New World monkeys. Many pub- With MHCfinder, exon sequences of
lished studies (Cao et al., 2015; van der Wiel MHC-I were obtained from 30 primate
et al., 2013; Kono et al., 2014) have con- WGS datasets. The sequences have
cluded that the MHC-I alleles in C. Jacchus been made available in the repository
are orthologous to the HLA-G gene. Fig- vgenerepertoire.org. The program iden-
ure 6 shows the phylogenetic tree of germline tifies individual exons, referred to here as
EX2 sequences obtained from MHCfinder to- viable, meaning that they have a valid reading
gether with allele sequences from the IPD- frame. If the exons are found with the canoni-
MHC database. The tree confirms the results cal MHC-I exon/intron structure (i.e., they are
of these studies, since the majority of alleles arranged in tandem, EX2-EX3-EX4 within
are from gene C. jacchus-ACFV01-13—MHC the same WGS contig and have a valid intron
belonging to the HLA-G clade. However, the spacing), then MHCfinder considers this a
alleles of the lineage Caja-G*18 belong to the viable gene (i.e., that is likely functionally
MHC-I-platy-3 clade. expressed). Nonetheless, problems associated
with WGS datasets (e.g., incorrect sequence
4. Discussion assembly or low coverage of certain regions)
may result in an underestimate of the total
A bioinformatics program, number of viable MHC-I genes found by
MHCfinder (freely available at MHCfinder. In WGS datasets with relatively
http://vgenerepertoire.org/), was devel- high coverage and N50¿15k (which is the
oped that identifies the MHC-I exons (EX2, case for most datasets we used), the results
EX3 and EX4) from WGS datasets. The algo- of MHCfinder agreed with annotated gene
rithm determines the in-frame exon nucleotide results, when available. While the use of WGS
12
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Ma cac a

89
HC|1 -89
Mac aca mul a-A AN U0 1-E X2 Ma mu -B* 1687

U01-E X2-0|M HC|1-


M ac
M ac
Ma

mu latt
ac a

-8 9
UE03 -EX2 -20|M
Mac ca mu C o n
M a c c a m u tt a -A A n s u s C a ja -2 1 |M s e u |1 -8 8
C a it h ri x x ja c

ac a at ta -A A
ca
C a ll it h ri

H C |1
m
ll it h

atta -JS UE0


aca
Ca
ll

ul
m ul at
ri x ja c c h h u s -A

C on se-J S U E 03 X 2- 11 u -B 1 1 L

7 |M
m u latt a -A E U 0 1 -E0 -e x 2 * 0 4 0H C |1 -8
C a M a c a m m u la C o n

ja c

la

X 2 -2
C o n -J S U E 1 -E X 2-1 3 |M HA -F |1 -8 8
la

ta
ll it a c a u la t t s e C V 0 1 1 - E X 2 - H m us e u B * 0 0 4 9
M a aca

tt a
c h u s -A C C F V

Macac a mulatt a-AAN


h r ix m t t a - J n a - E X 2 - 4 L A - B | 1 8 3

ns us -E X 2- |M H C |1 *0 1
N U 01
c a ca
M

s e n 0 3 -E -0 |M C |1 -9 1

3 -E
3-E X2- 14||MH C|1 -89

89 9
Maca ca mula tta-JS
c
u
s -A F V 0 1 -E M a m - B - 8

1- - 8
c

se
ja c u la tt a - A A S U E u s 1 - B X 2 - 4 | | P s e - V | * 0 - 8 9 1
C a ll it h h r ix

C| C | 1
sus

UE0
10 -e
H K 0 X 2 -H L 1 |1 88
C a ll it

C F 0 1 -E X 2 a m u - B * 0 6 9
ch a- N U 03 0 - 8 * 0 -1| Ps u| 1 - 1 9
ll it r ix ja C

-E M a m -p 3 |1 -8

Mam u-A1 *011


Ma mu -A11*0 66
*03 2
Ca

*05 5
C o - A C F UE0 1 - E X2- 2 - H 0 1 Pe u | | 1 - 88 9

MH H
Mamu-B*080
Mamu-B*142
Mamu-B*077

Mamu-B*178

Ma mu -A 1* 05 6
Mamu-B*0 93

Mamu-B *057
-25

Mamu- B*145
Mamu -B*16 3

M am -A 1* 01 8
Mam u-B* 0214
Mam u-B *12

M am u- A1 *0 92
us JS
hr

1 0 -M -0 |M H C |1 -9
V0

M am u- A1 *1 23

3| | M
M a m u -A 1 *0 1 tt a -J S
M am u- B* 00 9

M am u- A 1* 1350
Ma mu -B
x2 -H M H C |1 -8 9

0
Ma mu B* 01 7
ns

M am u-A1 *0 65

9
ix ja c c h u n s e

M a m u -A 02 2
M am -H |1 -8 9

2- - 2 3
M a m u -A 1 *0 9 0
1 -E X 2 -1 -6 |M u - B * * 0 5 01

M am u- A 1* 10
Pse u|1 -89

M am u- A1 *0

M a m u -A 1 *1 0 5
e n V 0 1 - E X 2 - 1 7 1 |M A - Ls | 1 - - 8 9 9
ja c h s - n

Ma mu -A1

M a c a c a m1 *0 0 8
Mam u-A
6|
M

H C -I H C |1 1
2
cc us B su

1 *0 9

EX 2
u-

M a m u -A 1 * 0 6
s u - E X 2 - |M H | 1 8 9

LA

M a u- A 1*
X 2 6 |P H C 0 7

M m u -A u la
ac

3- E X
M a m u -A 1 * 0 5 4
hu -A BX s

M aa m u - -A 1 * 0 8 5
-B *0 86
-F

Ma mu
c

s1

M mu A1 009
ac
M aca

E0 0 1 -
M aa m u - A 1 * 0 5 3
s- C K0 1 0

*0

M aa m uu - A 11 * 0 6 1
0 -E x 1
a
M
ac c

0 - e 2 - 1 2 2 |M H C |1 - 8- 8 9
3 X

M m -A *1 3
A C F V 1 - e M - 7 | a m u u - BC | 1 1 - 8 0 0

M m -A *05
s ja
m

M aa m uu - A 1 * 0 0 7

M a m - A 1 0 7 2 JSU N U
o
ac a m

M m -A 7* 91
7
u la

-8
x 2 |P H C |1 - 8 9
F 0 -E x 2 a P - * -8 9

M a L A u |1 8 9
a

M a m u - A 1 * 1 2 ta- A A
M ac u - A 2 * 2 4 7
M am u- 1* 02
M amu
m ula

tta

-9 1
- H s e |1 - 9

M am u- A2 00
m u - J |1 - 9 1

-9 1
ul tt

M a m u - A 1 * la t t a -
9
e
-J

M a m u - a m u la 5 2
M a m ac a m * 0 5
|1 -8 8
at a-

1 L
S

M ac a c A 1 * 0
ta A

M a u A u t
UE
-J AN
SU U

M a m u- 1 *0 4
03

M a m u- A1 *1 45
-8

C
-E

P
E0 01

M a m u- A1 *0 18
s u - 9 |1

M a mm u u - A B * 1 * 0 7 3
X
3- -E

M am u -A 1 1 81
2

e
H
EX X2 M a m m u

M 89
M

1
5|

am u - 1 *0 5
u- -A 1* *04 12
1-
2- -1 a m u - B
M 0|M se u - B B * 0 1 0

-2 u| 89

A * 01 1
1
am H u *1 5 2
1 |P

1* 0 5
EX se 1-
|P C|

02 3
8

3- 89

3
6
-4 H 1-
M a

7
M 05 59 E0 X2 -8|M C|
M

A
1
M a * U E H
|

M a m 1 0 9 S 1- X2
A 1* 11 -J |M
- *

M a m u- u - - A 1 * * 0 2a t t a U0 3-E -2 9
M a mu u-B B* AN E0 X2 |1-8
M a mu -B * 03 a mm u u - A A G ul 0 2 8 - E 89
8

A 1 o3
M a mm u - B * 0 1 8 7 3 M a m u - a m 1 * 0 1 tta- JSU |1 -
M m u B *0 6
a -
M am u -B *0 38 8
HLA-H M a m c A * la a-
M a ca u - A 7 u tt
U0 erc
AN -c
-A -I
HC
M am u -B *0 22
M a m u - m u la 3
M a m ca m *0 tta H C 21
|M -89
M am u- -B* *04 07 M a c a c a G u la - M X 2- C |1
-E H
M a m u - BB * 0 0 3 1 M a ca - A m 1 0
01
|M
M m u- * 6 4
a M a mu a us 01 -17
M am u- B*00675 M a ac ns G* 6 NU X2
M ac se -A 03 0 8 -AA - E
M a m u - BB * 0 0 6 M on u -B* * 1 tta E0
3
M am u-B *0 23 C a m u A 1 u la SU
M aa m uu - B * 0 11 5
M m -B *15 0
M a m uu - B * 0 3 0
HLA-F M a m u - m * 0 5 a-J
M a m a c a A G la tt
M ac u - mu 0 2 5
M amu -B* *05 9 M a m ca 1 * 1 0
M aca u - A 1 * 1 1 1
M a m u - B * 0 3 22 M am u-A 1*1 17
M aa m u - - B * 10 7 3
Ma
Ma mu B*1 56
Ma mu--B*0 05
MHC-I-p2 + MHC-I-cerco-3 M am u-A 1*1 0
M am u-A *01 1
M am -A1 *00
Ma caca M a m u -BB * 0 0 2 M amu -A1 *016
M a c c a c a m m u la tt a M a m u -B * 0 91 2 M a m u -A 1 3 0
aca u -J M a m u -B * 0 1 84 M a m u -A 5 * 0 4 8
m u la la tt a -A S U E 0 M mu *0 M a m u -A 1 * 2 0
tt E 3 -E a -B *0 8 3 M a m u 1 *1 4
C o n a -A A N H K 0 1 X m
se n -E 2 -9 |Mu -B 1 7 2 9 M a m u -A 4 *1 2
su s1U 0 1 -E X 2 -1 |M H C *0 1 M m u -A 1 *0 0
0 -M X 2 -1 |1 -8 M a m u -A 6 *0 1
H C -I 8 |M H C |1 -8 9
M a m u -A 1 *1 1 2
M ac ac
M ac ac a m ul at
-c e rc H C |1 9
M a m o -1 |1 -8 9
M am u -B *1-8 9
MHC-I-cerco-1 M a m u -A 1 *0 2 3
M a m u -A *0 0 3 |1 -8 9
a ta -J M a m u -A 1 10 6 C -I- p2
M ac ac m ul at ta -A SU E0 3- EX M am u- B *1 17 62
M a u- A 1* 10 ?? ?M H
a m ul AN u-
at ta -J U0 1- EX 2- 15 |M H B *0 56 M amse ns us 2
SU 2- C |1 -8 C on u- A4 *0 3
Ca llit hri Co ns en E0 3- EX 2- 14 |M HC |1 9
x su 22 |P se -8 M am u- A3 *1
Ca llit hri jac ch us -B BX s1 0- ex 2- HL u| 1- 9 M am u- A4 *0 1 HC |1- 89
x jac ch
us -AC K0 1-E X2 -3| MH A- E| 1-89 X2 -15 |M |1- 89
Ca llith rix 89 M am -A 4* 03 NU 01 -E eu
Cal lith rix
jac chu s-B FV 01 -EX 2-8 C| 1-8 9 Ma muca mu lat ta- AA E0 3-E X2 -2| Ps
BX K0 1-E |M HC |1- Ma ca a mu latt a-J SU
jac X2 89
Cal lithr ix jaccchu s-A CF V01 -EX -2| MH C|1 -89 Ma cac *06 1
Mac aca mula
hus -AC FV0
1-E
2-9 |Ps eu|
1-8 9 Ma mu -A1 *04 9
tta-A EHK 01-E X2- 3|M HC| 1-89 Ma mu -A11*06 0

Caja-B
X2-2 |Pse u|1-8
Mam u-B*0 31 7 HLA-E Mam u-A *121
Mam u-A1 04
Mamu -A1*0
Callithrix jacchus-A CFV01-E 4*01 01|1-8 9 Mamu-A 1*043
X2-5|MH C|1-89
Callithrix jacchus-ACFV0 1-EX2-19|Pseu |1-89 Mamu-A1* 040
Mamu-B*074 Mamu-A1*042
Caja-G*07 02|1-89
Macaca mulatta-AANU01-EX2-20|MHC|1-89 Consensus10-ex2-HL A-A|1-89
Mamu-B*064
Mamu-B* 070
9
HLA-ABC Consensus 10-ex2-HL A-C|1-89
Consen sus10-E X2-HLA
U01-E X2-8|M HC|1-8 Callith rix jacchu s-ACF -B|1-89
Macac a mulatt a-AAN seu|1 -89 Calli thrix V01-E X2-14 |Pseu
UE03 -EX2 -18|P u-B* 013 Call ithri x jacch us-A CFV 01-E X2-1 7|Ps |1-89
Maca ca mulat ta-JS Mam
Mam u-B *09
2 Caj a-B 6*0jacc hus -AC FV0 1-EX eu|1 -89
eu| 1-8 9 Ca llith rix 1 01| 1-8 2-15 |MH C|1-
U01 -EX 2-9 |Ps -B* 053 Ca llith rix jac chu s-B 9 89
Ma cac a mu
latt a-A AN Ma mu
Ma mu -B|1- 89
eu
*07 6 MHC-I-platy-2 Ca ja- B*
Ca llit hr 02
jac chu s-A BX K01 -EX 2-1
01
ix jac ch|1- 89
CF V0 |MH
1-E X2 -10 C|1 -89
03 -E X2 -4| Ps C| 1- 89 Ca llit hr us
|M HC |1-
89
JS UE 1- EX 2- 5| MHu- B* 05 1 Ca ja ix jac ch -A CF V0
mu lat ta-
Ma ca ca lat ta -A AN
mu
U0 M am u- B* 09 5
M am H C |1 -8 9
MHC-I-p1 + MHC-I-cerco-2 C aj a--B *0 1 01 us -J RU L0 1- EX 2- 20 |M
C aj B7 *0 1 |1 -8 9 1- EX 2- HC
0| Ps eu |1- 89
Ma ca ca 13 |Mse u| 1- 89 C al a- B *0 1 01 |1 -8 |1 -8 9

ul at ta -J SUN U 01 -E
EX 2-
E0 3- X 2- 3| P u- B *0 82
M am u -B *1 34 5
7 MHC-I-platy-3 C al lit hr ix ja 02 |1 -8 9 9
C a jalit hr ix ja
C a ja -B 3
cc hu
cc hu s- A C FV
am -A A Mam *0 -B 3 *0 1 0 1 s- A C F 01 -E X 2-
M ac ac a m ul at ta Mam
u -B *0 1 1 C a ja *0
u -B 8 9 C a -B 5 1 0 |1 -8 9 V 01 -E X 18 |M H
M ac ac M a mm u -B *0 51 C a llll it h ri x*0 1 0 12 |1 -8 9 2- 7| C |1 -8
MHC
M a m u -B *1 -8 9 C a it h ri ja c P s |1 |1 -8 9
|1
M a |P s e u |1 -8 99 C a ll it h x ja c c h u s -A -8 9 9
C a ja -B ri x ja c h u s
E 0 3 X
2 -1
9 H C -8
-E X 2 -2 2 |M e rc o 2 |1* 1 0 16
-B 6
HLA-G C ll it h 1 * 0 c c h u -B B X V 0 1 -E
C a ja -B ri x ja1 0 1 s -A C K 0 1
CF
X 2 -1
S U 1 -E C -I -c a m u -B * 00 9 1 C aa ja - G * 0 3 c c h u|1 -8 9 F V 0 1 -E X 2 -0 1 |P s e
tt a -J A N U 00 -M H M mu -B* 37 C ja - * 1 0 1 s -A -E X |P u
u la Ma mu -B*0 28 C aa ja - GG * 1 88 0 4 |1 -8 9 C F V 2 -0 s e u |1 |1 -8 9
a m la tt a -An s u s 1 |M
cac M aa m u - B * 00 9 0 C ja * 1 0 |1 - 0 1 -E H C -8 9
Ma c a m u o n s e M amu -B* 071 C a ja - G * 8 0 1 |1 8 9 X2 |1 -8
Ma
ca C
M a m u - B * 0 4 79 C a ja - G * 1 8 3 |1 - 8 9 -2 |M 9
M a m u - B * -8 9 C a ja - G * 1 8 0 6 |1 - 8 9 HC
|1 -8
M a m u H C |1 - 8 7 C a ja - G 1 8 0 9 |1 - 8
C a ja - G * 1 8 0 2 - 8 9 9
M 9 |M e u |1* 0 24 4 C aa ja - G * * 1 8 0 5 |1|1 - 8 9
-2 s -B *0 4 C ja - G 1 0 9
X 2 2 |P u - B 0 8 0 C a a j a - G * 2 58 0 87 | 1 - - 8 9
-E -1 a m u B * 3 C j -G *1 0 |1 8
E 0 3 - E X 2 M a m u - - B * 01 - 8 99 C aj a-G *1 9 1| -8 9
U 1 M am u | -8 5 C aj a- * 9 01 1- 9
-JS NU0 M a m HC | 1 0 2 9 C a a- G* 10 02 |1- 89
a
la tt - A A M 1|M HC B * 0 6 6 M o ja- G* 10 0 |1 89
mu tta -2 |M u - B * 0 1 6 M a ns G 15 0 1|1 -8
ca X2 19 a m u - B * 1 0 8 M a ca e n * 1 0 2 | - 8 9
ca mu
la
-E 2- M a m u - - B * * 0 0 0 3 4 M a c cac c a s u 5 0 1 | 1 1 - 8 9
Ma aca 03 - EX M am u -B *0 1 5 M a a a m s 2 - 9
M a ca c m u 1 0 | 1 8 9
c UE 01 M a mm u - B * 0 0 0 1 C ac c a ca a m ul l a t - e x - 8 9
Ma - JS NU M a mu -B * *0 a j a ca m u at t a 2
M | u - B |1 8 0

tta AA M a mu u-B A2 a - ca m u l a ta - A - H
C aj ja- -G *0 7 0 01| 01 2|1
26 m u C 1- 6
M a mP s e - B * 1 - 8 9 9

G m u lat t t a -JS A N L A
aj a- G *0 7 6 1 |1 -8
2- M a a m MH C| * 0

C a a G 0

u la - M a m u-
m latta * 2 u la ta - A U U - G
C aj ja- -G* *23 7 0 1 0 1-8 9

a- G *1 7 05 |1 -8 -8 9
M 8| MH u - B

M a m
M a m u - u| * 0 4 8 8

C a ja G *0 0 4| -8 9

1 lat t t a -J A E0 0 1 | 1
G *1 6 03 |1 -8 9 9
a m u B 1- 6

a M a
C a ja- G 07 0 1|1 -8

ac m u 0 1 ta - A SU N U 3 - E - 8
*2 6 02 |1 -8 9
2- -6 a m

E X M M a a m u - BB * 0 0 7 9

C a ja- G* 07 0 |1 -89

ac ca M
- * 8

| 1 -JS A N E0 0 1 -EX X 2 9
2- am mu u-B *0 54 9

4 01 |1 -8 9
C a a- G* 20 01 |1 89
EX X2 M

M a
2 |

C aj a- G* 17 02 |1- 9

01 |1 -8 9

- 8 U U 3 - E 2- - 1
M |P -B B* 09 3

ac
C aj a- * 7 03 1-8 9

9 E0 0 1 -E X 1| 0 | P
16 u - * 6
M am seu 20 097 9

|1 -8 9
C j -G *1

M
3- - E X2- 2 - 2 Pse s e
C a a j a - G * 1 72 0 12 | 1 - - 8 9 C F V

-8 9
M aa m u u - B | 1 - 8* 0 1
3- -E

C a ja - G * 1 3 0 1 | 1 - A
M m -B *1 9

9
C a ja - G * 1 0 h u s 9

EX X 23 4 | u u
E0 01

M a m u - B * 1 41 6

C a ja - G * 1 3 c c |1 - 8 9

2- 2 - 7 |P P s |1-8 | 1 - 8
M am u-B *04 9

C a ja - G ix ja 0 3 - 8
SU U

M a m uu - B * 1 3 2

C a ja h r 1 3 0 4 |1 - 8 9 - 8 9

5| | P se e u 9 9
M amu -B* *08 4
-J AN

EX

C ll it G * 3 2 |1 |1 9
M aa m u - B * 01 8 95

C aa ja - - G * 11 2 0 3 0 1 |1 - 8

Ps s e u| | 1
1
C a ja - G * 2 0 3 0 2 9
ta A

M
3-

X 2 M a m u - BB * 0 57 5

eu u | 1-8 - 8 9
C a ja G * 1 2 0 |1 -8 9
at a -

u *0 5

C a ja - G * 1 2 0 4 |1 -8
E0

M a |P s e - B * 0 4 8

C a ja - * 1 0 1 -8 9

|1 1 - 9
ul tt

M m u -Bu |1 -87 2

C a ja -G * 2 2 0 4 |1 -8 9
-E X M aa m u -B * 0 8 9
m la

-8 8 9
C ja -G * 0 9 5 |1 9
SU

2 -3 m u * 0 8

C a ja -G * 0 9 00 6 |1 -8 9
0 |P -B * 4 9
a mu

C a ja -G *0 9 8 |1 -8 9

9
035

C a ja -G *0 9 0 1 |1 -8
-J

9
C a ja -G *0 9 0 |1 -8 9
M a m s e u |1*0 9 8
-

C a ja -G 9 0 3 |1 -8 9
X 2 -2M a m us e u |1 -8

u -B -8 9

C a ja -G *0 9 0 7 -8 9
ta
ac a

s1 0 -M M a m u -B *0 4 7

C a -G *0 0 1 |1 9
H C -I u -B *0 0 4

| 8
ac c

C a ja -G *1 4 04 |1 -8 -8 9
4
at

01 P 1 |1 -8 9

C a jaa- G *1 0 03 01 |1 -8 9
M aca

1-

C aj a- G *1 0 03 02 |1 9
C aj a- G *0 8 23s| 1- 89

C aj a- G *1 0 03 03 |1 -8
|1 -8 9
2
*1

C aj a- G *1 0
9
ul

C aj -G *1 4 02 |1 -8 9
Ca ja- G**0 8 09 |1 -8 9

Ca ja -G *0 9 02
9
U0

4 |P -B

Ca ja- G* 08 06 |1- 899

Ca ja
20 |1 -8

Ca ja- G*
m
M

Ca ja- G* 08 03 |1- 89

Ca ja- G*
9

Ca ja- G**11 01 02| 1-8 9


-8

Caj a-G *08 01| 1-8 9

Ca ja-G *08 05| 1-8 9


12|1 -89

Caj a-G 8 22|1 -89


08 |1 -8

Caja- G*08 16|1--89

Caja -G*0
Caja-G *08 15|1-89 89

Caja -G*0 8 17|1- 9


Caja-G*08 13|1-89
Caja-G*08 19|1-89
Caja-G*08 14|1-89

Caja-G *08 18|1-8


Caja-G* 08 21|1-89
Caja-G*08 11|1-89
-1 6

Caja-G*08 04|1-89
AN

08 02| 1-8
a

Ca ja -G *0 8 10 |1
ac

-p
-A

Caja -G*0 8 07|1


ac

Mam
tta
M

0 3 -E

G *0 8

11
08
la

01
11 03 |1-
B 2* 01

11 01 01
Caj a-G *08
mu

-E
Ca ja -G
UE

Ca ja- G*
0 3 -E

02 |1- 89
C aj a-
E03

X2
a

|1 -8 9
-J S
ac

C aj a-

-13
SU

SUE
ac

se n su
tt a

89
|1- 89
M

|M
89
tt a -J
u la

tt a -J

HC
am

u la

Con

|1 -
m u la
cac

89
am
Ma

cac

aca
Ma

Mac

Figure 6: The phylogenetic tree using germline EX2 sequences from five WGS assemblies: two from C. jacchus and
three from M. mulatta. Next, the allelic sequences of M. mulatta (blue) and C jacchus (green) obtained from the
IDP-MHC database were aligned with the EX2 WGS sequences. Clades are collapsed to improve the visualization of
the results. Also, consensus sequences (brown) were aligned with the EX2 sequences to identify clades.

datasets by MHCfinder cannot guarantee the obtained across a wide range of primate
exact number of MHC-I genes within a partic- species. Until now, sequences were obtained
ular species, it can make claims in studies of from the genomes or cDNA sequences of
several species or taxanomic orders. only few specific primates ((Heimbruch et al.,
For the first time, a large number of 2015; Uda et al., 2005; Yan et al., 2013)).
germline MHC-I gene sequences have been Moreover, this work can shed more light on
13
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

previous studies of nonhuman primates that longing to the HLA-G that is responsible for
have identified expressed MHC-I alleles, at- most of the allelic variation.
tempting to classify these sequences without The presence of orthologs to the nonclas-
exact knowledge of the genes present in these sical genes (HLA-E, HLA-F, HLA-G and
species. HLA-H) in non Hominid primates indicates
Given the ability to construct phyloge- a previous origin prior to the separation from
netic trees from the MHC-I exons of various New World monkeys. In Prosimians, no or-
species, it is possible to group the evolution- thologs were found, thereby raising the ques-
ary origin of these genes and more precisely tion whether the separation between the clas-
infer orthologs and paralogs. Here, sequences sical and nonclassical MHC-I is general to
of exon EX2 were used to construct such trees; all mammals or specific to Platyrrhini and
EX2 being the most discriminative constituent Catarrhini. Similar results are found by other
of the MHC-I gene, while EX3 and EX4 alone authors (Piontkivska & Nei, 2003; Fukami-
cannot resolve clades. The results indicate Kobayashi et al., 2005).
that the diversification of these genes has been The birth and death processes observed in
driven by birth and death processes, thereby the data are similar to that which occurs in the
explaining the large number of pseudogenes. variable (V) regions of immunoglobulin (IG)
The results presented here demonstrate that and T-cell receptor (TCR) genes. In the case of
the classical HLA-A, -B and -C genes found MHC-I, the situation is more complex because
in Hominids were generated recently, coincid- a viable replication process involves at least
ing with the separation from Old World mon- six separate exons. The birth and death process
keys. In non-Hominid primates, orthologs to must be related to the adaptability and survival
these genes can no longer be found, but in- of the species. This evolutionary mechanism
stead there are paralogs that proceeded from a creates a greater genetic variability in genes
common ancestor. The sequences from Cerco- that must present antigen, as well as in those
pithecus are practically absent, except for one that must recognize antigen. Nonetheless, it is
sequence from M. mulatta, which is of interest still unknown why some species possess more
since it corresponds to a gene having a large V genes and/or MHC-I genes than others, or
amount of allelic variability. Similar to these whether the absolute number of such genes
genes, which are quasi-specific to Hominids, provides the species with immunological ad-
there are other genes that are specific to Cer- vantages. More studies are needed to clarify
copithecus and Platyrrhini. Taken together, whether correlaciones exists between the num-
data from this study provides evidence of rapid ber of these genes and to establish if coevolu-
birth and death processes. The absence of or- tion processes are at play.
thologs to HLA-A, -B and C in Cercopithecus,
and to a lesser extent in Platyrrhini, may ex-
plain the generation and/or expansion of other
clades consisting of genes that have the func-
tions of the classical MHC-I. Confirmation of
this hypothesis is confirmed in several cases,
as seen in M. Mulatta, for which an allelic vari-
ation is seen that is not present in Hominids,
while in C. jacchus a gene was identified be-
14
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

5. References clature report on the major histocompatibility com-


plex genes and alleles of great ape, old and new
Birch, J., Murphy, L., MacHugh, N. D., & Ellis, S. A. world monkey species. Immunogenetics, 64, 615–
(2006). Generation and maintenance of diversity in 31. doi:10.1007/s00251-012-0617-1.
the cattle mhc class i region. Immunogenetics, 58, Hedges, S., Dudley, J., & Kumar, S. (2006). Timetree:
670–679. A public knowledge-base of divergence times among
Breiman, L. (2001). Random forests. Mach. Learn., 45, organisms. Bioinformatics, 22, 2971–2972.
5–32. doi:10.1023/A:1010933404324. Heimbruch, K. E., Karl, J. A., Wiseman, R. W., Dud-
Cao, Y., Fan, J., Li, A., Liu, H., Li, L., Zhang, C., Zeng, ley, D. M., Johnson, Z., Kaur, A., & O’Connor,
L., & Sun, Z. (2015). Identification of mhc i class D. H. (2015). Novel mhc class i full-length allele
genes in two platyrrhini species. Am J Primatol., 77, and haplotype characterization in sooty mangabeys.
527–34. doi:10.1002/ajp.22372. Immunogenetics, 67, 437–445.
Castro, M., Morales, P., Fernández-Soria, V., Suarez, Heinrichs, H., & Orr, H. (1990). Hla non-a,b,c class i
B., Recio, M., Alvarez, M., Martı́n-Villa, M., & genes: their structure and expression. Immunol Res.,
Arnaiz-Villena, A. (1996). Allelic diversity at the 9, 265–74.
primate mhc-g locus: exon 3 bears stop codons in Horton, R., Wilming, L., Rand, V., Lovering, R., Bru-
all cercopithecinae sequences. Immunogenetics, 43, ford, E., Khodiyar, V., Lush, M., Povey, S., Tal-
327–36. bot, C. J., Wright, M., Wain, H., Trowsdale, J.,
Cock, P., Antao, T., Chang, J., Chapman, B., Cox, C., Ziegler, A., & Beck, S. (2004). Gene map of the
Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., extended human mhc. Nat Rev Genet., 5, 889–99.
Wilczynski, B., & de Hoon, M. (2009). Biopy- doi:10.1038/nrg1489.
thon: freely available python tools for computational Kono, A., Brameier, M., Roos, C., Suzuki, S., Shi-
molecular biology and bioinformatics. Bioinformat- genari, A., Kametani, Y., Kitaura, K., Matsutani,
ics, 25, 1422–3. doi:10.1093/bioinformatics/btp163. T., Suzuki, R., Inoko, H., Walter, L., & Shi-
Daza-Vamenta, R., Glusman, G., Rowen, L., Guthrie, ina, T. (2014). Genomic sequence analysis of
B., & Geraghty, D. (2004). Genetic divergence of the mhc class i g/f segment in common marmoset
the rhesus macaque major histocompatibility com- (callithrix jacchus). J Immunol., 192, 3239–46.
plex. Genome Res., 14, 1501–15. doi:10.4049/jimmunol.1302745.
Djurisic, S., & Hviid, T. (2014). Hla class Lee, N Ishitani, A., & Geraghty, D. (2010). Hla-f is
ib molecules and immune cells in pregnancy a surface marker on activated lymphocytes. Eur J
and preeclampsia. Front Immunol., 5, 652. Immunol., 40, 2308–18.
doi:10.3389/fimmu.2014.00652. Lefranc, M., Duprat, E., Kaas, Q., Tranne, M., Thiriot,
Fukami-Kobayashi, K., Shiina, T., Anzai, T., Sano, K., A., & Lefranc, G. (2005). Imgt unique numbering for
Yamazaki, M., Inoko, H., & Tateno, Y. (2005). Ge- mhc groove g-domain and mhc superfamily (mhcsf)
nomic evolution of mhc class i region in primates. g-like-domain. Dev Comp Immunol., 29, 917–38.
PNAS, 102, 9230–4. doi:10.1073/pnas.0500770102. doi:10.1016/j.dci.2005.03.003.
Garcia, K., Adams, J., Feng, D., & Ely, L. (2009). Liu, Y., Li, A., Wang, X., Sui, L., Li, M., Zhao, Y.,
The molecular basis of tcr germline bias for mhc Liu, B., Zeng, L., & Sun, Z. (2013). Mamu-b genes
is surprisingly simple. Nat Immunol., 10, 143–7. and their allelic repertoires in different populations
doi:10.1038/ni.f.219. of chinese-origin rhesus macaques. Immunogenetics,
Grimsley, C., Mather, K. A., & Ober, C. (1998). Hla-h: 65, 273–80. doi:10.1007/s00251-012-0673-6.
a pseudogene with increased variation due to balanc- Lynge-Nilsson, L., Djurisic, S., & Hviid, T. (2014).
ing selection at neighboring loci. Molecular Biology Controlling the immunological crosstalk during con-
and Evolution, 15, 1581–1588. ception and pregnancy: Hla-g in reproduction. Front
de Groot, N., Blokhuis, J., Otting, N., Doxiadis, G., Immunol., 5, 198. doi:10.3389/fimmu.2014.00198.
& Bontrop, R. (2015). Co-evolution of the mhc Moscoso, J., Serrano-Vela, J., Pacheco, R., & Arnaiz-
class i and kir gene families in rhesus macaques: an- Villena, A. (2006). Hla-g, -e and -f: allelism, func-
cestry and plasticity. Immunol Rev., 267, 228–45. tion and evolution. Transpl Immunol., 17, 61–4.
doi:10.1111/imr.12313. doi:10.1016/j.trim.2006.09.010.
de Groot, N., Otting, N., Robinson, J., Blancher, A., Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Lafont, B., Marsh, S., O’Connor, D., Shiina, T., Wal- Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
ter, L., Watkins, D., & Bontrop, R. (2012). Nomen-
15
bioRxiv preprint first posted online Feb. 15, 2018; doi: http://dx.doi.org/10.1101/266064. The copyright holder for this preprint
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.

Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,


Cournapeau, D., Brucher, M., Perrot, M., & Duch-
esnay, E. (2011). Scikit-learn: Machine learning in
Python. Journal of Machine Learning Research, 12,
2825–2830.
Perelman, P., Johnson, W., Roos, C., Seuánez, H.,
Horvath, J., Moreira, M., Kessing, B., Pontius, J.,
Roelke, M., Rumpler, Y., Schneider, M., Silva, A.,
O’Brien, S., & Pecon-Slattery, J. (2011). A molec-
ular phylogeny of living primates. PLoS Genet., 7,
e1001342. doi:10.1371/journal.pgen.1001342.
Piontkivska, H., & Nei, M. (2003). Birth-and-death
evolution in primate mhc class i genes: divergence
time estimates. Molecular biology and evolution, 20,
601–609.
Price, M. N., Dehal, P. S., & Arkin, A. P. (2010). Fast-
tree 2–approximately maximum-likelihood trees for
large alignments. PloS one, 5, e9490.
Robinson, J., Halliwell, J., McWilliam, H., Lopez, R.,
& Marsh, S. (2013). Ipd - the immuno polymorphism
database. Nucleic Acids Research, 41, D1234–40.
Robinson, J., Mistry, K., McWilliam, H., Lopez, R.,
Parham, P., & Marsh, S. (2011). The imgt/hla
database. Nucleic Acids Res., 32, D1171–6.
doi:10.1093/nar/gkq998.
Rogers, J., & Gibbs, R. (2014). Comparative pri-
mate genomics: emerging patterns of genome con-
tent and dynamics. Nat Rev Genet., 15, 347–59.
doi:10.1038/nrg3707.
Uda, A., Tanabayashi, K., Fujita, O., Hotta, A., Terao,
K., & Yamada, A. (2005). Identification of the mhc
class ib locus in cynomolgus monkeys. Immuno-
genetics, 57, 189–197.
van der Wiel, M., Otting, N., de Groot, N., Doxiadis,
G., & Bontrop, R. (2013). The repertoire of mhc
class i genes in the common marmoset: evidence
for functional plasticity. Immunogenetics, 65, 841–9.
doi:10.1007/s00251-013-0732-7.
Wilming, L. G., Hart, E. A., Coggill, P. C., Horton, R.,
Gilbert, J. G., Clee, C., Jones, M., Lloyd, C., Palmer,
S., Sims, S. et al. (2013). Sequencing and compar-
ative analysis of the gorilla mhc genomic sequence.
Database, 2013, bat011.
Yan, X., Li, A., Zeng, L., Cao, Y., He, J., Lv, L., Sui, L.,
Ye, H., Fan, J., Cui, X. et al. (2013). Identification
of mhc class i sequences in four species of macaca of
china. Immunogenetics, 65, 851–859.

16

Potrebbero piacerti anche