Sei sulla pagina 1di 130

XI QTLMAS

2007
Papers and abstracts from the Workshop on
QTL
and
Marker Assisted Selection
22-23 March 2007, Toulouse, France
Edited by Andr es Legarra
INRA
Station dAmelioration Genetique des Animaux
BP52627
31326 Castanet Tolosan Cedex
France
2007
INRA
Contents
1 Program 5
2 Advances in QTL detection theory 1 9
3 QTL in practice 25
4 A bit on plants 64
5 Genomic selection 65
6 Advances in QTL detection theory 2 82
1 Program
Communication Index
Thursday 22
nd
March
Advances in QTL detection theorv
Invited speech: Software for QTL mapping and fine mapping. Miguel Perez-Enciso.
Universidad Autonoma de Barcelona. Spain
On changing models in QTL mapping based on the Haley-Knott regression. Jose Alvarez-
Castro*. Arnaud le Rouzic. rian Carlborg.
and
Unifying functional and statistical epistasis Jose Alvarez-Castro*. rian Carlborg. Linnaeus
Centre for Bioinformatics. Uppsala. Sweden
Modelling Epistasis in Variance Component Models. L. Ronnegard*. R. Pong-Wong & O.
Carlborg.
and
A general and efficient method for IBD matrix estimation. Francois Besnier* and rian
Carlborg. Linnaeus Centre for Bioinformatics. Uppsala. Sweden
QTL in practice
Genetic interactions and response to selection: the example of body weight in chicken
Arnaud Le Rouzic* and rian Carlborg.
and
Epistatic QTL analysis of metabolic traits in an intercross between two divergently selected
chicken lines Weronica Ek*. rian Carlborg Linnaeus Centre for Bioinformatics. Uppsala. Sweden
QTL Mapping in a Brazilian broiler x layer cross Ana.S.A.M.T. Moura*. Ledur M.C..
Boschiero C.. Campos R.L.R.. Ambo M.. Nones K.. Ruy D.C.. Baron E.E.. Coutinho. L.L. UNESP.
Brazil
Multistage QTL mapping strategy in an advanced backcross cattle population. Stela
Masle*.. Mediugorac I.. Foerster M Institute for animal breeding. Munich. Germanv
Fine mapping of SSC4 for meat and carcass quality traits in a commercial crossbred pig
population Anna Slawinska*. M. Siwek. E.F. Knol. D. RoeloIs-Prins. H.J. van Wiik. B. Dibbits.
M. Bednarczyk. Universitv of Technologv and Life Sciences in Bvdgoszcz. Poland.
Analyses of Candidate Genes on the Iberian-of-Origin Porcine Chromosome 4: a Status
Survey of the Project Jordi Estelle*. A. Oieda. J.M. Folch. M. Perez-Enciso Universitat Autonoma
de Barcelona. Spain
6
Milk yield QTLs on Oar 6 in the Latxa sheep breed: a comparison between Selective DNA
Pooling and Selective Individual Genotyping. Fernando Rendo*. E Ugarte. E Lipkin and A
Estonba. Universitv of the Basque Countrv. Leioa. Spain
Pedigree-based QTL mapping for fruit firmness in apple using Markov Chain Monte
Carlo methods and Bayesian inferences Abou Kouassi*. CE Durel. F Mathis. F Laurens. L
GianIranceschi. M Komianc. D Mott. A Patocchi. D Gobbin. F Fernandez. F Dunemann. A
Boudichevskaia. M Stankiewicz. E Van De Weg. M Bink INRA. CR dAngers - UMR GenHort.
France
A bit on plants
Invited speech. Current trends in MAS in plants. Frederic Hospital. INRA. Gif-sur-Yvette.
France

What does a plant breeding company like Limagrain expect from QTL and MAS? Sebastien
Crepieux Limagrain Advanta BJ. Rilland. The Netherlands
Genomic selection
Marker-assisted selection for commercial crossbred performance Jack Dekkers*. Hong-hua
Zhao. and Rohan Fernando. Iowa State Universitv. USA
Does genomic selection work in an outbred mice population? Andres Legarra*. E ManIredi. C
Robert-Granie. JM Elsen. INRA. Toulouse. France
!"#$%%"&'(&)*&+,&"*+"-&+./*,"0&%&,1*.+2
Genome wide selection in dairy cattle based on high-density genome-wide SNP analysis:
from discovery to application. H.W. Raadsma*. K.R. Zenger. M.S. Khatkar. R. Crump. G. Moser.
J. Solkner. J.A.L. Cavanagh. R.J. Hawken. M.Hobbs. W. Barris. F.W.Nicholas. B.Tier.
and
Genome-wide selection in dairy cattle: use of genetic algorithms in the estimation of
molecular breeding values. Ron Crump*. B. Tier. G. Moser. J. Slkner. K.R. Zenger. M.S.
Khatkar. J.A.L. Cavanagh and H.W. Raadsma.
and
Principal components regression of SNP data to predict genetic merit. A.F. Woolaston*. B.
Tier

and R.D. Murison.
and
Estimation of molecular breeding values in genome wide selection using supervised
dimension reduction based on partial least squares. Gerhard Moser*. B. Tier. R.E. Crump. J.
Soelkner. K.R. Zenger. M.S. Khatkar. J.A.L. Cavanagh and H.W. Raadsma
and
A formal comparison of different methods of utilizing SNP information of molecular
breeding values in whole genome selection J. Slkner*. B.Tier. R. Crump. G. Moser. H. Raadsma
Co-operative Research Centre for Innovative Dairv Products-CRC IDP. ReproGen Centre for
Advanced Technologies in Animal Genetics and Reproduction. Facultv of Jeterinarv Science. The
Universitv of Svdnev. Camden. Australia.

BOKU. Jienna. Austria
7
Friday 23
rd
March
Advances in QTL detection theorv
Invited speech. Pathway inference using genetical genomics. Dirk-Jan de Koning. Roslin
Institute. UK
On the mapping of 2 QTLs in the same marker interval using multiple interval mapping
and a moment method ManIred Mayer*. Research Unit Genetics and Biometrv. Research Institute
for the Biologv of Farm Animals (FBN). Dummerstorf. Germanv
Three-locus haplotype probabilities for multiple-strain RIL. Friedrich Teuscher* Research
Unit Genetics and Biometrv. Research Institute for the Biologv of Farm Animals (FBN).
Dummerstorf. Germanv
Haplotype inference in crossbred populations. Albart Coster` Wageningen Universiteit. The
Netherlands
Modelling and optimizing a dynamic selection breeding scheme. Anne Devalle*. CR Moreno.
ZG Vitezica. JM Elsen INRA. Toulouse. France
Rejection thresholds for interval mapping in low-density maps. Charles Elie Rabier*. C
Delmas. JM Elsen. INRA. Toulouse. France
Properties of different phenotypic measures for estimating QTL variance components and
MA-Blup EBV. Stephan Neuner*. Emmerling. R.. Thaller. G.. Gtz. K.-U. Bavarian State
Research Center for Agriculture. Germanv
Detecting Dominance QTL: power of variance component analysis in different pedigree
structures Suzanne Rowe*. Ricardo Pong-Wong. Sara Knott. Chris Haley. DJ de Koning Roslin
Institute. Roslin. UK
IBD probabilities: their discrimination ability and their co-evolution with linkage
disequilibrium. Florence Ytournel*. D Boichard. H Gilbert. INRA. Jouv-en-Josas. France
Bovine genomic structure as revealed by construction haplotype block map based on a high
density 15k SNP scan in dairy cattle. Mehar S. Khatkar*. KR Zenger. M Hobbs. RJ Hawken.
JAL Cavanagh. W Barris. B Tier. FW Nicholas and HW Raadsma. Reprogen. Universitv of Sidnev.
Australia
8
2 Advances in QTL detection theory 1
QTL SOFTWARE AND BEYOND
Miguel Prez-Enciso
miguel.perez @ uab.es
U. Autnoma Barcelona & ICREA (SPAIN)
X
Softwares
- QTL
- Genetical genomics
- Association & fine mapping
- Coalescence
11
QTL (linkage) software
- Experimental design
- Crosses
- Outbred: within family & IBD
- Modelling flexibility
- Population complexity
Comparative Table
UseIul
options like
genotyping
error
detection.
haplotyping
Basically
Ior Human
Genetics.
Simple
output
Outbred NP / LS /
ML
Linux/Unix - Merlin
Nice output Notions oI
Tcl recom.
Outbred ML Linux/Unix - Solar
Nice output R notions
required
Crosses LS / ML R (all
platIs)
- R.qtl
Multitrait
PowerIul
modeling
Simple
output
Crosses.
Outbred
ML Linux Qxpak
Multitrait
Simulates
crosses
No
covariates
F2. BC ML Windows QTL
cartographer
A good
start
Web based
Simple
methods
F2. BC.
Outbred
(sib pair.
within Iam)
LS Web QTL express
Advantage
s
Main
Limitation
s
Population
s
analyzed
Stat
Method
Output
Quality
Platforms
Modeling
Flexibility
Usage
Installatio
n
Doc
Quality
Software
Table 1: QTL software
Among publicly and easily available software
12
QTL software performance:
Simulated data
SimuIated data set (QTL in positions 20 and 50 cM)
0
5
10
15
20
25
30
35
40
45
0 10 20 30 40 50 60 70 80 90 100
Position, cM
L
O
D

s
c
o
r
e
RqtI/EM
RqtI/HK
RqtI/MI
RqtI/NP
WinQTL
WinQTL CIM
QxPak
QTLexpress
QTL software performance:
SSC4's IBMAP cross
ReaI data set (SSC4 in porcine F2 cross)
0
1
2
3
4
5
6
7
8
9
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
Position, cM
L
O
D

s
c
o
r
e
QxPak
QxPak a0
QTLexpress
Marker Pos. (cM) IC
SW2404 0.0 0.80
S0301 37.2 0.85
S0001 57.0 0.88
SW317 67.5 0.73
SW35 69.0 1.00
AFABP 69.8 0.82
SW839 74.2 1.00
DECR2 79.3 0.18
S0073 91.6 0.92
S0214 95.9 1.00
SW524 114.0 0.86
SW445 128.0 1.00
SW58 130.5 0.91
S0097 145.9 0.84

13
Some conclusions of the QTL era
- There exists a reasonably general theory for
QTL analysis.
- QTL are ubiquitous, even microbes do it.
- Additivity, polygenic action and pleiotropy
are widespread.
- ... although this does not rule out more
complex inheritance instances.
Some remaining questions
- Analyzing complex pedigrees with incomplete
marker information.
- Modeling LD, specially for sparse and microsatellite
markers and under complex scenarios (ie,
selection).
14
Genetical genomics software
- No specific software yet (is
it needed?)
- Webqtl
- Batch option in Qxpak
Association & fine mapping software
- Fine mapping
- Whole genome association
(WGA)
15
Fine mapping comparative table
Epistasis allowed;
Permutation test
No doc on
methdos
Outbred
WGA
LS Linux ? - Plink
Case / Control
Very intensive
computationally
No unknown data
or phase
Outbred
Iine map
ML (MCMC) Windows
Linux
- TreeLD
Case / Control
A bit messy
Outbred
Iine map
MCMC Linux - - Generecon
No epistasis Outbred
WGA
NP Windows
Linux
- Hapminer
Case / Control
Very Iast
No epistasis
Outbred
WGA
ML Linux - - Blossoc
Epistasis allowed;
Permutation test
AIL LS R - - Happy
Comments
Population /
Purpose
Stat Method Platforms Flexibility Friendliness Software
Some conclusions
- Large variety of situations
- No standard
- High activity research area
16
Some remaining questions
- Model choice in an efficient and rigorous
manner.
- Computational efficiency.
- Combining coalescence and mixed model
theory.
Coalescence software
Coalescence simulation
SelSim
ms
...
Sequence analysis
DnaSP
Arlequin
...
17
Why do we need the coalescence?
- Each demographic process, including selection,
leaves a trace on DNA variability.
- Coalescence is a very efficient and realistic tool to
simulate DNA polymorphim under many different
demographic scenarios.
Signatures of selection
Bamshad & Wooding, 2003
18
Inferring
selection from
disequilibrium
Sabeti et al., 2002
Coalescence software comparative
table
Nice GUI interIace source C code
provided
coasim
No complex demographic events Win / Linux - selsim
the classical. by the boss source C code
provided
ms
A bit messy Windows Arlequin
generalizes ms to allow Ior recombination
hotspots
source C code
provided
snpsim
Only Ior playing Wed baseb - Web
Animator
The easiest option Ior real analysis Windows DnaSP
Comments Platforms Flexibility Friendliness Software
19
Caveats
- Signatures of natural selection are confounded by
population history and variation in local recombination
rates.
- Evidence of selection does not mean that the causal
mutation is automatically found nor that the trait
selected for is known.
- Modeling all processes that have affected livestock is
very difficult.
- NOT EVERYTHING IS IN THE SEQUENCE!
Final considerations
- Data are becoming very complex.
- More and more information is external to the
experiment itself.
- We need platforms rather than (or in addition to)
individual softwares.
20
Available info
?
sequence databases
comp. mapping literature
my data
21
!"#$%#"&'$(")*#+",-',".'/*,*#/*#),-'01#/*,/#/
Jose Alvarez-Castro*. rian Carlborg
2"')3,"&#"&'4+.0-/'#"'567'4,11#"&'8,/0.'+"'*30'9,-0%:;"+**'<0&<0//#+"/
Jose Alvarez-Castro*. Arnaud LeRouzic. rian Carlborg
Linnaeus Centre Ior BioinIormatics. Uppsala University
Husargatan 3. BMC Box 598
SE-75124 Uppsala
Sweden
In this ioint communication we present the NOIA (Natural and Orthogonal
InterActions) model. a new general Iramework Ior modeling genetic eIIects. This
model is set on two diIIerent Iormulations oI general genetic eIIects (including all
possible orders oI gene interaction). the Iunctional and the statistical Iormulations. We
Iounded these two Iormulations on parallel notations. which enables us to transIorm
statistical and Iunctional genetic eIIects into each other. ThereIore. the NOIA model
uniIies. Ior the Iirst time. statistical and Iunctional models oI genetic eIIects with
arbitrary epistasis. This means in particular that Iunctional models oI epistasis. which
were so Iar used to study long-term evolution and eIIects oI driIt in adaptation with
simulated data. can now be Ied with real data Irom QTL (Quantitative Trait Loci)
analysis. Unlike previous models oI statistical epistasis. the statistical Iormulation oI
the NOIA model is generally orthogonal regarding the number oI loci and the
Irequencies oI the genotypes at the diIIerent loci. accounting in particular Ior
segregation distortion. ThereIore. NOIA is the most convenient model to be directly
used to estimate genetic eIIects in QTL analysis. Consequently. we implement the
NOIA model Ior Interval Mapping (IM) with HKR (Haley-Knott regressions). In
other words. we implement HKR with NOIA. This generalization enables us to use
HKR Ior perIorming orthogonal estimates oI genetic eIIects and. thus. to obtain an
orthogonal decomposition oI variance oI the population under study. By applying the
model to simulated data. we show how to translate and compare genetic eIIects
obtained by QTL analysis in diIIerent populations. how to obtain Iunctional genetic
eIIects Irom statistical ones. and the perIormance oI the model under various degrees
oI segregation distortion. We also provide with an example on real data in which we
show how to use the NOIA model Ior IM-QTL analysis.
22
!"#$%%&'()*+&,-.,&,)&')/.0&.'1$)2"3+"'$'-)!"#$%,
L. Ronnegard
1.
*. R. Pong-Wong
2
& O. Carlborg
1
1
Linnaeus Centre Ior BioinIormatics. Uppsala University. Sweden.
2
Roslin Institute. Edinburgh. UK.
* Correspondence: lars.ronnegard(lcb.uu.se
45,-0.1-
Variance component (VC) models are commonly used to detect quantitative trait loci
(QTL) in general pedigrees. The variance-covariance structure oI the random QTL eIIect
is given by the identity-by-descent (IBD) between genotypes. Epistatic eIIects have
previously been modelled. both Ior unlinked and linked loci. as a random eIIect with a
variance-covariance structure given by the direct Hadamard product between the IBD-
matrices oI the direct QTL eIIects. In the original papers. where the model was proposed.
the assumptions oI the model were not presented. In my presentation. I will clariIy the
underlying assumptions oI this previously proposed model. We have developed an
algorithm to obtain the correct estimates oI the epistatic IBD matrix when these
assumptions do not hold. To illustrate the eIIects oI violating these assumptions. we
estimate the deviations in likelihood and VC estimates. The previously proposed model
assumes either unlinked QTL or that a Iully inIormative marker (i.e. all marker alleles are
unique in the base generation) is located between the loci. Simulations oI an F
2
pedigree
including 400 F
2
individuals having a phenotype with moderate epistatic eIIects (epistatic
variance0.2. residual variance1.0) showed large deviations between the IBD matrix oI
epistatic eIIects Ior the simpliIied and correct model. when the loci were closely linked.
The VC estimates were overestimated (up to 22) with the previously proposed model.
when the loci were closely linked. but only small diIIerences in the likelihood could be
detected between the two models. The adverse eIIects oI violating the underlying
assumptions are thus most prominent in the variance component estimates oI epistasis
rather than in the signiIicant testing Ior linked loci. It is thereIore important that the
estimated epistatic variances Ior linked loci are based on our general model and not the
Hadamard based approximated model.
23
A generaI and efficient method for IBD matrix estimation
Francois Besnier and rian Carlborg
Linnaeus Centre Ior BioinIormatics. Uppsala University. BMC Box 598. SE-75124
Uppsala. Sweden
We propose a new approach for BD matrices calculation, which utilizes that, within a
given marker interval, the BD probabilities are a continuous function of the
recombination probability. We show by reprogramming an existing deterministic
algorithm, that a single run can produce a set of continuous functions that represent
the BD relationship between two markers. A second general method is also
proposed for estimating the BD-function by curve fitting, based on single point BD
estimate produced by any method of choice (e.g LOK, Merlin). The genome-wide
BD can then be stored as a set of polynomial BD-functions for marker intervals
rather than as a large set of complete BD matrices. This leads to more efficient
storage of the information, which is highly useful in e.g. multi-dimensional genome-
scans for interacting QTL, as well as opens opportunities to develop new single- and
multidimensional search strategies for locating QTLs.
24
3 QTL in practice
Genetic interactions and response to selection: the example of body weight in
chicken Arnaud Le Rouzic and rian Carlborg.
Epistatic QTL analysis of metabolic traits in an intercross between two
divergently selected chicken lines Weronica Ek and rian Carlborg
Linnaeus Centre Ior BioinIormatics. Uppsala University
Husargatan 3. BMC Box 598
SE-75124 Uppsala
Sweden
ArtiIicial selection procedures. apart Irom providing a way to improve agricultural
varieties. is a powerIul tool in applied and Iundamental research dealing with the
understanding oI the genetic Iactors that underlie complex characters. We have
studied an intercross between two chicken lines resulting Irom more than 40
generations oI bi-directional selection Ior body weight. A maior component oI the
genetic architecture that explains nearly halI oI the 8-Iold phenotypic diIIerence
between the high- and low-body weight lines is a radial network oI Iour interacting
loci. We have used individual-based simulations to explore the dynamic properties in
the response to directional and stabilizing selection Irom this radial network
architecture. The results show that epistasis modiIies the selection response. leading
to a progressive release oI genetic variation. and also might lead to diIIerent Iinal
outcomes oI selection depending on the initial allelic Irequencies in the population.
We also show how strong genetic interactions may mislead QTL detection
experiments based on crosses between selected lines. The network was initially
detected in a genome scan Ior epistatic QTL due to its large eIIects on body weight.
We have now explored how the networks aIIect other traits Ior which the lines diIIer
and have been able to show that the network has pleiotropic eIIects on several other
morphological and physiological traits.
27
DETECTION OF QTL FOR PERFORMANCE, FATNESS AND CARCASS TRAITS ON CHICKEN
CHROMOSOMES 3 AND 5
D.C. Ruy
1,5
, A.S.A.M.T. Moura
2
, K. Nones
1
, E.E. Baron
1
, M.C. Ledur
3
, R. L.R. Campos
3
, M.
Ambo
3
, C.M.R. MeIo
4
, L.L. Coutinho
1
1
USP ESALQ, Av. Pdua Dias, 11, Piracicaba, SP, 13418-900, Brazil;
2
UNESP FMVZ, Botucatu, SP, 18618-000, Brazil;
3
Embrapa Sunos e Aves, BR 153, Km 110, Concrdia,
SC, 89700-000, Brazil;
4
UFSC, Rodovia Edmar Gonzaga, 1346, Florianpolis, SC, 88040-900, Brazil,

5
Present address: FAV/UnB Campus Darcy Ribeiro, Braslia, DF, 70910-900, Brazil
INTRODUCTION
n previous studies we have identified QTLs affecting performance, carcass and fatness traits, and
organ weights in chromosomes 1 (Nones et al., 2006), 6 to 8, 11 and 13 (Moura et al., 2006) in a
Brazilian F2 chicken resource population. n this report, we focused on chromosomes 3 and 5 for
which QTLs for growth related traits have been already mapped in other populations (reviewed by
Abasht et al., 2006a). Thus, the objective of this study was to describe QTL for performance,
carcass, fatness and organ weights in the Brazilian population.
MATERIAL AND METHODS
ExperimentaI popuIation and data recording
An F2 chicken resource population specially designed for QTL mapping studies was originated
from the crossbreeding of seven males from a broiler line and seven females from a layer line at
Embrapa Sunos e Aves, Concrdia, Brazil. From a total of 2,063 F2 chickens incubated over a
period of 8 months, 544 belonging to six full-sib families were used in this study. F2 chickens were
reared as broilers up to 42 d of age. They were individually caged from 35 to 41 d, when weight
gain and feed intake were recorded allowing the computation of feed conversion. Body weight was
recorded at 1, 35, 41, and 42 d. At the latter age, recording was performed after 6 h fasting and
transportation to the slaughterhouse. Carcasses were eviscerated, stored at 4
o
C for six hours
and dissected. Weights of heart, lungs, gizzard, liver, head and feet, as well as the length of
intestine were recorded before chilling. Weights of carcass, breast, drums and thighs, wings,
residual carcass and abdominal fat were recorded after chilling. Abdominal fat, breast and carcass
percentage was computed relative to body weight at 42 d. Blood samples were collected at
slaughter for DNA analyses.
Genotyping
Thirteen microsatellite markers covering 84.1% of the consensus map of chromosome 3, and 7
markers covering 75.5% of chromosome 5 (www.thearkdb.org) were used to genotype 12 parental
(6 males, 6 females), 9 F1 (3 males, 6 females), and 544 F2 chickens from 4 to 6 informative full-
sib families. The first and last markers were LEI0043 and LEI0166 on chromosome 3 and LEI0082
and ADL0298 on chromosome 5. ndividual PCR reactions using fluorescent primers were
conducted for each marker. PCR products from three to four markers were mixed for allele size
determinations in a MegaBACE genotyper (GE Healthcare). Linkage maps were constructed for
each chromosome using multipoint linkage analysis (Green et al., 1990).
QTL mapping anaIyses
Phenotypic data were submitted to a preliminary analysis of variance including effects of hatch,
sex, family and their two-way interactions. Adjustments for hatch and significant interactions were
then performed and the residuals used in the QTL interval mapping analyses using the regression
method (Haley et al., 1994) and the line cross genetic model of the QTL Express software (Seaton
et al., 2002). Sex and family effects were included in the model for QTL mapping. Body weight at
35 d was used as covariate in the model for weight gain, feed intake and feed efficiency from 35 to
41 d, whereas body weight at 42 d was used for carcass weight, carcass parts and organ weights.
Significance thresholds were computed using a permutation test (Churchill and Doerge, 1994) and
probability levels for significant (1 and 5%) and suggestive genome-wise linkage were used
(Lander and Kruglyak, 1995). f the test statistics for a QTL exceeded the suggestive threshold
level, a model including a parent of origin effect (Knott et al., 1998) as well as models including
QTL x sex and QTL x family interactions, were tested based on conventional F-tests.
!
""
28
RESULTS AND DISCUSSION
A total of nine QTL surpassed the genome-wide suggestive threshold (Table 1). No interaction with
sex were found for any of them (P > 0.05). Four QTL mapped to chromosome 3 had the greatest
effects, exceeding the 1% genome-wide threshold. All four showed significant (P < 0.05) QTL x
family interactions, suggesting that the QTL alleles for those traits were not fixed in the parental
lines. The first two, for closely related traits (i.e. body weight at 35 and 41 d), were likely the same
QTL. t acted predominantly in an additive fashion and explained over 4.5% of the phenotypic
variance of body weight at the earlier age (Table 2). These results are in agreement with those of
two other F2 populations: one derived from a cross of two lines divergently selected for body
weight at 56 d (Jacobsson et al., 2005), who found a suggestive QTL for body weight at 28 d in the
interval flanked by marker MCW0222 and another from a completely different F2 population
derived from a red junglefowl x White Leghorn cross (Kerje et al., 2003), who mapped a suggestive
QTL for body weight at 46 d to an interval flanked by ADL0161.
The third QTL, for abdominal fat percentage (Table 1), explained almost 4% of the phenotypic
variance. The QTL allele that conferred higher abdominal fat percentage originated from the broiler
line (Table 2). nterestingly, a significant (P < 0.01) parent of origin effect was detected for this QTL.
The effect was positive, indicating that the broiler allele coming from the male parent increased the
trait value. Genomic imprinting may be an explanation for parent of origin effects. De Koning et al.
(2002) recommended caution to avoid spurious detection of parent of origin effects when the QTL
is segregating in the parental lines, especially for designs in which the number of F1 sires is
reduced. There were only three F1 sires involved in the present study and there was evidence of
QTL allele segregation in the founder lines, therefore the parent of origin effect detected in this
study may not be true. No other fatness QTL with parent of origin effects was reported in chicken
(reviewed by Abasht et al., 2006a), but McElroy et al. (2006) found a paternally expressed QTL for
white meat percentage, whereas Park et al. (2006) detected a Mendelian QTL for abdominal fat
weight at 70 d, both close to position of the QTL reported in this study. Other QTL with parent of
origin effects were previously reported for body and carcass weights (McElroy et al., 2006), egg
weight (Tuiskula-Haavisto et al., 2004), and disease resistance (Siwek et al., 2003) in other regions
of chromosome 3.
The last 1% genome-wise significant QTL for wings weight was mapped to the intermediate region
of chromosome 3 (Table 1). This QTL showed negative additive effects, indicating that the allele for
higher weight, in this case, was coming from the layer line. A suggestive QTL for lung weight (Park
et al., 2006) and a significant QTL for skin fatness (keobi et al., 2002) were reported in this region.
Five suggestive QTL were identified: for liver weight on chromosome 3, and for heart and gizzard
weights and abdominal fat and carcass percentages on chromosome 5 (Table 1). The abdominal
fat percentage QTL, which acted both in an additive and dominant fashion (Table 2), was detected
after fitting in a cofactor to account for the background effect of the QTL for the same trait on
chromosome 3. Several studies identified QTL related to fatness on chromosome 5, some of them
in positions that were close to the QTL mapped in this study (Lagarrigue et al., 2006; Abasht et al.,
2006b). The heart and gizzard QTL acted mainly in a dominant fashion, whereas the liver and
carcass percentage QTL showed positive additive effects and positive or negative dominance
effects.
dentifying chromosome regions associated with fat deposition through QTL mapping studies may
lead to the identification of the actual genes controlling the trait, contributing to enhance selection
for lean meat yield. Therefore, the two QTL mapped for abdominal fat percentage in this study
should be further investigated. Together they explained over 6% of the phenotypic variance of the
trait and were mapped to chromosome regions where other independent studies have already
detected fatness QTL. Moreover, the abdominal fat percentage QTL on chromosome 3 was located
close to a QTL for market age weight, a trait under intense selection in broiler lines. This finding
may help to explain the correlated response in fatness to selection for growth rate in broiler lines.
As pointed out by Abasht et al. (2006a), due to the large confidence intervals of QTL, higher
resolution analysis will be necessary to distinguish a pleiotropic QTL from a closely linked QTL.
Other QTL mapped in this study (e.g. for body and heart weights and for carcass percentage) point
#
""
29
out to candidate regions for genes affecting traits of great economic relevance to the poultry
industry.
TabIe 1. QTL that exceeded suggestive Iinkage
Chromosome Trait Position
(cM)
A
Flanking markers F
3 Body weight at 35 d
Body weight at 41 d
Abdominal fat percentage
Wings weight
Liver weight
102
102
112
157
176
MCW0222 - LEI0161
MCW0222 - LEI0161
LEI0029 ADL0371
ADL0371 LEI0118
ADL0127 MCW0224
14.02**
11.42**
8.16**
12.87**
5.37
j
5 Heart weight
Carcass percentage
Abdominal fat percentage
Gizzard weight
25
97
133
150
MCW0193 MCW0090
LEI0149 - ADL0233
ADL0233 ADL0298
ADL0233 ADL0298
6.80
j
5.30
j
7.16
j
5.55
j
A
Position from the first marker (LEI0043 for chromosome 3 and LEI0082 for chromosome 5) in the
chromosome set. LEI0043 is at 9 cM and LEI0082 is at 32 cM in the consensus map of chromosomes 3
and 5, respectively.
|
Significance at the genome-wide suggestive level
** Significance at the 1% genome-wide level
TabIe 2. Additive and dominance effects (standard errors) and the proportion of the
phenotypic variance expIained by the QTL
Chromosome Trait Additive
effect
Dominance
effect
Phenotypic
variance (%)
3 Body weight at 35 d (g)
Body weight at 41 d (g)
Abdominal fat percentage
A
(%)
Wings weight (g)
Liver weight (g)
40.64 (7.93)
47.21 (10.24)
0.140 (0.035)
-1.32 (0.26)
0.53 (0.21)
-17.32 (12.66)
-21.00 (16.35)
-0.053 (0.051)
-0.18 (0.42)
-0.74 (0.36)
4.74
3.83
3.95
4.35
1.65
5 Heart weight (g)
Carcass percentage (%)
Abdominal fat percentage (%)
Gizzard weight (g)
0.024 (0.074)
0.328 (0.126)
0.153 (0.049)
0.087 (0.229)
0.413 (0.112)
0.346 (0.180)
-0.194 (0.103)
1.254 (0.376)
2.17
1.62
2.32
1.71
A
This QTL showed significant (P < 0.01) parent of origin effect = 0.100 (0.035)
CONCLUSION AND FUTURE WORK
The QTL mapped for body weight and abdominal fat percentage on chromosome 3 give support to
the results of recently published QTL mapping studies, and also provide strong evidence for
candidate regions for genes affecting traits of great economic relevance to the poultry industry. A
half-sib analysis should be carried out to investigate the QTL that showed interaction with family.
Expression studies should be conducted to search for evidence of imprinting at the molecular level.
Our group is concluding the genotyping of over 400 F2 chickens from five full-sib families with
markers from the microchromosomes (9, 10, 12, 14, 15, 18, 19, 23, 24, 26 to 28 and Z) in 2007, to
complete the genome scan for growth-related and some metabolic parameters-related traits.
ACKNOWLEDGEMENTS
Financial support was provided by EMBRAPA/PRODETAB. D.C.Ruy received a PCDT scholarship
from CAPES. A.S.A.M.T. Moura, C.M.R. Melo and L.L. Coutinho received scholarships from CNPq.
K.Nones, E.E. Baron, R. L.R. Campos, M. Ambo received scholarships from FAPESP.
$
""
30
REFERENCES
Abasht B., Dekkers J.C.M., Lamont S.J. (2006a) Poultry Science 85:2079-2096.
Abasht B., Pitel F., Lagarrigue S., Le Bihan-Duval E., Le Roy P., Demeure O., Vignoles F., Simon
J., Cogburn L., Aggrey S., Vignal A., Douaire M. (2006b) Genet. Sel. Evol. 38:297-311.
Churchill G.A. and Doerge R.W. (1994) Genetics 138: 963-971.
De Koning D.J., Bovenhuis H., Van Arendonk J.A.M. (2002) Genetics 161:931-938.
Green P., Falls K., Crooks S. (1990). CR-MAP Program VERSON 2.4. Washington University
School of Medicine, St. Louis.
Haley C.S., Knott S.A., Elsen J.M. (1994) Genetics 136: 1195-1207.
keobi C.O.N., Woolliams, J.A., Morrice, D.R., Law, A., Windsor, D., Burt, D.W., Hocking, P.M.
(2002) Anim. Genet. 33: 428-435.
Jacobsson L., Park H. B., Wahlberg P., Fredriksson R., Perez-Enciso M., Siegel P.B., Andersson
L. (2005) Genet. Res. 86:112-125.
Kerje S., Carlborg ., Jacobsson L., Schtz K., Hartmann C., Jensen P., Andersson L. (2003).
Animal Genetics 34:264-274.
Knott S., Knott S.A., Marklund L., Haley C.A.S., Andersson K., Davies W., Ellegren H., Fredholm
M., Hansson ., Hoyheim B., Lundstrom K., Moller M., Andersson L. (1998) Genetics 149:1069-
1080.
Lagarrigue S., Pitel F., Carr W., Abasht B., Le Roy P., Neau A., Amigues Y., Sourdioux M., Simon
J., Cogburn L., Aggrey S., Lecrercq B., Vignal A., Douaire M. (2006) Genet. Sel. Evol. 38:85-
97.
Lander E. and Kruglyak, L. (1995) Nature Genetics, 11: 241-247.
McElroy J.P., Kim J.J., Harry D.E., Brown S.R., Dekkers J.C.M., Lamont S.J. (2006) Poultry
Science 85:593-605.
Moura A.S.A.M.T., Boschiero C., Campos R. L.R., Ambo M., Nones K., Ledur M.C., Rosario M.F.,
Melo C.M.R., Burt D.W., Coutinho L.L. (2006) Proceedings of the 8
th
World Congress on
Genetics Applied to Livestock Production. Belo Horizonte. Communication 22-50.
Nones K., Ledur M.C., Ruy D.C., Baron E.E., Melo C.M.R., Moura A.S.A.M.T., Zanella E.L., Burt
D.W., Coutinho L.L. (2006) Anim. Genet. 37:95-188.
Park H.B., Jacobsson L., Wahlberg P., Siegel P.B., Andersson L. (2006) Physiol Genomics 25:216-
223.
Seaton G., Haley C.S., Knott S.A., Kearsey M., Visscher P.M. (2002) Bioinformatics 18: 339-340.
Siwek M., Cornelissen S.J.B., Nieuwland M.G.B., Buitenhuis A.J., Crooijmans R.P.M.A., Groenen
M.A.M., Vries-Reilingh G. de, Parmentier H.K., van der Poel J.J. (2003). Animal Genetics
34:422-428.
Tuiskula-Haavisto M., de Koning D.J., Honkatukia M., Sculman N.F., Mki-Tanila A., Vilkki J.
(2004). Genet. Res. 84:57-66.
%
""
31
1
Multistage QTL mapping strategy in an advance backcross cattle population

Stela Masle, Ivica Medugorac and Martin Frster


Institute for Animal Breeding, Faculty of Veterinary Medicine, Ludwig-Maximilians-
University Munich, D-80539 Munich, Germany


INTRODUCTION

One of the objectives of European Union research project BovMAS (QLK5-CT-2001-02379)
was identification of quantitative trait loci (QTL) affecting milk production traits in one
advanced backcross population Fleckvieh x Red Holstein that are identical by descent (IBD),
according to origin and effect. In order to achieve this objective we used a multistage QTL
mapping strategy consisting of a combination of different mapping designs and methods.
Mapping designs included daughter design (DD) and granddaughter design (GDD) as well as
complex pedigree created by collecting all available ancestors of (grand)sires from both
designs up to important founders. Different mapping methods included mapping by means of
selective DNA pooling, approximate interval mapping (AIM), interval mapping, identity by
descent (IBD) mapping and combined linkage and linkage disequilibrium (LDL) mapping.
Genome-wide haplotyping was performed in the complex pedigree for all 29 autosomes,
based on genome-wide genotyping results, and used in IBD mapping. Family-wise
haplotyping analysis was also performed in eleven GDD families connected to the complex
pedigree to meet the needs of LDL analysis. Combination of results coming from mentioned
analyses led to selection of family sires, segregating for milk protein percent QTL (PP-QTL)
on chromosome 19 (BTA19), for further intensive study and subsequent fine mapping of the
PP-QTL.

STAGE ONE
DD18: Total of eighteen families comprised daughter design (DD18) divided into two groups.
First group of ten DD families comes from purebred Fleckvieh (FV) population sampled in
Bavaria and Austria. Family sires are one of the most influencing sires in Fleckvieh
population, among which bull F1 (born 1966; Fig 1) showed to be the most important
founder. Second group consists of eight DD families, which are representing one unique
population coming from an advanced backcross Fleckvieh x Red Holstein (ABFV). This
population is conditionally termed advanced backcross population since a parallel could be
drawn between the advanced backcross QTL analysis (AB-QTL) method, proposed by
TANKSLEY and NELSON (1996), and the backcross between different breeds used in cattle. In
ABFV population bull F2 (born 1973; Fig. 1) is the most important founder. Repeated
backcrossing of his direct and indirect progeny on Fleckvieh is still in progress. Chosen half-
sib daughter families are coming from a backcross generation three and four.
In DD18 milk samples from totally 48,190 daughters were collected. The number of
daughters varied from 1470 to 6057, with an average of 2677 daughters pro family. Milk
samples were pooled into two tail pools, as proposed by DARVASI and SOLLER (1994).
Daughters for each tail were selected according to corrected breeding values (cBV) as
follows: cBV = daughter breeding value half of dams breeding value.
From any of the selected daughters we pooled 10,000 somatic cells. For each trait and family
there were eight pools: two tail pools (high and low tail) and two replicates, both in two
duplicates. Pools were made for two main traits, milk yield (MY) and milk protein percent
(PP), as well as for seven associated traits: milk protein yield (PY), milk fat yield (FY), milk
fat percent (FP), milk somatic cell count (SCC), maternal non-return rate (NR), maternal
calving ease (CE) and maternal stillbirth (SB). The number of animals in each of eight pools
was on average 101.5 (98-102) for the main traits and 108 animals (41-152) for the associated
32
2
traits. Totally 582 pools were constructed in two duplicates, whereof 144 for the two main
traits, also in two duplicates.
Mapping by means of selective DNA pooling in DD18 was applied in genome-wide
scan. According to DARVASI and SOLLER (1994) selective DNA pooling has proven
statistical power of detecting marker-QTL linkage by simultaneously reducing genotyping
costs and time, which makes it very suitable for the fast screening of the genome. Since the
selective DNA pooling mapping method can only be applied on the family of the
heterozygous sire all family sires had to be previously genotyped. Sires were genotyped for
237 markers covering all 29 autosomes and chosen from the public database
(http://www.marc.usda.gov/genome/genome). During the genotyping process we discarded 18
markers because of technical problems or null alleles. A total of 219 markers were considered
for the genome wide scan (GWS) for MY and PP in four pools of the first duplicate.
Altogether 17,569 pool genotypes of the first duplicate were produced. Second duplicate was
genotyped only for the confirmation of the results. Totally 4331 confirmation pool genotypes
for 83 markers were produced. The determination of linkage is based on the distribution of
parental alleles among pooled DNA samples of the extreme phenotypic groups of offspring
(DARVASI and SOLLER, 1994). Estimation of allele frequencies in pooled DNA samples by
shadow band correction and test for markerQTL linkage were done as proposed by LIPKIN et
al. (1998) and MOSIG et al. (2001). Total of 21,898 pool genotypes were combined into 4695
single marker tests.
The pool genotypes that showed inconsistent patterns between two replicates, i.e. large
difference in estimated allele frequencies, were reanalyzed and if necessary retyped. If the
cause of the large variance between replicates couldnt be resolved single marker tests were
excluded if they exceeded the arbitrary value of 0.012. On average 10% of the tests were
excluded due to a large variance. In the late phase of data evaluation it was noticed that two
ABFV families have unusually high number of significant sire by marker combinations but
with a high variance between two pool replicates (>0.0012). As these two families were
sampled in Austria with the different sampling logistic we presumed an error in the sampling
process and omitted these families in the further analyses. Totally 3701 tests had a correct
variance between the two replicates. Out of 3701 tests, 531 was significant with a probability
value of P<0.05 and 235 with a probability value of P>0.01. The significant results were
distributed over all 29 autosomes. The results of the single marker tests for the QTL linkage
on the given chromosome were combined together by approximate interval mapping (AIM)
analysis, performed as described by DOLEZAL et al. (2005). Finally, we detected 31 QTL
regions distributed across 26 chromosomes.

GDD20: First granddaughter design was comprised of twenty families (GDD20) with totally
1332 sons. Number of sons varied from 39 to 145, with an average of 67 sons per family.
Some GDD families were strongly related since some family sons were also grandsires in the
design. At the same time eleven DD18 family sires were presented as sons in a granddaughter
design, thereby making a connection between these two designs and assuring an independent
sample for a confirmation of mapping results.
Initial interval mapping was performed in GDD20 with a QTLexpress program
(SEATON et al., 2002). We used model for half-sib design with corrected breeding values
(cBV) as a phenotype, weighted by respective reliabilities. Interval mapping was performed
using one-QTL model on every cM along the chromosome. Chromosome-wise significance
threshold was calculated based on 10,000 permutations. Confidence interval was calculated
by 10,000 iterations of the bootstrapping option. Analysis for one chromosome was done first
with all families together. For significant or indicative results we also performed family-wise
analyses in order to determine QTL status for each family (similar to SCHNABEL et al., 2005).

FV-ROOT: For seventeen DD18 and seventeen GDD20 families, respectively, we were able
to sample the sire of the sire and all available male ancestors up to important founders. This
33
3
way we built up a complex five generation pedigree (FV-ROOT) comprised of 75 animals
(Fig.1). FV-ROOT was genome-wide genotyped for 381 markers.



FIGURE 1.- Complex pedigree (FV-ROOT) showing the 18 daughter design sires (DD18, arrows), 20
granddaughter design sires (GDD20; gray circles) of the initial GDD and 11 granddaughter design
sires (GDD11; black circles) of the GDD for intensive study, with all available ancestors back to
important founders (F1, F2). Only one granddaughter design sire is coming from an independent
family that connects neither to both F1 and F2 nor to remaining family sires (*). In the pedigree
squares are representing male and circles female animals. Symbols for non-genotyped animals are
crossed with a diagonal line. In order to reduce the complexity of the picture founder F1 is shown
twice.

STAGE TWO
Haplotyping analysis and identity by descent (IBD) mapping: Haplotyping analysis was
performed by SimWALK2 program for all 29 autosomes in FV-ROOT and was able to assign
origin of a chromosome region to the Fleckvieh or to the Red Holstein (RH) ancestor.
Obtained haplotypes were graphically presented for all chromosomes. In the graphics, the
haplotype flow from a founder to actual generation was much easier to track and
recombination sites were easier to observe. As family sires are part of FV-ROOT these were
genome-wide haplotyped in context of complex pedigree and scanned for QTL by selective
DNA pooling. Following the principles of IBD mapping, all markers for QTL affecting PP
and/or MY in ABFV families were checked for their origin, determined by haplotyping
analysis. If the haplotyping analysis was pointing to the RH origin we investigated these
regions closely. There were totally eight QTL regions, which were closely inspected for
possible RH introgression: BTA01 proximal, BTA05 distal, BTA09 distal, BTA10 proximal,
BTA19 central, BTA23 central to distal, BTA25 central and BTA28 distal. As the QTL
34
4
mapping by selective DNA pooling was performed in different stages and there was a good
concordance between single marker test and AIM results on the one hand and haplotype
analysis for two ABFV families, segregating for the QTL affecting PP, on the other hand,
BTA19 was selected as the most promising candidate for further intensive study.
On BTA19 we used nine markers in the genome wide scan. AIM results showed the
presence of a highly significant QTL affecting PP. The most possible position of the PP-QTL
was estimated between 20 and 70 cM, with the highest peak on the marker URB44 at 39.01
cM (Fig. 2). The evaluations of family wise AIM-statistic curves showed that two ABFV
families are contributing to the highest effect on marker URB44 with the negative effect on
the PP (-0.017). Family-wise AIM also indicates a possibility of one more segregating family
of purely FV origin at the adjacent marker (BM17132).
According to single marker test and AIM results all the ABFV sires were grouped into
two groups. In the first group were two sires showing the significant effect on the PP and in
the second group four non-significant ABFV sires. Sires haplotypes were compared for these
two groups in order to possibly localize the PP-QTL better (similar to RIQUET et al.; 1999, Fig
2). The two significant families shared the same Red Holstein haplotype in the vicinity of
URB44. Four non-significant sires, sorted on the size of the received RH haplotype, got:
The proximal and distal part of RH haplotype block but not the central part (URB44
and BM17132)
Only the distal part, including BM17132
Short chromosomal fragment in the vicinity of marker URB44, possibly also the next
proximal marker
Didnt receive the RH haplotype at all
If one sire received the Red Holstein haplotype on marker BM17132 and the other one on
URB44 but both of them are not significant for the QTL affecting PP while the significant
sires got Red Holstein haplotype on both markers then it could mean that the QTL lies
between these two markers, in a region of approximately 20 cM. Of course, this inference is
correct only under assumption that all performed analyses are accurate and there are neither
false positive nor false negative. Both PP-QTL segregating families were selected for the
intensive study.


35
5


FIGURE 2.- Identity by descent (IBD) mapping for chromosome 19 (BTA19). A) Results from
approximate interval mapping (AIM) for protein percentage (PP) and B) for milk yield (MY) are
shown. C) The haplotyping analysis for BTA19 is shown only for advanced backcross population
(ABFV). The important founder F2 is marked red and eight family sires are marked A-H. F2s
haplotype coming from Red Holstein is marked red, F2s haplotype coming from Fleckvieh is marked
blued and non-F2 haplotypes are marked grey. Each square in the haplotype presents one marker used
in the analysis. Paternal haplotypes are placed left and maternal right. In the pedigree squares are
presenting male and circles female animals. Symbols for non-genotyped animals have a diagonal line
through them. Three ABFV families, segregating for the quantitative trait locus (QTL) affecting PP
and QTL affecting MY, are marked with a yellow arrow. Families excluded from analysis are marked
with an asterisk (*).

STAGE THREE
Initial interval mapping: Initial interval mapping on BTA19 was conducted by QTLexpress
in GDD20 with the data coming from previous mapping studies in the Fleckvieh population.
Totally 16 markers were used for the analysis. Because all marker genotypes were not coming
from one project but were collected from different projects, not all families were genotyped
for all the markers (Fig. 3). The interval mapping procedure by QTLexpress and similar linear
regression based programs are able to combine different families genotyped for different
marker sets into one across-family analysis. Initial interval mapping gave us indications that
there might be two QTL on BTA19 (both with F-statistic score between 2 and 3)- one QTL
affecting PP at approximately 55 cM and second one affecting MY and PY at approximately
102 cM (Fig. 3). Family-wise analyses were conducted in order to determine QTL status for
each family (similar to SCHNABEL et al., 2005). Altogether six families were heterozygous for
QTL affecting PP (PP-QTL), out of three families are ABFV families and three sires are pure
FV sires. Surprisingly, two out of three ABFV sires didnt get the Red Holstein haplotype at
all. Results from initial interval mapping are indicating, together with family-wise AIM
results, the possibility that this PP-QTL is already present in FV. To test this possibility
further all six families segregating for PP-QTL were included in set for the intensive study.

36
6


FIGURE 3.- Initial interval mapping in
granddaughter design (GDD20). Interval
mapping results for milk yield (MY), fat yield
(FY), protein yield (PY), fat percent (FP) and
protein percent (PP) are shown. Positions of
markers used in analysis are denoted on the X-
axis and the number of families, genotyped for
each marker, is denoted. F-statistic values are
presented on Y-axis and chromosome length in
n centiMorgans (cM) on X-axis.

STAGE FOUR
GDD11: Totally eleven granddaughter design families (GDD11) were chosen for the
intensive study: Two families were chosen according to the results of selective DNA
pooling (DD18) and six families, according to results from initial interval mapping
(GDD20). Additional three families, two closely related and one unrelated, which were non-
significant by the selective DNA pooling but are very important for the genetic active
Fleckvieh population, were chosen for the set to contribute the fine mapping by maternal
haplotypes. GDD11 consisted of totally 681 animals. Number of sons varied from 22 to 96,
with an average of 63 sons per family.
Markers were selected mostly in the region of the possible location of the PP-QTL
(Fig. 4). Out of 22 markers in intensive study three were excluded from the further analysis
due to the inconsistent results (UW33 and DIK4688) and null allele (DIK5098). On the other
hand, two markers form the GWS, ILSTS73 and RM388, were included in the final analyses
because they were genotyped for some of the chosen families in the previous projects. Thus,
final analyses were performed with total of 21 markers (Fig.4).
All produced genotypes were submitted to two systems of quality control. First system
comprises of repeated genotyping of already genotyped animals. Second system includes
paternity check. Additional controls included mistyping analysis by SimWALK2 program
(SOBEL and LANGE, 1996), which was performed in FV-ROOT pedigree and the chrompic
option of the CRI-MAP program (LANDER and GREEN, 1987), which was performed in
GDD11 pedigree.
In order to confirm the marker order, given by the published linkage map (USDA
linkage map; ITOH et al., 2005; IHARA et al., 2004) and to separate the markers that were at
the same position on that map for BTA19 we used build option of the CRI-MAP program.
The fixed orders were ascertain by comparison of results from published linkage map, high-
resolution radiation hybrid map (ITOH et al., 2005; EVERTS-VAN DER WIND et al., 2005) and
the whole genome shotgun sequence results for a corresponding chromosome. The best order,
which integrates all available information on BTA19, was chosen (Fig.4).


37
7

FIGURE 4. Markers used for analyses on
chromosome 19 in genome wide scan (GWS)
and intensive study (set 1 and set 2). Markers
positions, in centiMorgans, are taken from the
publicly available linkage map (USDA map)
except for the underlined markers, whose
position is result of own linkage analysis.
Markers in parentheses were left out of the
final analyses. Two GWS markers (*) were
included into final analyses because some of
GDD11 families were genotyped for them in
other projects (number of genotyped families
given in parentheses).

STAGE FIVE
Final interval mapping: Results of interval mapping across the families confirmed the
presence of the QTL affecting PP at the approximate position of 55 cM with P=0.025 (Fig. 5).
Even though we used closely spaced marker map the PP-QTL position couldnt be refined by
the interval mapping. The reason for that lies in the fact that there are only few informative
recombinations between closely spaced markers (OLSEN et al., 2004). The 95% confidence
interval of the QTL position was placed in a broad range from 0 to 95 cM, with the best
results between 54 and 62 cM. However, we observed that bootstrapping procedure
implemented in QTLexpress program often produces an extra peak on the beginning and/or
the end of the chromosome, resulting in very broad range of confidence interval.



FIGURE 5.- Final interval mapping in
granddaughter design (GDD11). Interval
mapping results for milk yield (MY), fat yield
(FY), protein yield (PY), fat percent (FP),
protein percent (PP) and the results of
bootstrapping procedure for all families are
shown. The number of bootstrap samples has
been rescaled. Positions of markers used in
analysis are denoted on the X-axis. F-statistic
values are presented on Y-axis and
chromosome length in centiMorgans (cM) on
X-axis.

38
8
STAGE SIX
Combined linkage disequilibrium and linkage analyses: Combined LDL mapping based
on variance component approach was performed as described by LEE and VAN DER WERF
(2004; 2005; 2006). For the analyses the mutation age and past effective population size were
held 100. Initial homozygosity on each locus was 0.25. Two programs were used:
combined linkage disequilibrium (LD) and linkage (L) analysis with the random walk
approach (ra) and the meiosis Gibbs sampling (ms) - LDL_rams, which makes use of
unordered genotypes and
combined linkage disequilibrium (LD) and linkage (L) analysis - LDL which makes
use of reconstructed haplotypes.
For the final haplotyping analysis (by SimWALK2) and the combined LDL analyses GDD11
animals were, together with their sires and dams, connected through ancestors to the FV-
ROOT, building a complex pedigree based on GDD11. This pedigree was then filtered on
those animals genotyped for 12 to 21 markers. The applied filter left 593 genotyped animals,
i.e. totally 1460 animals in the pedigree. The threshold of 12 markers for rating the success of
genotyping process (>50%) was established empirically.
LDL_rams was started with 1100 iterations and initial burn-in of 100. Parameter
estimates were collected every 10
th
round. The LDL_rams analysis locates the PP-QTL at
position 53.69 cM and the log-likelihood ratio test (LRT=-2(log(L
0
)-log(L
p
)) value from 8.46
(Fig. 6). According to OLSEN et al. (2004) the significance level of the LRT value is
chisquared distributed with 1 degree of freedom. Assuming this probability distribution PP-
QTL is highly significant (P=0.0036). To calculate the confidence interval of the QTL
position we use 1-LOD drop-off criteria (LANDER AND BOTSTEIN, 1989; Fig. 6).
Reconstructed haplotypes, made by SimWALK2, were used for LDL analysis. The
same pedigree, consisting 1460 animals, was used. The LDL analysis located the PP-QTL at
the same position like the LDL_rams analysis (53.69 cM). The LRT value is 11.63 and PP-
QTL is highly significant at this position (P=0.0006). The second peak at 55.89 cM, which is
also coming by LDL_rams analysis, is now more prominent and significant as well.
Calculated LOD-score is 2.52 at position 53.69 cM and 2.25 at 55.89 cM. Because of the
second peak the 1-LOD drop-off confidence interval lies now between approximately 50 and
59 cM.


FIGURE 6.- Results of combined linkage disequilibrium and linkage analysis (LDL) by LDL_rams
(left) and LDL (right) program for quantitative trait locus affecting the protein percent (PP-QTL).
Positions of markers used in analysis are denoted on the X-axis. The 1-LOD drop-off confidence
interval for PP-QTL is marked grey. The log-likelihood ratio test values (LRT) are presented on Y-
axis and the chromosome length in cM on the X-axis.
39
9

The peak at the approximately 67 cM, which shows with LDL_rams analysis, is gone by
using the most probable haplotypes. This makes us conclude that the peak arose most
probably due to the small number of samplings used (1100 samplings).

Checking for possible associated effects: To estimate possible associated effects of the here
mapped QTL we analyzed our data on all available traits. Analysis was performed for the
following traits: milk yield (MY), fat yield (FY), protein (PY), fat percent (FP), milk somatic
cell count (SCC), milkability (MA), persistency (PE), productive life (PL), maternal non-
return rate (mNR), paternal non-return rate (pNR), maternal calving ease (mCE), paternal
calving ease (pCE), maternal stillbirth (mSB) and paternal stillbirth (pSB). Out of all analysed
traits we become significant results for QTL affecting three traits (Fig. 7):
QTL for the milkability at 59.21 cM with the LTR value of 14.51 (P=0.0001)
QTL for the productive life with two distinct peaks at 53.69 cM and the LRT value of
9.72 and at 57.86 cM and the LTR value of 8.38 (P=0.002 and P=0.004, respectively)
QTL for the milk somatic cell count at 25.94 cM with the LTR value of 6.73
(P=0.009).
The QTL with effect on milkability (MA-QTL) shows very high LTR value and it is highly
significant. In order to test whether the PP-QTL and MA-QTL are one QTL with effect on
both traits or they are two separated QTL we analysed the data with QTLexpress program.
The interval mapping results across families for both traits showed that the estimated sire
effects are going in the same direction. F-test statistic curve in family wide analysis showed
that there are:
Three families significant for both QTL,
One family with significant results for MA-QTL and indicative results for PP-QTL,
One family indicative for both QTL and
One family with significant results for MA-QTL, but without any effect for PP-QTL.
According to these results we could conclude that it is most possibly one QTL with effect
both on PP and MA with stronger effect on MA. However, more formal test of one pleiotropic
mutation versus two distinct QTL can be performed by application of multi-trait multi-QTL
model (MEUWISSEN AND GODDARD, 2004).



FIGURE 7.- Results of combined linkage
disequilibrium and linkage analysis (LDL) by
LDL program for the quantitative trait locus
affecting milkability (MA), productive life
(PL) and somatic cell count (SCC). Positions
of markers used in analysis are denoted on the
X-axis. The log-likelihood ratio test values
(LRT) are presented on Y-axis and the
chromosome length in cM on the X-axis.
The productive life is one complex trait dependent on many productivity, fertility and
conformation traits and is also difficult to estimate. Because it is based on the productivity
traits it is to expect to map it together with some production trait as well. There was also an
indication of the somatic cell count proximally on the BTA19 (BENNEWITZ et al., 2003),
40
10
which is in good concordance with the QTL for SCS detected here. The marker map used for
the analyses here was not dense enough in the proximal part of the chromosome so the QTL
position can not be better resolved.

LD map: Combined LDL method can use only LD amount existing in mapping population. If
there is no substantial LD between used markers the mapping result will be based mostly on
linkage information. To test out this possibility we made a linkage disequilibrium (LD) map
for the studied region. LD map was constructed by first calculating the amount of LD
(measured as D; HEDRICK 1987) for all possible marker pairs and then, for each marker,
finding the average D with that marker and all markers residing within 5 cM in each
direction from that marker (OLSEN et al., 2005; Fig. 8). D for the markers with distance to
nearest marker exciding 5 cM was set to 0. The figure 8 shows that the D values for the
genotyped marker on the BTA19 usually do not exceed 0.15. Only four markers have D
values higher than 0.15 (IDVGA46, ILSTS014, DIK4051 and URB32). Markers IDVGA46
and ILSTS014 show very high D values compared to the other markers. The explanation is
that both markers are poorly informative. Both have three alleles, out of which one allele has
frequency over 95%.



FIGURE 8.- Linkage disequilibrium map for the 15 markers on the chromosome 19 (BTA19). The
average D values (Y-axis) between the named markers (X-axis) and the markers within a distance of
5 cM.

DISSCUSION
The IBD QTL mapping method was successfully applied in the in humans (DE VRIES
et al., 1996; FALLIN et al., 2001), cattle (RIQUET et al., 1999; LI et al., 2004) and pigs (NEZER
et al., 2003). However, it was not always applied with the same success in cattle. The fine
mapping of QTL with effect on milk production on BTA14 (RIQUET et al., 1999) was
hampered by selected mapping population consisting of Dutch Holstein-Friesian population
and New Zealand Holstein-Friesian population. As later proved, haplotype of one of the New
Zealand sires was coincidentally identical by state with haplotypes of Dutch sires, leading to
the erroneous QTL localization (FARNIR et al., 2002). On the other hand, LI et al. (2004)
reported successful application of the IBD method in mapping the QTL for backfat on
chromosome 2, 5, 6, 19, 21, and 23 in a commercial cattle population. This population was
developed from an Angus base and is expected to derive from one or limited number of
founders. It was also under selection for over 30 years, which should be an extra factor
contributing positively to the IBD mapping (LI et al., 2004). Here presented IBD mapping
differs from the one proposed by RIQUET et al. (1999) in the fact that we compared the
haplotype of highly related sires so we were able to include the haplotypes of the non-
41
11
segregating sires into comparison in order to refine the QTLR as much as possible. Also the
chosen mapping population an advanced backcross population Fleckvieh x Red Holstein
(ABFV) is meant to represent the unique opportunity for IBD mapping, as the influence of
the founder in one such population is substantial. Also, the ABFV is under selection for here
examined milk production traits.
The feasibility to use IBD mapping method depends on the extent of the linkage
disequilibrium (LI et al., 2004). Analysis in Dutch Holstein-Friesian population revealed
surprisingly

high levels of LD extended over several tens of centiMorgan (FARNIR et al.,
2000). These findings were confirmed in North American Holstein population (VALLEJO et
al., 2003), two Japanese beef breeds (ODANI et al., 2006), U.K. Holstein population (TENESA
et al., 2003) and Australian Holstein population (KHATKAR et al., 2006). Our results on
BTA19 do not agree with prediction from FARNIR et al. (2000) who expects that situation
similar to the one described in Dutch Holstein-Friesian population will be encountered in
most other dairy cattle populations. Even though they found substantial LD on BTA19 the
results in our population contribute to the conclusion that there is a difference in degree of LD
between different populations. The lower LD can arise either from lower marker density or
higher effective population size. Bavarian and Austrian Fleckvieh breeders use large numbers
of tested parents to produce and select along successive cattle generations. Around 400 bulls
in Bavaria and 140 bulls in Austria, coming from a broad population of dames, are tested
every year. The large number of used parents leads to high effective size (N
e
>250; PIRCHNER,
2002), in contrast to the estimated effective population size of 50 in Dutch black-and-white
population (BIOCHARD, 1996), and consequently low LD degree in the population. In order to
use LD we should have far denser marker map in Fleckvieh. Supposing denser marker map,
the lower LD will allow for finer QTL mapping in Fleckvieh. However, our observations
demonstrate importance of previously check of the LD degree in the mapping population in
order to decide about mapping methods and the required marker density.
The method that combines linkage disequilibrium with linkage analysis was chosen
for refinement of the QTL position. This method in comparison with method using linkage
only was able to refine the QTL position sustainable. As it was already mentioned, the LD
degree in our mapping population is low so the most information was, once more, extracted
from the data by linkage analysis. Included pedigree information, as discussed by by LEE AND
VAN DER WERF (2004) has a big impact on the final mapping result when only linkage is used
but is not critical when the LD information is used. Our results suggest that pedigree
information should be used whenever available, especially in the case when the LD quantity
and LD distribution over the genome in the mapping population is not known.
Some important questions should be resolved in future:
- QTL origin
- Refinement of the QTL position
- Existence one or two QTL affecting PP and MA
The PP-QTL is most probably introduced in ABFV through their RH founder but also
exists in the FV population affecting PP and MA in both populations.

Here presented mapping population offer different opportunities for fine QTL mapping by (a)
combining dense marker map with low LD in effectively large Fleckvieh population, (b)
combining different breed origin of most possibly the same causal mutation (c) combining
effect of single causal mutation on two different traits. These opportunities could be exploited
only in consideration of more sophisticated mapping models.

42
12
LITERATURE CITED

BENNEWITZ et al. (2003) Genet Sel Evol. 35(3):319-38.
BIOCHARD (1996) Prod. Anim., 9: 323-335.
DARVASI AND SOLLER (1994) Genetics 138(4): 1365-73.
DE VRIES et al. (1996) Hum Genet. 98(3):304-9.
DOLEZAL et al. (2005) EAAP-Book of Abstracts No.11 p118.
EVERTS-VAN DER WIND et al. (2005) Proc Natl Acad Sci U S A 102(51): 18526-31.
FALLIN et al. (2001) Genome Res. 11(1):143-51.
FARNIR et al. (2000) Genome Res 10(2): 220-7.
FARNIR et al. (2002) Genetics 161(1): 275-87.
HEDRICK (1987) Genetics 117(2):331-41.
IHARA et al. (2004) Genome Res 14(10A): 1987-98.
ITOH et al. (2005) Genomics 85(4): 413-24.
KHATKAR et al. (2006) Genet Sel Evol 38:463-477.
LANDER AND BOTSTEIN (1989) Genetics 121(1):185-99.
LANDER AND GREEN (1987) Proc Natl Acad Sci U S A 84(8): 2363-7.
LEE AND VAN DER WERF (2004) Genet Sel Evol 36(2): 145-61.
LEE AND VAN DER WERF (2005) Genetics 169(1): 455-66.
LEE AND VAN DER WERF (2006) Genet Sel Evol 38(1): 25-43.
LI et al. (2004) J Anim Sci 82(4): 967-72.
LIPKIN et al. (1998) Genetics 149(3): 1557-67.
MEUWISSEN AND GODDARD (2004) Genet Sel Evol., 36(3):261-79.
MOSIG et al. (2001) Genetics 157(4): 1683-98.
NEZER et al. (2003) Genetics 165(1):277-85.
ODANI et al. (2006) Anim Genet. 37(2):139-44.
OLSEN et al. (2004) J Dairy Sci. 87(3):690-8.
OLSEN et al. (2005) Genetics 169(1): 275-83.
PIRCHNER (2002) Arch. Tierz., Dummerstorf 45, 4:331-339.
RIQUET et al. (1999) Proc Natl Acad Sci U S A 96(16): 9252-7.
SCHNABEL et al. (2005) Proc OLSEN Acad Sci U S A 102(19): 6896-901.
SEATON et al. (2002) Bioinformatics 18: 339-340.
SOBEL AND LANGE (1996) Am J Hum Genet 58(6): 1323-37.
TANKSLEY AND NELSON (1996) Theor Appl Genet 92: 191203.
VALLEJO et al. (2003) J Dairy Sci. 86(12):4137-47.
TENESA et al. (2003) J Anim Sci. 81(3):617-23.
43
FINE MAPPING OF SSC4 FOR MEAT AND
CARCASS QUALITY TRAITS IN A
COMMERCIAL CROSSBRED PIG
POPULATION
A. Slawinska
1
. M. Siwek
1
. E.F. Knol
2
. D.T. RoeloIs-Prins
2
.
H.J. van Wiik
3
. B. Dibbits
3
. M. Bednarczyk
1
1
University oI Technology and LiIe Sciences. Poland
2
Institute Ior Pig Genetics. the Netherlands
3
Wageningen University. the Netherlands
Focus on meat quality is inIluenced by
market demands
Pork quality is related to (Sellier. 1998):
Eating quality
Nutritional quality
Technological quality
Hygienic quality
Ethical quality
Pork Quality
pH
colour
marbling
anatomy
And many more.
44
SSC4 harbours many QTLs aIIecting meat
and carcass quality. Among others:
BackIat & Intramuscular Fat
Colour (Minolta. JCS)
Fatty Acid Composition
Carcass Weight & Length
Lean Weight & Percentage
Diameter & Proportion oI Muscle Fibers
QTLs on SSC4
And many more.
Validation and Iine mapping QTLs Ior the meat and
carcass quality detected on the SSC4q
in the commercial HMG population
The Aim oI the Study
45
Materials & Methods
The HMG Population:
14
Pietrain/Large White
Synthetic Sire line
303 Offspring
HMG Commercial crossbred
89
Crossbred Sows
Not genotyped
Materials & Methods
The Traits:
Meat Quality
Meat Colour
pH
Drip loss
Marbling
Firmness
Conductivity
Carcass Fatness
Ham weight
Loin weight
Belly weight
Back Iat
Loin depth
Meat percentage
46
Materials & Methods
Markers on SSC4q:
S0073
S0817
Sw270
S0809
Sj551
S0813
Materials & Methods
Marker bracket under study
S0073 S0813
47
Materials & Methods
Location oI Markers:
S0073
S0817
Sw270
S0809
Sj551
S0813
74.4 cM
76 cM
82.4 cM
92 cM
93 cM
93.9 cM
Materials & Methods
Analysis method:
Regression analysis oI halI-sib Iamilies (Knott et
al.. 1996) implemented in QTL express package.
(Seaton et al.. 2002)
ConIidence thresholds obtained Ior each
trait/Iamily by bootstraping experiment-wide
48
Results - PIC & He
0.64 0.58 Sw270
0.41 0.37 Si551
0.33 0.31 S0813
0.54 0.49 S0809
0.77 0.74 S0817
0.45 0.41 S0073
He PIC Loci
Regression analysis results
Meat Quality Traits (colour)
1. JCS cutting surIace loin
2. JCS outer ham
3. JCS inside ham
4. Minolta L value Loin
5. Minolta a value Loin
6. Minolta b value Loin
7. JCS Loin. day 6 post mortem
8. Loss in JCS Loin
p 0.01
p 0.05
p 0.05
49
QTLs Ior meat colour
0,00
0,50
1,00
1,50
2,00
2,50
3,00
7
4
7
6
7
8
8
0
8
2
8
4
8
6
8
8
9
0
9
2
9
4
ReIative position, cM
F
-
s
t
a
t
i
s
t
i
c
s
JCS inside ham
Minolta a value Loin
Minolta b value Loin
p 0.05
p 0.01
Regression analysis results
1. pH oI the ham 24h post mortem
2. pH oI the loin 24h post mortem
3. dripscore (Iilterpaper)
4. drip weight (mg) oI Iilterpaper
5. conductivity (loin)
6. Iiber optic probe (loin)
7. driploss 6 days p.m.
8. ham marbling score
9. loin marbling score
10. calculated driploss (dripscore)
Meat Quality Traits (others)
p 0.01
p 0.01
50
QTL Ior loin pH value
0,00
0,50
1,00
1,50
2,00
2,50
3,00
7
4
7
6
7
8
8
0
8
2
8
4
8
6
8
8
9
0
9
2
9
4
ReIative position [cM]
F
-
s
t
a
t
i
s
t
i
c
s
pH oI the loin 24 hours
p 0.05
p 0.01
QTL Ior conductivity
0,00
1,00
2,00
3,00
4,00
5,00
6,00
7
4
7
6
7
8
8
0
8
2
8
4
8
6
8
8
9
0
9
2
9
4
Relative position.cM
F
-
s
t
a
t
i
s
t
i
c
s
conductivity (loin)
p 0,05
p 0,01
51
Carcass Traits
1. Weight (kg) ham trimmed
2. Weight (kg) ham deboned
3. Weight (kg) loin trimmed
4. Weight (kg) loin deboned
5. Weight (kg) belly
p 0.01
Regression analysis results
QTLs Ior ham weight
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
7
4
7
6
7
8
8
0
8
2
8
4
8
6
8
8
9
0
9
2
9
4
ReIative position, cM
F
-
s
t
a
t
i
s
t
i
c
s
weight |kg| ham trimmed
weight |kg| ham deboned
p 0.05
p 0.01
52
Fatness Traits
Regression analysis results
1. backIat (HGP)
2. loindepth (HGP)
3. lean meat percentage (HGP)
Discussion oI the results (colour)
20 30 40 50 60 70 80 90 100 110 120
ReIative position, cM
Minolta a
Minolta b
1CS
Minolta a
OVILO et al..2002
1CS
EDWARDS et al.. 2006
CS
WANG et al.. 1998
Minolta L
DE KONING et al.. 2001
Minolta L
MALEK et al.. 2001
1CS
VAN WIJK et al.. 2006
Minolta a
VAN WIJK et al.. 2006
53
Discussion oI the results (others)
20 30 40 50 60 70 80 90 100 110 120
ReIative position, cM
pH
conductivity
pH
VAN WIJK et al.. 2006
pH
DE KONING et al.. 2001
conductivity
CEPICA et al.. 2003
Discussion oI the results (carcass)
20 30 40 50 60 70 80 90 100 110 120
ReIative position, cM
Ham weight trimmed
Ham weight deboned
Ham weight
GELDERMANN et al.. 2003
Ham weight
CEPICA et al.. 2003
Ham weight
BEECKMANN et al.. 2003
Ham weight
VAN WIJK et al.. 2006
54
Conclusions
In this study. QTLs aIIecting meat and
carcass quality on SSC4q. were reIined and
validated in the commercial HMG crossbred
The region harbouring QTLs Ior meat quality
traits (colour. pH value) was Ilanked by the
markers S0073 and Sw270 (74-82 cM)
QTLs Ior ham weight were situated in the
markers bracket S0809 S0813 (92-94 cM)
The same position oI the QTLs Ior colour
score (Minolta a value loin) and pH value
may suggest similar genetic background
inIluencing those two traits.
Conclusions
55
XI QTL-MAS Workshop 22-23/03/2007. INRA. Toulouse. France
ANALYSES OF CANDIDATE GENES ON PORCINE CHROMOSOME 4: A
STATUS SURVEY OF THE PRO1ECT.
1. Estell
a
. A. Ojeda
a
. 1.M. Folch
a
& M. Prez-Enciso
a.b
a
Dept. Ciencia Animal i dels Aliments. Facultat de Veterinaria. Universitat Autonoma
de Barcelona.
b
Institut Catala de Recerca i Estudis Avancats. Barcelona.
Despite intense research eIIorts. the causative Iactor oI FAT1 QTL on porcine
chromosome 4 (Andersson et al.. 1994) remains still unknown. We have shown. in an
Iberian x Landrace (IBMAP) cross. statistical evidence that there exist at least two loci
segregating on chromosome 4. one oI the QTL has a large eIIect on Iatness while the
second one. located about 30 cM away. would aIIect predominantly the shape oI the
animal (length and Ioreleg weight) and growth (Mercade et al.. 2005. 2006). A gene
cluster oI three Iatty acid binding proteins (FABP4. PMP2 and FABP5) reside within
the conIidence interval oI the Iirst QTL. In the IBMAP cross. none oI the
polymorphisms genotyped in FABP4 and FABP5 genes could explain individually all
variability in Iatness. but a two SNP FABP4-FABP5 haplotype did (Mercade et al..
2006; Estelle et al.. 2006). Next. we studied the eIIect oI this haplotype in 4 sire
Iamilies Irom three outbred commercial populations. but no association was Iound. This
could be explained by the Iact that the Iberian haplotype was not Iound in these
populations (Estelle et al.. unpublished). In order to characterize more in detail the
linkage disequilibrium structure and Iind any evidence oI selection Iootprint oI this
region. we resequenced the whole FABP4 and FABP5 genes in 23 and 14 pigs.
respectively. Irom 10 breeds. including wild boar and babirusa as outgroup (Oieda et
al.. 2006. 2007). Intriguingly. the polymorphism patterns Irom the two genes were
completely diIIerent despite the Iact that they are closely located (~ 200 kb in humans).
FABP5 held a nucleotide diversity 0.2 in accordance with published results in
other species. while was 1.17 in FABP4. Iive times more and much higher than
usually reported in domestic species. Furthermore. two largely FABP4 divergent
haplotypes were Iound to be segregating even in highly inbred animals like Tamworth
or Iberian. No relationship was Iound between haplotype and geographic origin Ior
FABP4. The estimated substitution rates diIIered also (it was about doubled in FABP4
with respect to FABP5). Finally. the Hudson-Kreitman-Aguade (HKA) test was
signiIicant (P 0.01). All this suggests that diIIerent evolutive Iorces have shaped the
variability observed in these physically close and Iunctionally related genes.
References
Andersson et al. (1994). Science 263:1771-1774.
Estelle et al. (2006). Animal Genetics 37:589-591.
Mercade et al. (2005). Mammalian Genome 16:374-382.
Mercade et al. (2006). Journal of Animal Science!84:2907-2913.
Oieda et al. (2006). Genetics 174:2119-2127.
Oieda et al. (2007). AIDA meeting. Zaragoza. Spain.
Perez-Enciso et al. (2000). Journal of Animal Science 78:2525-2531.
Acknowledgements
Research Iunded by proiects CPE03-010-C3 (INIA. Spain). AGF99-0284-CO2 and AGF2004-
00103/GAN (MEC. Spain). JE and AO are Iunded by FPU and FPI Iellowships Irom MEC.
56
MILK YIELD QTLs ON OAR 6 IN LATXA SHEEP BREED: A COMPARISON BETWEEN
SELECTIVE DNA POOLING` AND INDIVIDUAL GENOTYPING`

F. Rendo
1
. E. Ugarte
2
. E. Lipkin
3
and A. Estonba
1


1
Department oI Genetics. Physical Anthropology & Animal Physiology; University oI the Basque Country/Euskal
Herriko Unibertsitatea (UPV/EHU); POB 644; 48080 Bilbao; Spain
2
Department oI Animal Production; NEIKER A. B.; POB 46; 01080 Vitoria-Gasteiz; Spain
3
Department oI Genetics; Hebrew University oI Jerusalem; 91904 Jerusalem; Israel

INTRODUCTION
IdentiIication oI QTL has been reported Ior several traits in sheep. including parasite resistance. wool
production traits. milk production and dagginess (Cockett. 1999). In addition. some genes have been reported
Ior several traits in sheep such as ovulation rate and sterility (Hanrahan et al.. 2003). etc. However. data
concerning milk production QTLs in ovine breeds are scarce. Moreover. data about milk production QTLs
have mainly been obtained by individual genotyping oI the selected Iamilies (or populations). This is the
case oI the most recent comprehensive paper on this issue. published by Barillet et al. in 2005. in which
individual genotyping is applied to perIorm a whole genome scan Ior QTL detection in several breeds oI
sheep.
An alternative QTL detection method has been proposed. Selective DNA Pooling` (Lipkin et al.. 1998;
Mosig et al.. 2001). which signiIicantly reduces genotyping costs and eIIort. Moreover. Fisher et al. (2004)
describes a Selective DNA Pooling` design applied to checking the identiIication and estimation oI the
DGAT1 eIIect. However. the use oI Selective DNA Pooling` has mainly been limited to bovine breeds.
The aim oI the present study is to compare the eIIiciency oI Selective DNA Pooling` and Individual
Genotyping` in detecting QTLs in sheep. To that end. both methods were applied to a daughter-design in
Latxa sheep to evaluate the presence oI milk yield QTLs on OAR6.

MATERIALS AND METHODS
Animals & phenotyping. Eight halI-sib Iamilies with an average oI 260 daughters per sire. ranged Irom 79
to 479 (Table 1). were chosen Irom herds included in the Latxa breed improvement program oI the
Autonomous Community oI the Basque Country. where the animals are routinely genetically evaluated Ior
milk production by the Best Linear Unbiased Predictor BLUP animal model. According to their
Estimated Breeding Values EBV daughters with the most extreme EBV values were selected (20. 4 -
27. 2). deIining the so-called phenotypic tails` (Table 1). With respect to Individual Genotyping`.
approximately 50 oI the total number oI evaluated daughters Ior the two largest Iamilies (5 and 8) were
selected. In both cases. the number oI daughters is estimated to be representative oI their own Iamilies.
















Table 1. Number of animals in each family. ED. the total number of evaluated daughters per family; Number of daughters in each pool
selected for phenotypic tails. and their percentage (in brackets); Number of daughters for Individual Genotyping.

DNA extraction and pools preparation. Genomic DNA was extracted Irom blood samples collected Irom
all animals with standard protocols: sires` DNA Iollowing a salt-based DNA extraction procedure (Smith et
al.. 1990). while all DNA Irom all daughters was extracted using the Ultraclean DNA BloodSpin Kit`
(MOBIO Laboratories. Inc). DNA was quantiIied using spectrophotometry.
All individual DNA samples were stored in a DNA bank. Three independent replicates oI each pool were
prepared by pooling equal amounts oI DNA Irom the corresponding daughters.
Markers selection and genotyping. Based on bovine results (Mosig et al. 2001 and Khatkar et al. 2004)
and the resemblance between the bovine and the ovine genomes. in this study we evaluated the presence oI
SELECTIVE DNA POOLING`
FAMILY ED
High Low
INDIVIDUAL
GENOTYPING`
1 79 10 (12.6) 12 (15.2) -
2 205 24 (11.7) 29 (14.1) -
3 194 26 (13.4) 23 (11.8) -
4 118 16 (13.6) 16 (13.6) -
5 479 50 (10.4) 48 (10.0) 233
6 310 39 (12.6) 39 (12.6) -
7 307 38 (12.4) 38 (12.4) -
8 391 42 (10.8) 55 (14.1) 209
57
QTLs on OAR 6. Primers Ior 15 microsatellite markers (Figure 1) were synthesised based on published
sequences. All markers were successIully ampliIied Irom ovine DNA. AmpliIied Iluorescent Iragments were
detected in automatic genetic analyser equipment (ABI PRISM 3100 Avant. Applied Biosystems).
Genotypes were obtained Ior the eight sires used in this proiect. On average. sires were heterozygous Ior 10
markers. At each marker daughters` pools Irom heterozygous sires were genotyped.
In addition. a representative portion oI the Iamilies was individually genotyped in the two largest Iamilies (5
and 8) Ior all heterozygous markers on their respective sires.
Analysis. In the case oI data obtained Ior Selective DNA Pooling` design. densitometry. shadow correction
and statistical analysis Ior the sire alleles were carried out as described (Lipkin et al.. 1998; Mosig et al..
2001). Dams` alleles were analysed Iollowing Lipkin et al. (2002).
In the Individual Genotyping` approach. the marker and the milk yield data were used to carry out a linkage
analysis using the web-based QTL Express soItware (Seaton et al.. 2002).
0
2
9
.
7
4
5
.
0
5
9
.
0
7
7
.
3
8
8
.
1
1
0
0
.
6
1
1
1
.
5
1
2
2
.
8
1
3
4
.
7
1
4
6
.
0
1
5
5
.
7
\
O
a
r
C
P
1
2
5
\
M
c
M
5
3
\
O
a
r
A
E
1
0
1
\
B
M
1
4
3
\
B
M
S
3
6
0
\
B
M
4
6
2
1
\
B
M
4
3
1
1
\
C
S
R
D
2
9
3
\
O
a
r
1
M
P
8
\
M
c
M
2
1
4
\
I
N
R
A
1
3
3
\
B
M
1
3
2
9
\
M
c
M
A
9
\
I
L
S
T
S
0
8
7
\
O
a
r
1
M
P
1
2
2
.
6
1
6
.
0
4
9
.
9
8
0
.
9
1
5
1
.
2
0
2
9
.
7
4
5
.
0
5
9
.
0
7
7
.
3
8
8
.
1
1
0
0
.
6
1
1
1
.
5
1
2
2
.
8
1
3
4
.
7
1
4
6
.
0
1
5
5
.
7
\
O
a
r
C
P
1
2
5
\
M
c
M
5
3
\
O
a
r
A
E
1
0
1
\
B
M
1
4
3
\
B
M
S
3
6
0
\
B
M
4
6
2
1
\
B
M
4
3
1
1
\
C
S
R
D
2
9
3
\
O
a
r
1
M
P
8
\
M
c
M
2
1
4
\
I
N
R
A
1
3
3
\
B
M
1
3
2
9
\
M
c
M
A
9
\
I
L
S
T
S
0
8
7
\
O
a
r
1
M
P
1
2
2
.
6
1
6
.
0
4
9
.
9
8
0
.
9
1
5
1
.
2

Figure 1. Selected microsatellites spanning OAR 6 (Maddox .. 2001; v4.7 of Australian Sheep Gene Mapping Web Site -ASGMWS).

RESULTS

Map construction.
Microsatellite markers were chosen Irom existing linkage maps oI the sheep genome (Maddox et al.. 2001)
and selected to maximise OAR 6 coverage with markers spacing between 10 and 20 centiMorgans (cM).
The Australian Sheep Gene Mapping Web Site (ASGMWS) map has been commonly accepted and it was
shown that the diIIerences in estimated recombination Irequency did not bias the test Ior QTL or estimates oI
QTL eIIects (Chen et al.. 2006). so the ASGMWS map v. 4.7 was used in this study.
Selective DNA Pooling`. Pools were genotyped Ior all eight Iamilies pooling DNA Irom the daughters oI
each phenotypic extreme tail. Following the methodology proposed by Lipkin et al. (1998). shadow-
corrected estimates oI sires and dams allele Irequencies were compared between the high and low pools.
Allele Irequency estimates obtained Irom pools and individual genotyping proved to be in good agreement
Ior all these Iamilies. where correlation analyses perIormed Ior each marker showed values greater than 0.90.
Marker P values obtained Ior all eight Iamilies are represented in Figure 2.
MARKER-QTL ASSOCIATION ON OAR 6
0,00
2,00
4,00
6,00
8,00
10,00
12,00
14,00
16,00
18,00
20,00
0,0 20,0 40,0 60,0 80,0 100,0 120,0 140,0 160,0
cM
-

I
o
g

P BTA 6
Sign. Level
OAR 6
INRA133
BM1329
BM2508
BM143
BMS360
BM415
BM4311 OarJMP8
BM2320

Figure 2. Marker QTL association on OAR6 using information on all 8 families based on sires and dams alleles. Bovine markers information
obtained from Tchourzyna .. 2002 (Lipkin personal communication).

58
Using inIormation on all eight Iamilies. Iour putative ovine regions were detected close to markers INRA133
at 16.0 cM (p
min
9.63E-11). BM143 at 59.0 cM (p
min
4.71E-14). BMS360 at 80.9 cM (p
min
8.06E-11) and
OarJMP8 at 134.7 cM (p
min
2.87E-10). Two oI them (BM143 and OarJMP8) appeared to be harbouring milk
yield putative QTL as described Rendo et al. (2003) on a preliminary analysis Ior Iamily 5. A Iairly good
match between the bovine and ovine data was obtained. with approximately the same regions (or markers)
containing putative QTL Ior milk yield being observed.
Concerning the two largest Iamilies. and with respect to Iamily 5 (Figure 3). two putative regions over OAR
6 were observed explained only by dams` alleles: the Iirst between markers OarAE101 (49.9 cM; p4.47E-
06) and BM143 (59.0 cM; p3.62E-02). and the second at the marker CSRD293 (122.8 cM; p1.43E-02).
By contrast. we detect no eIIects oI the alleles oI the sire in this Iamily. that is. none oI the markers were
signiIicant Ior sire`s alleles; thus. sire 5 appeared to be QTL homozygote in these regions. The eIIect oI
dams` alleles could be the result oI population-wide marker-QTL linkage disequilibrium.
MARKER-QTL ASSOCIATION ON OAR6
0,00
2,00
4,00
6,00
8,00
10,00
12,00
14,00
16,00
18,00
20,00
0,0 20,0 40,0 60,0 80,0 100,0 120,0 140,0 160,0
cM
-
I
o
g

P
Fam 5 Dams
BTA 6
Sign. Level
INRA133
BM1329
OarAE101
BM2508
BM143
BM415
BM4311
CSRD293
BM2320

Figure 3. Marker QTL association in Family 5 based in dams` alleles. Two putative QTL. the first between markers OarAE101 (49.9 cM)
and BM143 (59.0 cM). and the second at the marker CSRD293 (122.8 cM).

With respect to Iamily 8 (Figure 4). the results. based on sire and dams` alleles. suggest Iour putative regions
over OAR6. which Iit quite well with the bovine results (Lipkin. personal communication). The putative
regions are: INRA133 (16.0 cM; p9.63E-11). BM143 (59.0 cM; p4.71E-14). BMS360 (88.1 cM;
p6.00E-17) and OarJMP8 (134.7 cM; p1.24E-08). These Iour regions are explained by dams` and sire`
alleles. indicating that both population-wide marker-QTL linkage disequilibrium and sire-QTL segregation
are aIIecting the trait.
MARKER- QTL ASSOCIATION ON OAR 6
0,00
2,00
4,00
6,00
8,00
10,00
12,00
14,00
16,00
18,00
20,00
0,0 20,0 40,0 60,0 80,0 100,0 120,0 140,0 160,0
cM
-

I
o
g

P FAM 8
BTA 6
Sign. Level
INRA133
BM1329
BM2508
BM143
BMS360
BM415
BM4311
OarJMP8
BM2320

Figure 4. Marker QTL association in Family 8 based in sire and dams` alleles. The putative regions are: INRA133 (16.0 cM). BM143 (59.0
cM). BMS360 (80.8 cM) and Oar1MP8 (134.7 cM)
59

Individual Genotyping` approach. The QTL Express based Individual Genotyping` approach takes into
account only sire allele eIIects in contrast to Selective DNA Pooling` approach where the eIIects oI sires`
and dams` alleles are considered. Results Irom QTL Express linkage analysis oI Iamily 5 are represented in
Figure 5. The test statistic (F-value) Ior one QTL at the given location vs. no QTL is shown across the
genomic area examined. The highest peak appears at 109 cM (F2.13). but this is not statistically signiIicant.
The absence oI signiIicant QTL in Iamily 5 was expected. since the two QTLs detected using Selective
DNA Pooling` are explained by dams` alleles` eIIects only. while the sire was Iound to be homozygous Ior
the QTLs.


















Figure 5. Test statistic profiles (F-values) for milk yield on Fam 5 Figure 6. Test statistic profiles (F-values) for milk yield on Fam 8

Figure 6 shows the results Irom Iamily 8. The F-value Ior one QTL at the given location vs. no QTL is
shown across the genomic area examined. A statistically signiIicant peak appears at 132 cM (F4.09)
suggesting a QTL located between CSRD293 (122.8 cM) and OarJMP8 (134.7 cM) markers. This peak
conIirms the last QTL-position observed with the Selective DNA Pooling` approach (Figure 4). Testing 2
QTL vs. 1 QTL showed smaller F-value. thus it is not considered. Another high peak. close to signiIicant
level. can be observed around 57 cM. which shows a marker-QTL association on the region Ilanked by
OarAE101 (49.9 cM) and BM143 (59.0 cM) detected by Selective DNA Pooling` in Iamily 8 (Figure 4).

DISCUSSION
Selective DNA Pooling` results show a complex pattern oI putative regions harbouring QTL on OAR6
when all the 8 Iamilies are analysed (Figure 2). Using this methodology Iour putative marker-QTL
associations are detected: INRA133 (16.0 cM). BM143 (59.0 cM). BMS360 (80.1 cM) and OarJMP8 (134.7
cM). All oI them are observed in Iamily 8. while only BM143 and OarJMP8 markers were considered to
harbour QTL in Iamily 5. In this last case. the sire is assumed to be homozygous Ior QTLs since only dams`
alleles contribute to QTL detection.
QTL Express soItware based Individual Genotyping` results suggest the absence oI QTL in Iamily 5. while
a putative QTL at 132 cM (close to OarJMP8 marker) is detected Ior Iamily 8. This supports the results
explained above: Iirst. there is no QTL detected Ior Iamily 5 due to the homozygote-QTL sire`s condition;
second. at least one QTL is observed in Iamily 8 (close to OarJMP8) and it is in agreement with one oI the
QTLs described above.
Comparing Selective DNA Pooling` and Individual Genotyping` results. several putative QTLs are
detected by the Selective DNA Pooling` approach. including all those detected using QTL Express based
Individual Genotyping`. QTL Express soItware takes into account only sire` alleles` eIIects and it is capable
oI testing only one or two putative QTLs. These Ieatures make QTL Express based analysis a restrictive
method compared to Selective DNA Pooling`. reducing the likelihood oI Iinding Ialse positives. But
together with this. QTL Express could involve a loss oI valuable inIormation. Iailing to detect real putative
QTLs (Ialse negatives). This is not the case with Selective DNA Pooling`. ThereIore. we consider Selective
DNA Pooling` an appropriate methodology Ior a Iirst QTL screening allowing a more speciIic analysis to be
conducted later on previously detected markers or regions (Fine Mapping). and thus an appropriate method
Ior perIorming QTL mapping on sheep breeds using a daughter-design.

Regarding previously reported marker-QTL associations Ior milk yield in sheep. Barillet et al. (2005)
suggest three milk production traits QTL on OAR6. Despite the diIIerent source oI the chromosomal map.
these results are in quite good agreement with our results commented above. First. a protein yield aIIecting
60
QTL around OarAE101 marker (34 cM oI their own map). which could be comparable to our OarAE101-
BM143 (49.9-59.0 cM) interval. Second and third. in the telomeric region around the marker BM4311 (126
and 128 cM oI their own map) aIIecting protein content and milk yield respectively. which could be
comparable to our telomeric CSRD293-OarJMP8 (122.8-134.7 cM) interval.

On the other hand. the present study shows a good resemblance between ovine and bovine genomes as cited
beIore (Rendo et al.. 2003; Barillet et al.. 2005). Orthologous regions on the bovine genome inIluencing milk
production traits have been described previously in several studies (Mosig et al.. 2001 and Tchourzyna et al..
2002). and are observed here in Figure 2 and 4. where some regions harbouring milk production traits QTL
are the same on both chromosomes. OAR6 and BTA6: INRA133. interval between OarAE101 and BM143.
and OarJMP8.

CONCLUSIONS.
This study shows that 'Selective DNA Pooling' applied to a daughter-design is an eIIicient method Ior QTL
mapping in sheep breeds.
As cited beIore. the Selective DNA Pooling` design is highly capable oI detecting putative QTLs compared
with a QTL Express based Individual Genotyping`. In addition. Selective DNA Pooling` signiIicantly
reduces genotyping eIIort. These Ieatures make Selective DNA Pooling` as an eIIicient methodology Ior a
Iirst QTL screening. allowing a more speciIic analysis to be used later on previously detected markers or
regions (Fine Mapping). Thus. Ior studies related with QTL searching in ovine populations we consider
Selective DNA Pooling` as an adequate alternative to the widely used 'Individual Genotyping'.
On the other hand. we wish to underline that comparative mapping between the ovine and bovine genomes is
a useIul tool Ior QTL searching.

ACKNOWLEDGEMENTS
This proiect has received Iunding Irom the Basque Government and the ArtiIicial Insemination & Selection
Centre ARDIEKIN. S.L`. Authors acknowledge Feli Arrese and the CONFELAC breeders association who
provided blood samples. and the staII oI the Genetics Lab (Univ. oI the Basque Country) Ior excellent
technical assistance. We are also very grateIul to Moshe Soller Irom the Alexander Silberman Institute oI the
Hebrew Univ. oI Jerusalem (Israel) Ior his interesting comments.

REFERENCES
Barillet F.. Arranz J.J.. Carta A. (2005). Mapping quantitative trait loci Ior milk production and genetic
polymorphisms oI milk proteins in dairy sheep. Genet. Sel. Evol. 37 (Suppl. 1) S109-S123.
Chen H.Y.. Zhang Q.. Yin C.C.. Wang C.K.. Gong W.J. and Mei G. (2006). Detection oI Quantitative Trait
Loci aIIecting milk production traits on bovine chromosome 6 in a Chinese Holstein population by the
daughter design. J. Dairy Sci. 89: 782-790.
Cockett N. E. (1999). Genomics oI sheep. AgBiotechNet Vol 1 April. ABN013.
Fisher. P.J.. Spelman. R.J. (2004). VeriIication oI selective DNA pooling methodology through identiIication
and estimation oI the DGAT1 eIIect. Animal Genetics. 35(3): 201-205. June 2004.
Khatkar M.S.. Thompson P.C.. Tammen I.. Raadsma H.W. (2004). Quantitative trait loci mapping in dairy
cattle: review and meta-analysis. Genet. Sel. Evol. 36. 163-190.
Lipkin E.. Gruzman G.. Friedmann A.. and Soller M. (2002). Using inIormation on segregation oI dam
marker alleles within a daughter design Ior mapping QTL aIIecting milk production traits in Israel
Holstein dairy cattle. Proceedings oI the 28th ConIerence oI the International Society oI Animal
Genetics ISAG: E032. pp. 172-173.
Lipkin E.. Mosig M.O.. Darvasi A.. Ezra E.. Shalom A.. Friedmann A. and Soller M. (1998). Quantitative
Trait Locus mapping in dairy cattle by means oI selective milk DNA pooling using dinucleotide
microsatellite markers: analysis oI milk protein percentage. Genetics 149: 1557-1567.
Maddox et al. (2001). An enhanced linkage map oI sheep genome comprising more than 1000 loci. Genome
Research 11: 1275-1289.
Mosig M.O.. Lipkin E.. Khutoreskaya G.. Tchourzyna E.. Soller M. and Friedmann A. (2001). A whole
genome scan Ior Quantitative Trait Loci aIIecting milk protein percentage in Israeli-Holstein cattle. by
means oI selective milk DNA pooling in a daughter design. using an adiusted Ialse discovery rate
criterion. Genetics 157: 1683-1698 (April 2001).
Rendo F.. Ugarte E.. Lipkin E. and Estonba A. (2003). Detection oI QTLs inIluencing milk production in
OAR6 oI the Latxa breed. Proceedings oI the international workshop on maior genes and QTL in
sheep and Goat. Toulouse (France) 8-11 dec. 2003. communication n2-22.
61
Seaton G.. Haley C.S.. Knott S.A.. Kearsey M.. Visscher P.M. (2002) QTL Express: mapping quantitative
trait loci in simple and complex pedigrees. BioinIormatics 18 339-340.
Smith J.C.. Anwar R.. Riley J.. Jenner D. and Marham A.F. (1990). Highly polymorphic minisatellite
sequences: allele Irequencies and mutation rates Ior Iive locus-speciIic probes in a Caucasian
population. J. Forensic Sci. Soc.. 30:19-32.
Tchourzyna E.. Grosman G.. Friedmann A.. Soller. and Lipkin E. (2002) Detection oI multiple QTL on a
single chromosome by haplotype analysis with selective DNA pooling. Proceedings oI the 7th World
Congress on Genetics Applied to Livestock Production. CD-ROM communication n 21-46.
62
!"#$% &' % ()*!++, %
-
%%. Charles-Eric DUREL
1
. Francois LAURENS
1
. Fabienne MATHIS
1
.
Luca GIANFRANCESCHI
2
. Matteo KOMJANC
3
. Daniela MOTT
3
. Valentina COVA
3
.
Andrea PATOCCHI
4
. Davide GOBBIN
4
. Fabio Rezzonico
4
. Kate EVANS
5
. Felicidad
FERNANDEZ-FERNANDEZ
5
. Frank DUNEMANN
6
. Anastasia BOUDICHEVSKAJA
6
.
Marta STANKIEWICZ-KOSYL
7
. Adriana ANTOFIE
8
. Eric VAN DE WEG
9
. Marco BINK
10
-
Institut National de la Recherche Agronomique (INRA) Centre de Recherche dANGERS Unite Mixte de
Recherche Genetique et Horticulture (UMR GenHort) 42. Rue Georges MOREL
49071 BEAUCOUZE Cedex France
.
Universita degli Studi di Milano( UNIMI) Dipartimento di Scienze Biomolecolari e Biotecnologie
via Celoria 26 - 20133 Milano Italv
/
Istituto Agrario San Michele allAdige(IASMA) via E. Mach.1 38010 S. Michele allAdige (TN) Italv
0
Plant Pathologv. IBZ. ETH Zrich. 8092 Zrich. Switzerland.
1
Current address. Phvtopathologv. Agroscope-Changins-Wdenswil (ACW) Research Station.
8820 Wdenswil. Switzerland
2
East Malling Research (EMR) Plant Breeding and Genetics - Kent. ME19 6BJEMR. East Malling Belgium
3
Federal Centre for Breeding Research on Cultivated Plants (BAZ ) Institute of Fruit Breeding
Pillnitzer Platz 3a - 01326 Dresden Germanv
4
Warsaw Agricultural Universitv (SGGW) Department of Pomologv and Basic Natural Sciences in
Horticulture - ul. Nowoursvnowska 166 - 02-787 Warsaw Poland)
5
Centre Wallon de Recherches Agronomiques (CRA-W) Departement Lutte Biologique & Ressources
Phvtogenetiques Rue de Liroux 4 - 5030 Gembloux - Belgium
6
Plant Research International (PRI) B.J. - Genetics and Breeding - Droevendaalsesteeg 1
6700 AA Wageningen -The Netherlands
-7%
Plant Research International Biometris B.J.. P.O. Box 16. 6700 AA Wageningen. The Netherlands.
Pedigree -based QTL mapping for fruit firmness in appIe using Markov Chain
Monte CarIo methods and Bayesian inferences.
Classical Quantitative trait loci (QTL) mapping experiments exploit three sources oI
inIormation: mapping populations. phenotypic scores and marker data (genotypes). In plants.
a high level oI success has been obtained with commonly used single mapping populations
(F2. BC. HD. RIL .) derived Irom initial crosses between two parents or with a Iew closely
related populations (e.g. crosses Irom a diallel design). However with such material only a
small part oI the genetic variability that is actually available is explored. As an alternative. the
proposed protocols are based on the use oI connected populations having common Iounders
(Crepieux et al.. Genetics 168: 17371749). For more unbalanced designs and Iragmented
populations. we advocated the concept oI Pedigree Genotyping (Van de Weg et al.. Acta Hort.
663:45-50); this is the use oI an extensive series oI populations that are connected by their
pedigrees and that represent diIIerent Iounders as well as many generations. To prove the
concept. the European proiect HiDRAS (High-quality Disease Resistant Apples Ior a
Sustainable agriculture; http://www.hidras.unimi.it/) was initiated. The present study presents
the Iirst results on Iruit Iirmness.
Twenty seven apple F1 Iamilies oI almost 50 individuals were each phenotyped Ior
Iruit Iirmness aIter diIIerent storage periods and genotyped by a genome covering set oI SSR
markers. The Iamilies are interconnected by their complex pedigrees and are part oI ongoing
breeding programs in several European countries. QTL analyses were perIormed by the
FlexQTL
TM
soItware (Bink et al.. TAG. 2002 ; Bink. EUCARPIA-Biometry. Euphytica.
2007 ; http://www.Ilexqtl.nl). that was developed in association with the HiDRAS proiect.
Bayesian inIerences were perIormed on the QTL chromosomal locations. allele Irequencies.
eIIects. parental genotypes. contributions to phenotypic variance by employing Markov
Chains Monte Carlo sampling processes.
63
4 A bit on plants
5 Genomic selection
!"#$%#&"''(')%*+'%,%-)(./+0.#+-.11%#-(",+-#.''2#%*+3%#0.#1"/-%+
Jack Dekkers*. Hong-hua Zhao. and Rohan Fernando
Department oI Animal Science and Center Ior Integrated Animal Genomics
Iowa State University. Ames. 50011-3150
Several studies have shown that selection oI pure breeds Ior increased perIormance oI
their crossbred descendants under Iield conditions is hampered by low genetic
correlations between purebred and commercial crossbred (CC) perIormance. Although
this can be addressed by including phenotypic data Irom CC relatives Ior selection oI
purebreds through combined crossbred and purebred selection (CCPS). this also increases
rates oI inbreeding and requires comprehensive systems Ior collection oI phenotypic data
and pedigrees at the CC level. This study shows that both these limitations can be
overcome with marker-assisted selection (MAS) using estimates oI the eIIects oI markers
on CC perIormance. To evaluate the potential beneIits oI CC-MAS. a model to
incorporate marker inIormation in selection strategies was developed based on selection
index theory. which allows prediction oI responses and rates oI inbreeding using standard
deterministic selection theory. Assuming a genetic correlation between purebred and CC
perIormance oI 0.7 Ior a breeding program representing a terminal sire line in pigs. CC-
MAS was shown to substantially increase rates oI response and reduce rates oI
inbreeding compared to purebred selection and CCPS with 60 CC halI-sibs available Ior
each purebred selection candidate. When the accuracy oI marker-based EBV was 0.6.
CC-MAS resulted in 34 and 10 greater responses in CC perIormance than purebred
selection and CCPS. Corresponding rates oI inbreeding were 1.4 per generation Ior CC
MAS. compared to 2.1 Ior purebred selection. and 3.0 Ior CCPS. For an accuracy oI
marker-based EBV oI 0.9. CC-MAS resulted in 75 and 43 greater response than
purebred selection and CCPS and Iurther reduced rates oI inbreeding to 1.0 per
generation. Selection on marker-based EBV derived Irom purebred phenotypes resulted
in substantially less response in CC perIormance than CC-MAS. In conclusion. eIIective
use oI MAS requires estimates oI eIIect on CC perIormance and MAS based on such
estimates enables more eIIective selection Ior CC perIormance without the need Ior
extensive pedigree recording and while reducing rates oI inbreeding.
1
67
Does genomic selection work in a mice population?
A Legarra, C Robert-Granie, E Manfredi, JM Elsen
22 March 2007, QTLMAS
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 1 / 18
Accuracy of genomic selection
In the genomic selection `a la Meuwissen et al. we have a SNP model:
y
i
= +
n

j =1
w
ij
a
j
+ e
i
and G
i
=

n
j =1
w
ij
a
j
= w
i
a where
1 a
j
eect of the j -th SNP
2 w
ij
indicator variable depending on the SNP carried by the i -th
individual
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 2 / 18
68
Testing genomic selection
Let

G be an estimator of the true breeding value, with an accuracy.
Genomic selection works better than classical BLUP if accuracy of

G
i
= w
i
a (SNP model) is better than accuracy of

G
i
= u
i
(classical
innitesimal model).
How to validate a model?
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 3 / 18
How to compare accuracies?
We use a cross-validation criteria with genetic and statistical
interpretations.
1 Split the data into two at random : y = [y
1
, y
2
]
2 Estimate
(

G|y
1
, genealogy) BLUP or
(

G|y
1
, SNPs) w
i
a
conditional on y
1
only.
3 For every individual in y
2
, compute

G
i
.
4 And compute r (y
2
,

G
genomic
), and r (y
2
,

G
innitesimal
).
(y
2
is corrected by xed eects).
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 4 / 18
69
How to compare accuracies?
Note that r (y
2
,

G) = r (y
2
, y
2
). What is that?
1 r (y
2
, y
2
) is proportional to the expected genetic gain in y
2
selecting
by

G estimated from y
1
. Mimicks a selection process.
2 r (y
2
,

G) is a measure of model tting from a genetic point of view.
3 r (y
2
, y
2
) is a robust, general measure of model tting by
cross-validation.
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 5 / 18
The data
Nature Genetics - 38, 879 - 887 (2006) Genome-wide genetic association
of complex traits in heterogeneous stock mice William Valdar, Leah C
Solberg, Dominique Gauguier, Stephanie Burnett, Paul Klenerman,
William O Cookson, Martin S Taylor, J Nicholas P Rawlins, Richard Mott
& Jonathan Flint
Our data set, freely available at http://gscan.well.ox.ac.uk . . . We
obtained genotypes for 13,459 SNPs on 1,904 fully
phenotyped mice and 298 parents . . .
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 6 / 18
70
The pedigree
In fact the pedigree is composed of many nuclear families...
and no parent is phenotyped.
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 7 / 18
The data
id weight sex genotype
1 A048005080 20.3 1 12121212221222121212221222121222121211221212121222121212111211121212
2 A048006063 26.7 2 12122212211221221212221221112221121212212212121222121212111212221112
3 A048006555 19.5 2 22112222222222222211221122112222112211222222222222222222112211221111
4 A048007096 22.2 2 12121212211221221212221221112221121212212212121222121212111212221112
5 A048010273 17.3 1 22112222222222222211221122112222112211222222222222222222112211221111
6 A048010371 18.1 2 12121212221222121212221222121222121211221212121222121212111211121212
8 A048011287 25.6 2 12122212211221221212221221112221121212212212121222121212111212221112
9 A048011567 20.6 2 12122212211221221212221221112221121212212212121222121212111212221112
10 A048013559 17.3 1 2211222222222222221122112211222211221122222222222222222211221122111
11 A048015047 16.3 2 2211222222222222221122112211222211221122222222222222222211221122111
12 A048017615 21.8 2 1212121222122212121222122212122212121122121212122212121211121112121
13 A048019267 18.5 2 2211222222222222221122112211222211221122222222222222222211221122111
14 A048021023 22.3 2 1212121222122212121222122212122212121122121212122212121211121112121
15 A048022858 17.4 1 1212121222122212121222122212122212121122121212122212121211121112121
16 A048023355 18.7 1 1212121222122212121222122212122212121122121212122212121211121112121
17 A048023581 21.2 2 1212221221122122121222122111222112121221221212122212121211121222111
18 A048028854 20.6 1 2211222222222222221122112211222211221122222222222222222211221122111
19 A048028871 19.7 1 1122111122112112112221122112122112111122121112122112121212121111221
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 8 / 18
71
Assumptions and estimation
We use a mixed model.
y = X +Zu +e
Works very well, h
2
= 0.96 0.03. All variation is additive.
y = X +Wa +e
This model, although not optimal, gave good results in the simulations by
Meuwissen et al. (accuracy 0.71).
y = X +Wa +Zu +e
a N(0, I
2
a
), e N(0, I
2
e
), u N(0, A
2
g
)
We use MCMC to estimate variance components once in y
1
. Otherwise
we use BLUP (full MCMC gave similar results, not shown).
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population?22 March 2007, QTLMAS 9 / 18
Cross-validation
The question is: How to split y in [y
1
, y
2
] ? Options:
By families: Most DL is only at the populational level, less powerful.
BLUP does not give information in this case (without any relative in
y
1
, y
i
= 0, y
i
y
2
).
Splitting families in two. High DL because there is a family structure
and we use full-brothers to predict full-brothers. Comparable to a
two-generations design.
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 10 / 18
72
Sampling families
!!
"
#
"
5 6 7 1 8 2 10 4 9 3
!$
"
%
"
5 6 7 1 8 2 4 9 3
!&
"
'
"
5 6 7 1 8 2 4 3
!(
"
)
"
5 6 1 2 4 3
!*
"
+
"
5 6 7 1 8 2 10 4 9 3
!,
"
-
"
5 6 7 1 8 2 4 9 3
!.
"
/
"
5 6 7 1 2 4 3
y
1
, y
2
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 11 / 18
Splitting families
!!
"
#
"
5 6 7 1 8 2 10 4 9 3
!$
"
%
"
5 6 7 1 8 2 4 9 3
!&
"
'
"
5 6 7 1 8 2 4 3
!(
"
)
"
5 6 1 2 4 3
!*
"
+
"
5 6 7 1 8 2 10 4 9 3
!,
"
-
"
5 6 7 1 8 2 4 9 3
!.
"
/
"
5 6 7 1 2 4 3
y
1
, y
2
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 12 / 18
73
Splitting the data
For the two ways of sampling (animals or families), in 100 dierent
partitions y = [y
1
, y
2
] , we have computed:
1 r (y
2
,

G
genomic
) ,G
i
=

a
j
2 r (y
2
,

G
genomic&BLUP
), G
i
= u
i
+

a
j
3 r (y
2
,

G
BLUP
), G
i
= u
i
.
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 13 / 18
Results: sampling families at random; 100 replicates
Table: Correlations r (y
2
,

G)
Method Mean 95% quantiles
r (y
2
,

G
genomic
) 0.21 0.14-0.29
r (y
2
,

G
genomic&BLUP
) 0.19 0.12-0.27
r (y
2
,

G
BLUP
) 0 NA
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 14 / 18
74
Results: splitting families in half at random; 100 replicates
Table: Correlations r (y
2
,

G)
Method Mean 95% quantiles
r (y
2
,

G
genomic
) 0.48 0.45-0.51
r (y
2
,

G
genomic&BLUP
) 0.60 0.58-0.62
r (y
2
,

G
BLUP
) 0.59 0.56-0.61
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 15 / 18
The end
Conclusions:
1 The genomic model performs better than classical BLUP when there
is no information from relatives. Apparently, it recovers either family
information or population LD.
2 The genomic model alone performs worse than classical BLUP when
there is information from close relatives. It is able nevertheless to
recover a good part of the family information.
3 The genomic model together with BLUP performs slightly better than
classical BLUP.
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 16 / 18
75
The end
This might indicate either a truly wrong model or overtting (too many
variables).
However the simulations of Meuwissen et al. 2001 did not show problems
in overtting. The BLUP approach weve used here worked pretty well,
with accuracies of 0.73 for h
2
= 0.5. Was their model or their simulations
too unrealistic?
Questions:
1 Can we recover population LD + family LD and do better than
BLUP?
2 How to choose the SNPs?
A Legarra, C Robert-Granie, E Manfredi, JM Elsen () Does genomic selection work in a mice population? 22 March 2007, QTLMAS 17 / 18
76
Genome wide selection in dairy cattle based on high-density genome-wide
SNP analysis: from discovery to application.
H.W. Raadsma.
1.2
. K.R. Zenger
1.2
. M.S. Khatkar
1.2
. R. Crump
1.4
. G. Moser
1
. J. Solkner
1.2
.J.A.L. Cavanagh
1.2
. R.J. Hawken
1.3
. M.Hobbs
1.2
. W. Barris
1.3
. F.W.Nicholas
1.2.
. B.Tier
1.4
1
Co-operative Research Centre for Innovative Dairv Products-CRC IDP.
2
ReproGen
Centre for Advanced Technologies in Animal Genetics and Reproduction. Facultv of
Jeterinarv Science. The Universitv of Svdnev. Camden. Australia.
3
CSIRO Livestock
Industries. Brisbane. Australia.
4
AGBU. Universitv of New England. Armidale. Australia
Development oI high-density large-scale single nucleotide polymorphism (SNP) genotyping
platIorms has opened the possibility oI Genome Wide Selection (GWS) in cattle. 1546
Australian progeny-tested dairy bulls were tested Ior 15.036 SNP markers. This led to the
Iollowing WGS platIorm Ior use in dairy cattle.
SNP discovery: the platIorm is built on a commercial SNP genotyping platIorm (Parallele-
AIIymetrix) incorporating 10.410 public domain SNP markers and 4.626 proprietary SNP
markers. The proprietary markers were selected to cover regions in the genome predicted to
be marker-sparse. known QTL regions. and candidate genes Irom the CRC-IDP candidate
gene data base. using both in-silico discovery and re-sequencing strategies.
SNP perIormance: The 22.5million data points resulted in the Iollowing summary
perIormance statistics; 99.4 conversion rate to genotype assays; 88.1 inIormative SNP
markers; 91.1 placed with predicted position based on Btau3; 97.1 on an integrated
bovine map. 74.6 with minor allele Irequency ~0.05 and a reproducibility oI 99.2 Ior
repeat inIormative assayable SNPs. AIter editing and correction Ior discordant SNPs. 10.715
high utility SNPs were used in GWS.
SNP complexity reduction: The challenge oI dealing with over parameterized data sets where
the number oI SNP variables greatly exceed the number oI observations is dealt with at this
meeting by Crump et al. Moser et al and Woolaston et al.( these proceedings). PowerIul
approaches Ior analyzing high-dimensional whole-genome SNP data such as supervised
dimension reduction through partial least squares (PLS). Principal Component Analyses. and
use oI optimal search algorithms Ior exploring the parameter space were used Ior prediction
oI genetic merit based on Molecular Breeding values (MBV). Additional non statistical SNP
reduction methods will exploit use oI tag SNPs in deIined haplotypes. Furthermore no loss oI
eIIiciency was observed when 6000 oI the available SNPs were used in GWS development.
Prediction and validation oI MBV: A remarkable Ieature oI model selection and cross
validation methods has been the accurate prediction oI true breeding value (TBV) via EBV.
Accuracies oI prediction within the range oI 0.7-0.85 in the absence oI pedigree. and
QTL/gene inIormation have been obtained. Typically only a Iraction oI the available SNP
(1 ) are used to predict MBV Ior all maior traits used in dairy cattle selection. Realization
oI GWS may thereIore well represent the Iirst true promise oI DNA based technologies Ior
livestock improvement.
Utility and Application oI GWS. Deriving MBV Irom a population in which Iuture
predictions have to be made oIIers immediate use in young sire and elite dam selection.
Features oI GWS can be readily incorporated with advanced reproductive technologies.
leading to greatly increased rates oI genetic gain and potential signiIicant cost reduction as
breeding programmes move Irom progeny testing in sire selection to progeny validation. Use
oI MBV allows Ior screening oI suitable germplasm Irom global sources. and may possibly
extend to incorporate GxE and GxG and an NRM based on shared genome content in genetic
evaluation. Molecular keys Ior GWS can be readily updated as new sires enter the industry.
Additional applications. In addition to GWS. the SNP inIormation is being used in the
assessment oI genome wide and population diversity. mate selection. management oI
inbreeding. study oI inherited disorders. pedigree validation. assembly oI the bovine Hapmap.
and high-density integrated maps.
77
CONFIDENTIAL
GENOME-WIDE SELECTION IN DAIRY CATTLE: USE OF GENETIC
ALGORITHMS IN THE ESTIMATION OF MOLECULAR BREEDING VALUES
R.E. Crump, B. Tier, G. Moser, J. S olkner, K.R. Zenger, M.S. Khatkar,
J.A.L. Cavanagh and H.W. Raadsma
CRC for Innovative Dairy Products, Melbourne, Australia.
Marker genotype information may be utilised both in the detection of genes and in the prediction
of genetic merit. Genome-wide selection can be performed by the incorporation of genotype
information on relatively small numbers of markers alongside pedigree information in a best
linear unbiased prediction analysis. However, given sufcient markers it should be unnecessary
to use pedigree information at all, as the summation of marker effects (molecular breeding
value, MBV) will be a good predictor of genetic merit. For prediction it is desirable that the
markers are well spread across the genome, but further characterization is not required.
With on-going development of genotyping technology, in particular single nucleotide polymor-
phisms (SNP), it is now possible to generate genotypes for many markers on any individual.
Consequently there are now many more genotypes than observations. In this project 10715
biallelic SNP genotypes were available after editing on 1546 progeny tested dairy bulls. To es-
timate the SNP effects jointly from these over-parameterised data sets, it is necessary to either
explore the parameter space (as we will discuss further here), or to use dimension reduction
techniques such as partial least squares (see Moser et al., Woolaston et al., this workshop).
Estimated breeding values (EBV) with high reliability were used as a proxy for true genetic
merit for a variety of traits, using progeny tested bulls from the Australian Dairy Herd Improv-
ment Scheme. A simple regression model was used; y
i
= +
s
j=1

j
g
i j
where y
i
is the EBV
for bull i, is the intercept, g
i j
is the genotype of bull i for SNP j (0, 1 or 2 copies of one of the
alleles), and
j
is the additive effect of SNP j. Given there are only 1546 degrees of freedom
available in our data, it is necessary to explore the humungous set of possible models.
Agenetic algorithm(GA) was used to both explore and optimize the set of models. Apopulation
of random models was dened and the tness of each model assessed. The tness criterion
was originally envisaged as being the residual sum of squares, however alternative criteria in
conjunction with internal cross-validation are now preferred in order to optimise the predictive
power of the model and avoid over-tting. In n-fold cross-validation each record is assigned
to one of n sets. For each model, each of these sets is then predicted using the results of
regression analysis on the other n1 sets combined. Parent models are selected and an offspring
model is generated by a single crossing over event followed by mutation (removal or addition
of SNP from the offspring model). The offspring model joins the population if it is better than
the worst model already in the population. The GA generates results both for the best model
and as weighted averages (both SNP effects and MBV) across models. The results of models
with higher tness receive a higher weighting than less t models. The weighted estimates are
typically better predictors of EBV than those from the best model alone. Accuracies across
replicated analyses of 4 traits ranged from 0.84 to 0.92. Accuracies of predictions from these
models ranged from 0.64 to 0.82 for the same range of traits.
The process allows for the selection of subsets of SNP that may be further utilised to genotype
young bulls and cows more cheaply than a whole genome scan. In addition, SNP that may be
of interest in gene detection can also be found with much less tendency to overestimate the size
and signicance of effects than is associated with single-marker approaches such as the t or
F-tests.
78
!"#$%&#$'()'*)%'+!,-+&.)/.!!0$(1)2&+-!")$()1!('%!)3$0!)"!+!,#$'()-"$(1)"-4!.2$"!0)
0$%!("$'().!0-,#$'()/&"!0)'()4&.#$&+)+!&"#)"5-&.!"
G. Moser . B. Tier. R.E. Crump. J. Soelkner. K.R. Zenger. M.S. Khatkar.
J.A.L. Cavanagh and H.W. Raadsma
CRC FOR INNOVATIVE DAIRY PRODUCTS. Melbourne. Australia
&/"#.&,#
The advent oI high density SNP typing platIorms in cattle has opened the possibility to perIorm
Genome Wide Selection (GWS) without the need oI QTL and pedigree analysis in predicting
Molecular Breeding Values.(MBV). A challenging problem connected with whole-genome SNP data is
that it contains typically many more variables (p. SNP) than observations (n. breeding values). It is not
uncommon to collect genotype inIormation on several thousand SNP markers using only a Iew hundred
individuals. Since most traditional multivariate techniques are not applicable with such high-
dimensional genomic data special techniques such as variable selection or dimension reduction are
required.
A powerIul approach Ior analyzing high-dimensional whole genome SNP data is supervised!dimension
reduction based on partial least squares (PLS). As a supervised approach. it uses the response variable
oI interest in the dimension reduction step which oIten makes it more eIIicient in prediction problems
than the unsupervised principal component analysis. In PLS. dimension reduction and regression are
perIormed simultaneously.
PLS was used to predict MBV based on SNP inIormation. The data comprised 10715 SNP typed in
1546 dairy bulls born between 1955 and 2001. Breeding values (EBV) Ior several traits were supplied
by the Australian Dairy Herd Improvement Scheme. Internal validation oI data using cross-validation
was perIormed to determine a model`s predictive capacity and to determine the optimal model com-
plexity (i.e. number oI latent components).
PLS models calculated Irom all SNP described the relationship between SNP and EBV very well (r
.85 -.97). The use oI a large number oI SNP with small numbers oI animals gives rise to a signiIicant
risk oI overIitting. To Iurther validate the models the EBV variables were permutated randomly and
none oI the randomized sets gave a high predictive score.
An issue which is tightly connected with the prediction oI breeding values is the identiIication oI genes
underlying genetic variation. Using reduced SNP sets provides Iaster and more cost-eIIective genotyp-
ing and allows the application oI statistical methods which can not handle p~~ n. A ranking list oI the
most important markers was constructed by summing the absolute values Ior all loading vectors in a
PLS model. taking into account the variance explained by each component. A simple Iorward selection
strategy to minimize the cross-validated prediction error was then applied to develop sets oI non-redun-
dant SNP that are useIul in predicting breeding values.
79



PRINCIPAL COMPONENTS REGRESSION OF SNP DATA TO PREDICT
GENETIC MERIT

A.F. Woolaston
1.2
. B. Tier
1.2
and R.D. Murison
3


1
CRC Ior Innovative Dairy Products. Melbourne. Australia
2
Animal Genetics and Breeding Unit. University oI New England. Armidale. Australia
3
School oI Mathematics. Statistics and Computer Science. University oI New England.
Armidale. Australia

Developments in genetic technology allow large numbers oI single nucleotide polymorphism
(SNP) values to be scored Ior individuals. Within animal breeding. it is hoped that these SNPs
will be used to predict the genetic merit oI animals at an early stage so that superior animals can
be identiIied Ior Iurther testing or breeding. The large numbers oI SNPs that are evaluated means
that the predictor variables are contained in a high dimensional space. with a limited number oI
Iree parameters available. This can be overcome by adding more animals to the experiment or by
reducing the dimension oI the predictor space. It may not be practical to increase the number oI
animals in many cases because the required increase in the number oI animals is approximately
3
n
. where n is the number oI SNPs. which can be in the thousands. Thus. it is sensible to reduce
the dimension oI the predictor space.

Principal component analysis (PCA) is a multivariate analysis technique where the aim is to
reduce the dimension oI a dataset comprised oI many correlated variables. while still accounting
Ior a large proportion oI the variance. That is. a linear transIormation is applied to the data. with
uncorrelated principal components (PCs) resultant. The technique is well suited to SNP marker
data. where markers are highly correlated due to linkage disequilibrium. Principal component
regression involves using the PCs as explanatory variables in a multiple linear regression.

SNP data are simulated to assess PCR as a method oI molecular breeding value (MBV)
prediction. The true MBV oI each animal is simulated as:



1
. where

is the
additive eIIect oI the ith SNP and

is the genotype at the ith SNP (0. 1 or 2). The phenotypic


value is simulated as: T MBV c. where c~N(0.o
2
). The variance oI c and the additive genetic
variance are simulated so that the trait has a heritability oI (a) 0.1. (b) 0.4 and (c) 0.7.

PCR is applied to the older animals in these simulated data. with the phenotypes the response
variables and the PCs oI the SNPs as the explanatory variables: T
i

1
PCi
1

2
PCi
2

3
PCi
3
..

p
PCi
p
. where
i
is the regression coeIIicient Ior the ith PC. These regression coeIIicients are
used to predict the MBVs Ior the younger simulated animals. aIter their genotypes are
transIormed into the principal subspace. The correlation between the simulated and estimated
MBVs is used as a measure oI the accuracy oI the method. The accuracy ranges Irom 0.98 when
the heritability is 0.7 to 0.78 when the heritability is 0.1.
80
!"#$%&'()*$+"$,"-),,.(.+/"(.0(.**)$+"%./1$-*",$("0.+$%)#2'**)*/.-"&(.-)#/)$+"
$,"0.+./)#"3'45.*")+"-')(6"#'//4.
J. Slkner
1.2.3
. B. Tier
1.4
. R. Crump
1.4
. G. Moser
1
. P. Thomson
1.3
and H.W. Raadsma
1.3
1
Co-operative Research Centre for Innovative Dairv Products-CRC ID.
2
Universitv of Natural
Resources and Applied Life Sciences. Gregor Mendel Str. 33. A-1180 Jienna. Austria .
3
ReproGen Centre for Advanced Technologies in Animal Genetics and Reproduction. Facultv
of Jeterinarv Science. The Universitv of Svdnev. Camden. NSW 2560. Australia.
4
Universitv of
New England. AGBU. Armidale. NSW 2351. Australia.
The availability oI large arrays oI single nucleotide polymorphisms (SNP) is changing the
approach oI predicting breeding values Irom molecular inIormation (MBV) through genome wide
selection(GWS)(Raadsma et al. these proceedings). A pool oI 1546 Holstein Friesian bulls.
mostly oI Australian origin. with highly accurate estimates oI breeding values (EBV) was
genotyped Ior 15380 SNP. Methods oI regressing EBV. considered as proxies Ior true breeding
values. on SNP genotypes coded as 0. 1 (heterozygous) and 2 were compared. The traits
considered were total merit. protein yield. overall type. Iertility and somatic cell count. The
methods applied were ordinary least squares regression (OLSR) with LAR variable selection.
OLSR using a genetic algorithm Ior variable selection and modiIied prediction (OLSR-GA.
Crump et al. these proceedings) and partial least squares regression (PLSR. Moser et al. these
proceedings). To avoid over-Iitting due to the large number oI regressors. cross-validation
techniques were applied and the predictive capacity was evaluated Irom 5 repeated runs
separating 200 bulls as test data not involved in the estimation. Correlations oI true and predicted
values Ior these test data sets were in the range oI 0.65-0.8 Ior most traits. including Iertility. a
trait with low heritability. OLSR-GA and PLSR perIormed signiIicantly better than OLSR.
Critical issues remain selection oI optimal models and optimal number oI SNP. multi-trait MBV
estimation. and validation oI GWS in applied breeding programmes.
81
6 Advances in QTL detection theory 2
E-mail DJ.deKoning(BBSRC.AC.UK
Towards Genetical genomics in Livestock
D. J. de Koning. C. P. Cabrera and C. S. Haley
The Roslin Institute. Roslin Biocentre. Roslin. EH25 9PS. United Kingdom
This document is modified from a paper presented at PSA 2006. EAAP2006 and currentlv accepted
for publication in Poultrv Science. The contents do not entirelv reflect the workshop presentation
but provide a useful overview.
The QTL MAS presentation had additional input from Dirk Husmeier. BIOSS. Edinburgh
ABSTRACT
Microarrays have been widely implemented
across the liIe sciences although there is still
debate on the most eIIective uses oI such
transcriptomics approaches. In genetical
genomics. gene expression measurements are
treated as quantitative traits and genome
regions aIIecting expression levels are
denoted as expression quantitative trait loci or
eQTL. The detected eQTL can either
represent a locus that lies close to the gene
that is being controlled (cis-acting) or one or
more loci that are unlinked to the gene that is
being controlled (trans-acting). One powerIul
outcome oI genetical genomics is the
reconstruction oI genetic pathways underlying
complex trait variation. Because oI the
modest size oI experiments to date. genetical
genomics may Iall short oI its promise to
unravel genetic networks. We propose to
combine expression studies with Iine mapping
oI Iunctional trait loci. This synergistic
approach Iacilitates the implementation oI
genetical genomics Ior species without inbred
resources but is equally applicable to model
species. Among livestock species. poultry is
well placed to embrace this technology with
the availability oI the chicken genome
sequence. microarrays Ior various platIorms
as well as experimental populations in which
QTL have been mapped.
Other species are catching with genome
sequences becoming available Ior cattle and
advanced plans Ior the pig genome. In the
build-up towards Iull-blown eQTL studies.
we can study the eIIects oI known candidate
genes or marked QTL at the gene expression
level in more Iocussed studies. To
demonstrate the potential oI genetical
genomics. we have identiIied the cis and
trans eIIects Ior a Iunctional body weight
QTL on chicken chromosome 4 in breast
tissue samples Irom chickens with contrasting
QTL genotypes.
Key words: Experimental Design. Fine
Mapping. Gene Expression. Quantitative Trait
Locus
INTRODUCTION
Dissecting the genetic control oI variation in
complex traits and identiIying underlying loci
controlling such variation has proved to be
very challenging. While quantitative trait
locus (QTL) detection has been successIul in
identiIying chromosomal regions associated
with a wide range oI complex traits in many
diIIerent species |e.g. experimental crosses
1
.
livestock
2-4
. humans
5
|. these regions are
suIIiciently large to contain hundreds iI not
thousands oI potential candidate genes.
Further Iine mapping oI these QTL to reduce
the size oI these regions and hence reIine the
list oI potential candidate genes can be
achieved by creating additional recombination
events through selective breeding
6
or by
exploiting historical recombinations
7
An approach that has great promise to make a
maior contribution to the dissection oI
complex traits is genetical genomics; the
combined study oI gene-expression and
marker genotypes in a segregating population
8.9
.
Genetical genomics is aimed at detecting
genomic loci that control variation in gene
expression. so called expression QTL (eQTL)
(to distinguish them Irom Iunctional QTL that
aIIect traits at the whole-organism level). The
detected eQTL can either represent a locus
1
83
that lies close to the gene that is being
controlled (cis-acting) or one or more loci that
are unlinked to the gene that is being
controlled (trans-acting)
8
. A maior promise
oI genetical genomics is that by examining
the relationship between transcript location.
location oI eQTL and pleiotropic eIIects oI
eQTL. it might be able to reconstruct genetic
pathways that underlie phenotypic variation
8
.
Additional inIormation to reconstruct
pathways comes Irom the correlations
between and among gene expression
measurements and Iunctional traits
10
and the
epistatic interactions between eQTL and
Iunctional QTL
11
. ThereIore. genetical
genomics can be exploited as an additional
tool to dissect phenotypic variation into its
underlying components and elucidate how
these components interact. II successIul.
genetical genomics will enhance and
accelerate the characterization oI Iunctional
QTL. which remains an arduous task. even in
model species. At present. several studies
have demonstrated the Ieasibility oI eQTL
studies and some oI these have successIully
integrated eQTL and gene expression with
data on traditional phenotypes. What all these
studies have in common is that. in comparison
to traditional` QTL studies oI Iunctional
traits. the sizes oI the experiments are modest
to small
12
. Consequently. the power oI the
studies is low and many important QTL will
not have been detected and interactions
between QTL will have been missed. Thus
the results to date have not been very
successIul at reconstructing genetic pathways
or identiIying genes underlying Iunctional
trait variation. ThereIore. more powerIul
experiments addressing these issues are
necessary to realise the Iull potential oI eQTL
mapping.
Following a case study oI how a QTL
experiment has been integrated with
microarray analyses in poultry. we outline an
experimental strategy to improve the
eIIiciency oI Iuture eQTL studies. We
subsequently introduce a targeted approach to
study the gene expression eIIects oI a marked
QTL.
!"#$%&'#("% ) *+, ) '"- ) .$"$ ) /01&$22(3")
2#4-($2
In a number oI cases. traditional QTL studies
have been supplemented with microarray data
in an attempt to move Irom a Iunctional QTL
to the underlying gene(s)
13
. Below. we outline
a case study where detection oI Iunctional
QTL was Iollowed up by a gene expression
analysis. In this example. microarray
experiments were carried out on the Iounder
lines oI the study. The underlying idea was
that genes that were diIIerentially expressed
between the Iounder lines AND were located
in the areas oI the QTL that were Iound in the
cross resulting Irom these lines. would be
prime positional candidates Ior the Iunctional
genes underlying the QTL. In genetical
genomics terms. this type oI analysis explores
whether the Iunctional QTL is also a cis-
acting eQTL. It would be much more diIIicult
Ior such a study to determine the genetic basis
oI a QTL that had its Iunctional eIIect through
trans-acting regulation oI expression oI genes
located outside the QTL region. This is
because there are likely to be many
diIIerences in expression between lines Ior
genes across the genome. This study provides
no inIormation on where in the genome the
control oI those expression diIIerences lies
and hence which oI these genes are associated
with the QTL region.
5$2(2#'"6$ ) #3) 7'&$892 ) -(2$'2$ ) (") 6:(68$".
Marek`s disease (MD) is an inIectious viral
disease and a member oI the herpes virus
Iamily. MD costs the poultry industry about 1
billion USD per annum. To study the genetic
control oI MD susceptibility. an experimental
cross was established between a resistant and
a susceptible inbred line oI chicken
14
. F
2
oIIspring Irom this cross were experimentally
challenged and genotyped. providing the data
Ior a QTL analysis that resulted in seven QTL
Ior susceptibility to MD
14.15
. Subsequently. the
Iounder lines oI the F2 cross were used Ior a
microarray study to identiIy genes that were
diIIerentially expressed between the two lines
Iollowing artiIicial inIection. FiIteen oI these
genes were mapped onto the chicken genome
and two oI them mapped to a QTL region Ior
Marek`s resistance
16
. At the same time.
protein interaction studies between a viral
2
84
protein (SORF2) and a chicken splenic cDNA
library revealed an interaction with the
chicken growth hormone (GH)
17
. This led to
the detection oI a polymorphism in the GH
gene that was associated with diIIerences in
the number oI tumours between the
susceptible and the resistant line
17
. The GH
gene coincided with a QTL Ior resistance and
also showed up as diIIerentially expressed
between the Iounder lines in the expression
study
16
. More recently. the same group
describe detection oI lymphocyte antigen 6
complex (LY6E) as a putative Marek`s disease
resistance gene. again using the virus-host
protein interaction screen
18
. LY6E had been
demonstrated earlier to be diIIerentially
expressed between resistant and susceptible
chickens. but its location was not near a MD
QTL
16
. Hence one could speculate that one oI
the MD QTL could act through trans acting
control oI the expression oI this locus.
This research has demonstrated nicely how
integrating across research disciplines can be
very proIitable. A limiting Iactor in the
Iurther exploitation oI the QTL is the lack oI
precision and power to detect QTL with only
272 chickens in the F
2
. The comparison oI
gene expression levels on the Iounder lines
showed several potential candidate genes. but
the link to the QTL regions is indirect.
Scoring the gene expression levels on the F
2
would have provided a more direct link
between MD QTL and eQTL and may well
have Ilagged LY6E and GH as targets Ior
eQTL. The GH eIIect coincided with a
Iunctional QTL pointing towards a cis- eIIect
while the LY6E eIIect would appear to be
trans regulated. and thereIore only traceable
to its eQTL in a genetical genomics setting.
;#'#42)3<)$*+,)2#4-($2= To date. actual eQTL
studies have been published Ior mice
19-21
.
rats
22
. maize
21
. yeast
23-25
. eucalyptus
26
and
human
27.28
. Most oI these studies are prooI oI
principle` or Iocus on the regulation oI gene
expression in itselI.
The eQTL studies in yeast started out as a
Iairly straightIorward prooI oI principle
23
.
which was Iollowed up by exploring whether
trans- regulating elements coincided with
known transcription Iactors
24
. More recently.
this work was extended to more general
questions about the genetic regulation oI gene
expression in yeast
25
and the relevance oI
epistasis
29
. Two studies used the same
recombinant inbred (RI) lines oI mice to
study eQTL in Iorebrain
20
and haematopoietic
stem cells
19
. respectively . could relate their
Iindings to a whole range oI phenotypes that
have been measured on these mice as part oI
other studies. These phenotypes. as well as
the expression phenotypes. have been made
available online at www.genenetwork.org.
providing a very valuable resource Ior the
research community. However. the
phenotypes. including the expression
phenotypes. are only provided Ior about 33 RI
lines available so Iar. resulting in relatively
low power to detect Iunctional QTL and
eQTL. With low power to detect QTL. only
the largest oI QTL eIIects are detected and
most moderate and small QTL will be missed.
As a result. integration oI eQTL results and
Iunctional trait QTL will only identiIy the
largest eIIects.
A general conclusion Irom the published
eQTL studies. is that the most convincing
evidence Ior eQTL is Ior the cis-acting eIIects
12
while the reconstruction oI genetic
networks would require the identiIication oI a
larger proportion oI trans-acting eQTL.
including those with moderate eIIects. In
short. current eQTL studies miss many
important loci and Iail to reconstruct genetic
pathways underlying Iunctional variation. At
the same time. eIIorts to Iind the gene(s)
underlying Iunctional QTL via Iine mapping
and/or gene expression studies would be more
eIIective iI they were better integrated.
TOWARDS TARGETED AND
INTEGRATED MAPPING
With the continuous improvements in data
extraction and normalization. Iurther increase
in precision oI gene expression measurements
can be anticipated. Such a reduction in
technical variation in gene expression
measurements will increase the power to
detect eQTL. Nonetheless. to improve the
power and repeatability oI eQTL studies it is
necessary to increase their size towards those
used in QTL studies oI other traits.
In addition. combining larger studies with a
more Iocussed approach Iurther improves the
3
85
power oI Iuture eQTL studies. In Figure 1 we
outline Targeted and Integrated Mapping` oI
marked phenotypic QTL Ior Iunctional traits
(Iunctional QTL). The central idea is to Iocus
the studies on a relevant Iunctional trait Ior
which QTL have been identiIied previously.
Targeted and Integrated Mapping has three
components (Figure 1): 1) Irom a large
resource population. individuals that are non-
recombinant Ior markers Ilanking the QTL
region(s). are selected Ior the eQTL
experiment. 2) Individuals that are
recombinant Ior the QTL region(s) are
utilised Ior Iurther Iine mapping oI the QTL.
3) Additional expression studies are carried
out Ior some oI the recombinant individuals
to conIirm or evaluate positional candidate
genes underlying the QTL.
Targeted and Integrated Mapping is
applicable to any species Ior which large
segregating populations are either naturally
occurring or can be created experimentally. as
in many livestock (including poultry). crop
and experimental organisms. The approach is
particularly appropriate where inbred
resources. such as RI lines. are not available
or cannot be realistically produced like Ior
most poultry species. In the Iollowing
sections. we outline this approach in the
context oI an F
2
study.
The underlying assumption oI Targeted and
Integrated Mapping is that QTL with maior
eIIects on the phenotype Ior a Iunctional trait
will oIten have maior eIIects on expression oI
one or more genes. In some cases. the eQTL
underlying a Iunctional QTL may act in cis to
control the expression that causes the
phenotypic eIIect. as recently demonstrated
Ior the IGF2 locus in pigs
30
. Alternatively. the
phenotypic eIIect oI a QTL and eIIects on
expression oI one or more genes may be the
downstream consequence oI genetic variation
acting within a pathway or complex (network)
oI pathways. In this case we might expect to
map one or more trans-acting eQTL to the
region oI the Iunctional QTL. Compared to an
unspeciIic genome scan Ior eQTL. Targeted
and Integrated Mapping will have increased
power to detect eQTL underlying Iunctional
QTL and to identiIy genetic networks and
gene interactions Ior target QTL.
;#$1)>?)+:$)$*+,);#4-@
Let us assume that we know Irom prior
inIormation (e.g. a QTL mapping study) that
a selected genomic region aIIects a complex
trait. usually because a Iunctional QTL has
been mapped there. Markers spaced through
the putative Iunctional QTL region (target
region) are genotyped prior to phenotyping
Ior Iine-mapping or tissue collection Ior
expression studies. This allows contrasting
genotypes (e.g. alternative homozygotes in a
F
2
population) Ior one or more Iunctional
QTL to be selected Ior the expression study.
whilst individuals that are recombinant in the
QTL regions are diverted into the Iine-
mapping study. Selecting individuals that are
homozygous Ior the target regions increases
the power to detect eQTL Ior these regions
and decreases genetic complexity.
This approach improves the power to detect
eQTL in three ways: 1) When selecting n
homozygous individuals Irom an F
2
. the
power to detect the additive eIIect oI an eQTL
Ior the target regions equals that oI an F
2
oI
size 2n. 2) Because the contrast to estimate
the putative eQTL eIIect is only between
classes oI homozygous individuals. the
genetic test is simpler and uses less degrees oI
Ireedom. 3) A targeted study oI one or several
predetermined QTL regions involves
substantially less multiple testing than does a
complete genome scan. so the signiIicance
threshold Ior the identiIied regions could be
less stringent than that Ior the remainder oI
the genome. increasing the power to detect
eQTL even more.
The rest oI the genome can also be studied Ior
eQTL albeit with lower power than Ior the
target regions. (For regions unlinked to
selected regions the power to detect eQTL
should be equivalent to that oI an unselected
sample oI the same size. so selection is not
disadvantageous Ior eQTL mapping in these
regions). Furthermore. interactions can be
studied between target regions as well as
between the target regions and the remainder
oI the genome
31
.
;#$1A?)+:$)B("$)7'11("%);#4-@
Fine mapping strategies include those in
which recombination in the QTL region is
increased by targeted breeding (e.g. advanced
4
86
intercross lines
6
) and those that exploit
historical recombination events.
Alternatively. a large pedigreed population
should provide suIIicient recombination to
Iine-map a QTL without the need Ior
additional generations
32.33
. By typing all
individuals oI the population Ior markers
Ilanking the QTL. all recombinant individuals
are identiIied. These recombinant individuals
are available Ior Iurther Iocused study to Iine
map the Iunctional QTL. To decrease the
genotyping load oI the Iine mapping. a subset
Irom these recombinant individuals could be
chosen Ior Iurther study based on their
phenotypic values Ior the Iunctional trait in
question
33
. Rather than typing the selected
individuals Ior all available markers in the
QTL region. a Iurther decrease oI the
genotyping load could be obtained by
applying genotyping strategies like the halI-
section algorithm or the golden section
algorithm
33
.
;#$1)C?)D3EF("("%)$*+,)'"-)B("$)7'11("%
Any eQTL that are identiIied in the target
regions are potential candidates underlying
the Iunctional QTL eIIect. Given the
increased power Ior the target regions. it is
possible that eQTL that are detected in the
target region have no direct relation with the
Iunctional QTL. The Iine mapping study will
reduce the conIidence interval oI the
Iunctional QTL. Iacilitating a more limited
selection oI positional candidate genes
underlying the Iunctional QTL eIIect.
For cis-acting eQTL. the position oI the gene
with the associated cis eIIect will be
accurately known Ior species with good
physical mapping or sequence data. Thus. it
can be evaluated whether the gene with an
associated cis eIIect still maps to the reIined
conIidence interval oI the Iunctional QTL.
For trans eIIects that map to the candidate
region. a simple comparison oI location oI
eQTL and Iine-mapped Iunctional QTL is
unlikely to be conclusive. In this case.
additional expression studies using selected
individuals Irom the Iine mapping study may
be required to resolve which eQTL are most
likely correlated with the Iunctional QTL and
which are more likely to be linked eIIects. II
the number oI positional candidate genes is
limited. such a study could evaluate a much
smaller number oI genes using methods like
RT-PCR. The selection oI genes that merit
additional expression studies can be Iurther
limited by selecting those genes that give
strong correlations with the phenotypic trait
or have a known biological Iunction related to
the trait oI interest.
G3H$&I)G&$6(2(3")'"-)G314J'#(3")2(K$
The successIul implementation oI the
proposed strategy depends on the power to
detect eQTL and the resolution oI the Iine
mapping experiment. Selecting a required
number oI individuals that are homozygous
Ior a Iunctional QTL region determines the
minimum size oI the resource population.
This in turn then determines the expected
precision that can be achieved Ior Iine
mapping. Figure 2A shows the predicted
statistical power to detect eQTL oI with
diIIerent relative eIIects (on gene expression)
and diIIerent numbers oI F
2
selected Ior eQTL
mapping. As stated above. Ior the regions
where all selected individuals are
homozygous the increase in power is
equivalent to doubling the number oI
individuals. For instance. with 200 F
2
the
predicted power to detect an eIIect oI 0.3
phenotypic SD is 0.34 Ior most oI the
genome. while Ior the target region it is 0.84
(Fig. 2A).
When selecting Ior homozygosity based on
markers Ilanking the conIidence interval oI
the Iunctional QTL. the minimum size oI the
resource population should take into account
the number oI QTL that are targeted. the size
oI the interval between the Ilanking markers.
and random Iluctuations in Mendelian
proportions. To obtain the number oI
homozygous individuals Ior eQTL analysis
shown in Figure 2A. we have calculated the
required size oI the resource population when
1. 2. or 3 Iunctional QTL are targeted with an
initial conIidence interval oI 20 cM (Figure
2B). These population sizes give a 95
probability oI yielding the stated number oI
individuals homozygous Ior one or the other
gamete through each oI the selected regions
34
. When Iocussing on a single QTL the
required population size is 700 when aiming
at 200 F
2
Ior the expression study and 1.650
5
87
when aiming at 500 F
2
Ior expression studies.
When targeting 2 (3) Iunctional QTL. a
population oI 2.200 (6.850) is required to
provide 200 homozygous F
2
and 5.250
(16.400) to provide 500 homozygous F
2
(Figure 2B). Selecting individuals that are
homozygous Ior multiple Iunctional QTL
improves the ability to map interactions at the
expression level between these QTL but the
required population size becomes prohibitive
Ior most species when three or more QTL are
considered. Assuming inIinite map density.
the expected conIidence interval oI a QTL
study can be predicted based on the size oI the
experiment and the QTL eIIect
6
. The
predicted conIidence interval Ior the
Iunctional QTL Iollowing the Iine mapping
exercise with 5.000 - 25.000 individuals in
the resource population is shown in Figure
2C. For Iunctional QTL oI larger eIIect. sub-
cM conIidence intervals can be obtained
when using a population exceeding 5.000
individuals (Fig 2C). Such an experiment
could Ior instance accommodate eQTL
mapping with 400 F
2
that are selected to be
homozygous Ior two QTL Ilanked by a 20 cM
marker bracket (Fig 2B). Based on the
numbers presented in Figure 2. targeting
(multiple) Iunctional QTL with more modest
eIIects will prove very challenging.
Even though the size oI the resource
population may seem prohibitive. it is
important to realise that not all individuals are
Iully genotyped. From a resource population
oI size N. all individuals will be typed Ior 2
markers Ilanking the m targeted Iunctional
QTL. The n selected F
2
will be used Ior
genome-wide marker analysis in the eQTL
study. Linkage mapping does not require
high density markers and Ior most genomes
anywhere between 200 and 400 markers
should be suIIicient Ior a medium density
linkage analysis. The amount oI genotyping
required Ior the selective recombinant
genotyping depends on the selected Iraction.
the genotyping strategy and the size oI the
targeted interval
33
. Figure 2D summarizes
the genotyping requirements Ior resource
populations oI 5.000 - 20.000 individuals.
targeting 1. 2. or 3 Iunctional QTL with an
initial conIidence interval oI 20 cM using the
combined golden section / halI section
algorithm
33
and selecting the top and bottom
25 Ior the trait oI interest.
TOWARDS eQTL IN LIVESTOCK
Genetical Genomics requires genotypes. gene
expression measure and a pedigreed
population. However. to Iully interpret the
results. we need to know the location oI the
genes as well as their Iunction.
Among livestock species. chickens are very
well placed to be used in Iull blown genetical
genomics studies. There is a large number oI
chicken QTL regions in the public domain
4
and the species has the beneIit oI a Iull
genome sequence
35
and a SNP database
36
. In
terms oI the gene expression tools. there are a
number oI tissue speciIic as well as general
two-colour arrays (both spotted cDNA and
long oligonucleotide array; http://www.ark-
genomics.org/resources/chickens.php) as well
as an AIIymetrix chicken genome array
(http://www.aIIymetrix.com/products/arrays/s
peciIic/chicken.aIIx). Large resource
populations oI chicken can be bred in a timely
Iashion or obtained Irom commercial lines.
Populations Ior Iine mapping. like advanced
intercross lines (AIL)
6
. are available in
several labs ( e.g. Wageningen University.
Netherlands; Iowa State. USA and Roslin
Institute. UK). Microarrays are also available
Ior other livestock and while the draIt
sequence oI cattle has been released and the
plans Ior sequencing the pig genome are
advancing. the required annotation is still
some way oI. An area Ior Iurther development
in the immediate Iuture is the ongoing
annotation oI the genome and other
bioinIormatics tools like pathway databases
that incorporate livestock speciIic inIormation
rather than pathways that are derived Irom
model organisms or humans. However. the
most limiting Iactor in the uptake oI genetical
genomics in livestock species is the budget
required to run microarray studies on large
numbers oI animals. The recently proposed
design oI distant pairing
37
Ior genetical
genomics looks promising in that it oIIers the
possibility to array 2n individuals using n
microarrays. In contrast. to reIerence designs
or one-colour arrays. this design is based on
the contrast in gene expression between
6
88
individuals that have been selected a prior on
their divergent genotypes. However. this
method has been implemented only Ior RI
lines and its eIIiciency Ior outbred
populations has not been quantiIied.
,(E(#$-)&$234&6$2
With limited resources and a more Iocused
obiective. the principle oI targeted eQTL
mapping can still be applied. In the context oI
an F
2
study or similar. increasing the power
Irom a smaller study can be used iI the main
Iocus is the identiIication oI cis and trans
acting eQTLs that underlie the QTL peak and
a whole genome eQTL scan is not oI interest.
The increased power Irom selection oI
homozygous individuals and the less stringent
signiIicance threshold required in a Iocused
study. as opposed to a genome scan. require a
more modest sized resource population and
correspondingly Iewer individuals to be
assayed Ior gene expression. For an
experiment with 200 F
2
that are homozygous
Ior the selected region(s). the power to detect
eQTL is ~ 95 Ior any eIIect larger than 0.3
phenotypic SD (using a less stringent
threshold oI P 0.01.) Recombinant
individuals can be used to increase the
mapping accuracy oI the QTL. but the
improved resolution will be more modest and
correspond to the smaller overall size oI the
resource population. The smaller sized study
may mean that a genome scan is less
worthwhile (although iI genotyping costs are
modest. a genome scan Ior the largest eIIect
eQTL can be undertaken with little additional
input as the expression data are already
recorded).
L2("% ) .$"$#(6'J ) %$"3E(62 ) <3& ) ' ) E'&8$-)
*+,
To illustrate potential oI genetical genomics
we describe a pilot study in chickens
38
. The
crucial part is the Iocused study oI a particular
putative QTL. in this case one aIIecting body
weight segregating in an inter-cross oI
broilers and layers. Our obiective was to
identiIy candidate genes through the eIIect oI
the QTL at the gene expression level: what
genes are aIIected. where do they map and in
what kind oI pathways are they involved?
We identiIied individuals that were
homozygous Ior markers Ilanking a QTL
region on chromosome 4 (GGA4) Irom the
seventh generation oI an advanced inter-cross
between a single broiler and a single layer
chicken. These were inIerred to be either QQ
(broiler allele) or qq (layer allele) Ior the QTL
and matings were set up to provide birds with
known` QTL genotypes. From the resulting
oIIspring. QQ males and qq males were
slaughtered at 21 days oI age and a sample oI
the breast muscle was taken Ior RNA
isolation and microarray studies. The
microarrays design was a direct comparison
oI QQ versus qq Ior eight independent
samples with a dye-swap (16 arrays used in
total). The microarray was a chicken cDNA
array with 12.877 Iunctional Ieatures. spotted
in duplicate (Ark-Genomics. 2004). Using
Iive alternative normalization procedures. we
deIined a consensus set oI results consisting
oI 45 (895) diIIerentially expressed genes
when applying a Ialse discovery rate (FDR)
oI 5 (20)
38
. This implies that out oI 45
(895) results we expect less than 3 (180) Ialse
positive results. The genes that are
diIIerentially expressed seem evenly
distributed over the genome and there appears
to be no enrichment Ior aIIected genes in the
QTL area on GGA4. However. there are 12
diIIerentially expressed genes (FDR 20)
that map to the QTL region and should be
considered positional candidate genes Ior the
QTL. Among these. AADAT (FDR 5) is
involved in lysine degradation. lysine
biosynthesis and tryptophan metabolism.
making it a promising candidate gene. At
present. we are perIorming pathway analyses
to see what pathways are enriched Ior
diIIerentially expressed genes and thus
providing Iurther clues on the way in which
the QTL aIIects body mass. Further
annotation oI the microarray and dedicated
pathway databases Ior chicken will Iurther
improve the characterization oI this QTL.
This demonstrates how a Iocussed study can
aid the dissection oI a QTL using limited
resources.
CONCLUDING REMARKS
Although the existing eQTL studies
demonstrate the utility oI genetical genomics.
they do not show its Iull potential because
they miss many moderate eIIects and provide
7
89
little opportunity to unravel genetic pathways
due to a lack oI trans-acting eIIects that
would provide tangible links between eQTL
and genes. Targeted and Integrated Mapping
is applicable to any species Ior which
populations with a Iew thousand or more
pedigreed individuals can be accessed and has
distinct advantages over an untargeted
genome scan Ior eQTL. II the targeted eQTL
study identiIies cis-acting eQTL underlying
the Iunctional QTL. this provides a direct
route to the candidate loci controlling the
Iunctional QTL
13.16
. Targeted and Integrated
Mapping is speciIically aimed at unravelling
genetic pathways underlying a Iunctional
QTL; by contrast. non-targeted studies oI
similar size would identiIy eQTL relating to
many pathways. but with too Iew
interconnected QTL to reconstruct a pathway.
For example. the studies on BXD mice
consider a very wide range oI phenotypes and
gene expression measures. but limited
statistical power reduces the number oI
meaningIul inIerence that can be drawn
19.20
.
With a Targeted and Integrated Mapping
approach the Iine mapping will reduce the
size oI the region containing the Iunctional
QTL. which in turn can be used to re-evaluate
the eQTL that map to the Iunctional QTL.
Iurther reIining the list oI potential candidate
genes and the possible gene networks
underlying the Iunctional QTL.
While the utility oI inbred resources like RI
lines Ior Iine mapping and (e)QTL mapping
has been demonstrated elsewhere
19.20.22
. we
want to emphasize that genetical genomics
should not be restricted to model species and
we make the case that poultry is very well
placed among livestock species to pioneer
these approaches.
Acknowledgements. The authors
acknowledge Iinancial support Irom the
BBSRC.
ReIerence List
1 Doerge.R.W. (2002) Mapping and analysis
oI quantitative trait loci in experimental
populations. Nat. Rev. Genet. 3. 43-52
2 Andersson.L. and Georges.M. (2004)
Domestic-animal genomics: deciphering
the genetics oI complex traits. Nat. Rev.
Genet. 5. 202-212
3 Andersson.L. (2001) Genetic dissection oI
phenotypic diversity in Iarm animals.
Nat. Rev. Genet. 2. 130-138
4 Hocking.P.M. (2005) Review oI QTL
results in chicken. Worlds Poultrv
Science Journal 61. 215-226
5 Flint.J. and Mott.R. (2001) Finding the
molecular basis oI quantitative traits:
successes and pitIalls. Nat. Rev. Genet.
2. 437-445
6 Darvasi.A. (1998) Experimental strategies
Ior the genetic dissection oI complex
traits in animal models. Nat. Genet. 18.
19-24
7 Cardon.L.R. and Bell.J.I. (2001)
Association study designs Ior complex
diseases. Nat. Rev. Genet. 2. 91-99
8 Jansen.R.C. and Nap.J.P. (2001) Genetical
genomics: the added value Irom
segregation. Trends Genet. 17. 388-391
9 Jansen.R.C. (2003) Studying complex
biological systems using multiIactorial
perturbation. Nat. Rev. Genet. 4. 145-
151
10 Hitzemann.R. et al. (2003) A strategy Ior
the integration oI QTL. gene
expression. and sequence analyses.
Mamm. Genome 14. 733-747
11 Carlborg.O. et al. (2004) Simultaneous
mapping oI epistatic QTL in chickens
reveals clusters oI QTL pairs with
similar genetic eIIects on growth.
Genet. Res. 83. 197-209
12 de Koning.D.J. and Haley.C.S. (2005)
Genetical genomics in humans and
model organisms. Trends Genet. 21.
377-381
13 Wayne.M.L. and McIntyre.L.M. (2002)
Combining mapping and arraying: An
approach to candidate gene
identiIication. Proc. Natl. Acad. Sci. U.
S. A 99. 14903-14906
14 Valleio.R.L. et al. (1998) Genetic mapping
oI quantitative trait loci aIIecting
susceptibility to Marek's disease virus
induced tumors in F2 intercross
chickens. Genetics 148. 349-360
15 Yonash.N. et al. (1999) High resolution
mapping and identiIication oI new
8
90
quantitative trait loci (QTL) aIIecting
susceptibility to Marek's disease. Anim
Genet. 30. 126-135
16 Liu.H.C. et al. (2001) A strategy to
identiIy positional candidate genes
conIerring Marek's disease resistance by
integrating DNA microarrays and
genetic mapping. Anim Genet. 32. 351-
359
17 Liu.H.C. et al. (2001) Growth hormone
interacts with the Marek's disease virus
SORF2 protein and is associated with
disease resistance in chicken. Proc.
Natl. Acad. Sci. U. S. A 98. 9203-9208
18 Liu.H.C. et al. (2003) IdentiIication oI
chicken lymphocyte antigen 6 complex.
locus E (LY6E. alias SCA2) as a
putative Marek's disease resistance gene
via a virus-host protein interaction
screen. Cvtogenet. Genome Res. 102.
304-308
19 Bystrykh.L. et al. (2005) Uncovering
regulatory pathways that aIIect
hematopoietic stem cell Iunction using
'genetical genomics'. Nat. Genet. 37.
225-232
20 Chesler.E.J. et al. (2005) Complex trait
analysis oI gene expression uncovers
polygenic and pleiotropic networks that
modulate nervous system Iunction. Nat.
Genet. 37. 233-242
21 Schadt.E.E. et al. (2003) Genetics oI gene
expression surveyed in maize. mouse
and man. Nature 422. 297-302
22 Hubner.N. et al. (2005) Integrated
transcriptional proIiling and linkage
analysis Ior identiIication oI genes
underlying disease. Nat. Genet. 37. 243-
253
23 Brem.R.B. et al. (2002) Genetic dissection
oI transcriptional regulation in budding
yeast. Science 296. 752-755
24 Yvert.G. et al. (2003) Trans-acting
regulatory variation in Saccharomyces
cerevisiae and the role oI transcription
Iactors. Nat. Genet. 35. 57-64
25 Brem.R.B. and Kruglyak.L. (2005) The
landscape oI genetic complexity across
5.700 gene expression traits in yeast.
Proc. Natl. Acad. Sci. U. S. A 102.
1572-1577
26 Kirst.M. et al. (2005) Genetic architecture
oI transcript-level variation in
diIIerentiating xylem oI a eucalyptus
hybrid. Genetics 169. 2295-2303
27 Monks.S.A. et al. (2004) Genetic
inheritance oI gene expression in human
cell lines. Am. J. Hum. Genet. 75. 1094-
1105
28 Morley.M. et al. (2004) Genetic analysis
oI genome-wide variation in human
gene expression. Nature
29 Brem.R.B. et al. (2005) Genetic
interactions between polymorphisms
that aIIect gene expression in yeast.
Nature 436. 701-703
30 Van Laere.A.S. et al. (2003) A regulatory
mutation in IGF2 causes a maior QTL
eIIect on muscle growth in the pig.
Nature 425. 832-836
31 Carlborg.O. and Haley.C.S. (2004)
Epistasis: too oIten neglected in
complex trait studies? Nat. Rev. Genet.
5. 618-625
32 Thaller.G. and Hoeschele.I. (2000) Fine-
mapping oI quantitative trait loci in
halI-sib Iamilies using current
recombinations. Genet. Res. 76. 87-104
33 Ronin.Y. et al. (2003) High-resolution
mapping oI quantitative trait loci by
selective recombinant genotyping.
Genetics 164. 1657-1666
34 Jansen. R. C. On the selection Ior speciIic
genes in doubled haploids. Heredity 69.
92-95. 1992.
ReI Type: Generic
35 Hillier.L.W. et al. (2004) Sequence and
comparative analysis oI the chicken
genome provide unique perspectives on
vertebrate evolution. Nature 432. 695-
716
36 Wong.G.K. et al. (2004) A genetic
variation map Ior chicken with 2.8
million single-nucleotide
polymorphisms. Nature 432. 717-722
37 Fu.J. and Jansen.R.C. (2006) Optimal
design and analysis oI genetic studies
on gene expression. Genetics 172.
1993-1999
38 Cabrera. C. P.. Dunn. I. C.. Fell. M..
Wilson. P. W.. Burt. D. W..
Waddington. D.. Talbot. R. T..
9
91
Hocking. P. M.. Law. A.. Haley. C. S..
Knott. S. A.. and de Koning. D. J.
Application of genetical genomics to a
marked QTL in poultry. Proceeding
oI 8th WCGALP in press. 2006. Brazil.
ReI Type: ConIerence Proceeding
39 Lynch M. and Walsh J.B. (1998) Genetics
and analvsis of complex traits. Sinauer
associates. Inc.
10
92
11
Experimental challenge
(a) Large segregating population
(b) Molecular typing of population for known QTL
eQTL mapping
0
10
20
30
40
50
0 50 100 150
cM
T
e
s
t
S
t
a
t
is
t
ic

Tissue sampling, gene expression analysis
and genome-wide genotyping
Fine mapping
0
10
20
30
40
50
60
70
0 20 40 60 80 100 120
cM
T
e
s
t
S
t
a
t
is
t
ic
Additional phenotyping and (selective)
recombinant genotyping
(d) ndividuals that are recombinant for target
region(s)
(c) ndividuals that are homozygous for target
region(s)
(e) Positional candidate genes in target regions, effects
of QTL on gene expression in cis and trans
eQTL fine mapping
Figure 1. Targeted and Integrated Mapping. The design requires a resource population oI a Iew thousand individuals or more (a)
that is segregating Ior phenotypic traits oI interest and in which QTL aIIecting this trait have been discovered or conIirmed. (b) The
entire population is genotyped Ior markers Ilanking the previously identiIied QTL. Individuals that are homozygous Ior the region(s)
oI interest (c) will be used Ior tissue collection and gene expression analyses. possibly Iollowing an experimental challenge. They
will also be genotyped Ior markers spanning the entire genome. This will identiIy whether a marked QTL region: aIIects expression
oI genes in the same area oI the QTL (in cis). aIIects the expression oI genes elsewhere in the genome (in trans) or interacts with
each other marked QTL regions or other regions oI the genome. Individuals that are recombinant Ior the QTL region(s) (d) can be
Iurther phenotyped Ior the trait oI interest. Combined with a genotyping strategy that is aimed at identiIying all recombinants in the
QTL region(s) the QTL region can be narrowed down. II experimentally Ieasible. a sub-set oI the individuals that were used Ior Iine
mapping could be used Ior limited gene expression analysis in order to validate the eQTL results via eQTL Iine mapping.
93
12
94
A) Power to detect eQTL for different experimentaI sizes
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
eQTL effect (phenotypic SD)
P
o
w
e
r

t
o

d
e
t
e
c
t

e
Q
T
L
200 F2 (100)
300 F2 (150)
400 F2 (200)
500 F2 (250)
B) Required size of resource popuIation
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
1 QTL 2 QTL 3 QTL
Number of QTL targeted for geneticaI genomics
F
2

r
e
q
u
i
r
e
d
200 F2 for eQTL
300 F2 for eQTL
400 F2 for eQTL
500 F2 for eQTL
C) Predicted confidence intervaIs for fine mapping
0.10
1.00
10.00
100.00
0 5000 10000 15000 20000 25000
TotaI F2 size
C
o
n
f
i
d
e
n
c
e

i
n
t
e
r
v
a
I

o
f

Q
T
L

(
c
M
)
0.3 0.4 0.5
0.6 0.7
QTL Effect (SD)
D) totaI genotypings for targeted design
0
50,000
100,000
150,000
200,000
250,000
300,000
5,000 10,000 15,000 20,000
Resource popuIation (F2)
T
o
t
a
I

g
e
n
o
t
y
p
i
n
g
s
1 QTL
2 QTL
3 QTL
Figure 2 Statistical power and precision for Targeted and Integrated Mapping. A) Power to detect eQTL at P 0.001 (LOD 3.0) Ior diIIerent eQTL eIIects and F2
population sizes
39
. Between brackets is the equivalent number oI selected F2 that are homozygous Ior the QTL or RI lines Ior the same statistical power. B) The size oI the
resource population that is required to obtain a given number oI F2 individuals that are homozygous (with the same line origin) Ior 1. 2. or 3 Iunctional QTL with a
conIidence interval oI 20 cM taking into account potential deviations Irom Mendelian ratios. The numbers are based on a 95 probability to have the required number oI
animals homozygous Ior the QTL
34
. C) The expected resolution (conIidence interval) Irom Iine mapping given the size oI the resource population and the original complex
trait QTL eIIect
6
. D) The total number oI genotyping experiments Ior the combined strategy targeting 1. 2. or 3 QTL with an initial conIidence interval oI 20 cM using a
golden section / halI section selective genotyping strategy on the 25 top and tails oI the resource population
33
. The genotyping Ior the eQTL. assuming a genome scan. is
Iixed at 120.000 (i.e. 200 individuals Ior 600 markers. 300 Ior 400 markers etc.). The concept is illustrated Ior an F2 experimental cross. based on the outline that is
presented in Figure 1
95
96
!"#$%&#'())*"+#,-#.#/012#*"#$%&#2('&#'(34&3#*"$&35(6
72*"+#'76$*)6&#*"$&35(6#'())*"+#("8#(#','&"$#'&$%,8
Maver. M.
Research Unit Genetics and Biometrv
Research Institute for the Biologv of Farm Animals (FBN)
Dummerstorf. Germanv
Theoretical studies have shown that. when using least squares. it is impossible to map
multiple QTL within the same marker bracket. Maximum likelihood can separate the
locations and eIIects although it is known that the amount oI inIormation contained in the
distribution oI the data is small relative to the amount oI inIormation contained in the mean
marker contrasts. The ioint conditional probabilities oI QTL genotypes Ior two putative QTL
within a marker interval were derived. A simulation study on the mapping oI 2 closely linked
QTLs within the same marker interval using multiple interval mapping and the ioint
conditional probabilities was perIormed. As the advantage oI a two-step moment method was
supposed to be its simplicity and computational eIIiciency in detecting closely linked QTL
this moment method was also included in the study and compared with multiple interval
mapping.
97
Three-Iocus hapIotype probabiIities for muItipIe-strain RIL
!"#$%"#&'()$*+&'$"
Research Unit Genetics and Biometrv. Research Institute for the Biologv of Farm Animals
FBN. Dummerstorf. Germanv
Recombinant inbred lines (RIL) derived Irom multiple inbred strains can serve as a powerIul
resource Ior the genetic dissection oI complex traits. The use oI such multiple-strain RIL
requires a detailed knowledge oI the haplotype structure in such lines. Broman (2005) derived
the two- and three-point haplotype probabilities Ior 2
n
way RIL; the Iormer required heIty
computation to inIer the symbolic results. and the latter were strictly numerical. Teuscher and
Broman (2007) describe an approach Ior the calculation oI these probabilities. which allowed
them to derive the symbolic Iorm oI the three-point haplotype probabilities. The technique is
explained here. It is also shown that the legendary two-strain results oI Haldane and
Waddington (1931) can be derived in a much more simple way.
BROMAN. KW; 2005 The genomes oI recombinant inbred lines. Genetics.(,-./1133-1146.
HALDANE. JBS and CH WADDINGTON; 1931 Inbreeding and linkage. Genetics ,-/(357-374.
TEUSCHER. F and KW BROMAN; 2007 Haplotype probabilities Ior multiple-strain RIL. Genetics.
,01/(1-8.
98
Haplotype sampling in crossbred populations
Albart Coster, Albart.Coster@wur.nl
March 6, 2007
Abstract
Current methods for haplotype inference without pedigree information assume random-
mating populations. Crossbred populations, however, are non-random-mating. Haplotype
frequencies in both parental populations provide information for haplotype frequencies in
the crossbred population. We introduce a new method for haplotype inference without pedi-
gree that does not need rely on the assumption of random-mating and that can use genotype
data of the parental lines and of the crossbred line in the inference. The method uses a
Bayesian layout with a Dirichlet Process as prior for the haplotypes in the population. The
basic idea is that only a subset of the whole set of possible haplotypes is present in the pop-
ulation. Both haplotypes corresponding to an individual are sampled jointly. Running the
algorithm on simulated data showed that it compares to other methods in random-mating
populations and that it over performs them in non-random-mating populations. The method
is robust against incomplete data, i.e. only a sample of parental genotypes is needed in the
inference of crossbred haplotype frequencies.
1
99
Modelling and optimizing a
dynamic selection breeding
scheme
Anne DevaIIe, CaroIe Moreno, ZuIma Vitezica, Jean MicheI EIsen
QTL MAS
22-23 March 2007
Introduction
Context
An additional selection criterion based on the effect of a gene on
a new trait is introduced within a selection scheme already
existing.
The genetic progress on the basic polygenic traits is reduced by
this new selection objective
Aim: to create a mathematical model allowing a total optimization
in order to maximize the frequency of desired genotype while
minimizing the loss of genetic progress
100
To optimise genotypic selection while
minimizing the loss oI genetic progress
The population is structured in groups defined by:
Sex (females et males)
Age (1 to 6 years)
Category (elite or not)
genotype (1, 2 or 3)
Each group is characterised by its relative size and
average breeding value
Selection scheme
Young males Young females
Progeny test
Not elite males Elites males
Not elite females Elites females
Assortative mating
P
R
s 1-s
Q
v 1-v
Generation t
Generation t +1
Young males Young females
P
Q
101
The concept
The variables and the parameters of year t influence those of the year
t+1, those of t+1 have an impact on those of t+2 and finally those of
the year T (last year of selection) have an impact on the objective
function
t's necessary to consider all these variables and these parameters
globally and not step by step
9 parameters (p
t1
, p
t2
, p
t3
, r
t1
, r
t2
, r
t3
,s
t1
, s
t2
, s
t3
) to estimate
P, R and s constants for each year of selection
Example:
6 parameters to optimise and 3 to calculate
Males selection rates Ior one selection year
1 110( 1)1 2 110( 1) 2
3
110( 1)3

t t t t
t
t
P p f p f
p
f


=
1-P
Genotypic and genetic value selection
p
t1
p
t2
p
t3
P
r
t1
(1-s
t1
) r
t2
(1- s
t2
) r
t3
(1- s
t3
)
R(1-s)
1-R r
t1
s
t1
r
t2
s
t2
r
t3
s
t3
Rs
Genotypic and genetic
value selection
1 year
2 years
3 years
4 years
and +
102
Females selection rates Ior one selection year
1-Q
q
t1
(1-v
t1
) q
t2
(1-v
t2
) q
t3
(1- v
t3
) q
t1
v
t1
q
t2
v
t2
q
t3
v
t3
Q
Q(1-v) Qv
Q(1-v)
Q(1-v)
Q(1-v)
Qv
Qv
Qv
1 year
2 years
3 years
4 years
5 years
6 years
Genetic value selection
SingIe seIection threshoId
6 parameters to calculate according to genetic value
Which type oI mating?
n the selection scheme, we can consider different mating type
Random Mating
Mating according to genetic value
Assortative mating according to genotype
Mating
Young males will be born from mating between elite males and elite females
Young females will be born from mating between :
Elite males/ elite females
Not elite males / not elite females
Elite males / not elite females
Males in progeny test / not elite females
103
Assortative mating
3
2
1
3 2 1
q
33t
q
23t
q
13t
q
32t
q
22t
q
12t
q
31t
q
21t
q
11t

Constrained by dams
proportion of genotype 1
Constrained by dams
proportion of genotype 2
Constrained by dams
proportion of genotype 3
Constrained by sires proportion
9 parameters to optimise constrained by parents
genotypic frequencies
4 parameters to optimise and 5 to calculate
Genetic algorithm and obiective Iunction
Resolution of complex problem:
Multiple and complex constraints
Don't use classical method of resolution (no derivation of the objective function)
Many parameters to optimise (10*T parameters)
Maximisation of the frequencies of resistant femaIes weighted by age
constrained by a loss of AG for quantitative trait lower than a threshold
: percentage of the progress made / realizable that one wishes to preserve
/=100
6
2 . 1
3
a1
21. 21.
6
1
1
3
6
21. 21.
2 . 1
1
a1
6
1
1
si 0
1
1
* iI not
1
a T
T Tg Tg
g
a
T Tg Tg
a T
g
T
a
f
a
G f
a
Fobi
G f
f
a
G
a
t u
t u

t
=
=
=
=

| |
|
|
A s
|
|
|
\

=

| |
A
|
|

|
A
|
|
\

104
Application: selection Ior resistance to scrapie
0,89 0,77 0,56 Males
Females
0,207
Genotype 3
0,496
Genotype 2
0,297
30% c
20% R
10% v
P
Q
22,5% Males
Females
0,73
Estimation error of
selection index
36 o genetic standard deviation
80% Selection rate
Genotype 1
nitial frequencies
genotype 1 = resistant, genotype 2= intermediary, genotype 3 = susceptible
Selection scheme during 13 years
Results
Without constraint on the genetic progress
total disappearance of susceptible genotype after 13 years
87% of the young females are of resistant genotype after 13 years
88% of the young males
All the breeding animal carry the homozygous resistant genotype
Genetic progress loss= 6.4%
With constraint on the genetic progress
accepted loss of genetic progress=4%
1% of susceptible genotype after 13 years
77% of the young females are of resistant genotype after 13 years
78% of the young males
Totality of breeding animal have resistant genotype
105
SimpliIication oI the selection scheme
Simplifications are possible without loss on objective function : constant
selection rate for elite males and mating randomly
0.6
0.62
0.64
0.66
0.68
0.7
0.72
0.74
0.76
initial constant rate
before progeny
test
constant rate
relating to elite
males
random mating constant rate
for elite and
random mating
Conclusion
Main features of the model
This model takes into account 2 traits (production trait and secondary
trait)
The objective function co-optimise genotypic frequencies and genetic
progress
The derivative is not used
Main results
Rather long computing
To be used as a gold standard when exploring simpler strategies
Projects
Comparison with Dekkers et al. modelling
Extension to other parameters
106
Thanks Ior your attention
107
Dense map Sparse map Conclusion
Rejection thresholds in QTL detection
Charles-Elie Rabier
C eline Delmas, Jean-Michel Elsen
Station dAm elioration G en etique des Animaux
INRA Toulouse
3/23/07
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion
The notion of threshold
This palmtree seems to be very tall
Is it signicant ?
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
108
Dense map Sparse map Conclusion
Outline
1
The dense map
Interval Mapping
Result of Lander & Botstein (1989) / Cierco (1996)
Population with family structure
Is it useful to consider the dense map ?
2
The sparse map
Test performed only on markers
Genome scan using only the closest marker
3
Conclusion
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Interval Mapping
H
0
: q=0 (no QTL) vs H
1
: q=0
Likelihood ratio test at a position x :
T
x
= 2ln
L
x
( ,

q,
2
)
L( , 0,
2
)
T
x
1
, T
x
2
, ..., T
x
K
dene a process T(.)
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
109
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Result of Lander & Botstein (1989) / Cierco (1996)
Hypothesis :
all markers informative
dense map
N ( N : number of individuals )
T(.) converges to the square of an Ornstein Uhlenbeck
process under H
0
Denition (Ornstein Uhlenbeck process)
An O.U process is a gaussian stationnary process, with mean
equals to 0, variance equals to 1, covariance fonction equals to
r (t ) = exp(2 | t |)
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Population with family structure
I sires each with J progenies
s
i
: polygenic effect of the sire i
q
i
: qtl effects of the allele present on the rst chromosome
of the sire i
H
0
: q
1
= ... = q
I
= 0 vs H
1
: q
i
= 0
Likelihood ratio test at a position x :
T
x
= 2ln
L
x
(

s
1
, ...,

s
I
,

q
1
, ...,

q
I
,
2
)
L(

s
1
, ...,

s
I
, 0, ..., 0,
2
)
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
110
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Population with family structure
Hypothesis :
all markers informative
dense map
J ( J : number of progenies by sire )
T(.) converges to an Ornstein Uhlenbeck Chi Square
process under H
0
Denition (Ornstein Uhlenbeck Chi Square process)
Let Z
1
(.), ..., Z
I
(.) I independant O.U. process.
Y(t ) =

I
i =1
(Z
i
(t ))
2
is named an O.U.C.S process
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Ornstein Uhlenbeck
FIG.: 2 trajectories of the square
of an O.U process
FIG.: 2 trajectories of an O.U.C.S
process, I=5
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
111
Dense map Sparse map Conclusion IM Lander & Botstein & Cierco Family Useful ?
Is it useful to consider the dense map ?
Interval Mapping statistic
T
IM
= sup
x
T
x
d O.U.C.S(1) O.U.C.S(5)
10
5
9.1696 18.8718
10
4
9.0830 18.7576
10
3
8.9230 18.5648
Thresholds at the level 5% for T
IM
d : distance between markers (in Morgan)
100 000 trajectories, L=1M
the map will never be dense
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion on markers GS closest marker
Test performed only on markers
K markers (equally spaced for the illustration)
d : distance between 2 adjacent markers
x
1
, ..., x
K
: markers position
only 1 family
sparse map square of a Discrete O.U. process

T
x
1
.
.
.
.
.
.
T
x
K

H
0

J
W
2
with
W N(

0
.
.
.
.
.
.
0

1 . . .
K1

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

K1
. . . 1

) and = e
2d
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
112
Dense map Sparse map Conclusion on markers GS closest marker
Test performed only on markers
I families Discrete O.U.C.S(I) process
Application for I=5
d = 0.1 cM
L = 0.50 M
100 000 trajectories of the Discrete O.U.C.S(5) simulated
threshold=16.8391
Thresholds obtained with classical methods
(40 000 populations simulated under H
0
)
J classical methods
100 16.94
200 16.841
the Discrete O.U.C.S(5) goes 1680 times faster ! ! !
4 minutes vs almost 4 days
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion on markers GS closest marker
Genome scan using only the closest marker
Let M
x
the closest marker from x
Here : M
x
1
= M1 and M
x
2
= M2
r
x
1
= r (M
x
1
, x
1
) = r (M1, x
1
)
r
x
2
= r (M
x
2
, x
2
) = r (M2, x
2
)
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
113
Dense map Sparse map Conclusion on markers GS closest marker
Genome scan using only the closest marker
x
1
, ..., x
K
: position tested
T
x
1
, T
x
2
, ..., T
x
K
dene a process T(.)
1 family T converges to the square of a Z process under H
0
Denition (Z process)
A Z process is a gaussian process, with mean equals to 0,
variance equals to 1, covariance fonction equals to
cov(Z
x
i
, Z
x
j
) = exp(2d(M
x
i
, M
x
j
) )
I families T converges to a Y process under H
0
Denition (Y process)
Let H
1
(.), ..., H
I
(.) I independant Z process.
Y(t ) =

I
i =1
(H
i
(t ))
2
is named a Y process
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
Dense map Sparse map Conclusion on markers GS closest marker
Genome scan using only the closest marker
FIG.: A trajectory of the Y process (L=1.2 M, d=0.4 M, I=5)
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
114
Dense map Sparse map Conclusion
Conclusion
How can thresholds of rejection be obtained?
dense map to give up
we must consider sparse map to obtain thresholds
Sparse map :
test on markers D.O.U.C.S (I) process
genome scan using the closest marker Y process
classical IM : a non stationnary process not presented here
Rabier, Delmas, Elsen Rejection thresholds in QTL detection
115
Properties of different phenotypic measures for estimating QTL
variance components and MA-BLUP EBV
S. Neuner
1
. R. Emmerling
1
. G. Thaller
2
und K.-U. Gtz
1
1
Bavarian State Research Centre Ior Agriculture. Institute oI Animal Breeding. D-85580 Grub
2
Christian-Albrechts-University. Institute oI Animal Breeding and Husbandry. D-24098 Kiel
Introduction
Reliable estimates Ior variance components in QTL-models are important Ior Iine mapping
experiments and MA-BLUP evaluations in breeding programs using marker assisted selection
(MAS). In cattle populations in many cases only a small Iraction oI the population will be
genotyped at genetic markers. As only these genotyped animals provide inIormation Ior QTL
speciIic evaluations a two step approach` was used to estimate QTL variance components
and MA-BLUP EBV. e.g. by Liu et al. (2004) or Druet et al. (2006). At Iirst a polygenic
animal model evaluation is conducted Ior the entire population to estimate pre-corrected
phenotypes (step 1). These estimates are Iurther used as observations in an MA-BLUP model
Ior the genotyped animals (step 2). Since the pre-corrected phenotypes may represent
diIIerent amounts oI inIormation. the problem oI weighting oI phenotypes in the second step
arises.
This study examines the use oI daughter yield deviations (DYD) oI bulls and yield deviations
(YD) oI cows as observations in MA-BLUP models. Various models were calculated to detect
the best combination oI phenotypic measures (DYD. YD) and weighting Iactors Ior the
estimation oI QTL variance components and MA-BLUP EBV.
Material and Methods
A stochastic simulation model was applied to evaluate diIIerent alternatives Ior estimating
QTL variance components and MA-BLUP EBV in a two step approach. Each simulation
cycle is divided into two phases: data generation and analysis oI the simulated data sets.
Data Generation
In the simulation a dairy cattle breeding scheme with progeny testing and use oI bulls Ior the
generation oI second crop daughters was simulated to generate the data. The time horizon Ior
the simulation was 16 years which equals approximately the time spans Ior collection oI
genotypic data in real research proiects.
In the current simulation study a single trait model Ior 305-day milk yield with a heritability
oI 0.36 was assumed. The overall breeding value oI each animal is the sum oI a residual
polygenic breeding value` and a QTL breeding value`. One biallelic QTL with an allele
Irequency Ior the Iavourable allele oI 0.5 is linked to a genetic marker. Recombination rate
between QTL and maker locus was Iixed at 0.01 and also the other assumptions the marker
locus were very optimistic: 100 diIIerent alleles Iollowing a uniIorm distribution (PIC0.9899
and PFIM0.9799). Calculations were done Ior diIIerent proportions oI the QTL-variance as
compared to the overall genetic variance (QVR) in the trait investigated: 0. 5. 10. 20. 30. 40.
50 and 90.
Analvsis of simulated data sets
QTL variance components and MA-BLUP EBV were estimated in a two step approach. In the
Iirst step a classical animal model (AM) evaluation was done Ior the entire population to
estimate DYD Ior progeny tested bulls and YD Ior cows in milk. Observations in this step
were phenotypic records oI cows. Usually all animals are included in genetic evaluations oI
1
116
dairy cattle. however. only a small Iraction oI animals might be genotyped at genetic markers.
These animals are most likely proven bulls. bull dams and selection candidates Ior progeny
testing. As only genotyped animals can provide inIormation Ior the estimation oI QTL
variance components and MA-BLUP EBV. the second evaluation step was only applied to
this genotyped subset oI the population. Observations in step 2 are DYD and YD calculated in
step 1.
The MA-BLUP model used Ior evaluations equals that oI Fernando and Grossmann (1989):
i
m
i
p
i i i
e v v u v + + + + =
where
i
v
is the record (YD Ior dams and DYD Ior sires) oI individual i .
i
u
is the residual
polygenic eIIect oI individual i .
p
i
v and
m
i
v are the paternal and maternal gametic eIIects oI
individual i and
i
e
is the residual.
Additionally. a classical animal model was calculated Ior this reduced data set. In this model
there was only one predictor Ior the overall animal eIIect and it is henceIorth denoted as
animal model on MA-BLUP records` (AM-MA). For the AM-MA and the MA-BLUP model
various combinations oI phenotypic measures (DYD. YD) and weighting Iactors are
calculated. The evaluations are divided into blocks A and B that are characterised by the
phenotypic measures used. Block A is similar to the German MA-BLUP system (Liu et al..
2004) where only DYD (DYD Models) are used. whereas in block B DYD and YD (DYD-
YD Model) are used together as in the French MA-BLUP system (Boichard et al.. 2002).
Within the blocks diIIerent weighting Iactors as described in the literature were applied to
DYD: no weighting. variance oI DYD (Bennewitz et al.. 2004). eIIective daughter
contributions (EDC)(Fikse and Banos. 2001; Liu et al.. 2004; Szyda et al.. 2005) and daughter
equivalents (DE) (Van Raden and Wiggans. 1991. Druet et al.. 2006). YD were not weighted
since each cow had only one record in the current study (Iollowing Druet. 2006. personal
communication).
The results Ior the diIIerent models are presented in terms oI deviations and correlations
between estimated and simulated parameters.
Results
Jariance Component Estimation
Criteria to evaluate the variance component estimation are the absolute values Ior the
estimated components and the estimated QVR. The results are consistent Ior all simulated
QVR. As an example. the results oI a variance component estimation Ior a simulated QTL
ratio oI 30 are shown in table 1. Presented results are the averages Ior 25 replicates. Results
clearly show that the use oI DYD-Models is superior to that oI DYD-YD models. The best
results were obtained by using DE` or EDC` as weighting Iactors. In DYD-YD MA-BLUP
models estimated variance components were less accurate. Best results were obtained when
the weighting Iactor variance oI DYD` was applied. Weighting oI inIormation in block A and
B was essential. No weighting oI daughter inIormation led to a gross overestimation oI
variance components.
2
117
Table 1: Simulated and estimated values of variance components for a QVR of 30
30 QTL Additive Genetic
Variance
Residual Polygenic
Variance
QTL Variance QTL Variance Ratio
(QVR)
Block weighting
factor
simulated MA-
BLUP
model
simulated MA-
BLUP
model
simulated MA-
BLUP
model
simulated MA-BLUP
model
A
(DYD)
no 260100 310276 182070 226130 78030 84146 0.30 0.271
DE 260100 267523 182070 189122 78030 78401 0.30 0.293
var(DYD) 260100 285015 182070 205123 78030 79891 0.30 0.280
EDC 260100 266468 182070 187311 78030 79157 0.30 0.297
B
(DYD-
YD)
no 260100 332366 182070 240387 78030 91979 0.30 0.277
DE 260100 296924 182070 221844 78030 75080 0.30 0.253
var(DYD) 260100 251005 182070 169885 78030 81120 0.30 0.323
EDC 260100 295643 182070 220385 78030 75258 0.30 0.255
Accuracv of MA-BLUP EBJ
For each group oI animals (bulls. cows and young bulls) correlations between simulated and
estimated breeding values were calculated Ior all combinations oI block x weight x model. As
Ior the estimation oI variance components. the Iindings are consistent Ior all simulated QVR
ratios. Correlations between EBV and simulated BV Ior a QTL ratio oI 30 are shown in
table 2. Presented results are the averages oI 25 replicates.
Table 2: Correlations between simulated and estimated breeding values for various models using
different phenotypic measures and weighting factors for QVR of 30
First oI all. our results show that weighting is also essential Ior the estimation oI MA-BLUP
EBV. Furthermore the diIIerences oI the correlations are pretty small when a weighting is
applied. Secondly. correlations oI proven bulls are nearly unaIIected whether YD are included
in the evaluation models or not. In contrast to proven bulls correlations Ior bull dams and
young bulls in the AM and the MA-BLUP model highly depend on whether YD are included
in the phenotypic measures (block B) or not (block A). II YD are not included. there are no
observations Ior bull dams and their EBV are only based on pedigree inIormation in the MA-
BLUP data. Since EBV oI young bulls are calculated based on their parents` EBV. the
advantage oI using YD is obvious. The diIIerence in correlations between DYD-YD models
3
30 QTL Progeny Tested Bulls Cows Young Bulls
Block weighting
factor
AM AM-MA MA-
BLUP
AM AM-MA MA-
BLUP
AM AM-MA MA-
BLUP
A
(DYD)
no 0.919 0.907 0.910 0.744 0.467 0.491 0.551 0.488 0.547
DE 0.919 0.908 0.911 0.744 0.468 0.491 0.551 0.488 0.547
var(DYD) 0.919 0.907 0.910 0.744 0.467 0.491 0.551 0.488 0.547
EDC 0.919 0.908 0.911 0.744 0.468 0.491 0.551 0.488 0.547
B
(DYD-
YD)
no 0.919 0.838 0.843 0.744 0.682 0.69 0.551 0.507 0.567
DE 0.919 0.910 0.914 0.744 0.708 0.718 0.551 0.543 0.605
var(DYD) 0.919 0.911 0.915 0.744 0.708 0.718 0.551 0.546 0.608
EDC 0.919 0.910 0.914 0.744 0.708 0.718 0.551 0.543 0.605
118
and DYD models is about 0.06 Ior young bulls and 0.25 Ior the cows. Thirdly. iI only DYD
are used in MA-BLUP evaluations (block A) a QTL oI at least 30 QVR is needed to get
accuracies Ior young bulls in MA-BLUP models that are close to the values Iound Irom the
classical AM. In DYD-YD MA-BLUP models (block B). QTL ratios >10 (results not
shown) are required to obtain higher accuracies Ior young bulls than in ordinary genetic
evaluations.
Lower correlations Ior all animal groups in AM-MA than Ior AM Ior all scenarios (table 2)
indicate that there is a loss oI inIormation iI two step approaches are applied.
Discussion
Our results show the importance oI weighting the daughter inIormation in DYD and DYD-
YD models. II there is no weighting diIIerent numbers oI daughters per bull are not accounted
Ior. This leads to inaccurate estimates oI variance components and lower accuracies oI MA-
BLUP EBV. Following the results oI the current study. the correct choice oI the weighting
Iactor is more important Ior the estimation oI variance components than Ior the estimation oI
MA-BLUP EBV. For the latter. the ratio oI the applied variances is more important than their
absolute values iI only accuracies are considered. The Iact that the highest accuracies are
always obtained iI the estimated variance components and ratios are as close as possible to the
simulated parameters shows the importance oI correctly chosen weights. Weighting is
especially diIIicult in DYD-YD models because the scales oI the two types oI inIormation are
not identical.
Applying two step approaches Ior MA-BLUP Models always causes a loss oI inIormation. As
only a small Iraction oI the population is included in the model. several relationships among
animals get lost in MA-BLUP data sets. The inIormation content Ior proven bulls decreases
only marginally because DYD are estimated very accurately Irom many daughters. For cows
only relationship inIormation and. in case oI DYD-YD models. their own phenotypic records
are taken as sources oI inIormation. As a consequence. this leads to a higher impact oI
missing relationships Ior cows as compared to bulls. ThereIore. even in AM-MA DYD-YD
models accuracies oI cows are about 0.04 lower than in AM. In consequence the loss oI
inIormation due to the 2-step-approach and to missing phenotypes oI bull dams in DYD
models has to be overcome by an additional source oI inIormation: QTL inIormation.
Analyses oI MAS applied to practical breeding programs describe the increase in accuracies
oI young bulls (Liu et al.. 2004; Druet et al.. 2005). Liu et al. (2004) described the increase in
accuracy oI young German Holstein bulls iI two QTL are included as random eIIects and
DGAT1 as a Iixed eIIect in MA-BLUP evaluation. Correlations increased Irom 0.45 in the
AM-MA to 0.65 in the MA-BLUP model. but the main eIIect was due to DGAT1. More
important than comparing results oI AM-MA and MA-BLUP Models is the superiority oI
accuracies Irom MA-BLUP models over traditional AM. Druet et al. (2005) investigated this
Ior the French MAS program. In the French MAS system between 40-50 oI the variance Ior
all milk traits is explained by 3 to 5 QTL. Accuracies Ior milk yield EBV oI young bulls
increased Irom 0.47 to 0.55. Results oI our analysis Ior a 40 QTL show an increase Irom
0.58 to 0.68. DiIIerences in the level can be explained by diIIerent heritabilities and diIIerent
designs Ior the French breeding program and the one assumed in the simulation. A higher
beneIit through MAS in the simulation could be expected because only one QTL with
advantageous properties was simulated.
The choice oI whether DYD or DYD-YD models are used. depends on the intention oI the
research: Iine mapping or estimation oI MA-EBV Ior MAS. While Iine mapping is especially
interested in correct estimates Ior variance components. MAS requires an increase in
accuracies Ior MA-EBV oI animals without own phenotypic or progeny inIormation.
4
119
ThereIore. MA-BLUP models Ior MAS should include both DYD and YD to ensure the
highest possible eIIiciency oI selection.
Conclusions
To estimate QTL variance components in MA-BLUP models the use oI DYD models is the
most appropriate strategy. Weighting is essential and best results are obtained by using DE`
or EDC` as weighting Iactors. In DYD-YD MA-BLUP models estimated variance
components are less accurate.
MA-BLUP evaluations that do not make use oI phenotypic data Ior bull dams will only give
beneIits Ior QTL explaining more than 30 oI the additive genetic variance. In DYD-YD
models the data is still incomplete compared to conventional animal model evaluation. To
outweigh the loss oI inIormation caused by the two step approach a 10 QTL is necessary.
As a consequence oI the results oI this study MA-BLUP models used to estimate EBV Ior
MAS should include DYD and YD to ensure that MAS improves selection eIIiciency even Ior
moderate QTL eIIects.
References
Bennewitz. J.; Reinsch. N.; Paul. S.; LooIt. C.; Kaupe. B.; Weimann. C.; Erhardt. G.; Thaller. G.; Kuhn. Ch.;
Schwerin. M.; Thomsen. H.; Reinhardt. F.; Reents. R. and Kalm. E. (2004): The DGAT1 K232A Mutation Is
Not Solely Responsible Ior the Milk Production Quantitative Trait Locus on the Bovine Chromosome 14. J.
Dairy Sci.. 87: 431-442.
Boichard. D.; Fritz. S.; Rossignol. M. N.; Boscher. M. Y.; MalaIosse. A. and Colleau. J. J. (2002):
Implementation oI Marker Assisted Selection in French Dairy Cattle Breeding. Proc. 7
th
World Congr. Genet.
Appl. Livest. Prod.. Montpellier. France. n 22.03.
Druet. T.; Fritz. S.; Colleau. J.J.; Gautier. M.; Eggen. A.; Rossignol. M.N.; Boscher. M.Y.; MalaIosse. A and
Boichard. D. (2005): Genetic markers in breeding programs. 26th European Holstein and Red Holstein
ConIerence. Prague 2005.
Druet. T.; Fritz. S.; Boichard. D. and Colleau. J. J.. (2006): Estimation oI Genetic Parameters Ior Quantitative
Trait Loci Ior Dairy Traits in the French Holstein Population. J. Dairy Sci.. 89: 4070-4076.
Fernando. R. L. and Grossman. M. (1989): Marker assisted selection using best linear unbiased prediction.
Genet. Sel. Evol.. 21: 467477.
Fikse. W. F. and Banos. G. (2001): Weighting Factors oI Sire Daughter InIormation in International Genetic
Evaluations. J. Dairy Sci.. 84: 1759-1767.
Liu. Z.; Reinhardt. F.; Szyda. J.; Thomsen. H. and Reents. R. (2004): A Marker Assisted Genetic Evaluation
System Ior Dairy Cattle Using a Random QTL Model. Int. Bull Evaluation Service. Uppsala. Bulletin 32: 170-
174.
Szyda. J.; Liu. Z.; Reinhardt. F. and Reents. R. (2005): Estimation oI Quantitative Trait Loci Parameters Ior Milk
Production Traits in German Holstein Dairy Cattle Population. J. Dairy Sci.. 88: 356-367.
Van Raden. P. M. and Wiggans. G. R. (1991): Derivation. calculation. and use oI national animal model
inIormation. J. Dairy Sci.. 74: 27372746.
5
120
Detecting Dominance QTL: power of variance component analysis in different pedigree
structures
*S.Rowe
1.2
. R.Pong-Wong
1
. C.Haley
1
. S.Knott
2
and D.J. De Koning
1
*
1
Roslin Institute. Midlothian. Edinburgh. EH25 9PS. UK
2
University oI Edinburgh. Kings Buildings. West Mains road. Edinburgh.
Abstract
Using variance component models. additive and dominant QTL eIIects accounting Ior 6-8 oI within
Iamily variance Ior conIormation score were Iound on chromosomes 4 and 5 in a commercial poultry
population. Extensive simulations were carried out to validate these results in varying population
structures comprising oI Iull/halI sib pedigrees. Power to detect QTL was high in pig and poultry
scenarios Ior dominance eIIects accounting Ior more than 4 oI phenotypic variance but low in human
type pedigrees. Maternal or common environment eIIects were conIounded with dominance iI not
Iitted within the model. Including dominance into the model did not aIIect the ability to detect additive
eIIects nor was spurious dominance detected other than when maternal eIIects were unaccounted Ior.
Variance component analysis can be used to detect additive and dominant QTL eIIects alongside
traditional breeding values and oIIers an opportunity to exploit marker assisted selection in commercial
populations.
Introduction. Use oI the linear model to model QTL variance as multivariate random eIIects is
increasingly recognised as a powerIul and Ilexible approach in both human and livestock sectors
1-7
.
Variance component based methods have been developed to incorporate interaction within and
between loci and have the advantage oI simultaneously locating and estimating genetic eIIects within
any population structure. Advantages are that many alleles or allelic eIIects can be modelled and all
relationships in a pedigree can be used thus increasing power to detect QTL. The method can be used
in any population structure making it cost eIIective and commercially viable and particularly useIul
where test crosses are unethical. untenable or impractical. It is oIten more practical to explore QTL
segregating within a population. particularly iI it is to Iacilitate selection. Despite intense selection
there is evidence to suggest that there is still much variation that might be exploited within commercial
populations
8
.
Dominance has been increasingly Iound to be an important source oI variation. in particular. Ior traits
with low heritability such as those associated with reproduction. Many livestock breeding schemes
involve the development and subsequent crossing oI pure bred lines to exploit non additive variation
e.g heterosis. II these schemes are to successIully combine attributes Irom diIIerent lines. Ior example.
production and maternal traits it is imperative that the mode oI inheritance is understood.
Method
Variance component analysis. Following the twostep approach described by George et al.
9
. IBD
coeIIicients were estimated at 1cM intervals with novel recursive soItware
10
. Variance/covariance
matrices were constructed as described by Lui et al..
11
. Variance components were estimated using
ASReml to maximise likelihoods Ior nested mixed models including Iixed eIIects oI sex. age oI dam
and hatch within Ilock and random polygenic. additive and dominance QTL eIIects. Models used
were:
1
121
y X Zu e (0)
y X Zu Za e (1)
y X Zu Za Zd e (2)
y X Zu Zm Za Zd e (3)
where y is a vector oI phenotypic observations. is a vector oI Iixed eIIects oI sex. age oI dam and hatch within Ilock. u. a.
d. m and e are vectors oI additive polygenic eIIects. additive and dominance QTL eIIects. maternal eIIects and random
residuals respectively. and X and Z are incidence matrices relating to Iixed and random genetic eIIects respectively.
Variances Ior polygenic and QTL eIIects are distributed as . Var(a)~ AQo
2
a. Var(d) ~ DQo
2
d. var(e)~ o
2
e. var(u)~Ao
2
a where
A is the standard additive genetic matrix based on pedigree data only. AQ is the QTL additive genetic relationship matrix
based on marker inIormation. DQ is the QTL dominance genetic relationship matrix representing the probability that two
individuals have the same pair oI alleles in common based on marker inIormation.
Test statistic A likelihood ratio test statistic Ior a given location was obtained by comparing the
likelihoods oI nested models under the assumption that the test statistic is chi squared distributed with
degrees oI Ireedom equal to the number oI extra parameters estimated. This is conservative as there is
a complex mixture oI distributions
12
. Three tests were carried out; model 1 versus 0 to estimate
variance components under a purely additive model using 1dI (add); model 2 versus 0 to estimate
variance components under a model including additive and dominance eIIects using 2dI (adddom);
and model 2 versus 1 (dom) to try to estimate how much oI the variation could be attributed purely to
the dominance parameter using 1dI. Under the maternal scenarios model 2 was substituted with model
3.
Simulations Population structure
For three population structures. random mating oI parents was simulated to obtain a second generation
oI 1900 progeny. Population 1 (poultry) involved 19 males each mated to 5 Iemales with 20 oIIspring
per Iemale. Population 2 (pig) involved 10 males each mated to 19 Iemales with 10 oIIspring per
Iemale. Population 3 (human) involved 633 sires each mated to 633 Iemales with 3 oIIspring per
Iemale.
Figure 1 M denote markers along 20cM spacing. Arrows below the line denote test positions.
A 20 cM chromosome was simulated with 5 markers spaced at 5cm intervals and a bi-allelic QTL
between the second and third marker at 7.5cM (Iigure 1). Dominance eIIects were simulated ranging
Irom partial to over dominance over a range oI additive eIIects. For some scenarios. a maternal eIIect
oI 0.1 was simulated. Heritability ranged Irom 0.1 to 0.6 with dominance QTL eIIects ranging Irom 2
to 10 oI phenotypic variance.
Analysis
Real Data. Phenotypes on conIormation score measured at 40 days. were available Ior a commercial
broiler dam line Irom Cobb-Vantress Breeding Company Ltd. A two-generation pedigree oI 100 dam
0 5 10 15 20
!"#
"$%&
M M M M M
2
122
Iamilies nested within 46 sire Iamilies gave a total oI 2708 records with genotypes available Ior
markers Ior candidate regions on chicken chromosomes 1. 4 and 5.
Simulations
Using population 1 to obtain a point wise test statistic. 1000 replicates oI a single test were carried out
at the QTL position. In all populations chromosome wise test statistics were generated using tests Ior
additive and dominant QTL eIIects at 2.7.12. and 17cM. 100 replicates were used to determine 1. 5 and
10 empirical thresholds.
Results and discussion
Poultry data
chromosome 4
0
0.5
1
1.5
2
2.5
0 20 40 60 80
-
I
o
g
1
0

P

v
a
I
u
e
chromosome 5
0
0.5
1
1.5
2
2.5
0 10 20 30 40
cM
conf add
conf dom
P<0.05
P<0.01
Figure 1. -Log10 P value Ior models including additive (add). and additive and dominance (dom) QTL eIIects versus no
QTL eIIects Ior 40-day conIormation score on chicken chromosomes 4 and 5.
Based on LR oI additive versus null being chi square distributed with 1dI and LR oI additive and dominance QTL eIIects
versus null chi squared distributed with 2dI.
For conIormation on chromosome 4. the additive variance is reduced to virtually nil with a dominance
eIIect explaining 8 oI total variance. Similar eIIects are seen Ior weight and conIormation on
chromosome 5 with dominance eIIects explaining 6 oI total variance. Fitting a direct maternal eIIect
did not aIIect the results.
Simulations
Null distribution. For the point wise test all models were conservative at all thresholds. The additive
model. however. was much closer to the expectation at all thresholds. This implies that when a
dominance component is included. the test becomes more conservative. A similar pattern is observed
in the chromosome wide case although all tests are less conservative in the case oI multiple testing.
Figure 1A gives Ialse positive rates at the 5 threshold Ior the three populations and population 2
when a maternal eIIect was simulated but not Iitted. It is quite clear that in this case the maternal
variance simulated is expressed as spurious dominance. The remaining scenarios all appear
conservative. again in particular in the case oI testing Ior dominance. Figure 1B shows that when
dominance variance is at 7 oI the phenotypic variance power Ior population 1 is 95. Ior population
2 is ~75 and Ior population 3 ~25.
In the event that no dominance is simulated. there is little spurious dominance detected. Furthermore.
power to detect an additive eIIect is similar whether or not the extra dominance component is included.
3
123
This suggests that a routine scan including dominance would not result in too great a loss oI power
even in the absence oI any dominant eIIects. In the scenario where the only variance is due to
dominance eIIects when testing Ior an additive QTL some dominance variance would be detected.
0
10
20
30
40
50
60
70
poultry pig human c2 c2 fitted
p
o
w
e
r

%

0
10
20
30
40
50
60
70
80
90
100
0.4 0.5 0.6 0.7 0.8
poultry
pig
human
Figure 1 A False positive rates Ior comparison oI the three models under assumption that test statistic is chi-squared
distributed at 5 threshold dI equal to number oI extra parameters modeled. Chromosome wise 4 tests 100 iterations. B
Power to detect QTL eIIects at 5 threshold with additive eIIect Iixed at 0.8. Partial to Iull dominance in pig poultry and
human type populations.
Power above 90 was achieved Ior dominance eIIects exceeding 4 oI the total variance Ior
population 1. This is in line with theresults Irom real data. Power was similar Ior pig and poultry
populations but much lower Ior the human type population. This was unsurprising as it involved many
small Iamilies with low numbers oI Iull sibs making it diIIicult to detect dominance. This indicates that
we have power to detect modest dominant QTL eIIects in livestock populations but power in human
type populations remains low. Greater power might be achieved in human studies Irom a pedigree with
more generations and this requires Iurther exploration.
Simulation results showed that extending the linear model to include a dominance component resulted
in a conservative test when imposing a chi-squared distribution Ior the likelihood ratio test statistic.
The loss oI power to detect an additive QTL when using the Iull dominance model was limited.
suggesting no detriment to the routine incorporation oI a dominance component. Furthermore.
detection oI spurious dominance was rare. There is strong evidence to suggest that a common
environment eIIect should be incorporated into all models as its inclusion has little eIIect on Ialse
negative rates but a potentially huge impact on Ialse positive rates. This is because most variation due
to common environment masquerades as dominance iI unaccounted Ior.
Conclusions. Using variance component analysis. dominant QTL eIIects Ior conIormation score were
detected on chicken chromosomes 4 and 5. accounting Ior 8 oI residual variance. We have shown
that dominant QTL can be detected accurately and are segregating within commercial livestock
populations. We have also shown that dominance may be detected as additive genetic variance iI
unaccounted Ior. This has important implications Ior predicting response to selection as the success oI
any selection program is dependent on correctly identiIying the mode oI inheritance.
4
124
ReIerence List
1 Amos.C.I. (1994) Robust variance-components approach Ior assessing genetic linkage in pedigrees.
Am. J. Hum. Genet. 54. 535-543
2 Allison.D.B. et al. (1999) Testing the robustness oI the likelihood-ratio test in a variance-component
quantitative-trait loci-mapping procedure
11. American Journal of Human Genetics 65. 531-544
3 Heuven.H.C. et al. (2005) EIIiciency oI population structures Ior mapping oI Mendelian and
imprinted quantitative trait loci in outbred pigs using variance component methods. Genet. Sel Evol.
37. 635-655
4 Mitchell.B.D. et al. (1997) Power oI variance component linkage analysis to detect epistasis. Genet.
Epidemiol. 14. 1017-1022
5 Visscher.P.M. et al. (1999) Detecting QTLs Ior uni- and bipolar disorder using a variance component
method. Psvchiatr. Genet. 9. 75-84
6 Kolbehdari.D. et al. (2006) Transmission disequilibrium test Ior quantitative trait loci detection in
livestock populations. J Anim Breed. Genet. 123. 191-197
7 Diao.G. and Lin.D.Y. (2005) A powerIul and robust method Ior mapping quantitative trait loci in
general pedigrees. Am. J Hum. Genet. 77. 97-111
8 De Koning.D.J. et al. (2004) Segregation oI QTL Ior production traits in commercial meat- type
chickens. Genetical Research 83. 211-220
9 George.A.W. et al. (2000) Mapping quantitative trait loci in complex pedigrees: A two- step variance
component approach. Genetics 156. 2081-2092
10 Pong-Wong.R. et al. (2001) A simple and rapid method Ior calculating identity-by-descent matrices
using multiple markers. Genetics Selection Evolution 33. 453-471
11 Liu.Y. et al. (2002) The covariance between relatives conditional on genetic markers. Genet. Sel
Evol. 34. 657-678
12 SelI.S.G.L.K. (1987) Asymptotic properties oI maximum likelihood estimators and likelihood ratio
tests under nonstandard conditions. Journal of the American Statistical Association 82. 605-610
5
125
IBD probabilities: their discrimination ability and their co-evolution with
linkage disequilibrium
YTOURNEL F.. BOICHARD D.. GILBERT H.
INRA. UR337 Station de Genetique Quantitative et Appliquee. F-78352 Jouy en Josas
!" !#$%&'()$*&#
Linkage Disequilibrium (LD) has become oI common use Ior Iine-mapping purposes.
Methods have been developed to take LD into account. Ior example through the use oI the
excess oI association oI a marker allele with a gene allele or the probabilities oI Identity By
Descent (IBD). An analytic way oI computing these probabilities oI IBD according to the
Identity By State (IBS) oI the markers oI an haplotype supposedly centred on the QTL
location has been proposed by Meuwissen and Goddard (2001).
The ability oI these probabilities Ior Iine-mapping purposes depends on their capacity
oI discriminating between IBD and non-IBD QTLs. Under diIIerent combinations oI
molecular inIormation. we studied the relationship between the distribution oI the IBD
probabilities and their correlations and the real IBD status oI the QTL
!!" +,$-&'.
1. Simulated populations
The populations were composed oI 100 individuals and resulted Irom 100 generations.
Matings were random. possibly with selIing.
2. Genetic maps
Three maps were considered. whose general Ieatures are presented on Fig. 1. Maps A and B
were composed oI 8 markers and one QTL. They diIIered Irom the distance between the QTL
and its two closest markers: on map A. this distance was 0.1 cM !"# 0.05 cM on map B.
thereIore implying a regular marker spacing in the 3 cM-QTL region on map B. The last map
has the same Ieatures than map B except that it has been densiIied in the 0.9 cM central
region with a regular marker spacing oI 0.1 cM (it was then composed oI 12 markers instead
oI 8).
2 cM
1.9 cM
0.5 cM
0.3 cM
QTL
0.1 cM
A B C
Figure 1. Different maps considered in the study. Thick lines : QTL locus. thin lines
: marker loci. dashed lines : position tested in the middle of the marker bracket
before the QTL
126
We considered diIIerent markers: the maps were composed either oI 5-alleles microsatellite
markers or oI SNPs.
3. Computation oI IBD probabilities
The IBD probabilities were computed according to the analytic Iormula oI Meuwissen and
Goddard (2001) Ior two map locations: the simulated QTL position and the middle oI the
marker-marker interval beIore the interval containing the QTL (dashed lines on Fig.1). The
IBD probabilities were computed with haplotypes composed oI 4 or 6 markers on the two
sparser maps (A and B on Fig. 1) and 6- and 10-marker haplotypes on the denser one (C). The
corresponding haplotype lengths and the abbreviations used in the Iollowing Iigures were
reported in Table 1.
Haploypes oI:
Haplotype length (cM) / Abbreviation
Haplotype centred on the QTL location Haplotype centred on the
middle oI the marker bracket
preceding the QTL
Map A Map B Map C Map A Map B
4 markers 0.4 / 4 0.3 / 4c - 0.6 / 4-2 0.5 / 4c-2
6 markers 1.0 / 6 0.9 / 6c 0.5 / 6d 1.2 / 6-2 1.1 / 6c-2
10 markers - - 0.9 / 10d - -
Table 1. Length of the haplotypes used for IBD computations on the different genetic
maps.
4. Studied parameters
We Iocused on two main aspects. computed Irom 1.000 replicates Ior each design:
the distribution oI the IBD probabilities compared to the real IBD status at the QTL. to
evaluate the ability oI IBD probabilities to discriminate IBD Irom non-IBD QTLs;
the evolution oI the correlations between the real IBD status at the QTL and the IBD
probabilities. depending on the value oI the linkage disequilibrium (LD. evaluated with ,`
(Yamazaki. 1977)) computed between the QTL and its closest marker. The correlations
were Iirst computed Ior each simulated population and then averaged over the 1.000
replicates.
III. /,.(0$.12#'1'*.)(..*&#
1. Distribution oI the IBD probabilities
It appeared that non-IBD QTLs were much better discriminated than IBD QTLs: in more than
75 oI the situations where the QTLs were not IBD between two haplotypes. the IBD
probability was lower or equal to 0.1 (Fig. 2). When the two QTLs were actually IBD. the
proportion oI IBD probabilities over 0.9 was comprised between 27 (with a haplotype
composed oI 4 SNPs on map A) and 60 (haplotypes composed oI 10 microsatellite markers
on map C). The segregation ability was better with a denser map in the QTL neighbourhood.
with haplotypes containing microsatellites (curves noted 'M on Fig.2) and with a longer
physical length covered by the haplotype (Fig. 2). It seemed that the map density around the
QTL had more inIluence on the IBD probabilities than the haplotype length (comparing Ior
instance the curves 4c and 4).
127
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Haplotypes non-BD at the QTL locus Haplotypes BD at the QTL locus
CIasses of IBD probabiIities
F
r
e
q
u
e
n
c
y

o
f

t
h
e

I
B
D

c
I
a
s
s
e
s
4 6 4c 6c 6-M 6c-M 10d-M
Figure 2: Distribution over 1.000 replicates of the IBD probabilities (class 1
corresponding to IBD probabilities under 0.1 and class 10 to probabilities between 0.9
and 1.0) according to the real IBD status at the QTL location depending on the
haplotype length. the molecular information and the density around the QTL. M:
haplotypes composed of microsatellite markers. else haplotypes composed of SNPs.
The ability oI discrimination was slightly reduced when the IBD probability was
computed at the middle oI the interval bracket preceding the QTL (data not shown). with
similar improvement as Ior the QTL position depending on the other parameters. It still
remained greater Ior non-IBD QTLs with over 75 oI the IBD probabilities under 0.1 The
low magnitude oI the reduction due to the diIIerent position tested is probably due to the short
distance between the tested point and the QTL locus. and this tendency needs to be tested Ior
positions located Iurther Irom the QTL.
2. Co-evolution oI Linkage Disequilibrium and IBD probabilities
The average correlations between the real IBD status oI the QTL and the IBD probabilities
were over 0.70 Ior all designs and Ior all intensities oI LD between the QTL and its closest
marker (Fig. 3). These correlations slightly increased with the increase oI the LD. They were
higher when including microsatellites in the haplotype and when the haplotype was longer.
However. it seemed that adding Iurther microsatellite markers to the 6-marker haplotype
(with the same haplotype length) did not increase the concordance between IBD probabilities
and the real IBD statuses.
When computing the IBD probabilities at the middle oI the marker bracket preceding the QTL
(curves noted '2 on Fig.3). the correlations were lower. particularly when there was little
LD. 6-locus haplotypes seemed to be more sensitive than 4-marker haplotypes.
128
0,70
0,75
0,80
0,85
0,90
0,95
0
=
L
D
<
0
.
1
0
.
1
=
L
D
<
0
.
2
0
.
2
=
L
D
<
0
.
3
0
.
3
=
L
D
<
0
.
4
0
.
4
=
L
D
<
0
.
5
0
.
5
=
L
D
<
0
.
6
0
.
6
=
L
D
<
0
.
7
0
.
7
=
L
D
<
0
.
8
0
.
8
=
L
D
<
0
.
9
0
.
9
=
L
D
<
1
.
0
Linkage DisequiIibrium between the QTL and its cIosest marker
A
v
e
r
a
g
e

c
o
r
r
e
I
a
t
i
o
n

b
e
t
w
e
e
n

t
h
e

I
B
D

p
r
o
b
a
b
i
I
i
t
i
e
s

a
n
d

t
h
e

r
e
a
I

I
B
D

6-M
6-2-M
10d-M
4c
4c-2
6c
6c-M
Figure 3. Evolution of the average correlations between the IBD probabilities and the
real IBD status at the QTL according to the linkage disequilibrium between the QTL
and its closest marker. M: haplotypes composed of microsatellite markers. else
haplotypes composed of SNPs. -2: IBD probabilities computed at the middle of the
interval preceding the QTL. else computed at the QTL locus.
IV. 3&#)0(.*&#
It results Irom this study that the IBD probabilities are well correlated with the real IBD
statuses. although it seemed that their power was greater to discriminate non-IBD QTLs than
IBD ones. They are inIluenced by Iactors such as the composition oI the haplotype. its length.
the map density around the tested position and the linkage disequilibrium existing between
the closest marker and the QTL (both being included in the haplotype used to compute the
IBD probabilities). Shorter haplotypes seemed to be more inIluenced by the LD intensity; 6-
marker haplotypes seemed to be a compromise between the haplotype length and the accuracy
oI the estimates oI the IBD statuses. As it is impossible to have as dense haplotypes composed
oI microsatellites as those simulated in this study. it would be oI interest to check iI the
inclusion oI one microsatellite marker in the haplotypes provides usable inIormation Ior the
computation oI IBD probabilities. Further investigations should also be conducted on the
inIluence oI selection on the distribution oI the IBD probabilities among populations.
/,4,%,#),.5
Meuwissen T.H.E.. Goddard M.E.. 2001. Prediction oI identity by descent probabilities
Irom marker-haplotypes. Genet. Sel. Evol.. 33(6):605-34.
Yamazaki T.. 1977. The eIIects oI overdominance on linkage in a multilocus system.
Genetics. 86:227-236.
129
Bovine genomic structure as revealed by construction haplotype
block map based on a high density 15k SNP scan in dairy cattle.
Mehar S. Khatkar
*.
. Kyall R. Zenger
*.
. Matthew Hobbs
*.
. Rachel J. Hawken
f.
.
Julie A. L. Cavanagh
*.
. Wes Barris
f.
. Bruce Tier
|
. Frank W. Nicholas
*.
and
Herman W. Raadsma
*.
*
Centre Ior Advanced Technologies in Animal Genetics and Reproduction
(ReproGen). University oI Sydney. Camden NSW 2570. Australia
f
CSIRO Livestock Industries. St Lucia QLD 4067. Australia.
|
Animal Genetics and Breeding Unit. University oI New England. NSW 2351.
Australia.
CRC Ior Innovative Dairy Products. William Street. Melbourne. Vic. 3000.
Australia.
We perIormed a high-throughput single nucleotide polymorphism (SNP) discovery
and genotyping on 1546 Australian progeny-tested dairy bulls using 15.036 SNP
markers Ior association studies and bull selection as part oI an Australian dairy
research program (Cooperative Research Centre Ior Innovative Dairy Products. CRC-
IDP). These SNPs included 10410 public-domain SNP markers (AIIymetrix) and
4626 in-house SNPs Ior gene coding regions. These SNPs were mapped on bovine
sequence assembly 3.1 (Btua3.1) and also to an in-house annotated bovine integrated
map. The average intermarker spacing was 229 kb and average minor allele Irequency
(MAF) 0.29. The data have been analysed Ior the assessment oI linkage
disequilibrium and construction oI a haplotype block map oI whole bovine genome.
Following the deIinition oI Gabriel !"# $%& #2002 (Science 296:2225). a total oI 727
haplotype blocks with three or more SNPs were identiIied with an average block size
oI 68kb. A set oI tag SNPs useIul Ior Iurther Iine-mapping studies could be identiIied
within each block. Though the extent oI linkage disequilibrium as measured by
commonly used measure '( is quite extensive. only small proportion oI the genome
could be assigned as haplotype blocks with the current SNP density. Based on SNP
usage Ior hapmap deIinition it is suggested that about 250.000 SNPs would be
required to construct a saturated haplotype block map oI bovine genome with 75.000-
100.000 tag SNps. As an example oI the detailed analysis. the results oI one
chromosome will be presented.
130

Potrebbero piacerti anche