Figure S2:
Two measures of overdispersal of block numbers across individuals (i.e. substructure):
Suppose we have
individuals from population
,
and
is the number of IBD blocks of length at least 1cM
that individual
shares with anyone from population
.
Our statistic of substructure within
with respect to
is the variance of these numbers,
.
We obtained a “null” distribution for this statistic by randomly reassigning all blocks shared between
and
to an individual from
, and used this to evaluate the strength and the statistical significance of this substructure.
(A) Histogram of the “
-value”, of the proportion of 1000 replicates that showed a variance greater than or equal to the observed variance
,
for all pairs of populations
and
with at least 10 individuals in population
.
(B) The “
score”, which is observed value
minus mean value divided by standard deviation,
estimated using 1000 replicates.
The population
is shown on the vertical axis,
with text labels giving
,
so for instance, Italians show much more substructure with most other populations than do Irish.
Note that sample size still has a large effect – it is easier to see substructure with respect to the Swiss French (
CHf)
because the large number of Swiss French samples allows greater resolution.
A vertical line is shown at
.
Only pairs of populations with at least 3 samples in country
and 10 samples in country
are shown.
Because of the log scale, only pairs with a positive
score are shown, but no comparisons had
,
and only three had
.