The Random Match Probability is generated using the frequency that the observed “alleles” appear in the population.
Analysts primarily look at the allelic frequencies, using a method known as the “product rule”, to eventually arrive at a final Random Match Probability. The process is complicated, but can be broken down into three steps.
First, the analyst looks at each allele, and determines its frequency in the population. Next, the analyst multiplies those two frequencies by each other, with some modifications based on whether the locus was homozygotic or heterozygotic.
So, before going to step number three, the difference between homozygotic and heterozygosity is simply that homozygotic displays two copies of the same allele, while heterozygotic displays two different alleles.
Moving to the third step, the analyst multiples each of these alleles by one another, to determine the frequency of the entire genotype, along with a .01 corrective to account for population subdivision. In the case of small, isolated sub populations, this “theta correction,” is elevated in accordance with recommendations based on the National Academy of Science.
It should be considered that some of the most heated controversy’s in past decade have emerged over the generation and use of the process outlined above.
Critics within the criminal justice system have charged that the sampled populations were too small to generate reliable frequencies and that the racial and ethnic classifications are non scientific and self-identified.
Other critics allege that the final calculations failed to accurately account for the existence of “substructure”, or commonness of alleles among certain sub populations.
This is particularly the case in smaller-communities where there is a greater likelihood that members will have more similar DNA. Becasue relatives are likely to match a given profile with a frequency higher than that of a random person, if a relative is a possible source of the sample, then frequencies should also be computed for those relatives.
Second, if an individual is identified through a search of a DNA database, sometimes referred to as a “cold hit” then a different calculation may be ok.
However, in the case of a mixture, a different calculation should be used.
Now if this process is not complicated enough, along with the problems with the process, many critics advocate for the use of alternative methods of statistical analysis, including, likelihood ratios or Bayesian calculations, or the combined probability of exclusion and inclusion.