| Genotype | Frequency | Value |
|---|---|---|
| $A_1A_1$ | $q^2$ | $- a$ |
| $A_1A_2$ | $2pq$ | $d$ |
| $A_2A_2$ | $p^2$ | $a$ |
Parametrizations
In this chapter we will go through the derivation and assumptions of the two most common parametrizations to model additivity and dominance for both diploids and autotetraploid species. Please, refer to Vitezica et al, 2013 and Endelman et al, 2018 for further explanations.
Diploid Theory
Let’s begin by defining the following model
\[ y = \mu + Ta + Xd + \varepsilon \]
The frequencies and genetic values of each possible genotype for a single locus under Hardy-Weinberg equilibrium can be defined as
Then, the genetic mean of the population under both parametrizations equals
\[ \begin{align*} \mu_G &= \sum (\text{value} \times \text{frequency}) = \\ &= P_{A_1, A_1}*G_{A_1, A_1} + P_{A_1, A_2}*G_{A_1, A_2} + P_{A_2, A_2}*G_{A_2, A_2}\\ &= p^2(-a) + 2pq(d) + q^2(a) \\ &= a(p^2 - q^2) + 2pqd \\ &= a((p - q)(p + q)) + 2pqd \\ &= a(p - q) + 2pqd \end{align*} \\ \]
1) Breeding Parametrization
This approach separate the Genotypic Value (\(G\)) of an individual into the breeding value (\(u\)) and dominance deviations (\(v\))
\[ G = \mu_G + u + v \]
Breeding Values (\(u\))
This value is defined as a function of the additive substitution effect \(\alpha\). This parameter is computed as the difference of the alleles substitution effects, which represent the mean deviation from the population mean of the individuals receiving this particular allele from one parent and the second allele at random from the population.
\[ \begin{align*} \alpha &= a + d(q - p) \end{align*} \]
Thus, for a single locus with two alleles
\[ u = \begin{cases} (0 - 2p)\alpha = -2p\alpha\\ (1 - 2p)\alpha = (q - p)\alpha\\ (2 - 2p)\alpha = 2q\alpha\\ \end{cases} \]
So, the breeding values of a set of individuals is \(\mathbf{u} = \mathbf{Z}\alpha = \mathbf{Z}a + \mathbf{Z}(q - p)d\) where \(\mathbf{Z}\) is the centered marker matrix, so that \(\mathbf{Z} = (z_1...z_n)\) including all markers is identical to \(\mathbf{T}\) but centered. Then
\[ z_i = \begin{cases} -2p, & \text{for } A_1A_1\\ q - p, & \text{for } A_1A_2\\ 2q, & \text{for } A_2A_2 \end{cases} \] #### Dominance Deviations
Alternatively, the dominance deviations can be computed as the difference between the actual genotypic value and the genotypic value predicted from the sum of the average effects
\[ \begin{align*} G_{ij} &= \mu_G + \alpha_i + \alpha_j + v{ij} \\ \hat{G}_{ij} &= \mu_G + \alpha_i + \alpha_j \\ v_{ij} &= G_{ij} - \hat{G}_{ij} \\ &= G_{ij} - (\mu_G + \alpha_i + \alpha_j) \\ &= G_{ij} - \mu_G - u_{ij} \end{align*} \]
Then
\[ v = \begin{cases} - 2p^2d \\ 2pqd\\ - 2q^2d \end{cases} \]
In summary, \(\mathbf{v} = \mathbf{W}d\), where \(W = (w_1...w_n)\) is not \(\mathbf{X}\) in the model and
\[ w_i = \begin{cases} - 2p^2, & \text{for } A_1A_1\\\\ 2pq, & \text{for } A_1A_2\\\\ -2q^, & \text{for } A_2A_2\\ \end{cases} \]
Besides that, it is important to note that the expected Value of both \(u\) and \(v\) equals 0. Under this parametrization, we will assume that there is no covariance between additive and dominance, and therefore we can also express the genetic variance for a single locus simply as
\[ \operatorname{var}(G) = \operatorname{var}(A) + \operatorname{var}(D) \]
Where
\[ \operatorname{var}(A) = 2pq[a + d(1-p)]^2 + (2pqd)^2 \]
\[ \operatorname{var}(D) = (2pqd)^2 \]
Besides that, if we assume that \(a \sim N(0, \sigma^2_a)\) and \(d \sim N(0, \sigma^2_d)\), and that \(\sigma_A^2 = 2pq\big(\sigma_a^2 + (q - p)^2\sigma_d^2\big)\)
\[ \begin{align*} \operatorname{cov}(\mathbf{u}) &=\operatorname{cov}(\mathbf{Z}\alpha) \\ &= \operatorname{var}(\mathbf{Z}(\mathbf{a} + \mathbf{d}(p-q))) \\ &= \mathbf{ZZ'}\operatorname{var}(\mathbf{a}) + \mathbf{ZZ'}(p-q)^2\operatorname{var}(\mathbf{d}) \\ &= \mathbf{ZZ'}\sigma^2_a + \mathbf{ZZ'}(p-q)^2\sigma^2_d \\ &= \mathbf{ZZ'} \big(\sigma_a^2 + (p-q)^2 \sigma_d^2\big) \\ &= \mathbf{ZZ'} \frac{\sigma_A^2}{2pq} \\ &= \frac{\mathbf{ZZ'}}{2pq} \, \sigma_A^2 \\ &= \mathbf{G} \, \sigma_A^2 \end{align*} \]
Which equals the VanRaden method to compute the Additive Genomic Relationship Matrix. A similar Derivation can be made for dominance deviations, considering that \(\operatorname{\sigma^2_D} = (2pq)^2\sigma^2_d\)
\[ \begin{align*} \operatorname{cov}(\mathbf{v}) &= \operatorname{var}(\mathbf{W}\mathbf{d}) \\ &= \mathbf{WW'} \, \sigma_d^2 \\ &= \frac{\mathbf{WW'}}{(2pq)^2} \, \sigma_D^2 \\ &= \mathbf{D} \, \sigma_D^2 \end{align*} \]
- Allele substitution effects
The allele substitution are by definition the differrence between the mean of the population carrying the allele and the overall population mean. Thus
\[ \alpha_{A_1} = \mu_{A_1} - \mu \] \[ \alpha_{A_2} = \mu_{A_2} - \mu \]
So we need to compute \(\mu_{A_1}\) and \(\mu_{A_2}\)
Let’s consider then the average effect of Allele \({A_1}\)
| Allele.from.other.Parent | Probability | Genotype | Value |
|---|---|---|---|
| $A_1$ | $p$ | $A_1A_1$ | $a$ |
| $A_2$ | $q$ | $A_1A_2$ | $d$ |
And the average effect of Allele \(A_2\)
| Allele.from.other.Parent | Probability | Genotype | Value |
|---|---|---|---|
| $A_1$ | $p$ | $A_1A_2$ | $d$ |
| $A_2$ | $q$ | $A_2A_2$ | $-a$ |
We can simply compute the subpopulations means as
\[ \mu_{A_1} = ap + dq \] \[ \mu_{A_2} = dp - aq \]
We can solve now the allele substitution effects taking into account that the population mean \(\mu = a(p - q) + 2pqd\)
\[ \begin{align*} \alpha_{A_1} &= ap + dq - (a(p - q) + 2pqd) \\ &= ap + dq - ap + aq - 2pqd \\ &= q(a - 2pd + d) \\ &= q(a + d(-2p + 1)) \\ &= q(a + d(-2p + p + q)) \\ &= q(a + d(q - p)) \end{align*} \]
\[ \begin{align*} \alpha_{A_2} &= ap + dq - (a(p - q) + 2pqd) \\ &= dp - aq - ap + aq - 2pqd \\ &= p(-a + d(-2q + 1)) \\ &= -p(a + d(2q - (p + q))) \\ &= -p(a + d(q - p)) \end{align*} \]
Now, if \(\alpha = \alpha_{A_1} - \alpha_{A_2}\), then
\[ \begin{align*} \alpha &= q(a + d(q - p)) - (-p(a - d(q + p)))\\ &= a + d(q - p) \end{align*} \]
Finally, we can express the allele substitution effects of each allele in terms of \(\alpha\)
\[ \alpha_{A_1} = q(a + d(q - p)) = q\alpha \]
\[ \alpha_{A_2} = -p(a + d(q - p)) = -p\alpha \]
- Dominance Deviations
Consider first that \(v = G - (\mu + u)\). Then, following the information shown below
| Genotypes | Frequency | Value | u |
|---|---|---|---|
| $A_1A_1$ | q^2 | - a | $-2p\alpha$ |
| $A_1A_2$ | 2pq | d | $(q - p)\alpha$ |
| $A_2A_2$ | p^2 | a | $2q\alpha$ |
We can solve that
\[ \begin{align*} v_{A_1, A_1} &= -a - (a(p - q) + 2pqd) - (-2p\alpha) \\ &= -a -ap + aq - 2pqd + 2p(a + d(q - p )) \\ &= -a -ap + aq - 2pqd + 2pa + 2pdq - 2p^2d \\ &= -a + aq + pa - 2p^2d \\ &= -a + a(p + q) - 2p^2d \\ &= - 2p^2d \\ \end{align*} \]
\[ \begin{align*} v_{A_1, A_2} &= d - \big(a(p-q) + 2pq d\big) - (q-p)\big(a + d(q-p)\big) \\ &= d - a(p-q) - 2pq d - (q-p)a - d(q-p)^2 \\ &= d - 2pq d - d(q-p)^2 \\ &= d(1 - 2pq - (q-p)^2) \\ &= d\big(1 - p^2 - q^2\big) \\ &= d(2pq) \\ &= 2pq\,d \end{align*} \]
\[ \begin{align*} v_{A_2, A_2} &= a - \big(a(p-q) + 2pq d\big) -2q\alpha \\ &= a -ap + aq -2pqd -2q(a+ d(q-p))\\ &= a -ap + aq -2pqd -2aq -2q^2d + 2pqd \\ &= a - ap -aq -2q^2d \\ &= -2q^2d \end{align*} \]
- Expectations
Let’s prove why the expectations of the breeding values and the dominance deviations equals 0.
\[ \begin{align*} E(u) &= p^2(2q\alpha) + 2pq(q-p)\alpha + q^2(-2p\alpha) \\ &= 2p^2q\alpha + 2pq^2\alpha - 2p^2q\alpha -2pq^2\alpha \\ &= 0 \end{align*} \]
\[ \begin{align*} E(v) &= p^2(-2q^2d) + 2pq2pqd + q^2(-2p^2d) \\ &= -2p^2q^2d + 4p^2q^2d -2p^2q^2d\\ &= 0 \end{align*} \]
- Variances
Following the properties of the variance, we can define the additive and dominance variance as follows
\[ \begin{align*} \operatorname{var}(A) &= \operatorname{var}(u) = E[(u−E(u))^2]=E(u^2)− \cancel{[E(u)]^2} \\ &= p^2(2q\alpha)^2 + 2pq((q-p)\alpha)^2 + q^2(-2p\alpha)^2\\ &= 4p^2q^2\alpha^2 + 2pq(q-p)^2\alpha^2 +4 p^2q^2\alpha^2 \\ &= 2pq\alpha^2(2pq + (q-p)^2 + 2pq) \\ &= 2pq\alpha^2(2pq + p^2 + q^2 -2pq + 2pq) \\ &= 2pq\alpha^2(p^2 + q^2 + 2pq) \\ &= 2pq\alpha^2 = 2pq[a + d(1-p)]^2 \end{align*} \]
\[ \begin{align*} \operatorname{var}(D) &= \operatorname{var}(v) = E[(v−E(v))^2]=E(v^2)−\cancel{[E(v)]^2} \\ &= p^2(-2q^2d)^2 + 2pq(2pqd)^2 + q^2(-2p^2d) ^2 \\ &= 4p^2q^4d^2 + 8p^3q^3d^2 + 4p^4q^2d^2 \\ &= 4p^2q^2d^2(q^2 + 2pq + p^2) \\ &= 4p^2q^2d^2 = (2pqd)^2 \end{align*} \]
2) Genotypic Parametrization
Pure additive values
A different approach is to define the genotypic value as
\[ G = E(G) + u^* + v^* \]
Where \(u^*\) and \(v^*\) are no longer the breeding values and dominance deviations but pure additive and pure additive values. Given that we aim to partition the effects in pure additive and pure dominance effects, we will no longer have dominance with \(u^*\). An easy way to summarize this is to remove \(d\) from \(\alpha\), so that \(\alpha = a\). then, we can substitute to obtain
\[ u^* = \begin{cases} -2pa\\ (q-p)a\\ 2qa \end{cases} \]
See how still, the covariate equals the \(Z\) matrix as in the Breeding parametrization. Then, we can express \(u^* = \mathbf{Z}a\).
Pure dominance values
The pure dominance values are derived by substracting the population mean, and the pure additive values to the genotypic values. Therefore
\[ v^* = \begin{cases} -2pqd\\ d(1 -2pq)\\ -2pqd \end{cases} \]
So that \(\mathbf{v^*} = \mathbf{H}d\) where
\[ \mathbf{H} = \begin{cases} -2pqd, & \text{for } A_1A_1\\\\ (1 -2pq)d, & \text{for } A_1A_2\\\\ -2pqd, & \text{for } A_2A_2\\ \end{cases} \]
It is important to notice that \(H \neq W\). However, it can be proved that the expected values of \(u^*\) and \(v^*\) is still \(0\). Now, the genetic variance equals
\[ \operatorname{var}(G) = 2pqa^2 + 2pq[ 1 - 2pq ]d^2 \]
And the covariances between additive values are computed as
\[ \begin{align*} \operatorname{cov}(\mathbf{u^*}) &= \mathbf{ZZ'}\sigma^2_{a} \\ &= \frac{\mathbf{ZZ'}}{2pq}\sigma^2_{A^*} \\ &= \mathbf{G}\sigma^2_{A^*} \end{align*} \]
Notice that the Additive GRM follows the same definition for both parametrizations. Let’s show how this is not the case for the Dominance GRM
\[ \begin{align*} \operatorname{cov}(\mathbf{v^*}) &= \mathbf{HH'}\sigma^2_{d} \\ &= \frac{\mathbf{HH'}}{2pq(1 - 2pq)}\sigma^2_{D^*} \\ &= \mathbf{D^*}\sigma^2_{D^*} \end{align*} \]
- Pure Dominance Values
Under the genotypic parametrization, \(v^*\) are computed as \(v\) in the breeding parametrization, although now, by definition \(v^* = G - (\mu + u^*)\). Despite the population mean remains the same, now \(u^* \neq u\), so
\[ \begin{align*} v^*_{A_1A_1} &= -a - \big(a(p-q) + 2pq d\big) + 2pa \\ &= -a -ap + aq -2pqd + 2pa \\ &= -a + a(p + q) -2pqd \\ &= -2pqd \end{align*} \]
\[ \begin{align*} v^*_{A_1A_2} &= d - \big(a(p-q) + 2pq d\big) -(q-p)a) \\ &= d -ap + aq -2pqd -aq + ap \\ &= d(1 -2pq) \end{align*} \]
\[ \begin{align*} v^*_{A_2A_2} &= a - \big(a(p-q) + 2pq d\big) -2qa \\ &= a - ap + aq -2pqd -2qa \\ &= a -ap -2pqd -qa \\ &= a -a(p + 1) -2pqd \\ &= -2pqd \end{align*} \]
- Expectations
The expectation of \(u^*\) and \(v^*\) still equal 0
\[ \begin{align*} E(u^*) &= p^2(2qa) + 2pq(q-p)a + q(-2pa) \\ &= 2p^2qa + 2pq^2a - 2p^2qa -2pq^2a\\ &= 0\\ \end{align*} \]
\[ \begin{align*} E(v^*) &= p^2(-2pqd) + 2pq((1-2pq)d) + q(-2pqd) \\ &= -2p^2qd + 2pqd - 4p^2q^2d -2pq^2d\\ &= 2pqd(-p^2 + 1 - 2pq - q^2)\\ &= 2pqd(-1 + 1) = 0 \end{align*} \]
- Variances
By definition, the variances under the genotypic parametrization can be defined as
\[ \begin{align*} Var(A^*) = \sigma^2_{A^*} &= E(u^{*2}) - \cancel{E(u^*)^2} \\ &= p^2(2qa)^2 + 2pq((q-p)a)^2 + q^2(-2pa)^2 \\ &= 4p^2q^2a^2 + 2pq(q-p)^2a^2 + 4p^2q^2a^2 \\ &= 2pqa^2(2pq + (q-p)^2 + 2pq) \\ &= 2pqa^2(q^2 + p^2 +2pq)\\ &= 2pqa^2 \end{align*} \]
\[ \begin{align*} Var(D^*) = \sigma^2_{D^*} &= E(v^{*2}) - \cancel{E(v^*)^2} \\ &= p^2\big(-2pq\,d)^2 \;+\; 2pq((1-2pq)\,d)^2 \;+\; q^2(-2pq\,d)^2 \\ &= d^2[ p^2( -2pq )^2 \;+\; 2pq(1-2pq)^2 \;+\; q^2( -2pq )^2\\ &= d^2[ 4p^4q^2 \;+\; 2pq(1 - 4pq + 4p^2q^2) \;+\; 4p^2q^4] \\ &= d^2[ 4p^2q^2(p^2+q^2) \;+\; 2pq(1 - 4pq + 4p^2q^2)] \\ &= 2pq[ 2pq(p^2+q^2) \;+\; (1 - 4pq + 4p^2q^2)]d^2\\ &= 2pq[ 2pq(1-2pq) \;+\; 1 - 4pq + 4p^2q^2 ] d^2\\ &= 2pq[ (2pq - 4p^2q^2) \;+\; (1 - 4pq + 4p^2q^2)d^2]d^2 \\ &= 2pq[ 1 - 2pq ]d^2 \\ \end{align*} \]
3) Summary
Briefly, \(\operatorname{cov}(\mathbf{u^*}) = \operatorname{cov}(\mathbf{u})\), but \(\operatorname{cov}(\mathbf{v^*}) \neq \operatorname{cov}(\mathbf{v})\), given that \(\mathbf{D^*} \neq \mathbf{D}\). Besides, given that \(\sigma^2_A + \sigma^2_D = \sigma^2_{A^*} + \sigma^2_{D^*}\). THus, both approaches explains the data but their interpretation must be different.
Finally, we wanted to highlight that from the practical point of view, t he conversion from the genotypic to the breeding values is easy, given that we can use \(a\), \(d\) estimated from the model and the allelic frequencies \(p\) and \(q\) from the original marker matrix to build \(\alpha = a + d(q - p)\). Then we should simply multiply \(Z\alpha\) to obtain \(u\)
- Variance correspondence
Takin into account that
\[ \begin{align*} \sigma^2_A &= 2pq \, \sigma_a^2 + 2pq (q - p)^2 \, \sigma_d^2 \\ \sigma^2_{A^*} &= 2pq \, \sigma_a^2 \\ \sigma^2_D &= (2pq)^2 \, \sigma_d^2 \\ \sigma^2_{D^*} &= 2pq (1 - 2pq) \, \sigma_d^2 \end{align*} \]
We can then demonstrate that
\[ \begin{align*} \sigma^2_A + \sigma^2_D &= 2pq\sigma^2_a + 2pq(q-p)^2\sigma^2_d + (2pq)^2\sigma^2_d \\ &= 2pq\sigma^2_a + \sigma^2_d(2pq(q-p)^2 + (2pq)^2)\\ &= 2pq\sigma^2_a + \sigma^2_d(2pq(q^2 + q^2 -2pq) + 4p^2q^2) \\ &= 2pq\sigma^2_a + \sigma^2_d(2pq^3 + 2p^3q -4p^2q^2 + 4p^2q^2) \\ &= 2pq\sigma^2_a + \sigma^2_d(2pq(q^2 + p^3))\\ &= 2pq\sigma^2_a + \sigma^2_d(2pq(1-2pq)) \\ &= 2pq(\sigma^2_a + \sigma^2_d(1-2pq) \end{align*} \]
\[ \begin{align*} \sigma^2_{A^*} + \sigma^2_{D^*} &= 2pq \, \sigma_a^2 + 2pq (1 - 2pq) \, \sigma_d^2 \\ &= 2pq(\sigma^2_a + \sigma^2_d(1-2pq) \end{align*} \]
And therefore
\[ \sigma^2_A + \sigma^2_D = \sigma^2_{A^*} + \sigma^2_{D^*} = 2pq(\sigma^2_a + \sigma^2_d(1-2pq) \]
Which means that both parametrizations are equivalents
Autoteraploid Theory
The theoretical derivation that allows to partition variance into additive and dominance has been made under the breeding framework. No derivation has been published to partition genetic values into pure additive and pure dominance effect in polyploids.
The increased level of complexity because of the enhanced number of possible alleles is illustrated in the definition of the genotypic value. Not only the number of additive effects is increased but also the amount of interactions between alleles. Disregarding trigenic and quadrigenic effect, it can be defined as
\[ G_{ijkl} = \alpha_i + \alpha_j + \alpha_k + \alpha_l + \beta_{ij} + \beta_{ik} + \beta_{il} + \beta_{jk} + \beta_{jl} + \beta_{kl} \]
Again, the residuals from the regression equation for additive effects \(y_{ijkl} = G_{ijkl} - (\mu + \alpha_i + \alpha_j + \alpha_k + \alpha_l)\) is what we call the dominance deviations.
As stated in Endelman et al, 2018, the breeding value (BV) of an individual is defined as twice the mean genotypic value of its progeny relative to the population mean. Under the model assumptions, all six possible gene pairs for tetraploid genotype ijkl have equal frequency in its gametes which leads to the following expression
\[ BV_{ijkl} = 2[\frac{1}{6}(G_{ij..} + G_{ik..} + G_{il..} + G_{jk..} + G_{jl..} + G_{kl..}) - \mu] \]
Taking into account that, as previously explained
\[ \beta_{ij} = G_{ij} - \mu - alpha_i - alpha_j \]
We can solve for \(G_{ij} = \beta_{ij} + \alpha_i + \alpha_j + \mu\). Therefore
\[ \begin{align*} BV_{ijkl} &= 2\left[\frac{1}{6}\Big( (\beta_{ij} + \alpha_i + \alpha_j + \mu) + (\beta_{ik} + \alpha_i + \alpha_k + \mu) + (\beta_{il} + \alpha_i + \alpha_l + \mu) \right.\\ &\qquad\qquad\left.+ (\beta_{jk} + \alpha_j + \alpha_k + \mu) + (\beta_{jl} + \alpha_j + \alpha_l + \mu) + (\beta_{kl} + \alpha_k + \alpha_l + \mu) \Big) - \mu \right] \\[6pt] &= 2\left[\frac{1}{6}\Big( 6\mu + 3(\alpha_i + \alpha_j + \alpha_k + \alpha_l) + \beta_{ij} + \beta_{ik} + \beta_{il} + \beta_{jk} + \beta_{jl} + \beta_{kl} \Big) - \mu \right] \\[6pt] &= (\alpha_i + \alpha_j + \alpha_k + \alpha_l) + \frac{1}{3}\left( \beta_{ij} + \beta_{ik} + \beta_{il} + \beta_{jk} + \beta_{jl} + \beta_{kl} \right) \\[6pt] &= u + \frac{1}{3}v . \end{align*} \]
Therefore, the breeding value equals the total additive value \(u\) plus 1/3 of the total digenic dominance \(v\).
Based on a series of equations developed in the cited article, the equation \(0 = \sum_k p_k\beta_{ik}\) yield the following two equalities
\[ p\beta_{BB} + q\beta_{Bb} = 0 \]
\[ p\beta_{Bb} + q\beta_{bb} = 0 \]
To solve this system of equation, the parameter \(\beta = \beta_{Bb} - \frac{1}{2}(\beta_{bb} + \beta_{BB})\) is introduced, producing the following set of equalities
\[ \beta_{bb} = -2p^2\beta \]
\[ \beta_{Bb} = 2pq\beta \]
\[ \beta_{BB} = -2q^2\beta \]
Thus, if we calculate the dominance deviation for the five possible genotypes of an autoetraploid (aaaa, Aaaa, AAaa, AAAa, AAAA) by examining all digenic interactions and we substitute based on the previous equations
\[ v = \begin{cases} v_{aaaa} = 6\beta_{bb} = -12p^2\beta \\[6pt] v_{Aaaa} = 3\beta_{bb} + 3\beta_{Bb} = -6p^2\beta + 6pq\beta \\[6pt] v_{AAaa} = \beta_{bb} + 4\beta_{Bb} + \beta_{BB} = -2p^2\beta + 8pq\beta - 2q^2\beta \\[6pt] v_{AAAa} = 3\beta_{BB} + 3\beta_{Bb} = -6q^2\beta + 6pq\beta \\[6pt] v_{AAAA} = 6\beta_{BB} = -12q^2\beta \end{cases} \]
If we substitute \(q\) by \((p - 1)\) we will obtain the following simplification
\[ v = \begin{cases} v_{aaaa} = \beta(-12p^2)\\ v_{Aaaa} = \beta(-12p^2 + 6p)\\ v_{AAaa} = \beta(-12p^2 + 12p -2)\\ v_{AAAa} = \beta(-12p^2 + 18p -6)\\ v_{AAAA} = \beta(-12p^2 + 24p - 12)\\ \end{cases} \]
This expression can be generalized to any ploidy and dosage such that
\[ v = [-2\binom{\phi}{2}p^2 + 2p(\phi - 1)X -X(X-1)]\beta = Q\beta \]
Notice that, for a diploid specie, following the recent derivation
\[ v_{\phi = 2} = \begin{cases} v_{aa} = -2p^2\\ v_{Aa} = 2pq)\\ v_{aa} = -2q^2 \end{cases} \]
The digenic dominance matrix Dij is defined similarly as the covariance between dominance values relative to the dominance genetic variance, based on the expectation with respect to \(b \sim N(0, \sigma^2_{\beta})\)
\[ D_{ij} = \sigma_D^{-2}cov[v_i, v_j] = \sigma_D^{-2}cov[Q_i\beta, Q_j\beta] = \sigma_D^{-2}\sigma_{\beta}^{-2}Q_iQ_j \] \[ \sigma^2_D = E[v^2] - E[v]^2 = E[v^2] = \binom{\phi}{2} 4\, p^2 q^2 \, \beta^2 \]
Therefore, the correct scaling of the Digenic Dominanc GRM is the following one
\[ \mathbf{D} = \frac{\mathbf{QQ^T}}{\binom{\phi}{2} \sum_k 4\, p_k^2 q_k^2} \]