The QR decomposition (also called the QR factorization) of a matrix is a decomposition of the matrix into an orthogonal matrix and a triangular matrix. A QR decomposition of a real square matrix \(A\) is a decomposition of \(A\) as \(A = QR\), where \(Q\) is an orthogonal matrix (i.e. \(Q^\top Q = I\)) and \(R\) is an upper triangular matrix. If \(A\) is nonsingular, then this factorization is unique. By Yanovsky (UCLA) (PDF) link
We start by factoring the design matrix \(X\), such that \[X = QR\]
Substituting this factorization into the normal equations, \[X^\top X \beta = X^\top y,\] we obtain \[(R^\top Q^\top)(Q R)\beta = R^\top Q^\top y\] Because \(Q^\top Q = I\), this simplifies to \[R^\top R \beta = R^\top Q^\top y\] By pre-multiplying both sides by \((R^\top)^{-1}\), we get \[R \beta = Q^\top y\] Finally, since \(R\) is upper triangular, we can solve for \(\beta\) efficiently using back substitution.
The normal-equations method uses an explicit \((X^\top X)^{-1}\); it’s concise but can be numerically less stable if \(X^\top X\) is ill-conditioned.
The QR method avoids forming \((X^\top X)^{-1}\) directly and is typically more stable; most linear model software uses QR (or related decompositions) under the hood.