PPT Slide
Proof. Geometrically, we start with the line y = Ax + B. The vertical distance dk from the data point (xk , yk) to the point (xk , Axk+B) on the line is dk = Axk+B - yk. We must minimize the sum of the squares of the vertical distances dk :