
Nick,
I myself used to interpolate gradients over the edge, but never got rid of the overflow/underflow that resulted from roundoff errors(?).
Then I changed the way I calculate subtexel "corrected" gradients, I did not interpolate them at all. I calculated the constant dudx and dudy ( for u coordinate, for example ) for the whole triagle. Ok, so far this is oldskool, and errors can result still.
If you know constant gradients over X and Y axis, and pixel's coordinate ( in integer!!! ), you can do this:
u_at_origin + ix * dudx + iy * dudy;
And you get perfect solution, I'm not sure if this is best possible, but I haven't got under/overflows ever since.
Now we see obvious optimization:
iy * dudy is simple adder, for each scanline you can actually replace it with:
foo += dudy;
If you initialize the foo with:
foo = dudy * iy_at_top;
You get interpolation. You still need to do the ix*dudx calculation, and add, but those can be done in any precision you like: float, fixedpoint.. ( float if you want to do subaffine perspective correction but that's another story ;)
So we're left with multiply+add per scaline per gradient, this isn't so bad as you'd be doing this anyway if you did "subtexel" adjustment "manually", heh, now we correcting automatically to center of pixel ( or anywhere you like ), without even thinking "subpixels" it just happens when it's done right from the square one.
.. hope my explanation wasn't too vague, it should be fairly clear if read it more than once.
