Initial performance testing was interesting. I set up three extreme cases to exercise different characteristics:
A test of the non-textured stencil shadow speed showed a GF3 about 20% faster than the 8500. I believe that Nvidia has a slightly higher performance memory architecture.
A test of light interaction speed initially had the 8500 significantly slower than the GF3, which was shocking due to the difference in pass count. ATI identified some driver issues, and the speed came around so that the 8500 was faster in all combinations of texture attributes, in some cases 30+% more. This was about what I expected, given the large savings in memory traffic by doing everything in a single pass.
A high polygon count scene that was more representative of real game graphics under heavy load gave a surprising result. I was expecting ATI to clobber Nvidia here due to the much lower triangle count and MUCH lower state change functional overhead from the single pass interaction rendering, but they came out slower. ATI has identified an issue that is likely causing the unexpected performance, but it may not be something that can be worked around on current hardware.
I can set up scenes and parameters where either card can win, but I think that current Nvidia cards are still a somewhat safer bet for consistent performance and quality.