View Full Version : Render State Change Costs?
Sybixsus
10-01-2006, 06:45 PM
Is there any information available on the relative cost of render state changes? I'm currently working on the assumption that texture is the costliest and color is the next costliest, and not really bothering with anything beyond that, but if I had some kind of way to compare the state changes, perhaps that would change and optimize my rendering a bit.
impossible
10-01-2006, 10:32 PM
Is there any information available on the relative cost of render state changes?
From the DX9 SDK (Accurately Profiling Direct3D API calls (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/Accurately_Profiling_Direct3D_API_Calls.asp))...
API Call - Average number of Cycles
SetVertexDeclaration 6500 - 11250
SetFVF 6400 - 11200
SetVertexShader 3000 - 12100
SetPixelShader 6300 - 7000
SPECULARENABLE 1900 - 11200
SetRenderTarget 6000 - 6250
SetPixelShaderConstant (1 Constant) 1500 - 9000
NORMALIZENORMALS 2200 - 8100
LightEnable 1300 - 9000
SetStreamSource 3700 - 5800
LIGHTING 1700 - 7500
DIFFUSEMATERIALSOURCE 900 - 8300
AMBIENTMATERIALSOURCE 900 - 8200
COLORVERTEX 800 - 7800
SetLight 2200 - 5100
SetTransform 3200 - 3750
SetIndices 900 - 5600
AMBIENT 1150 - 4800
SetTexture 2500 - 3100
SPECULARMATERIALSOURCE 900 - 4600
EMISSIVEMATERIALSOURCE 900 - 4500
SetMaterial 1000 - 3700
ZENABLE 700 - 3900
WRAP0 1600 - 2700
MINFILTER 1700 - 2500
MAGFILTER 1700 - 2400
SetVertexShaderConstant (1 Constant) 1000 - 2700
COLOROP 1500 - 2100
COLORARG2 1300 - 2000
COLORARG1 1300 - 1980
CULLMODE 500 - 2570
CLIPPING 500 - 2550
DrawIndexedPrimitive 1200 - 1400
ADDRESSV 1090 - 1500
ADDRESSU 1070 - 1500
DrawPrimitive 1050 - 1150
SRGBTEXTURE 150 - 1500
STENCILMASK 570 - 700
STENCILZFAIL 500 - 800
STENCILREF 550 - 700
ALPHABLENDENABLE 550 - 700
STENCILFUNC 560 - 680
STENCILWRITEMASK 520 - 700
STENCILFAIL 500 - 750
ZFUNC 510 - 700
ZWRITEENABLE 520 - 680
STENCILENABLE 540 - 650
STENCILPASS 560 - 630
SRCBLEND 500 - 685
TWOSIDEDSTENCILMODE 450 - 590
ALPHATESTENABLE 470 - 525
ALPHAREF 460 - 530
ALPHAFUNC 450 - 540
DESTBLEND 475 - 510
COLORWRITEENABLE 465 - 515
CCW_STENCILFAIL 340 - 560
CCW_STENCILPASS 340 - 545
CCW_STENCILZFAIL 330 - 495
SCISSORTESTENABLE 375 - 440
CCW_STENCILFUNC 250 - 480
SetScissorRect 150 - 340
These numbers vary depending on hardware drivers. In theory you could read the article and do some profiling for your target hardware. The best advice is to have as few batches as possible. Simply setting a texture or a renderstate might not have too big of an effect because drivers will ignore redundant state changes, delay state changes, etc.
ManuTOO
10-07-2006, 07:19 AM
I did a lot of optimizations, 3 years ago, on DX8.1 3D cards...
Basically, the change of render state itself costs nothing or almost.
The real cost is that anytime you have a render state change, you can't pack poly/mesh together : ie, it's a lot faster to tell 1 time the 3d card to draw 1000 polys, then 10 times 100 polys.
In D3D, there's a way to pack render state changes together, instead of calling a function for each of them. I tried that, and didn't notice any real performance difference.
vBulletin v3.6.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.