+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 30 of 35

Thread: SDL general speed up tips?

  1. #1
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default SDL general speed up tips?

    Hiya,

    I'm looking for ways to work with SDL to speed it up other than dirty rects. I am also looking to use Additive Blending in software SDL - is this possible? I can't find it in the SDL docs.

    I currently convert each image I load into the same pixel format as the display but I don't do much more than that.

    Any suggestions to speed stuff up?
    Last edited by Robert Cummings; 05-19-2006 at 03:48 AM.

  2. #2
    Moderator
    Join Date
    Jul 2004
    Location
    Zürich, Switzerland
    Posts
    1,966

    Default

    I'll leave this thread alone because it isn't offtopic, but you should definitely join the SDL mailing list or at least browse the archives. Unlike here, everyone in that list uses SDL

    As for your questions, 1) convert to display format, 2) use SDL_RLEACCEL when possible, 3) there's no additive blending but you can do it yourself.
    Gabriel Gambetta
    Google Zürich - Formerly Mystery Studio

  3. #3
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Thanks Gabriel I'll join that list and have a poke about with manual drawing.

  4. #4
    Senior Member
    Join Date
    Feb 2005
    Location
    Fenton, MO
    Posts
    736

    Default

    I use SDL, but mainly to set up video modes. I do all of my blitting myself using hand-rolled routines. That gives me the ability to add any functionality I need, and to optimize hot-spots.

    Also, it's easiest to pick a single pixel format (ARGB 32 bit), and write all your routines for that. For the handful of PCs not running in 32 bit and/or not capable, you can have a side-path just before you blit to the screen/flip page that converts down to the appropriate rez. I think SDL even does this automatically.

    I've used 16 bit in the past (565 or 555) and that works too, but these days, it's easiest to just go 32 bit.
    Bonnie's Bookstore - Casual Game Blogs (Multiple blogs by different developers) - My Game Dev Blog

  5. #5
    Senior Member
    Join Date
    Jul 2004
    Location
    California
    Posts
    584

    Default

    I think additive blending is supported by sdl_gfx. At least it seems to be in this demo that I wrote. See the screenshot at the bottom of the page.
    Last edited by HairyTroll; 05-19-2006 at 09:21 AM.

  6. #6
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Hi,

    I'm using SDL_GFX myself with rotations. Can you elaborate on how you got it to use additive blending? Btw - that shot looks like normal alpha blending to me.

  7. #7

    Default

    Pygame which uses SDL has additive blending now.

    Have a look at the alphablit.c in cvs.

    Or I've uploaded it here: http://rene.f0o.com/~rene/stuff/alphablit.c
    The blit_blend_THEM function.


    Tips for speed? Profile first. Test on old hardware. If you can use hardware surfaces. Convert your surfaces. Don't update very often. 25fps or 30fps jitter free is better than fluctuating frame rates of 67, 56, 56, 70. Plus laptop users will like you better.

    Note that the latest SDL doesn't use directx by default anymore on windows. For better compatibility with newer machines and the unreleased windows vista.

  8. #8
    Senior Member
    Join Date
    Jul 2004
    Location
    California
    Posts
    584

    Default

    Quote Originally Posted by Robert Cummings
    Hi,

    I'm using SDL_GFX myself with rotations. Can you elaborate on how you got it to use additive blending? Btw - that shot looks like normal alpha blending to me.
    Oops. You are correct, that's alpha not additive blending.

    -Luke

  9. #9
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Thanks a lot! Will take a look at that .c file (I'm not coding it in C but might be able to convert with minimal hassle)

  10. #10
    Moderator
    Join Date
    Jul 2004
    Location
    Zürich, Switzerland
    Posts
    1,966

    Default

    About additive blending, I believe I sent my blit code to the SDL mailing list.
    Gabriel Gambetta
    Google Zürich - Formerly Mystery Studio

  11. #11
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Hi Gabriel - I searched the list but was unable to find your blit code. Could you tell me what terms to search for? Thanks

  12. #12

    Default

    Another thing to remember is that, as long as you run in windowed mode, the backbuffers will ALWAYS be in system memory.

    In DirectX capable machines, this is not true, you can have backbuffers in video memory even in windowed mode, only difference is you don't flip, but blit from the backbuffer to the primary surface, so it's still aided by hardware.

    SDL for whatever reasons, decided to always use system memory for the backbuffer when running in windowed mode.

    So to achieve equal performance with directx/draw applications, you must run SDL in full screen in order to make use of video memory backbuffers.

  13. #13
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Great tip - thanks! I have the latest SDL which I believe defaults to GDI though - or some such, thanks to vista compatiblity?

  14. #14

    Default

    Try using fullscreen and check the surface settings to see if it's using video or system memory surfaces.

    As long as you're running in a window, everything is in software mode even if you forced a video_memory flag while creating a surface.

    Only in fullscreen will SDL use any hardware acceleration.

    I've a thread on this early this year.

    http://forums.indiegamer.com/showthread.php?t=5723

  15. #15
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Ah yes - sorry for the confusion - I actually want software mode to be as fast as possible as I have my own hardware renderer. This is for pcs without acceleration

    Any more tips on software SDL speed ups?

  16. #16

    Default

    Aligned blits? Especially for the destination buffer.

    During the older days when I code in asm, i make sure EDI is dword aligned during a rep movsd.

    Not sure if VC++ or Mingw will optimize the strcpy function, but for RLE sprites, it helped boosted performance by tons.

    Check the SDL source and see how it implements RLE sprites blitting, and disassemble and see if they used strcpy functions for copying the bytes and ensure EDI is double word aligned.

    If you have a large amount of data to blit, maybe can consider some of those FPU or SSE or MMX instructions for blitting large chunks of data.

    Check out some samples here at Paul Hsieh's site
    http://www.azillionmonkeys.com/qed/asmexample.html

  17. #17

    Default

    SDL uses mmx, 3dnow for blitting and mixing when they are available. The latest .10 release has been optimized even more from the previous releases with mmx/3dnow. SDL also uses altivec instructions on ppc machines.

  18. #18

    Unhappy

    Quote Originally Posted by illume
    SDL uses mmx, 3dnow for blitting and mixing when they are available. The latest .10 release has been optimized even more from the previous releases with mmx/3dnow. SDL also uses altivec instructions on ppc machines.

    Pointless. I just tested 1.2.10

    It is slower than previous version, because it no longer allows SDL_HWSURFACE flags to be set.

    The same sample I used which allowed SDL_HWSURFACE in fullscreen, will always use SDL_SWSURFACE even with the same parameters I typed after upgrade to latest sdl.dll for 1.2.10, even after recompiling with the latest libs and headers.

    testblitspeed --srchwsurface --dsthwsurface


    Is there anyone I can report this bug ? What's the point of using libSDL if it does not accelerate at all even in full screen.

    EDIT: never mind, did a putenv to change default windib driver to directx. Back to good speed.
    Last edited by Jason Chong; 05-23-2006 at 03:26 AM.

  19. #19
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    isn't windib there for a reason? for example, vista compatibility?

  20. #20

    Default

    Quote Originally Posted by Robert Cummings
    isn't windib there for a reason? for example, vista compatibility?
    from the latest documentation for 1.2.10

    The "windib" video driver is the default now, to prevent problems with certain laptops, 64-bit Windows, and Windows Vista. The DirectX driver is still available, and can be selected by setting the environment variable SDL_VIDEODRIVER to "directx".
    I don't know of what problems will occur, but here's the code I did to switch it back to directx.

    putenv("SDL_VIDEODRIVER=directx");


    It has to be called before SDL_Init() or else it won't work.

    Previous versions of SDL defaults to directx.

  21. #21
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Aha thanks! I'll try that. The thing about laptops disturbs me though, so I think I'll ship it with windib to reach the most people.

  22. #22

    Default

    I think it is mainly with a series of damned intel integrated chipsets. Those are the only reports I've had from the .8 .9 series of SDL.

    A driver update fixes it thankfully. So hopefully it won't be long until lots of these win XP machines get auto updated.

    Again, damn intel. I think SDL was too hasty in switching to windb. Although it is mostly more compatible, which is SDLs aim.

  23. #23
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Also I had some pretty disturbing findings last night with new .10.

    * I notice that converting your surfaces to the display format resulted in a HUGE performance drop. Why would that be?
    * Also using SetAlpha before drawing (a tip from gabriel gambetta) resulted in a further speed drop...

    Why on earth is this occuring?

  24. #24
    Senior Member
    Join Date
    Aug 2004
    Location
    Edinburgh, UK
    Posts
    342

    Default

    Have you tried palettised surfaces?

    (I'll get my coat...)

  25. #25
    Moderator
    Join Date
    Jul 2004
    Posts
    365

    Default

    Quote Originally Posted by Robert Cummings
    Also I had some pretty disturbing findings last night with new .10.

    * I notice that converting your surfaces to the display format resulted in a HUGE performance drop. Why would that be?
    * Also using SetAlpha before drawing (a tip from gabriel gambetta) resulted in a further speed drop...

    Why on earth is this occuring?
    Think that everytime you blit the surface, a new temporary destination-format-compatible surface is created, each pixel is translated to this new surface and only then the blit takes place with the temp surface to the destination surface. Memory&Time inefficiency .

  26. #26
    Senior Member
    Join Date
    Sep 2004
    Location
    Netherlands
    Posts
    849

    Default

    Quote Originally Posted by Gilzu
    Think that everytime you blit the surface, a new temporary destination-format-compatible surface is created,
    Obviously you do the conversion once after loading, not for every blit... Not sure why the new version would slow this down, though. Of course, you can always revert to an older version: they still work fine.

  27. #27
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    I do the conversion when I load the files. I found that it dramatically slowed things down. I don't want to use an earlier version but am hoping to find out the reason why

    Posted to sdl mailing list but sometimes you'll go several days without a reply...

  28. #28
    Moderator
    Join Date
    Jul 2004
    Posts
    365

    Default

    Quote Originally Posted by mahlzeit
    Obviously you do the conversion once after loading, not for every blit... Not sure why the new version would slow this down, though. Of course, you can always revert to an older version: they still work fine.
    I was describing the process taking place once you call blitting to a different format surface. Obvoiusly its best getting EVERY calculation you can outside the main loop.

    @Robert Cummings:
    I use this code, i think ive posted it before, but since it might help you sort your problem... i'd thought id post it again:
    Code:
    SDL_Surface *LoadSpriteDF(const char *FileName)
    {
        SDL_Surface *tempSurface;
        SDL_Surface *AnotherSurface;
        
        tempSurface = SDL_LoadBMP(FileName);
        if (tempSurface == NULL) {
            fprintf(stderr, "|%s,%d| Error Loading %s: %s\n", __FILE__,__LINE__,FileName, SDL_GetError());
            return NULL;
        }
        SDL_SetColorKey(tempSurface,SDL_SRCCOLORKEY|SDL_RLEACCEL,SDL_MapRGB(tempSurface->format,255,128,0));
        AnotherSurface =   SDL_DisplayFormat(tempSurface);
        SDL_FreeSurface(tempSurface);
        if (AnotherSurface == NULL) {
            fprintf(stderr, "|%s,%d| Error Loading %s: %s\n", __FILE__,__LINE__,FileName, SDL_GetError());
            return NULL;
        }
        return AnotherSurface;
    }

  29. #29
    Senior Member
    Join Date
    Apr 2005
    Location
    Texas
    Posts
    137

    Default

    Robert,
    A couple more items to ponder:

    Alpha blending to a surface - If you are doing any significant amount blending per frame, make sure the surface is in System memory, not Video memory. Reading pixels from video memory surfaces can be vastly slower than reading from a system memory surface. It doesn't take that many reads to where the cost of copying/blitting the finished surface to video memory is cheaper than the reads from video memory would be.

    Also, remember it's not just about Alpha blending -- any reads by the CPU from a video memory surface apply.

    Also, if you have rolled your own routines, you can get a big speedup by using cache prefetch instructions. They are a bit tricky (got to use them in the right amount, and the right temporal distance), and only available on later processors (Pentium 3 & Up, Athlon & up, and the opcodes may not have been unified until the Athlon XP - gotta check that). When you get them right though... WOW..

    And they are not just for reading pixels from the surface. If all you are doing is drawing/writing, the way the CPU works, it still pulls in the whole cache line on the first write to an address in that line, because the CPU has no way to know if you are going to fill the whole line or not. (Well, on the PPC and Xbox360 you can specify that and skip the cache line read...).

    Finally, if you are drawing a bunch of irregularly shaped sprites/whatever, you can see a big improvement by switching from dirty rectangles to dirty line strips. It's more complicated to manage, but can make a big difference. Either system is also very vulnerable to cache issues - making your dirty XYZ manager cache line friendly is very important.

    For what it's worth, I implemented all these things, and more, in my 2d commercial games, and their graphic performance was the best. (Wish the same could be said for AI perf...)
    -Spaceman Spiff
    Making games for the 6-year old in all of us

  30. #30
    Senior Member
    Join Date
    Apr 2005
    Posts
    1,133

    Default

    Wow thanks guys! Will check out your suggestions. Gilzu - are you certain you've tried that with the new .10 release? It is chopping my framerate in half when I do....

    Spaceman - I am not sure what you mean by dirty line strips? It sounds intriguing...

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts