Got myself a few months ago into the optimization rabbit hole as I had a slow quant finance library to take care of, and for now my most successful optimizations are using local memory allocators (see my C++ post, I also played with mimalloc which helped but custom local memory allocators are even better) and rethinking class layouts in a more “data-oriented” way (mostly going from array-of-structs to struct-of-arrays layouts whenever it’s more advantageous to do so, see for example this talk).

What are some of your preferred optimizations that yielded sizeable gains in speed and/or memory usage? I realize that many optimizations aren’t necessarily specific to any given language so I’m asking in [email protected].

  • @dragontamer
    link
    English
    5
    edit-2
    1 year ago

    I had a pretty standard linear-list scan initially. Each time the program started, I’d check the list for some values. The list of course grew each time the program started. I maximized the list size to like 2MB or something (I forget), but it was in the millions and therefore MBs range. I figured it was too small for me to care about optimization.

    I was somewhat correct, even when I simulated a full-sized list, the program booted faster than I could react, so I didn’t care.


    Later, I wrote some test code that exhaustively tested startup conditions. Instead of just running the startup once, I was running it millions of times. Suddenly I cared about startup speed, so I replaced it with a Hash Table so that my test-code would finish within 10 minutes (instead of taking a projected 3 days to exhaustively test all startup conditions).


    Honestly, I’m more impressed at the opposite. This is perhaps one of the few times I’ve actually taken the linear-list and optimized it into a hash table. Almost all other linear-lists I’ve used in the last 10 years of my professional coding life remain just that: a linear scan, with no one caring about performance. I’ve got linear-lists doing some crazy things, even with MBs of data, that no one has ever came back to me and said it needs optimization.

    Do not underestimate the power of std::vector. Its probably faster than you expect, even with O(n^2) algorithms all over the place. std::map and std::unordered_map certainly have their uses, but there’s a lot of situations where the std::vector is far, far, far easier to think about, so its my preferred solution rather than preoptimizing to std::map ahead of time.