Rust continues to top the charts as the most admired and desired language by developers, and in this post, we dive a little deeper into how (and why) Rust is stealing the hearts of developers around the world.
I’ve been writing C++ for years and I have yet to be burned by undefined behavor. And because it exists the compiler doesn’t have to insert some slow if checks for places my code could do different things on different systems.
I run undefined behavior sanitizer on everything. The only time it has ever complained was a case where my platform does define the behavior and I was intentionally relying on that.
The existence of undefined behaviour does not at all help performance. Those unnecessary if-checks are mostly a myth and even when they are introduced (e.g. bounds-checks when indexing arrays), they are usually outweight by the advantages of disallowing aliasing (references can be used much more “carelessly” without rutime checks, because these checks happen at compile time by default, comlilers can generally optimize code better because they know more about aliasing of specific data or the lack thereof). In larger, modern c++ projects a lot of smart pointers are used to enforce similar aliasing rules, which are then enforced at runtime though. Generally, the lack of undefined behaviour enables both programmers and compilers to design, architect and optimize code better. There are enough examples out there. Cloudflares in-house proxy solution comes to mind, which is written in rust and easily beats nginx (!!), serving billions of requests per day.
https://lists.isocpp.org/std-proposals/2023/08/7587.php gives one example where it does. Rust defines what happens for a case that is clearly nonsense, thus rust needs check for that case (on processors where the CPU does something different) even though if you get into it you have a bug in your code.
I don’t doubt that you can easily craft micro benchmarks out of very specific cases. My point was, that in real world applications, the advantages outweigh the disadvantages easily! And in a very tight loop of performance critical code where this might not be the case, you can still use unsafe and disable checks very carefully where you control the invariants yourself.
And, even more importantly: Depending on the use case, that work is not wasted! “You have a bug in your code” is very possible (more unlikely in rust due to its design, but still). If that bug triggers UB, chances are high you habe an exploitable security problem there. If it instead triggers a panic due to rusts checks, the app stopps in a clean way with a decent message and without a security vulnerability.
The only problem with that is that llvm, which the Rust compiler uses, is primarily designed for C++. Since this language always has aliasing, the compiler isn’t optimizing well for that situation. I think it’s fixed now, but for the first few years, rustc didn’t even supply the noalias attribute to the optimizer, because it was completely broken.
Yes, that optimization is finally enabled now. But even without it, programmers are less defensive when writing rust because of the freedom of UB, so they write more optimal code and use better architectures before the compiler even comes into play. It doesn’t show in micro benchmarks, but in more complex software that has been written in rust from the start it’s pretty obvious.
The only time it has ever complained was a case where my platform does define the behavior and I was intentionally relying on that.
If by platform you mean target CPU you should be aware that it’s still undefined behaviour and that it could break optimizations, unless your compiler also makes a commitment to define that behavior that is stronger than what the standard requires.
I broke the one definition rule by having a symbol in two different .so files. The optimizer can’t optimize around this and on Linux the order of loading says who wins. On windows there are different rules, but I forget which.
Of course if the optimizer could make an optimization I would be in trouble, but my build systems ensures that there is no optimizer that gets access to either definition.
I’ve been writing C++ for years and I have yet to be burned by undefined behavor. And because it exists the compiler doesn’t have to insert some slow if checks for places my code could do different things on different systems.
I run undefined behavior sanitizer on everything. The only time it has ever complained was a case where my platform does define the behavior and I was intentionally relying on that.
The existence of undefined behaviour does not at all help performance. Those unnecessary if-checks are mostly a myth and even when they are introduced (e.g. bounds-checks when indexing arrays), they are usually outweight by the advantages of disallowing aliasing (references can be used much more “carelessly” without rutime checks, because these checks happen at compile time by default, comlilers can generally optimize code better because they know more about aliasing of specific data or the lack thereof). In larger, modern c++ projects a lot of smart pointers are used to enforce similar aliasing rules, which are then enforced at runtime though. Generally, the lack of undefined behaviour enables both programmers and compilers to design, architect and optimize code better. There are enough examples out there. Cloudflares in-house proxy solution comes to mind, which is written in rust and easily beats nginx (!!), serving billions of requests per day.
https://lists.isocpp.org/std-proposals/2023/08/7587.php gives one example where it does. Rust defines what happens for a case that is clearly nonsense, thus rust needs check for that case (on processors where the CPU does something different) even though if you get into it you have a bug in your code.
I don’t doubt that you can easily craft micro benchmarks out of very specific cases. My point was, that in real world applications, the advantages outweigh the disadvantages easily! And in a very tight loop of performance critical code where this might not be the case, you can still use unsafe and disable checks very carefully where you control the invariants yourself.
And, even more importantly: Depending on the use case, that work is not wasted! “You have a bug in your code” is very possible (more unlikely in rust due to its design, but still). If that bug triggers UB, chances are high you habe an exploitable security problem there. If it instead triggers a panic due to rusts checks, the app stopps in a clean way with a decent message and without a security vulnerability.
The only problem with that is that llvm, which the Rust compiler uses, is primarily designed for C++. Since this language always has aliasing, the compiler isn’t optimizing well for that situation. I think it’s fixed now, but for the first few years, rustc didn’t even supply the noalias attribute to the optimizer, because it was completely broken.
Yes, that optimization is finally enabled now. But even without it, programmers are less defensive when writing rust because of the freedom of UB, so they write more optimal code and use better architectures before the compiler even comes into play. It doesn’t show in micro benchmarks, but in more complex software that has been written in rust from the start it’s pretty obvious.
If by platform you mean target CPU you should be aware that it’s still undefined behaviour and that it could break optimizations, unless your compiler also makes a commitment to define that behavior that is stronger than what the standard requires.
I broke the one definition rule by having a symbol in two different .so files. The optimizer can’t optimize around this and on Linux the order of loading says who wins. On windows there are different rules, but I forget which.
Of course if the optimizer could make an optimization I would be in trouble, but my build systems ensures that there is no optimizer that gets access to either definition.