Taming the UB Monsters in C++

87 points by ingve 2 days ago

roca 2 days ago

There isn't anything new here to defend against lifetime-related UB. For that it simply references https://arxiv.org/pdf/2503.21145, which is just a summary of existing dynamic mitigations --- which don't fix UB at the language level, impose performance tradeoffs, and in the case of pointer integrity, require hardware support that excludes e.g. x86.

Look at it this way --- mature products like Chrome are already doing all of that wherever they can. If it was enough, they wouldn't worry about C++ UB anymore. But they do.

Voultapher 2 days ago

At this point I'm wondering if C++ leadership is either willfully ignorant or genuinely in denial.
I know several people on various C++ committees and by and large their opinion is, we evolve the language and library to give existing projects incremental improvements without asking them to rewrite them, but if you are starting with a new project C++ is often a subpar option. From that perspective I get why they'd be hesitant about efforts like Circle. Circle and co. ask developers to rewrite their code, in something that looks very different to normal C++ - whatever normal C++ even is, given the multitude of dialects out there - can't seamlessly interop with existing code, needs a new incompatible standard library, that as of now doesn't even exist. At which point, honestly just rewrite it in Rust instead of going through the painful exercise to use something that's 10+ years behind where Rust is today in terms of DX, tooling and ecosystem.
But all that doesn't explain why at the very top, even mentioning Rust as an alternative seems taboo, idk.
- pjmlp 2 days ago
  
  There is also a strange dynamic going, and this has worked against C++.
  In the early ISO days, the people sent to ISO were employees from compiler vendors, and existing practice was the key factor into adding stuff to the standard.
  Eventually, comitee dynamics took place, and nowadays most of the contributors to WG21, and to lesser extent WG14 (which still keeps more close to the existing practice spirit), you have hundreds of contributors wanting to leave their historical mark on the ISO standard, withough having written a single line of compiler code, validating their proposal, which they are able to fight trough the whole voting process, and then leave the compiler vendors sorting out the mess how to implement their beloved feature.
  Those of us that really like C++, are also kind of lost on how things turned out this way.
  
  jcranmer 2 days ago
  
  WG21 is well down the path of actively being hostile to the implementers of C++. There was a recent proposal where all 4 implementers [1] stood up and said "no", and the committee still voted it in, ignoring their feedback.
  [1] C++ has only 4 implementations these days, Clang, EDG, GCC, and MSVC; everything keeping up with the standard is a fork of one of these projects.
  
  pjmlp 2 days ago
  
  This is why I see C++26 as the last great standard, it would likely be C++23 if it wasn't for reflection.
  Not that WG21 won't produce further ones, just like has happened with other ecosystems of similar age, who cares about Fortran 2023, or COBOL 2023 revisions, despite their critical use in many research projects, or companies infrastructure.
  It is already good enough (minus the security issues), for the existing infrastructure that relies on C++, and most of the new stuff isn't helping.
  
  quuxplusone a day ago
  
  Which proposal?
  
  gpderetta 2 days ago
  
  Proof of implementation should be a requirement for every proposal (allegedly it is, but in practice...).
  Which would limit most "outsider" proposals mostly to library features, which would be a good thing I guess.
  
  stingraycharles 2 days ago
  
  Well, you kind of have your answer right there: it’s a language designed by committee, not by “the industry”.
  This has been my biggest problem, and I say this as someone who has been on and off developing C++ for over 2 decades.
  At the same time, it’s a safe bet to say that C++ will still be around in another 2 decades.
  
  pjmlp 2 days ago
  
  Don't forget that same applies to C.
  
  roca a day ago
  
  And COBOL.
  
  pjmlp 19 hours ago
  
  Yeah, however at least COBOL wasn't designed without bounds checking, or pointer based strings, in 1959 hardware.
  
  zombot 18 hours ago
  
  In 1959 people still made mistakes. By the time C came around, human error was all but eliminated. /s
- zombot 2 days ago
  
  > even mentioning Rust as an alternative seems taboo, idk.
  Rust is even framed as an "attack on C++" by Stroustrup himself [1]. No wonder it's taboo.
  [1] https://www.theregister.com/2025/03/02/c_creator_calls_for_a...
  
  unboundedjiure 2 days ago
  
  Seems like a bit of a sensationalist deduction from what looks like a pretty levelheaded response. It's not a call to war, but call to improve the C++ standard
  
  steveklabnik 2 days ago
  
  The Register article was written based on a leaked paper, since then, Bjarne published the article itself, which you can read here: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p36...
  Not trying to take a position about if it's sensational or not, just wanting to add a primary source here.
pron 2 days ago

C++ developers worry about all UB, but we don't worry about all of them to the same extent. MITRE's CWE Top 25 [1] lists out-of-bounds write as the second most dangerous weakness, out-of-bounds read as the sixth, and use after free as number eight (null-pointer dereference and integer overflow are at nos. 21 and 23 respectively). All UBs are worrisome, but that's not to say they are equally so. Taking care of out-of-bounds is easier than UAF and at the same time more important. Priorities matter.
[1]: https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html
- Wumpnot 19 hours ago
  
  Interesting, for all the winging about C or C++ this shows most of these apply to all languages, and the ones that relate to C or C++ are actually pretty easy to prevent in C++(less so in C) by enabling hardening modes and using smart pointers.
- roca a day ago
  
  I don't know where that ranking comes from. It also matters that attackers adapt: UAF exploitation is harder than out of bounds, but it is well understood, and attackers can switch to it, so shutting off one source of UB isn't as effective in practice as you might expect.
  
  pron a day ago
  
  > I don't know where that ranking comes from.
  It comes from MITRE (https://en.wikipedia.org/wiki/Mitre_Corporation), and the methodology is explained on the website (roughly, the score is relative prevalence times relative average vulnerability severity).
  > and attackers can switch to it, so shutting off one source of UB isn't as effective in practice as you might expect.
  If that's how things work, you could say the same about all the other weaknesses that have nothing to do with UB.

pjmlp 2 days ago

While it is great that this is happening, it seems a bit getting too late to the party.

There is also the whole issue that standard working group, and the folks that actually work on C and C++ compilers, nowadays it is a very thin Venn diagram for the intersection of both groups.

So it remains to be seen how much of this will actually land on compilers, and in what form.

Nevertheless there are many C++ written tools that most likely will never be rewritten into something else, so any improvement is welcomed.

Kelteseth 2 days ago

> too late to the party
And even if this would all be available today, you would need to wait 10 years for all third party libraries to catch up. Just look at the current rate of c++ version adoption. Many projects just migrated to c++17 this year, and that version is 8 years old by now...
Edit: Just used PCL that got updated from c++14 to 17, or look at https://vfxplatform.com/ where everything is c++17, and the list goes on....
dgellow 2 days ago

According to Herb’s article most of this is already available in compilers
- steveklabnik 2 days ago
  
  Depending on how you look at it, that raises the question of "if this is so effective, and it's already available, why hasn't the situation improved?"
  
  pjmlp 19 hours ago
  
  As many of us point out, even those that actually apreciate C++ despite its warts, safety culture or lack thereof.
  Which I think it was much better back when it was C++ vs C in the C++ARM days, with great compiler provided frameworks, eventually past C++98 there seems to have been an inversion, as C++ graduality took over domains where C ruled.
  The same mideset that will reach out for unsafe language constructs, regardless of the programming language, without any kind of profiler information, because of course it is faster and every μs counts.
  
  Wumpnot 19 hours ago
  
  1. who says it hasn't? 2. most of the vul code is C, which is obviously much harder to harden, and the Rust Evangelism Strike Force loves to pretend that C++ is the same as C, so no matter the improvements to C++, they will just point at C. 3. I think many simply didn't know about these hardening modes, MSVC has had this for 10-15 years, but I still encounter people who don't know about it..somehow.
  
  steveklabnik 15 hours ago
  
  It hasn’t enough to satisfy industry or regulators.
  As long as that C code is valid C++ code, it’s still a problem for C++. Backwards compatibility with C is a strength, but also a weakness. The Go and Java folks invested in rewriting dependencies in their own language to prevent problems, if C++ is truly that much safer than C, the C++ community could do the same, and demonstrate that it’s safer.
  This is the power of opt out vs opt in. You can’t forget to run the borrow checker in Rust. That’s a practical, real-world advantage.
- AlotOfReading 2 days ago
  
  In mainstream compilers. Most of these options aren't available in compilers e.g. for safety critical code.
- pjmlp 2 days ago
  
  Partially.

uecker 2 days ago

In C, we made clear in C23 that there is no true time-travel UB and have eliminated about 30% of UB in the core language for C2Y as part of an ongoing effort that will eliminate most of the simple cases of UB. We also have plans to add opt-in memory safety mode proving full spatial and temporal memory safety.

Having said this, I think the sudden and almost exclusive focus on memory safety is weird. As a long-term Linux user, this is not my main problem. This is what people building app stores and non-free content distribution systems need and they now re-engineer the world according to their needs. There are lot of things compromising my online safety and freedom, and certainly memory safety issues are not very high on this list.

Finally, what Rust achieves, and what a memory-safe mode in C will hopefully also achieve in the future, is also just an incremental improvement. As long as there in unsafe code, and in practice there will be a lot of unsafe code, there is no perfect memory safety.

ultimaweapon 2 days ago

Most C/C++ users don't understand how Rust achieve memory safety because they don't know Rust enough. They always underestimate Rust memory safety. The truth is Rust can give you nearly 100% memory safety. The point of unsafe code in Rust is to isolate unsafe operations and provide a safe interface to it. As long as you wrote that unsafe code correctly the rest of your safe code will never have memory safety problems.
- pjmlp 2 days ago
  
  They also conflate unsafe keyword with Rust, when its use in systems programming languages predates C by a decade.
  Here is the most recent version of NEWP manual,
  https://public.support.unisys.com/framework/publicterms.aspx...
  Which started as ESPOL in early 1960's,
  https://en.wikipedia.org/wiki/Executive_Systems_Problem_Orie...
  Binaries with unsafe code blocks are tainted, and must be white listed by admins to allow execution in first place.
  This was then followed by several languages, using unsafe code blocks, pseudo packages like SYSTEM, unsafe, unchecked,..., until finally Rust came to be.
  But since most C and C++ users aren't language nerds, not even reading their own ISO specification, they are unaware of the whole safety history since JOVIAL, and naturally the whole unsafe code blocks is all about Rust.
- uecker 2 days ago
  
  I understand this perfectly. The point is that 1) memory safety is a small part of the overall picture. 2) in practice people will not build perfectly safe abstractions that are then used by 100% memory-safe code, but they will create a mess.
  
  Georgelemental 2 days ago
  
  > In practice people will not build perfectly safe abstractions that are then used by 100% memory-safe code
  Yes, in practice they quite commonly will. `unsafe` is rare, so it’s feasible to spend lots of extra efforts to validate it.
  
  uecker 2 days ago
  
  Is it rare? I see it a lot, especially in scenarios where speed matters, or where you need to interface with another system.
  
  steveklabnik 2 days ago
  
  I mean, it depends on what you mean by "rare."
  Some projects will have more than others, for example, as you mention, interfacing with other systems or hardware. (Performance is not as straightforward.)
  Even then, generally speaking it's usually pretty small: the sorta-kinda-RTOS we have at work for embedded systems is about 3% unsafe in the kernel, for example.
  Surveying all of crates.io [1] almost a year ago found that 20% have 'unsafe' somewhere in them; this is expected to be higher on crates.io than in all Rust code, because crates.io hosts mostly libraries, which are going to use unsafe more than application code.
  However, they also found that most of those usages of unsafe are for FFI, which is not able to be done in a safe way, and is overall easier to ensure the safety of than other forms of Rust's unsafe.
  1: https://rustfoundation.org/media/unsafe-rust-in-the-wild-not...
- bluGill 2 days ago
  
  I've seen Rust code where everything was in unsafe because they thought it was needed (in 1% is probably was).
  
  unboundedjiure 2 days ago
  
  I've also seen Java code that recursively calls 1 row from database at a time, a million times per request with 5ms latency per row. It's possible to willingly abuse almost any system that a reasonable person would deem perfectly performant or safe for the purpose under default conditions.
  The question should less be about whether it's possible to try to abuse the system and more what it looks like in a very reasonable everyday scenario.
galangalalgol 2 days ago

Can you clarify why memory safety isn't your main problem? Do you just mean that UB isn't your biggest problem, or that memory safety isn't your biggest source of UB? The latter sounds unlikely, and the former is interesting as all the code I give away for free gets used by people who very much care if it has UB.
- uecker 2 days ago
  
  When we talk to companies that are in the business of helping other companies that were hacked, the stories we hear are never that they were hacked by some 0-day in the Linux kernel or a server software caused by some memory-safety issue. Instead, they are hacked because some did not install updates of some component on the webserver, password authentication was used, the same passwords are used on different servers, etc. or some bug in some overly complex Microsoft infrastructure part.
  
  adrianN 2 days ago
  
  > because some did not install updates of some component on the webserver
  But how many of those updates fix memory issues?
  
  galangalalgol 2 days ago
  
  I guess his point is that these bugs get found and patches are available before most people have to worry about it, because people don't burn zero day exploits unless it is very likely to net them more than the price of a zero day. But the people who handle enough value that they are worth a zero day still often use the same kernel, so they have an interest in paying to secure it. And separately, the people willing to burn a zero day are likely to be more careful in maintaining their access undetected, sometimes inserting evidence ahead of an investigation to indicate a different compromise route to preserve their zero day and potentially their access. And even if we are finding all these UB vulnerabilities before they get used, that is a giant cost we don't have to pay in new projects of we start them with something like rust. The infamous learning curve isn't a thing of you are coming from modern c++. I have never come across someone coming from c++ that didn't take to it immediately. Modern c might be a different story but I doubt it. Rewriting things should be reserved for codebases that are small, high risk, producing lots of vulnerabilities, or some combination of those. Unless someone just wants to do it for fun.
  
  lelanthran a day ago
  
  > Instead, they are hacked because some did not install updates of some component on the webserver, password authentication was used, the same passwords are used on different servers, etc. or some bug in some overly complex Microsoft infrastructure part.
  Last I checked (a few months ago) 8 out of 10 breaches were due to human error.
  As far as reducing breaches go, you'll get more bang for your buck by ensuring employees are up to date on their routine security awareness training.
  Your employees are much much easier to hack than your computers. "Choice of language" is a blip in the stats.
- bluGill 2 days ago
  
  Most security breeches find vulnerabilities in humans not their computers. Yes if you find a software (or hardware) vulnerability that you can exploit you can get many many computers are once, but those are much harder to find. However the vast majority are not from that source.
  There is a lot we can do in UX to make human vulnerabilities less common, but no language change will help.
  
  unboundedjiure 2 days ago
  
  Are you sure about that? Throughout auditing a lot of codebases in my lifetime I've found loads of ways to bypass authentication, spoof identity, cause denial of service in every one. These are very big and widely used applications with a lot of userbase.
  While unauthorized people waltzing on in to company premises hasn't not happened, it's been way rarer than the amount of serious bugs or security flaws I find. Traditional phone and email scams happen more often, but their impact has materialized much less severe thanks to very limited user privileges
  
  bluGill 2 days ago
  
  Phone and email scams happen far more often than you think and the people doing them have gotten much better at looking real over time.
  
  unboundedjiure 2 days ago
  
  For better or for worse we've received substantially less international scam attempts due to the seemingly intractable problem of writing them in our language and the relatively small pool of viable targets, but the ones we do get are usually well-crafted and targeted. We run loads of internal email scams ourselves trying to see why people trip, and try to improve our practices based on those findings.
  Currently there has been negligible impact on any of the products I've looked over because of traditional human scams. Conversely there have been significant troubles from real exploits being found and abused in the wild with dire consequences for legal and financial, but maybe my experience just happens to heavily skew the opposite of norm?. I expect things to change in the medium-term future as LLMs and such improve so that they can generate coherent text above a single sentence
Sharlin 2 days ago

I’m not sure how you can even try to juxtapose technical issues like memory safety with sociopolitical problems like corpo interests being in conflict with the interests of the common people. The formeer can be alleviated with technical solutions. There’s nothing that a programming language can do about the latter.
Also, I question your claim that memory unsafety is not of great importance to regular computer users. Perhaps not if your computer is airgapped from the internet and never gets any unvetted software installed. Otherwise, have you missed the primary cause of the majority of CVEs issued in the past decades? Do you not think that the main technical problem behind countless security vulnerabilities, that have very concretely affected tens and hundreds of millions of people, does not deserve the attention that it’s finally starting to get?
Google has reported that the mere act of stopping writing new code in memory-unsafe languages has made the fraction of mem safety vulnerabilities drop from >80% to ~20% in a few years. This is because bugs have a half-life, and once you stop introducing new bugs, the total count starts going down asymptotically as existing bugs get fixed.
Finally, since you inevitably mentioned Rust, memory safety is indeed a necessary but not sufficient condition in software reliability. Luckily, Rust also happens to greatly decrease the odds of logic bugs getting in, thanks to its modern (i.e. features first introduced in ML in the 70s) type system that actually tries to help you get things right.
C is never going to have those parts, the “if it compiles, it is correct by construction” assurance. C++ has janky, half-assed, non-orthogonal, poorly-composing, inconsistently designed versions of a lot of the stuff, but it also has all of the cruft, and that cruft is still what is taught to people before the less-bad parts. And because C++ is larger than the most people’s brain capacity, most people can’t even get to the more less-bad parts, never mind keeping up with new standard versions.
- uecker 2 days ago
  
  Technology and society are never separate things. The question if why something is seen as important or not, and what is funded, very much depends on societal questions. I am using Linux for almost 30 years. I have never been hacked, or know anybody personally, who was hacked because of a memory safety issue in any open-source component. I know such people and companies exist, I just know many more which are affected by other issues. I know many people affected by bugs in Microsoft software, including myself. I am also affected by websites spying on me, from email not being encrypted by default, etc. A lot of things could be done do make my safety and security better. That you cite Google actually demonstrates the problem: Memory safety is much more their concern, not mine.
  And C definitely will have memory safety. Stay tuned. (And I also like to have memory safety.) I do not care about C++ nor do I care about Rust. Both languages are far too complex for my taste.
- bluGill 2 days ago
  
  > Otherwise, have you missed the primary cause of the majority of CVEs issued in the past decades
  Only because CVEs are never issued when humans are compromised. While that is probably the correct action on their part it means your argument is flawed as you don't account for human vulnerabilities which are much more common. Yes memory safety is a big problem and we should do something - but as an industry we need to not ignore the largest problem. There is a lot we can do in UX to prevent most security vulnerabilities, and putting too much emphasis on memory can take away from potentially more productive paths.
IshKebab 2 days ago

> I think the sudden and almost exclusive focus on memory safety is weird.
They're clearly panicking about people switching to Rust. I don't think it's surprising. Too little too late though; you can't just ignore people's concerns for decades and then as soon as a viable competitor comes along say "oh wait, actually we will listen to you, come baaack!".
> There are lot of things compromising my online safety and freedom, and certainly memory safety issues are not very high on this list.
Out of the things that programming languages can solve, memory safety should be very high on your list. This has been proven repeatedly.
- uecker 2 days ago
  
  I don't see people panicking in my vicinity. I see some parts of the industry pouring money into Rust and pushing for memory safety. I agree that memory safety is nice, but some of the most reliable and safe software I use on daily basis is written in C. I am personally much more scared of supply chain issues introduced by Rust, or other issues introduced by increased complexity (which I still think is the main root cause of security issues).
  
  galangalalgol 2 days ago
  
  If you only need what is in the c std library the number of rust crates you use will be tiny and all from authors within the foundation. The hash gets stored in your repo, so any rebuilds where a dependency repo got compromised and tried to modify an existing version will fail. If you are using a c library that isn't std, then you will probably pull in 10x to 20x the number of dependencies in rust, but not substantially more authors. The risk is real, but if you treat it like c and minimize your dependencies it isn't any riskier, probably less. If you get tempted and grab everything and the kitchen sink you can still be reasonably safe by using the oldest version of crates you can compile with that don't have any cve. That is made easier with crates like cargo deny and cargo audit.
  All that said, I would love a language that had the same guarantees and performance without the complexity, but I don't see how that could work. There is definitely extra stuff in rust but the core capabilities come from the type system. Getting the same safety any other way would probably require a purely functional language which has performance costs in any implementation I am aware of along with a runtime being necessary. If you can afford that, then we don't need a new c, we have those languages.
  
  steveklabnik 2 days ago
  
  To put an even finer point on it, crates.io releases are immutable, and so the only scenario in which you are even in a place where someone could try and modify a release is if you're depending on a git repo directly.
  
  uecker 2 days ago
  
  I believe the practical difference to 100% memory safety that Rust offers in theory, or the 99% memory safety you get in practice when using unsafe, compared to the 98% memory safety you can get today when writing decent modern C using abstractions such as string type and modern tooling is pretty irrelevant compared to all the other issues. Yes, the numbers are totally made up, but nothing I have seen so far convinces me that the reality is a lot different. Of course, there is a lot of old crappy legacy C code, and there will even be a lot of new crappy C code, but there will also be pretty bad Rust code. Now, if you like Rust, more power to you, I have no issue with people prefering Rust. But I think arguments for having to write in Rust for security reasons are mostly based on a series of fallacious arguments.
  
  IshKebab 2 days ago
  
  > Yes, the numbers are totally made up
  Indeed.
  > but nothing I have seen so far convinces me that the reality is a lot different
  https://www.memorysafety.org/docs/memory-safety/#how-common-...
  This is very well studied.
  > Of course, there is a lot of old crappy legacy C code
  Ahh... you don't make mistakes. I see.
  
  uecker 2 days ago
  
  I don't think anything "studied" there addresses my point.
  
  steveklabnik 2 days ago
  
  > Yes, the numbers are totally made up, but nothing I have seen so far convinces me that the reality is a lot different.
  This is really what a lot of all of this comes down to, Herb and the committee feel like you do, others feel that the numbers won't be that good. We'll see in time!
  
  IshKebab 2 days ago
  
  > We'll see in time!
  If by "in time" you mean last year then yes...
  https://security.googleblog.com/2024/09/eliminating-memory-s...
  
  steveklabnik 2 days ago
  
  I meant the opposite, the claim by sutter et al is that new C++ code that uses these new techniques will show the same results as this study shows for MSLs.
  
  IshKebab 2 days ago
  
  Oh I see. Yeah I agree. That's going to be a long time!

agwa 2 days ago

> It’s working: The price of zero-day exploits has already increased.

The only thing they've shipped is no UB in constexpr code - i.e. code that wouldn't have been reachable by attackers in the first place. How could that possibly be the reason for the price of zero-day exploits increasing?

mafuy 2 days ago

I think I'm misunderstanding something. The post sounds like UB has already been mostly eliminated from recent versions of C++. But to my knowledge even something simple as `INT_MAX + 1` is still UB. Is that false?

nmeofthestate 2 days ago

I think you are misunderstanding the post - it specifically says that there's "a metric ton of work" to be done to address UB, not that it's mostly a solved problem.
saagarjha 2 days ago

No, it’s true. None of these efforts significantly change the prevalence of UB in C++.
bluGill 2 days ago

Stop using `INT_MAX + 1` as an example! It is the worst possible example you can give (though easy to understand). Such code is essentially never a memory safety issue and not to what most people worry about UB.
In the vast majority of cases it doesn't matter what `INT_MAX + 1` does you code it wrong. Sure there are a few encryption cases where it is fine, but the vast majority of cases your code as a bug no matter what the result it. If the variable netWorth is at INT_MAX there is no value of adding 1 that is correct. If the variable employeeId is at INT_MAX all values of adding 1 are going to collide with an existing employee.
Meanwhile if you define INT_MAX+1 you force the compiler to add checks for overflow INT_MAX in addition operations even though most of the time you won't overflow and thus have needlessly slowed down the code.
UB causes real problems in the real world, but INT_MAX+1 is not one of those places where it causes problems.
- Maxatar 2 days ago
  
  This is a very worrisome perspective about undefined behavior. It suggests that the issue with undefined behavior is that there is a bug in your code and it's the bug that is the problem. But that's not (entirely) the case, the issue with undefined behavior is that compilers exploit it in ways that propagate this behavior in an entirely unbounded fashion that can result in bugs not only at the very moment that the undefined behavior happens, but even before it, ie. the infamous time travelling undefined behavior [1].
  Getting rid of undefined behavior will not get rid of bugs, and no one thinks that memory safe languages somehow are bug free and certainly C++ code will not be bug free even if undefined behavior is replaced with runtime checks. What eliminating undefined behavior does is it places predictable boundaries on both the region of memory and the region of time that the bug can affect.
  [1] https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...
  
  bluGill 2 days ago
  
  The article already says we need to get rid of (or at least greatly restrict) time travel optimizations.
  Your code has a bug if addition overflows and there is no point in defining how it works.
  
  pjmlp 2 days ago
  
  The security problem is how "addition overflows", or "this will never overflow", have a side effect on security related checks that get removed.
  Naturally whatever happens to a CPU register in isolation doesn't turn right away into a security issue.
  
  112233 2 days ago
  
  I assume from your comment that you know how to write code that does not have such bugs. Could you please share?
  In particular: how do you write a library that guaranteed will not have integer overflow on any existing or future architectures and compilers?
  
  steveklabnik 2 days ago
  
  > how do you write a library that guaranteed will not have integer overflow on any existing or future architectures and compilers?
  By following the behavior in the standard, and not trying to imagine how the standard is implemented.
  For example, C23 includes https://en.cppreference.com/w/c/header/stdckdint, which you can safely use to check for overflows. It's also in C++23.
  If you're not on the latest standards, there are techniques to do so as well, these macros codify existing practice: https://stackoverflow.com/questions/199333/how-do-i-detect-u...
  
  112233 a day ago
  
  Thank you for these references. It is how you say, and, for c++ would involve handling 16 bit "int", char larger than 8 bits and in the case of pre-20 compilers you cannot even assume two's compliment.
  I am sure such correct code can be written, but expecting everyone to play along with standard and ignore that 100% of existing compilers have much saner behaviour, might be very difficult.
  
  steveklabnik 15 hours ago
  
  I’m not sure what “100% of compilers have much saner behavior” means. I do agree that avoiding UB is difficult.
  
  bluGill 2 days ago
  
  Same way any other language. That Rust defines what happens doesn't change the fact that it is a bug when it happens (in most cases).
  Which is to say not nearly as well as I would like.
  
  112233 a day ago
  
  Agree. However, it's one thing to have bug in my code and sitting with a cold coffee staring at debug prints trying to find the incorrect operation.
  It is quite another thing to have a big chunk of code simply missing from the compiled binary, because complier found a path that reaches integer overflow and thus decided that the code is unreachable.
  
  Maxatar 2 days ago
  
  I don't see where the article says that it will get rid of that. The article simple says:
  >UB optimizations also just create mysterious ordinary bugs, such as ... “time travel” optimizations that change code that precedes the point where the UB can happen.
  >Less of all those things, please.
  I don't think someone saying "less of those things please" is the same as saying "we need to get rid of this or greatly restrict it".
  Also once again, I really find problematic the idea that it's the existence of a bug itself in code that is the main issue. That is a standard that absolutely no language or technology can ever eliminate. The goal of any safety related proposal is not to make it impossible to write bugs, but that the consequences of those bugs are constrained in both space and time.
- AlotOfReading 2 days ago
  
  INT_MAX+1 is actually a great example of UB because it demonstrates how UB is a problem even if there's a completely reasonable runtime behavior. One of the big reasons UB is problematic is that it invalidates the semantic guarantees the standard makes about all your other code. That signed integer overflow completely invalidates any sort of formal proofs or error handling you might otherwise have.

virtualritz 2 days ago

There is no end to what a(n old) white man thinks he can do. ;)

As an old white man who switched from C++, for a over a quarter century, to Rust, about seven years ago, the fallacy at the root of Herb's piece is all too well understood by me.

gw2 2 days ago

A question to security experts reading this thread:

What is your opinion on deploying C++ codebases with mitigations like CFI and bounds checking? Let's say I have a large C++ codebase which I am unwilling to rewrite in Rust. But I:

* Enable STL bounds checking using appropriate flags (like `-DGLIBCXX_ASSERTIONS`).

* Enable mitigations like CFI and shadow stacks.

How much less safe is "C++ w/ mitigations" than Rust? How much of the "70% CVE" statistic is relevant to such a C++ codebase?

(I've asked this in an earlier thread and also in other forums, but I never really got a response that does not boil down to "only Rust is safe, suck it up!". It also doesn't help that every other thread about C++ is about its memory unsafety...)

pjmlp 2 days ago

It helps, however there is also a culture mindset that is required.
Back in the old Usenet flamewars, C developers would to say coding in languages like Object Pascal, Modula-2, Ada,... was like programming with straightjacket, and we used to call them cowboy programming.
When C++ came into the scene with its improved type system, it seemed a way we could have the best of both worlds, better safety and UNIX/C like ecosystem.
However this eventually changed as more and more people started to adopt C++, and thanks to its C subset, many C++ projects are actually mostly C code compiled with a C++ compiler.
So hardned runtimes help a lot, as does using static analysers like clang tidy, VC++ analyse, Sonar, PVS Studio, Clion analysers, ....
However many of them exist for the last 30 years, I was using Parasoft in 1999.
The biggest problem is culture, thinking that such tools are only required by those that aren't good enough to program C or C++, naturally those issues only happen to others, we are good drivers.
rwmj 2 days ago

STL bounds checking isn't bounds checking. Your code (or other libraries you use) can still have simple pointer arithmetic that goes outside bounds.
But the larger problem is that bounds checking (even ASAN) isn't as good as statically checking code. ie. Your code with bounds checking still crashes at run time, which can be a denial of service attack, whereas with static checking your code would never have compiled in the first place.
Nevertheless if you don't want to rewrite the world, then using these mitigations is much better than not using them. I would also add fuzzing to the mix.
- gpderetta 2 days ago
  
  DoS is vastly better than an RCE. And safe code can still panic.
  But as you mention, unfortunately enabling bound checking in the STL wouldn't catch a lot of pointer manipulation.
  It would still be better than the the status-quo.
UncleMeat 2 days ago

For the first one, a lot of this depends on how modern your codebase is. STL bounds checks work great (and have remarkably low overhead) if the vast majority of your code is working with standard library types. Maybe all of the code that might have been a c-style array in the past is now using std::vector, std::span, or std::array and so you've got built in lengths. Not perfect, of course, since you can still have all sorts of spatial safety issues with custom iterator implementations or whatever, but great.
But my hunch is that the vast majority of C++ codebases aren't using std::span or std::array everywhere because there is just a lot of much older code. And there's no comparable option for handling lifetime bugs.
Tools like CFI or hardware memory tagging or pointer authentication help, but skilled exploit creators have been defeating techniques like these for a while so they don't have the "at least I know this entire class of issue is prevented" confidence as bounds checks inserted into library types.
The general industry recommendation is "if you are starting something new that has security implications, please seriously explore Rust" and "if you have a legacy C++ codebase that is too expensive to port please seriously explore these mitigation techniques and understand their limitations."
IshKebab 2 days ago

Good question. If I had to bet I'd say something like half of the 70% would be prevented. Yeah it wouldn't really help with lifetime issues or type confusion but a huge proportion of that 70% is simple out-of-bounds memory accesses.
But don't forget lots of open source code is written in C and this barely helps there.
- gw2 2 days ago
  
  > something like half of the 70% would be prevented
  Sure, but the other half are use-after-frees and those would not be exploitable anyway because of CFI and shadow stacks.
  
  IshKebab 2 days ago
  
  That is a very bold claim!
  
  gw2 a day ago
  
  Have a look at this repo: https://github.com/trailofbits/clang-cfi-showcase
rfoo 2 days ago

My two cents, I'm wearing my exploit writer's hat, but my current day job is SWE on legacy/"modern-ish" C++ codebases.
> Enable STL bounds checking using appropriate flags
This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers. Or they are even just write in C. However, if enabling these flags do not hurt you, please still do as it does make non-zero contribution.
> Enable mitigations like CFI and shadow stacks.
Shadow stack is meh. CFI helps a bit more, however there's some caveats depending on which CFI implementation you are talking about, i.e. how strong is it, for example, is it typed or not? But in best case it still just makes the bug chain one bug longer and maybe completely kills some bugs, which isn't enough to make your exploits impossible. It just raises the bar (that's important too though). It also depends on what the specific scenario. For example, for browser renderer without sandbox / site-isolation etc, CFI alone makes almost no impact, as in this case achieving arbitrary R/W is usually easier than taking over $rip, and it's obvious you can do data-only attack to have UXSS, which is a serious enough threat. On the other hand, if it's a server and you are mainly dealing with remote attackers and there's inherently no good leak primitive etc, various mitigations soup could make real difference.
So, all in all, it's hard to tell without your project details.
> How much of the "70% CVE" statistic is relevant to such a C++ codebase?
Uh, I'd guess, half or more of that. But still, it just raises the bar.
- gw2 a day ago
  
  First of all, thanks for your response.
  > This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers.
  While I agree with this, is not modifying those code to use STL containers much cheaper than rewriting into an entirely new language?
  > Shadow stack is meh.
  Are you referring to the idea of shadow stacks in general or a particular implementation of them?
  > For example, for browser renderer without sandbox / site-isolation etc
  I may be wrong, but I think you are referring to JIT bugs leading to arbitrary script execution in JS engines. I don't think memory safety can do anything about it because those bugs happen in the binding layer between the C++ code and JS scripts. Binding code would have to use unsafe code anyway. (In general, script injection has nothing to do with memory safety, see Log4j)
  > Uh, I'd guess, half or more of that.
  I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?
  Now, let me get more specific - can you name one widespread C++ exploit that:
  * would have happened even if the above mentioned mitigations were employed.
  * would not have happened in a memory safe language?
  
  rfoo a day ago
  
  All good questions.
  > is not modifying those code to use STL containers much cheaper
  That's right. However, I'd add that most exploited bugs these days (in high-profile targets) are temporal memory safety (i.e. lifetime) bugs. The remaining spatial (out of bound) bugs are mostly in long forgotten dependencies.
  > Are you referring to the idea of shadow stacks in general or a particular implementation of them?
  The idea. Shadow stack (assuming perfect hardware assisted implementation) is a good backward-edge control flow integrity idea, and ruins one of the common ways to take over $rip (write a ROP chain to stack), but that's it. Besides making exploitation harder, both forward-edge and backward-edge CFI also kill some bugs. However, IMO we are long past non-linear stack buffer overflow days, once in a while there may still be news about one, but it could be news because it is an outlier. Hence, compared to CFI, the bugs shadow stack kills are pretty irrelevant now.
  > JIT bugs leading to arbitrary script execution in JS engines
  Not necessarily JIT bugs. Could also be an UAF and people went a bloody path to convert it to an `ArrayBuffer` with base address = 0 and size = 0x7FFFFFFFFFFFFFFF accessible from JavaScript. Chrome killed this specific primitive. But there's more, I'm not going to talk about them here.
  You may have a slight confusion here. In case of browser renderer, people starts with arbitrary JavaScript execution, the goal here is to do what JavaScript (on this page!) can't do, via memory corruption - including, but not limited to executing arbitrary native code. For example, for a few years, being able to access Chrome-specific JS APIs to send arbitrary IPC message to browser process (out of renderer sandbox), is one `bool` flag on .bss away from JavaScript. If we managed to get arbitrary R/W (that is, can read / write all memory within the renderer process, within JavaScript, see my ArrayBuffer example above), we just change it and run our sandbox escape against browser process in JavaScript, who needs that native code execution?
  Or, if you do want native code execution. For a few years in V8 the native code WASM gets compiled to, is RWX in memory, so you just use your arb R/W to write that. You can kill that too, but then people starts coming up with bizarre tricks like overwriting your WASM code cache when you load it from disk and before making it R/X, and there're enough fishes in the pool that you likely can't patch em'all.
  > I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?
  Yeah. But as I said, nowadays people usually use temporal memory safety bugs, and they want arbitrary R/W before they attempt to take over $rip. Don't get me wrong, this is because of the success of CFI and similar mitigations! So they did work, they just can't stop people from popping your phones.
  > can you name one widespread C++ exploit that:
  I just google'd "Chrome in the wild UAF" and casually found this in the first page: https://securelist.com/the-zero-day-exploits-of-operation-wi...
  I assume "in the wild exploited" fits your "widespread" requirement.
  Granted, it's five years old, but if you are okay with non-ITW bugs I can come up with a lot of more recent ones (in my mind).
  This is an UAF. So it would not have happened in a memory safe language. While back then the exploited chrome.exe may not have enabled CFG (it was enabled late 2021 IIRC), I don't see how the exploit path could be blocked by CFI.

dist-epoch 2 days ago

The C++ people were kind of ignoring the safety problems and Rust, but when Microsoft suddenly announced that all new code will try using Rust first, it's like suddenly they woke up and realized this is not a fad and that the gun is pointing at their heads.

like_any_other 2 days ago

As amusing as it is to imagine language developers getting executed when their language falls out of favor, I think most C++ people are happy about Rust. The problem are large codebases already in C++ that won't get rewritten, so we have to do the best we can, within the constraints of C++.
- IshKebab 2 days ago
  
  Depends what you mean by "C++ people". I think most of those C++ people who are happy about Rust would say they are now Rust people who may be forced to use C++ sometimes.
  
  bluGill 2 days ago
  
  No, I'm still a C++ person because while Rust is intriguing I have so much existing C++ it would be billions of dollars to rewrite to rust and it will take years to get more than a trivial amount of Rust. For the vast majority of new features the cost to implementing it in Rust is far higher than the cost of doing it in C++.
gpderetta 2 days ago

MS is a large contributor to the C++ standardization effort.
- pjmlp 2 days ago
  
  It is, while at the same time they have changed their point of view on allowing C++ for new projects at Microsoft.
  I also think Herb Sutter leaving his role at Microsoft might have been related with this.
  From "Microsoft Azure security evolution: Embrace secure multitenancy, Confidential Compute, and Rust"
  https://azure.microsoft.com/en-us/blog/microsoft-azure-secur...
  "Decades of vulnerabilities have proven how difficult it is to prevent memory-corrupting bugs when using C/C++. While garbage-collected languages like C# or Java have proven more resilient to these issues, there are scenarios where they cannot be used. For such cases, we’re betting on Rust as the alternative to C/C++. Rust is a modern language designed to compete with the performance C/C++, but with memory safety and thread safety guarantees built into the language. While we are not able to rewrite everything in Rust overnight, we’ve already adopted Rust in some of the most critical components of Azure’s infrastructure. We expect our adoption of Rust to expand substantially over time."
  From "Windows security and resiliency: Protecting your business"
  https://blogs.windows.com/windowsexperience/2024/11/19/windo...
  "And, in alignment with the Secure Future Initiative, we are adopting safer programming languages, gradually moving functionality from C++ implementation to Rust."
  Finally,
  "Microsoft is Getting Rusty: A Review of Successes and Challenges - Mark Russinovich"
  https://www.youtube.com/watch?v=1VgptLwP588
otabdeveloper4 2 days ago

Pretty much everything Microsoft does is guaranteed to be a fad. They announce a new language paradigm shift every 10 years. (And then abandon it 10 years later.)

saagarjha 2 days ago

The problem here is that these don’t actually solve the problems that attackers use to exploit software written in C++. Nobody cares what constexpr code can’t UB (I thought it already couldn’t…?). Attackers will take your object and double free it and there’s nothing in here that will stop them from doing that. Fixing this in C++ is actually very difficult and unfortunately the committee doesn’t want to do this because it would change the language too much. So we’re only getting the minor improvements, which are nice, but nowhere near what is necessary to “tame UB”.

bluGill 2 days ago

Attackers cannot double free memory. They can force you into a state where you double free memory, but that is a different situation.
- saagarjha 21 hours ago
  
  Well, they can force you into a state where you let them double free memory.

qalmakka 2 days ago

This is nice and all, but the main issue with UB and C++ IMHO has never been that "nice", modern codebases are problematic - modern C++ in the hands of competent people is very nice, really. The problem is that 90% of all C++ development ATM is done either on legacy codebases full of ancient crap or with old frameworks and libraries that spam non-standard containers, raw pointers, allocate manually, ...

In my experience, introducing modern C++ in a legacy codebase is not that much easier compared to adding Rust to it. It's probably safe to argue that C++03 stands to C++26 almost like K&R C stood to the original C++

techbrovanguard 2 days ago

> Tech pundits still seem to commonly assume that UB is so fundamentally entangled in C++’s specification and programs that C++ will never be able to address enough UB to really matter.

- denial ← you are here

- anger

- bargaining

- depression

- acceptance

Cope, seethe, mald, etc.

kookamamie 2 days ago

At depression they'll figure out the codebase is full of const-casts and null-dereferences.
I completely agree this is trying to polish a turd, essentially. The train has left the station some decades ago.
- steve_gh 2 days ago
  
  But you are never going to rewrite the gazillion or so lines of C++ out there, and currently being used in all sorts of production systems.
  But if you have a beter compiler that points out more of the problem UB areas in your codebase, then you have somewhere you can make a start towards reducing the issues and attack surface.
  The perfect is often the enemy of the good.
  (edit - typo)
  
  lambdaone 2 days ago
  
  I don't doubt that most of the gazillion of so lines of legacy C++ will never be rewritten. But critical infrastructure - and there's a lot of it - most certainly needs to be either rewritten in safer languages, or somehow proved correct, and starting new projects in C++ just seems to me to be an unwise move when there are mature safer alternatives like Rust.
  Human civilization is now so totally dependent on fragile, buggy software, and active threats against that software increasing so rapidly, that we will look back on this era as we do on the eras of exploding steam engines, collapsing medieval cathedrals, cities that were built out of flammable materials, or earthquake-unsafe buildings in fault zones.
  This doesn't mean that safer C++ isn't a good idea; but it's also clear that C++ is unlikely ever to become a safe language; it's too riddled with holes, and the codebase built on those holes too vast, for all the problems to be fixed.
  
  steve_gh 2 days ago
  
  I'm very much in agreement - in principle. But we are where we are, and that gazillion lines is out there. We don't necessarily know which bits of it are running critical infrastructure - I'm not sure that we are even sure which bits of our IT infrastructure are critical, and we don't always know what problems are lurking in the code.
  So yes, moving to safer alternatives is a very good thing. But that's going to take a long time and cost a lot of money, which we don't necessarily have. So if we can mitigate a bunch of the problems with improved C++, it is a definite win.
  Let's face it, most of central Italy is still beautiful little stone towns, despite being in an earthquake zone. People still live there in stone houses because demolishing and rebuilding half the country is just not feasible. Our IT infrastructure is possibly in the same state.
  
  usrnm 2 days ago
  
  A lot of very critical infrastructure is still not even rewritten into C, rewriting it all in Rust or whatever is a pipe dream. And before you say that the financial system is not critical, I'd like to see you stop relying on it.
  
  wffurr 2 days ago
  
  Does COBOL have undefined behavior and lifetime or aliasing issues like C? I have never heard that it does, but don’t know enough to say for sure it doesn’t.
  Rewriting in C seems like a dodged bullet. Better for it to stay on older safer languages.
  Most COBOL rewrites I have heard of went to Java, a safe language.
  
  Philpax 2 days ago
  
  At this rate, we're more likely to see major advancements in AI enabling verifiable rewrites than we are to see the C++ committee make any substantive efforts towards improving safety or ergonomics. And I'm only half-joking.
  
  lambdaone a day ago
  
  There is some really promising-looking work on using a mixture of LLMs and formal proof techniques and/or unit testing to perform reliable and idiomatic translation from unsafe to safe languages. See, for example, https://arxiv.org/abs/2503.12511v2 and https://arxiv.org/abs/2409.10506, and https://arxiv.org/abs/2503.17741v1
  The nice thing about this approach is that the LLMs don't need to be flawless for it to work, as the formal analysis / unit testing will keep their errors at bay - they just need to be good enough to eventually output something that passes the tests.
  
  pjmlp 2 days ago
  
  It starts by rewritting LLVM, GCC, CUDA, Vulkan, OpenGL, DirectX, Metal, POSIX,..... candidates?
  That is the problem, and why we need to fix C and C++, somehow.
  
  matu3ba 2 days ago
  
  A rewrite of such technologies would not fix their semantic problems and related architecture decisions around tractability, debugging etc. For example, rewriting LLVM and GCC does not fix the underlying problem of missing code semantics leading to miscompilations around provenance optimizations in the backend. Likewise, Vulkan is not an ideal driver API (to argue on GPU performance) and let's not even start with OpenGL. POSIX is neither optimal, nor has it a formal model for process creation (security). So far there is no good one for non-micro Kernels.
  From my experience with C++ I do expect 1. "verschlimmbessern"/aggravate-improving due to missing edge cases, 2. only spatial problems aka bound-checks to be usable (because temporal ones are not even theoretically discussed) and 3. even higher language complexity with slower compile times by front-end (unless C++ v2 like what Herb is doing becomes available).
  
  pjmlp 2 days ago
  
  Most likely, however then the question is what technoglies get to replace those with safer approaches.
  Which as proven by the failure to push safer whole OS stacks, tends to fail on the political front, even if the technologies are capable to achieve the same.
  I would have loved to Managed DirectX and XNA to stay around and not be replaced by DirectXTK, that Singularity, Midory, Inferno, Oberon, Midori,.... would gotten a place in the market, and so forth.
  
  lambdaone a day ago
  
  This is why Rust is the leading alternative to C/C++; it was designed from the start to both call and be called from other languages to enable progressive migration, rather than requiring an incompatible and impractical big bang change that would never happen.
  The mitigations in the cited article are good too, but they don't replace the need for safer languages.
  
  roca 2 days ago
  
  Writing new graphics drivers in Rust will definitely be helpful, and is starting to happen.
  The safety of LLVM and GCC need not be a priority... they're not normally exposed to untrusted input. Also, it's a particularly hard area because the safety of generated code matters just as much as the safety of the compiler itself. However Cranelift is an interesting option.
  No silver bullet here unfortunately... but writing new infrastructure in C or C++ should mostly be illegal.
  
  pjmlp 2 days ago
  
  Yeah, but API surface also needs to change for it to actually work.
  
  inglor_cz 2 days ago
  
  If we want everything sensitive to be written in Rust, we need many more Rust programmers...
  ... and we also need to be more paranoid about what makes its way into globally significant crates, otherwise we just trade one class of vulnerabilities for another.
  
  lmm 2 days ago
  
  > But you are never going to rewrite the gazillion or so lines of C++ out there, and currently being used in all sorts of production systems.
  We are, because we will have to, and the momentum is already gathering. Foundational tools and libraries are already being rewritten. More will follow.
  > But if you have a beter compiler that points out more of the problem UB areas in your codebase, then you have somewhere you can make a start towards reducing the issues and attack surface.
  Sure. But fixing those is going to be harder and less effective than rewriting.
- raverbashing 2 days ago
  
  Seriously
  wtf someone comes up with "X is UB" and even worse, "Since it's UB this gives a license to do whatever the f we want, including something that's clearly not at all what the dev intended"
  No wonder the languages being developed to solve real problems by people with real jobs are moving forward
  
  masklinn 2 days ago
  
  > Since it's UB this gives a license to do whatever the f we want, including something that's clearly not at all what the dev intended
  That’s really not how it works.
  Compilers rather works in terms of UBs being constraints (on the program), which they can then leverage for optimisations. All the misbehaviour is emergent behaviour from the compiler assuming UBs don’t happen (because that’s what an UB is).
  Of note, Rust very much has UBs, and hitting them is as bad as in C++, but the “safe” subset of the langage is defined such that you should not be able to hit UBs from there at all (such feasibility is what “soundness” is about, and why “unsoundness” is one of the few things justifying breaking BC: it undermines the entire point of the langage).
  
  Measter 2 days ago
  
  > Compilers rather works in terms of UBs being constraints (on the program), which they can then leverage for optimisations. All the misbehaviour is emergent behaviour from the compiler assuming UBs don’t happen (because that’s what an UB is).
  I think a good way to view this would be that optimization passes have invariants. The passes transform code from one shape to another while ensuring that the output from running the code remains the same. But in order for the transformation to be valid certain invariants must be upheld, and if they are not then the result of the pass will have different output (UB).
  
  masklinn 2 days ago
  
  That’s part of it, but compilers also use the information more directly especially in languages with inexpressive type systems e.g. dereferencing a null pointer is UB so the compiler will tag a dereferenced pointer as “non-null”, then will propagate this constraint and remove unnecessary checks (e.g. any check downstream from the dereference, or unconditionally leading to it).