The nested function execution interleaving blew my mind (imagine having to debug that), so I had to look it up. Apparently, interleaving is prohibited in C++17 onward. So this:
> The second one is there since this is C++, priority() might raise an exception, meaning that new Widget will be called, but never passed to std::shared_ptr, and thus never deleted!
Is now impossible (thank God!). See a full SO discussion here[1]. Stuff like this makes me so happy I don't write C++ anymore, but the gist of it is (from the standard):
> For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.
In three decades' use of C++, of all generations going back to original cfront, I have never encountered any difficulty, confusion, or actual problem arising from this phenomenon. It is a favorite of language lawyers and people borrowing trouble.
I really doubt this sort of hypothetical source of memory leaks was either significant or its root cause could be pinned to this. As this hypothetical scenario depends on an exception being thrown, your code would need to throw exceptions repeatedly for the memory leak to be a problem. If your code throws so many exceptions that not freeing the memory allocated during that code point becomes a problem then you have far more serious problems to fix than discussing language lawyer corner cases.
Yes it does. Of course you can detect different evaluation orders using side effects, but the fact that evaluation order is unspecified isn't generally a problem: if you write code like `a() + b()` and a() and b() has side-effects that depend on the order they're evaluated in, that is just a bad way of writing code. In any language.
The point here is that the compiler is allowed to interleave subexpressions of the two different arguments which causes a memory leak in seemingly safe code. That is a very unreasonable thing for the compiler to do, which is why it is a footgun. But not that it can ONLY happen if the second function throws in between the construction of the object with "new" and the construction of the shared_ptr: if the second function doesn't throw an exception, the construction of shared_ptr is guaranteed to happen.
Incidentally: those "exercise for the reader" comments are incredibly obnoxious.
The number of stars evaluated before 'g()' (0, 1 or 2) varies between compilers, and the program I was working on would break if only one * was evaluated before g.
The order of parameter evaluation is not specified by the standard, so if your program behavior is relying on it it may be incorrect. But relying on it doesn't instantly make your whole program completely meaningless, nasal demons and all.
> Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
> Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
"Unspecified behavior" means something correct will happen, but you don't know what (in that case you can't rely on the execution order). "Undefined behavior" means something entirely incorrect may happen (crash or totally unexpected and bad behavior, so the term "nasal daemons" which means dragons coming out of your nose).
If it was undefined, then anything could happen (nasal demons), but that would be silly: you wouldn't be able to safely call functions with multiple side-effecting arguments at all.
Unspecified means that you can't rely on the order of evaluation. "Anything" may not happen; the compiler controls the sequencing.
Agreed. This code just looks weird to me. Throwing from a constructor? Holding the result of new() in a temporary? This is just asking for trouble.
Another favorite “criticism” I have of C++ goes along the lines of “but someone could overload operator+ to be division!” I have only seen that as a contrived critic’s example, or a joke.
That's an option, but now you're using a factory method instead of a constructor. It's a valid choice, particularly if you want to go for exception-free style - but GP's main point was that throwing from a constructor is not only not a footgun, it's actually the correct practice if you're writing RAII style.
Your program will still have as many possible memory states as if you had a bool m_isValid; in your type - as now you don't store a Foo but an optional<Foo>.
It's an improvement in terms of code reuse but not in semantic simplification basically.
Not if it's part of the state of your app ? E.g. a struct member. It also replaces an automated and enforced action by the compiler (exception being thrown) by a need for a manual check which feels very 1975
>Not if it's part of the state of your app ? E.g. a struct member.
I'm not sure what you mean. Here's an example with the output from std::optional being unwrapped and put in a struct member: https://godbolt.org/z/Ms4Kar18j
Also there's opt.value_or() which can use a default value in the case opt is nullopt.
>It also replaces an automated and enforced action by the compiler (exception being thrown) by a need for a manual check which feels very 1975
I think remram's goal was to avoid exceptions. So this is a criticism of remram's goal rather than the method to achieve that goal.
Also, std::optional can integrate with exception-producing code because opt.value() throws an exception if opt is nullopt.
> I'm not sure what you mean. Here's an example with the output from std::optional being unwrapped and put in a struct member: https://godbolt.org/z/Ms4Kar18j
And now you have to do this for every type that have some amount of preconditions. I don't understand in which universe this is sane - more code == more bugs.
e.g. here's what one would write with exceptions: https://godbolt.org/z/evMsh6383 (and, to be honest, if one is writing e.g. a command-line tool and not a reactive gui app, most likely https://godbolt.org/z/eW8YM5Gz4 as your OS will catch the thing anyways and then it's just a `coredumpctl gdb` / "open in debugger" away) ; you get the same guarantee of never having an invalid Foo but at much less mental cost.
It also prevents you of having aggregates as all the "parent" owner class will need their own wrapping in optional + private ctor, in case a sub-sub-sub-sub field would fail.
now you can't just do DomainObject{initForMainFoo, initForSecondaryFoo, 123}; or DomainObject{.mainFoo = "", .secondaryFoo = "", .whatever = 123}; anymore, even less GameState{{DomainObject{...}, DomainObject{...}}; which is how modern C++ is meant to be used
Exception handling may indeed end up with less code, but often it is because the developers forgot to handle most error situations properly. And that's my biggest gripe with exceptions, particularly the unchecked ones, which can just pop up from 100 layers below. I've just seen it too many times when developers just throw exceptions and let them bubble up to a place where there is so little context that the final error message presented to the user is near useless.
On the other hand return values (using something like a try/either monad) have the property of forcing the developer to think about the unhappy path in every layer. Maybe a bit more code, but the quality of diagnostics - priceless.
IMHO exceptions/panics should be reserved for bugs only (eg assertions), and return values for all the other error handling caused by user input or environment.
> I've just seen it too many times when developers just throw exceptions and let them bubble up
but it's the whole point ! that's exactly why exceptions are good !
> to a place where there is so little context that the final error message presented to the user is near useless.
there are only two valid cases: there's an error that you know you are able to recover from, you did recover, and the users does not see a message and never knows that an error happened, OR the user sees the message "The program has encountered an error. A backup of your data has been saved in c:/foo/. The program will now exit.".
Anything else makes software terrible to use. If you have logging and recovery to do, use your logging architecture and core dumps, don't misuse exceptions for that.
> On the other hand return values (using something like a try/either monad) have the property of forcing the developer to think about the unhappy path in every layer. Maybe a bit more code, but the quality of diagnostics - priceless.
Except what happens in practice is that people just put panics everywhere because most of the time there is no meaningful action one can take at the layer where the implementation happens. Actual examples in servo: https://pastebin.com/0DAFU9vS
I also invite you to run the following command:
$ cd ~/.cargo
$ rg panic! | wc -l
I get 2987 as a result. Choosing to program like this instead of using exceptions which at least can be recovered by the person who will use your lib, is literally hateful towards users.
I've not seen Go code to be meaningfully different either.
> IMHO exceptions/panics should be reserved for bugs only (eg assertions), and return values for all the other error handling caused by user input or environment.
Sure, this is the community consensus in C++. It's the Python people who use exceptions to get out of for loops :)
But the most important rule is that - invalid state must not be representable in the program. There must be no way to get an "invalid" object (unless the object being invalid is part of its possible domain, but this is exceedingly rare when one isn't interfacing with a 1985 C API)
> but it's the whole point ! that's exactly why exceptions are good !
Good for what? For crashing the program with a stacktrace? Probably. But for regular error handling - that's a terrible thing. They break normal control flow by adding a parallel / alternative control flow. Now with them you have to think on what happens if any of the code you call suddenly throws an exception and the number of potential control flow paths increases drastically. They are not referentially transparent and it is nearly impossible to do functional programming with them (they introduce side effects everywhere). You can also end up in a really bad situation when a destructor throws (typically crash).
> I get 2987 as a result. Choosing to program like this instead of using exceptions which at least can be recovered by the person who will use your lib, is literally hateful towards users.
But panics are Rust equivalent of exceptions.
That's the whole point of them. They are used for exceptional cases, which are impossible to recover from (which means the only way to deal with is to dump core and terminate) and most of the time preventable (you can always avoid dividing by zero or going out of array bounds).
It looks like we agree they should not be used for regular errors, like e.g. file not found. I'd not like my word processor to terminate with a core dump when it couldn't load the file from the disk. In this case return values are way better error handling mechanism.
> invalid state must not be representable in the program
Returning an option/try/either strikes that goal just as well as exceptions.
Yeah, I know I can recover, that's why I said they are equivalent of C++ exceptions (which can be caught as well). But this is not a recommended idiomatic way of handling expected errors, e.g. input errors, improper configuration, I/O errors etc.
Not always, e.g. not if your class includes a member that is a reference (and, say, aggregate initialization is not allowed for one reason or another).
That's a good point... You might be able to work around that using a template function factory e.g. `make_foobar<ChildrenFooBar>()`. I'm not sure what's a good pattern here, I don't write much C++.
In practice I have not found that running destructor is problematic. In many cases the destructor is the default one that just destructs the fields. When the destructor should be non-default the constructor can leave the object in the move out state either explicitly or via hacks like using T tmp(std::move((*this)) on the error return in the constructor.
Throwing from a constructor is a fundamental way to make it impossible to construct invalid instances of a class. It's great, especially for immutable classes. Throwing from constructor + immutability = no invalid states ever.
> Another favorite “criticism” I have of C++ goes along the lines of “but someone could overload operator+ to be division!” I have only seen that as a contrived critic’s example, or a joke.
Yeah there's definitely criticisms you can throw at C++, but this always struck me as stupid. You could also write an `add` method that divides in Java.
I've never liked the idea of 'operators' in the first place, tbh. Scheme opened my eyes in that they're all just procedures. And then with Smalltalk and Scala I saw that they can all just be methods as well.
I don't see how that is so. C++ 17 allows new Widget to complete and then the priority() call to execute and throw before both are passed to shared_ptr(), thus creating a leak. Your cited example [1] doesn't leak because both arguments are shared_ptr. Seems to me that C++ 17 does indeed solve this latter case but not the former.
> evaluations of A and B are indeterminately sequenced: they may be performed in any order but may not overlap: either A will be complete before B, or B will be complete before A. The order may be the opposite the next time the same expression is evaluated.
And:
> 21) Every expression in a comma-separated list of expressions in a parenthesized initializer is evaluated as if for a function call (indeterminately-sequenced)
Emphasis mine. What the above basically says is that in the case of some function `f(A, B)`, the arguments `A`, and `B`, are what's known as "indeterminately-sequenced" -- this mean that their execution cannot be interleaved (overlap) -- but they still individually execute in a non-deterministic order (A before B, and sometimes B before A)!
With that said, the good news is that B can now never throw in the middle of A, which is precisely what we have in OP's example.
Whew thanks for looking this up! I was afraid I'd have to add yet another entry to my gigantic C++ footguns.org document. It's getting so big now that Emacs struggles a bit to load it due to inline code examples!
Your footguns.org file legitimately sounds like something that other C++ programmers would find useful. I for one, being a rank novice at the language (despite attempting to use it for the last ten years!), would certainly appreciate a reference like that.
Most of it is basically from cppcon talks and Scott Meyer's books, but the advice I basically give to incoming C++ programmers today is this:
* Start with C++20 unless you have a very good reason not to. Not only does it obviate crufty old things like SFINAE (concepts!), but it includes a ton of usability fixes (e.g. reflexive operator== -> I only have to write the code for MyType == MyOtherType in order to get both that version and MyOtherType == MyType). A lot of lambda behavior has really been cleaned up too (e.g. capturing `this` has fewer corner cases).
* start with Google's coding guidelines [0], as they've been developed to avoid many footguns (the bottom line depends on it). Once you understand things better, keep removing these rules (e.g. about exceptions, const& parameters; many of these rules are good for big teams, but annoying to follow for individuals)
* The CppCon "back to basics" talks are absolute goldmines, and the speakers usually tell you if some technique has been outdated as of the recording. Some things in certain books, although valuable, may be outdated based on later revisions to the language.
* The best notes to take are usually some structure list followed by a bunch of godbolt links with examples, e.g. this one I made to demonstrate the reflexive operator== behavior [1]
IIRC, this may also leak when `priority()` throws due to evaluation order. (Not exactly sure now, as I now always use `make_shared` whenever possible.)
However in the Effective C++ example, the `shared_ptr` constructor gave a false sense of security as it seemed the `new`-ed `Widget` was always managed by the smart pointer from its allocation.
I always wonder if compilers actually do weird optimizations taking advantage of semantics like those.
Another example is when undefined behavior occurs, like (iirc) signed integer overflow. Compilers can technically do anything they want, including weird things. But how often does that actually happen?
Then compilers will use the fact that integer overflow is undefined behavior. This allows the compiler to assume the `if` statement will always be true. And hence it removes this check.
This isn't just the compiler removing a check the programmer wanted. This could occur because of constants, where other constants make the check meaningful. It could also occur after in-lining a specific function call.
Really, the only case where this optimization would be bad is if the author was trying to detect overflow. Sadly, detecting if something would cause undefined behavior in C code is hard. Because you cannot try the thing.
Others have mentioned that compilers take advantage of these things for optimization, but haven't mentioned that compilers will still often do unexpected things when encountering UB with optimizations off.
Turning optimizations on doesn't suddenly undefine any behavior, it just changes what the compiler does when encountering some of those behaviors (among other, non-UB based optimizations). A non-optimizing compiler and an optimizing compiler both have to conform to the standard.
Different architectures have their own function passing convention. It’s convenient to evaluate left-to-right, pushing as you go, if that matches the arch. And vice versa.
That would explain why you want it implementation-defined, but leaving it indeterminate means that the same piece of code is allowed to have different evaluation orders each time it is encountered. This is much stranger, and very specific to C and C++, I don't think there is any other language that doesn't specify the order of execution.
If you think about how it interacts with inlining, it's easy to see how optimizers like this freedom. Suppose there's just one call that may throw and the others are pure functions. The compiler could bunch the pure ones together to optimize register and stack allocation.
I was claiming it could be left implementation defined instead of unspecified. That way, code for particular platforms could be evaluated in different order, but it would always be deterministic for a particular compiler.
The GP that my post was replying to was claiming it was arch dependent, and I was pointing out that, if it had been, they would have likely not left it unspecified. So, I think you and I are in agreement.
During the standardization of C++17, a proposal specifying the left-to-right order of evaluation of parameters (and other operations) came up and the committee got very close to accepting it, but in the end there were concrete examples in actual code that was worse of because of the proposal, so it was unfortunately weakened at the last minute.
Looking at the proposal in question [0], it seems very sad that the alternate, not recommended, choice was made instead [1] , especially since the reasoning seems to have not been put down in writing anywhere. The paper does mention that the difference in performance was <4%, with both improvements and worsening being seen by the VC++ implementers.
Why shocked? Certainly a micro-optimization, but order of things often matters if the compiler cannot prove it's safe to speculate about the execution order. Maybe the surprising thing here is the theme of TFA, that the order of argument evaluation is not defined.
Given the out of order nature of processors, and the relatively limited scope for complex expressions given as function arguments, I would not expect this to matter in real-world programs. It seems this was also the opinion of the people drafting the new evaluation order changes [0] in C++17:
> We do not believe that such a nondeterminism brings any substantial added optimization benefit, but it does perpetuate the confusion and hazards around order of evaluations in function calls.It perpetuates unnecessary confusion around brace-initialization vs. direct initialization using parenthesis.
> We found that some entries in the benchmark suite ran slower, others ran faster compared to the scenario where the evaluation of the argument list is left unspecified. The variation is between -4% and +4%. It is worth noting that these results are for the worst case scenario where the optimizers have not yet been updated to be aware of, and take advantage of the new evaluation rules and they are blindly forced to evaluate function calls from left to right. It is clear that the left-to-right evaluation strategy is triggering new optimization paths (different inlining decisions and different register allocation) affecting the variations in the benchmark performance. It appears those opportunities have not traditionally been exploited, even though permitted under the unspecified order regime.
Unfortunately, they seem to have been overruled by the Core Working Group [1], for reasons I have not been able to dig up.
Haskell? Expressions have to be evaluated in order to evaluate expressions that depend on them, but otherwise the compiler is free to decide the order of evaluation.
> In this case, it's pretty easy to see that 3 will have to be printed after both 1 and 2, but it's unclear whether 1 or 2 will be printed first. The reason for this is evaluation order: in order to evaluate one + two, we'll need to evaluate both one and two. But there is nothing telling GHC which of those thunks should be evaluated first, and therefore GHC is at full liberty to choose whichever thunk to evaluate first.
Well, Haskell's lazy execution model and implicit purity makes this far different from C++. The examples there all rely on the guarantee-breaking unsafePerformIO, while in C++ you can break your program with perfectly safe,c normal C++ code, such as creating a shared pointer and throwing an exception.
Note that in Haskell evaluation order is not guaranteed anywhere, even between separate statements, whereas in C++ it is basically guaranteed everywhere except function calls.
unsafePerformIO was there to make the indeterminate evaluation order visible, but isn't what causes that indeterminate evaluation order. You are right, though, that for 'safe' Haskell code the evaluation order shouldn't matter, although it is sort of a matter of convention; 'head' is unsafe, but is presumably now permanently stuck in the language, while on the other hand perhaps one could write C++ without exceptions, but that wouldn't be 'normal' C++ code.
Sure, my point was that, as far as I understand, even Haskell code that performed IO would have a well-determined order, if it was doing normal IO; whereas C++ code that does something as simple and safe looking as foo(i++, i) may produce different results in different runs (it was even UB before C++17, and still us in C; and even other common idioms, such as function chaining, had undefined execution order).
Yes and no. The order IO is performed is well-defined (that's the purpose of the IO monad), but the evaluation order is not (apart from ordering arising from data dependency), and this is the case with or without unsafePerformIO. You just won't get any different behaviour with safe Haskell code, and Haskell is very much on the safe side of things, so even though the order of evaluation may be indeterminate, you would never notice it (maybe a timing attack is possible?). But this is admittedly hair-splitting.
Most optimization uses of UB that I've seen is to take full liberty to remove branches and instructions. For example, if the result of UB is used in a comparison, the compiler is free to pick whichever side of the branch (even inconsistently) and elide. It is actually quite common with for example loop unrolling and vectorization (ignoring signed overflow of the indvar is a huge boon)
The answer is yes. Compiler authors absolutely take advantage of these things for optimizations.
As for paranoia about undefined behavior? Compilers won't emit a program to call you a pizza or format your disk. There are risks, but those aren't real ones.
If you already have a problem that contains instructions that are manipulating your disk in these sorts of ways then this is within some reason. But the usual hyper-concern about UB is that it'll take a program that otherwise does completely different things (imagine some CLI utility that just reads from stdin) and instead produce some maximally adversarial program that ruins your life. This ins't a real concern.
There was a post here not too long ago about how C/C++ compiler writers have consistently taken the wiggle room of "undefined behavior" to do terrible, terrible, damage to otherwise fine code. https://news.ycombinator.com/item?id=27221552
Undefined behavior is unbounded in both space and time in terms of program semantics, that is, a program that will engender undefined behavior may do so at any point during the course of program execution (unbounded in time) as well as may be observed within any memory region (unbounded in space). A colloquial way of saying this is that a compiler is free to do anything it wants anytime it wants in the presence of undefined behavior.
Unspecified behavior is bounded both in space and time. Unspecified behavior may only affect program semantics within the statement in which it is engendered and may only be observed within the region of memory that is written to or read from.
It is often said that undefined behavior is invalid, and one can sympathize with that statement as it's commonly repeated, but it's not true. All the standard says about undefined behavior is that the standard imposes no requirements about program behavior, it does not state that the program is invalid or that the program must behave in some kind of invalid manner. Almost all non-trivial C++ programs engender undefined behavior in a manner documented by the compiler and do so in a safe and principled way.
In English, that's a reasonable point but within the C++ standard behaviour is declared to be specifically undefined in the text - that is to say, we are telling you, you may write this construct however said construct is not C++
The author writes "this might be why there is std::make_shared", but that's not really why it exists. When you create a std::shared_ptr it needs to allocate memory for a control block structure (which tracks the reference count) as well as the memory for the object itself. If you write std::shared_ptr<Foo>(new Foo) this will result in two memory allocations, one for the Foo object and the other for the std::shared_ptr control block. If you write std::make_shared<Foo>() then a single allocation will happen which has enough space for both the control block and the Foo object, and then placement new is used to initialize the memory for both structures. So std::make_shared exists to reduce the number of memory allocations that are being made, not for the reason suggested here.
There's a crazy downside to make_shared that I learned recently because of this: if you have a weak pointer to a shared thing, and the refcount for the shared thing drops to zero, the weak pointers will keep the allocation for the object "alive", because they still need access to the remnant and the remnant was created in the same allocation as the object so they can't be freed separately. So now I only use make_shared if I know for sure there won't be a weak_ptr pointing at it (or if the base object has a relatively small memory footprint after it's been destructed).
Won't the object be destructed though? So while the memory for the object is kept, the memory for the object's heap-allocated members is not. I.e. if you have a 100mb string, after only weak references are left you won't have 100mb memory taken, but only sizeof(string) + sizeof(control block).
Depends on the object. As OP notes, `make_shared` with weak pointers is fine "if the base object has a relatively small memory footprint after it's been destructed".
There's lots of cases where the object itself is big, though. Think of objects with big fixed arrays, "god objects" with a bajillion pointers, or objects which themselves allocate data in-line.
Yeah, that's definitely something to be aware of. It's usually not an issue as most objects have small footprint (and any allocations they in turn hold would be released when the strong refcount goes to 0).
I believe this is correct. Here's [0] some old Boost documentation on Boost's make_shared which inspired the C++ standard's make_shared. It mentions both reasons.
I think the idea that you are also relying on a presumably well tested library to get the exception corner cases right is also noteworthy. Memory leaks on allocation failure are pretty common in naive code, and a good thing to handle in a library where it can get well thought out.
My (least) favorite footgun is "auto" when it comes to references.
If you want to get a pointer from something, you would write this:
auto fooPtr = mywidget.getPtr();
fooPtr->doCoolStuff();
But we all know references are better than pointers right? So we should just write...
auto fooRef = myWidget.getRef();
fooRef.doCoolStuff()
The problem is... this is valid and will probably not do what you want. It will make a copy. What you want is actually
auto& fooRef = myWidget.getRef();
I once spent an entire day debugging a bizarre crash because of that. I was asking for a reference to a scene graph, and what was actually happening is I was getting a clone of the scene graph (which was an object that couldn't be safely copied), and then destroying a bunch of shared pointers when the function returned. Fun times.
As others have pointed out, the idiomatic solution here is to delete the copy constructor if an instance is unsafe to copy — however, I suspect the reason why auto behaves differently from other inference in requiring this to be explicit is probably something along the lines of making it harder to have a dangling reference. It’s shockingly easy to wind up with a dangling reference with fairly innocuous code, something like QString().toUtf8().data() so maybe this makes sense. (Doesn’t help for that case since it’s a pointer to raw data, but you get the picture.)
I think, all things considered, the way it works is probably the best way, since you might want to use "auto" to actually make a copy. It's just very surprising, because when I have explicit types I'm used to looking for that sort of error, but with an auto I wasn't (at the time) used to looking for that.
Worse yet is an object which is safe to copy in some cases, but you intended to not copy it at the moment, and you create a dangling reference to a field within. I wish C++ copy constructors were explicit by default. (But declaring the copy constructor explicit turns off aggregate initialization, so I can't do that.)
Wait, you had an object that was unsafe to copy, but you told the computer it was safe by not deleting the copy constructor? I guess it should have done what you wanted, not what you told it.
A C++ object can be perfectly safe to copy in one situation and conjure up nasal demons in another situation. I don't think rule of 5 would help in that case.
Probably one of the only common language-lawyer pieces of criticism of C++ I fully agree with. I'm a pretty firm believer that a language should be as constrained as possible with the ability to make it explicitly less-constrained.
Rule of three is such a common C++ concept though, I would be surprised if anyone with more than a few weeks experience with the language is still running into such issues.
I've been coding C++ off and on since 1998. But not all the code I use is written by me... not even most of it. I'd love it if everyone followed best practices... but they don't.
There is a problem indeed, which is OP's failure to implement exception-safe code.
If someone opts to use exceptions, they have the responsibility of writing their code in a way that complies with scenarios involving throwing exceptions.
Failing to support a scenario by not following the most basic principles is something that's on the person writing the code, not the language.
It would make as much sense to blame java for the problems you create by not initializing objects in some code paths, because your code started to throw null pointer exceptions.
Are you talking about the same thing as everyone else?
auto fooRef = myWidget.getRef(); vs. auto& fooRef = myWidget.getRef();?
That's not a problem caused by exceptions.
If you're talking about the article, the standards committee seems to think it's a confusing interpretation that never should have existed, and it's pretty ridiculous to say it's "basic principles".
Yeah I learned that the hard way when I found out that std::vector apparently feels free to move your objects around in memory for you, no matter if you point to them from somewhere else.
I mean in hindsight it's obvious, but still not exactly what I wanted to happen.
For reference, what I did does actually work, temporarily.
> Yeah I learned that the hard way when I found out that std::vector apparently feels free to move your objects around in memory for you
In standard terminology, this is described as "invalidating iterators". There are a bunch of member functions in std::vector that either do or don't invalidate iterators, e.g. push_back(...) does but size() doesn't. And as the name implies, if you call a function that invalidates iterators, all your existing iterators/pointers/references become invalid.
Iterator invalidation is different from this. Iterator invalidation, as the name suggests, only applies to iterators. The problem OP was having had to do with dangling references.
Some containers guarantee references won't dangle on mutation such as unordered_map. An unordered_map may invalidate iterators if objects are added or removed, but will never result in dangling references (unless the object is removed). That is, it is safe to have a pointer to an object owned by an unordered_map and continue using that pointer even after an iterator to that same object is invalidated.
This has its own problems. Pointer stability limits the internal implementation so much that unordered_map is embarrassingly slow given what we know about modern hashmap design. There is a good reason why the swisstables that Google built dropped this behavior.
This is a poor understanding of what an iterator is. For one, the iterator of a std::vector<T> is not a T*, it's a std::vector<T>::iterator and you are welcome to verify that the following assertion fails for all T:
Furthermore just like a cat is animal does not imply an animal is a cat, a pointer being an iterator does not imply that an iterator is a pointer and holding such a confusing thought is the source of many bugs and poor C++ code. An iterator not only represents a form of indirection to an object, it also represents traversal through a collection of values.
unordered_map guarantees that the objects it stores will remain alive until the unordered_map is destroyed or the object is removed from the map. However, unordered_map, as the name once again suggests, does not provide any guarantee about the ordering of objects. Hence, if you have an iterator to an object owned by such a map, and you add or remove objects to that map, the iterator becomes invalid not because the object the iterator references gets destroyed, but because the iterator loses information about where it's located within the unordered collection.
For further information, please review the following link [2], specifically the quote "References and pointers to either key or data stored in the container are only invalidated by erasing that element, even when the corresponding iterator is invalidated.".
You're being far more patronizing than is necessary here, especially because the parent poster was correct in iterators originating out of a being a generalization over pointers, although not in their latter statement. But, while implementation defined, a vector iterator must behave identically to a pointer, and your assertion fails in both Clang and GCC not for any interesting reason but more because they wrap the pointer into a class for what I understand to be better encapsulation and earlier detection of nonportable code, not because pointers are somehow unfit to do the job of std::vector<T>::iterator. It so happens that iterators have become a useful way to generically iterate over noncontiguous containers as well, but it is very clear that they are "logically" meant to mirror the usage of a pointer (down to the operators you use to work with them) and suggesting that holding this opinion is a sign of inadequacy is condescending.
>because the parent poster was correct in iterators originating out of a being a generalization over pointers
And an animal is a generalization of a cat, but that doesn't mean an animal is a cat. What is true of a particular doesn't have to be true of the general.
> But, while implementation defined, a vector iterator must behave identically to a pointer
No it does not. MSVC implements a vector iterator with additional runtime safety guarantees in DEBUG mode and while those guarantees are not mandatory as per the standard, they are fully in compliance, so saying they have to behave identically is patently false as MSVC's behavior is a superset of the behavior provided by a pointer.
> suggesting that holding this opinion is a sign of inadequacy is condescending.
If you care about writing correct code, then do not mix the very concept of pointers, arrays, references, and iterators with one another. They are all related to one another but have very important differences to the point that calling them literally identical to one another is a sign of inadequate understanding. It happens far too often and inevitably leads to poor code, bugs, and a fundamentally poor understanding of what these things represent conceptually.
First of all, calm down. Consider that the people you are talking to may not be idiots.
Second of all, nobody is saying that pointers, arrays, references, and iterators are identical in general. They are clearly related concepts but differ in fundamental ways. However, my point was that a vector iterator needs to behave like a pointer would and can in fact be implemented as a naked pointer. Pointing to MSVC's implementation having safety checks does not change that fact, because pointing to what an implementation does on undefined behavior is clearly out of scope when discussing specified behavior. I'm not going to bring address sanitizer into this discussion to show how pointers can actually be a superset of "pointer behavior" because that would just be irrelevant. The fact remains that iterators were originally meant to act like pointers, and in many cases still do. Believing this does not mean I have a poor understanding of the language.
Not knowing something does not make you an idiot, there are tons of things I don't know about C++ and I make tons of mistakes. However, not knowing something and doubling down on that lack of knowledge by defending something that is wrong because you don't like the tone of the person correcting you about it is highly problematic and suggests you are prioritizing your own sensitivity on the issue and protecting your own ego instead of taking the opportunity to learn something new.
>Second of all, nobody is saying that pointers, arrays, references, and iterators are identical in general.
The comment I replied to said, and I quote:
"I mean, literally. The iterator of an std::vector<T> is a T*."
That is what I originally replied to before you felt it necessary to interject yourself into the conversation to police my tone.
>However, my point was that a vector iterator needs to behave like a pointer would and can in fact be implemented as a naked pointer.
No, you said it is required that they behave identically, which is false, once again, I quote you:
"a vector iterator must behave identically to a pointer"
There is no such requirement that they behave identically, and there is a demonstrable example of a compiler where they do not behave identically. Furthermore your talk about it being undefined behavior is false on the basis that in general, two pointers of type T* may compare with one another, for example the following is valid:
auto v1 = new int();
auto v2 = new int();
v1 == v2; // This is valid.
However, this is undefined behavior.
auto v1 = vector_a.begin();
auto v2 = vector_b.begin();
v1 == v2; // This is undefined behavior.
Only iterators to the same vector may be compared, which is not true of pointers.
>Believing this does not mean I have a poor understanding of the language.
I did not say you have a poor understanding of the language as a whole, I said if you believe that an iterator to a vector is identical to pointer as you have claimed, then you have a poor understanding of iterators.
I stand by that statement and I would suggest that you not be so sensitive about being wrong about a corner case of the language and instead simply accept this to be a fact, learn something new, and move on with your day. I learn new things about C++ all the time, sometimes from polite people, sometimes from jerks, it's all the same because what matters is the learning. Something doesn't become wrong just because the person who told it to you said it in a way you disapprove of.
Having said that, all the best to you Sir and thank you for engaging in this discussion. I have said all I think is appropriate given the topic.
> However, not knowing something and doubling down on that lack of knowledge by defending something that is wrong because you don't like the tone of the person correcting you about it is highly problematic and suggests you are prioritizing your own sensitivity on the issue and protecting your own ego instead of taking the opportunity to learn something new.
Reminder that I was not the one who passed up the opportunity to reply to 'einpoklum with a simple "iterators can behave like pointers, but vector's iterators don't have to be pointers. You can see this here: https://godbolt.org/z/Y5f4d74Kb". I'm just annoyed that you went after this person for coming to an incorrect conclusion from what is clearly a correct position, which is that iterators were intended to "morally" be pointers, especially in cases where they iterate over contiguous containers. You are correct that I claimed that vector iterators must behave identically to pointers, but I really meant to say that this is only in cases where the iterator has behavior defined on it: I alluded to it previously when I mentioned that GCC and Clang wrap pointers in their own custom iterator classes to prevent people from accidentally using operations on them that are not legal. There's a bunch of other things that the iterator doesn't guarantee, like being directly casted to an integer.
But, stepping back a bit: do you really disagree with a claim that a vector iterator is meant to be a little pointer into the internal buffer it has, with a bit of additional restrictions on top that are reasonable to add? This seems to be a very strange thing to disagree with, and I would like to hear more about why you hold that position.
FWIW I believe that the parent was pedantic but correct.
This being C++, being pedantic is a Good Thing.
The distinction between reference and pointer invalidation, and the requirements and axioms of the various iterator concepts are all valid things to point out, especially in a thread about C++ footguns.
I am always down for C++ pedanticness! But in this case my issue was not with factual accuracy, but putting down the opinion of "iterators act like pointers" as coming from ignorance rather than simplification. It was mostly a complaint about tone, not content.
Hacker News is a place for curious conversation, rather than a place to sneer at people's ineptness for coming to reasonable (if incorrect) conclusions. You can be completely correct, but if you're going to be a petty jerk about it then you're not helping the conversation. I provided an example response that conveys exactly the same information, except you'll note mine doesn't start with "this is a poor understanding of [general concept]". If anything, "link provided below of convenience" and "please review the following link" is far more sealioning than anything I wrote, because it hides the message of "you're an idiot who needs to read up on this topic" behind faux politeness. It's the very thing that I found disconcerting about that comment, not its factual content.
Ah, interesting. Even within that dichotomy, I'd think that if references aren't invalidated on mutation, then dereferencing iterators would be valid, but incrementing them would not be.
I think if you come from the JVM / CLR world, you are so protected by the runtime’s patching up of references that it might not even occur to you that a (raw) pointer to a data structure’s internals can dangle after the data is moved around. The runtimes mentioned pause your code, move things around and even compact the heap and your references magically still point to what they did before!
Except, it's not magic. It costs processing cycles, and latency, and invalidations of cache rows you were using, and memory bus traffic. Lots of all of them.
Most of the costs are not counted as runtime for your process, so benchmarks invariably appear to show GC as costing less overhead than it does. Among the costs, as with all the myriad varieties of caching done in modern systems, is that GC makes it hard to know the costs of design choices you make. Many of the costs are in making the caches less effective.
Well it's not so much being protected by patching up references, but more that C++ is (as far as I know) the only language that doesn't complain when you create a reference to an object that can move away.
Most languages either don't have the concept of classes or ensure that invalidating references is an explicit operation (such as calling the destructor).
> Yeah I learned that the hard way when I found out that std::vector apparently feels free to move your objects around in memory for you, no matter if you point to them from somewhere else.
It's not as much as "feeling free" as it is having to reallocate the array because you added elements beyond it's capacity and they had to be stored somewhere.
This is why things like std::list still have value: iterator stability. You even get this in most map/sets too.
But vector is usually what one should reach for unless one has a good reason not to. The only danger really comes when the lifetime of iterators and the thing they point to become a bit decoupled, even due to insertion!
> If the placeholder-type-specifier is of the form type-constraint auto, the deduced type T' replacing T is determined using the rules for template argument deduction.
You need to continue reading that paragraph to the very end and note the exception for when an initializer list is used. The following are not equivalent nor do they apply the same rules:
template<typename T>
void f(T) {}
f({1, 2, 3});
auto x = {1, 2, 3};
>Now if you program in C++ regularly and don't know that in...
I program in C++ daily and no, I didn't know that a copy is made. What I do know, as an actual professional C++ programmer, is that whether a copy is performed is incredibly complex and depends on numerous factors such as whether a move constructor exists and "something" is an lvalue but not an rvalue reference, or if "something" is an rvalue (which is not the same as an rvalue reference) and T defines a move constructor.
And finally if something is the same type as T and T has a copy constructor, then the final question is whether copy elision will be performed, which the standard specifies is a valid optimization even if said optimization would change the observable behavior of the program.
That's what I... as a professional C++ programmer know but I fully admit that C++ is such a complex beast of a language that I am almost definitely missing a few corner cases.
> You need to continue reading that paragraph to the very end and note the exception for when an initializer list is used.
Exceptions don't mean that the general rule does not apply ? I don't know how it is possible to be more explicit that "the deduced type T' replacing T is determined using the rules for template argument deduction.". That'd be like saying that "priority to the right" when driving isn't a rule because there is sometimes a "stop".
> I program in C++ daily and no, I didn't know that a copy is made.
you're kidding
> whether a copy is performed is incredibly complex and depends on numerous factors such as whether a move constructor exists and "something" is an lvalue but not an rvalue reference, or if "something" is an rvalue (which is not the same as an rvalue reference) and T defines a move constructor.
but `something` cannot be a rvalue in f(something);
maybe in f(something()); or f(some + thing); but just a variable named "something", as is, passed to a template argument or auto will necessarily lead to the creation of a new value of the same type. Whether it is copied, moved, materializes three different types because people forgot to mark their constructors explicit or whatever else frankly does not matter much in my experience (after all people survived for 30 years without move semantics) - what matters is whether a new value is created (which I personally and improperly call a "copy" no matter what - I should have been more precise but it's really the most useful distinction to make in practice imho as it means that some function will be called) or just a reference to existing value which is always a no-op no matter what.
> And finally if something is the same type as T and T has a copy constructor, then the final question is whether copy elision will be performed,
I've been at it for 15 years now and I don't remember one time in non-toy-example-code where copy elision taking place or not did matter. Thinking about is is a self-inflicted problem ; expect that a copy happens, if it's slow, profile and if it's not, be happy.
Yep, except now we're supposed to call them forwarding references because the motivation Scott Meyers had for naming them universal references turned out to be incorrect in subtle ways.
Even the notable experts get things wrong when it comes to how complex and bloated C++ is.
Well, the scene graph code wasn't code I wrote, and this was back in 2014. But sure, obviously it was an error on my part... we are talking about footguns afterall, I'm the one that pulled the trigger.
If you develop on Windows and are using Visual Studio 2019
"auto fooRef = myWidget.getRef()" will get a little squiggle under it to warn you that you are making a copy.
I am very happy tooling is helping to make these sorts of mixups easier to find, although sadly C++ is such a hard language to write tooling for that those kinds of nice features are pretty rare. I remember back when I had to work in Xcode a few years ago I could rarely even get basic intellisense to work.
It's not that these features are rare for C++, it's that Xcode is generally a pretty bad C++ IDE (imo, of course).
In both Visual Studio and CLion, these types of features work well, in addition to plenty of more advanced features like address/thread sanitizers, fuzz testing, etc.
You're too nice, Xcode is an atrocity. Still though, trying to unambiguously parse C++ without compiling miles of headers is not a simple thing... I find my typescript and C# tooling is so much nicer when C++ is the language I need that magic most for. VS is nice but compared to what I get with like resharper or intellij its a bit disappointing.
However they were too resource hungry for the hardware of those days, then Java and .NET took over the spotlight, and until clang/LLVM came into the scene, the industry forgot of what was already possible.
Nowadays, Rule of Zero. Let the compiler generate all the special members implicitly, when the members can each do their own cleanup. This turns out to be quite often.
Probably! I don't remember the code exactly, this was back in 2014 and I think most of the code I had to consume was primarily written by a 24 year old out of college (smart of course, but C++ takes a long time to get good at).
If it returns a temporary, you'll get a compile-time error, as the temporary cannot be bound to a non-const ref. I would prefer "const auto&" as you shouldn't be modifying temporaries, or "auto x{std::move(myWidget.getRef())}" if I want a mutable value and to avoid copying the return value of getRef.
auto by default removes top level const, reference qualification, and decays arrays and function types to pointer, same as for template argument deduction [1]; this is also true when auto is used as for return type deduction.
You can use decltype(auto) to preserve the actual type of the rhs.
[1] as discussed elsewhere initializer lists behave differently, but initializer_list itself was a mistake to start with.
I too have fallen victim to this and similar traps. it's a good idea to use a deleted copy constructor when you know it's not safe to copy the object. you can't completely stop others (or yourself) from doing bad things, but you can at least give them a moment to reflect on what they are doing.
The problem is that structures are implicitly (and trivially) copyable in C and requiring adding an explicit copy constructor would break interoperability. It would also break trivial copiability which is a different issue.
So, yes, backward compatibility and historical baggage.
Note that if you use explicit wrappers for all object-owning pointers you would get either the right behavior or a compilation error.
It's not about safety, it's about efficiency. It's usually an unintended copy. For example, if your function `foo` returns a `const std::string&` the code `auto x = foo()` creates a copy.
well I don't really see a way of solving that. I may or may not want to copy the object, depending on the object's lifetime and what I plan to do with it. warning on copies would be a step too far imo. I would immediately suppress that warning.
My point was simply that auto to assign pointers doesn't cause them to lose their pointiness, but using auto to assign a reference causes them to lose their referenceness. Yes I understand why it works the way it does. I'm not saying it's wrong just that it's been an occasional footgun.
I tend to think of pointer a as a general type aspect whereas reference is specifically on the variable that it's on and doesn't transfer. Sort or like how auto doesn't copy the static specifier.
Also the rules for auto are the same as for template argument deduction.
template<typename T>
std::vector<T> MakeVec(T val) {
std::vector<T> vec;
vec.push_back(val);
return vec;
}
int main()
{
int x = 1;
int& y = x;
std::vector<int> vec = MakeVec(y);
}
Notice how in this case MakeVec(y) makes a std::vector<int> not a std::vector<int&>. If you want your template argument function to take a reference, use T& not T (or use T&& which is a whole new complication). Similarly use auto& not auto.
Ok, that's a fair point. It's easy to make the distinction when you think about how in C++, things become references or lose their reference'iness pretty easily, but nothing ever implicitly becomes a pointer.
My favorite C++ footgun is creating a Vector with some items, taking a reference to, say, &vec[3], then adding another item to the vec, then trying to use the reference from the previous step.
If you write C++ it might be obvious what the problem is.
If you don't, this will absolutely ruin your entire day.
The worst part is, 95% of the time, it will probably work without issue.
But eventually, pushing a new item to the Vector will trigger a relocation of the whole vector, which will invalidate your reference and bring down production. Have fun debugging that.
Yep. I learnt about this when I was learning Rust (which makes this a compile-time error). I was very glad I didn't have to learn this and the 100 other things like that seem to exist in C++ the hard way!
Footguns might keep you prepared for the zombie apocalypse, but they'll inevitably make it difficult to walk. I'm glad I don't currently use C++ in production, but that may change soon. ):
I kid you not, I spent a full working day debugging this exact same issue (taking a pointer to a vector element, before adding more elements).
Very obvious if you understand how C++ and vectors work, yet it took me forever to realize, and it was miserable…
Heh, you can run into this even if you understand how iterator invalidation works. Once you see the bug it's easy to understand, but finding it might not be…
Rust takes the approach of flat-out not letting you mutate any collection while you have any references to its contents. It eliminates all dangling pointer bugs... I don't know if it rules out any useful use cases of collections with iterator stability. I think any C++ collection holding unique_ptr is stable (pushing to the collection doesn't invalidate the target of the unique_ptr), and Rust doesn't have an safe ergonomic way to achieve that (perhaps Pin<Box<MagicCell<T>>>, but we don't yet have a MagicCell that makes &mut MagicCell<T> not noalias).
Rust takes the approach of flat-out not letting you mutate any collection while you have any references to its contents. It eliminates all dangling pointer bugs... I don't know if it rules out any useful use cases of collections with iterator stability.
It does. Some collections have a "collection.retain" method.
collection.retain(|v| true_if_we_want_to_keep(v))
which goes through the collection in linear time and deletes any items for which the test is false. This is O(N).
Some collections ("multi_map", which isn't used much, comes to mind) don't have that. You have to make a read pass over the collection, construct a list of things to delete, then go through the delete list and delete those items. Can't read and change with the same iterator. That's slower. But sound.
The deque holds the string contents, and the set holds views of those contents. This is safe because the deque has iterator stability, and it's pleasant because the `vals` set can just be a regular set.
As you say Rust does not allow this; its answer is IndexSet, and it is implemented differently, like:
RawTable is a hash table that doesn't know how to compute a hash or determine equality. Instead, you provide it with closures to do those tasks, but at the call site (not construction time).
Basically Rust achieves the same thing through a layer of indirection involving indexes, which is a common pattern.
The underlying pointer to a unique_ptr won't be invalidated but the iterator might. Consider if you had a vector of unique_ptrs and inserted into it within a for loop. Depending on the implementation of the iterator this may not be sound (if it's an index, you're probably ok, if it's a pointer, you're screwed).
If you wanted to do the same in Rust it would be Vec<Box<T>>. Mutating the collection won't invalidate the pointers.
> Mutating the collection won't invalidate the pointers.
Rust still won't let you mutate a Vec<Box<T>> while you hold a &T or &mut T borrowed from the Vec.
Perhaps it would be sound to do so unsafely; Rust would let you mutate a Vec<Rc<T>> while you hold a Rc<T> cloned from the Vec. But I'm not clear on whether Stacked Borrows allows moving the Box while you have a &mut T pointing to the same memory as the Box points to. It definitely does not allow dereferencing the Box. (I heard Stacked Borrows will be updated to make self-referential types sound, and I don't know if it will affect this situation.)
In the case of a Vec<Box<T>> the Vec owns the box, so the only thing you could link the reference lifetime to is the Vec itself because if the lifetime isn't bound to the Vec, mutating the Vec might drop that item which in the case of Box<T> will also deallocate the memory. If any references point directly to that memory, then it would be dangling.
In the case of Rc<T>, cloning the Rc creates a second owner of the heap allocation, so the Vec dropping its copy won't deallocate. Though I believe one complication to this would be that the lifetime any references created through the Vec's copy of the Rc will be linked specifically to that copy, so would result in a borrow check error when it's dropped.
If the Vec is storing references, you can borrow the thing behind the reference and still mutate the Vec because that reference stays valid even if mutating the Vec drops its reference to the item.
let (a, b, c, d) = (1, 2, 3, 4);
let mut v = vec![&a, &b, &c, &d];
// If you change the type here to &&u32, or let the compiler infer the type
// then you'll get a borrow check error because the outer reference
// borrows the Vec.
let borrowed_b: &u32 = &v[1];
v.remove(1);
println!("{}", borrowed_b);
println!("{:?}", v);
If you use a proper C++ compiler, the debug builds throw a runtime exception or abort on invalidation misuse, and it can be enabled for release build as well.
Interestingly, AFAIK also Golang suffers from something similar: creating a slice from an array, then performing an operation on the array, that causes resizing - the slice will keep pointing to the old array data.
You can run into the same problem in C, using malloc/realloc.
realloc, in fact, makes for a nasty footgun, too (and remains, of course, available in C++).
An important difference is that in C it is usually obvious that a reallocation is happening while in the C++ stdlib, memory management is usually hidden (important to note that this isn't a design fault of the C++ language, but of the C++ stdlib, unfortunately the two are more and more entangled in newer C++ versions). Because of the fact that memory management is hidden in C++, I find claims that C++ is more memory-safe than C quite hilarious. If it works, it works fine, but if anything goes wrong (which is quite easy to achieve) it's much harder to find the actual problem in C++ than in C.
Yes, but unlike with realloc and a custom dynamic array, C++ references and smart pointers and containers are half-smart and do all sorts of things on their own. Knowing when they will and when they will not be "smart" makes writing C++ really difficult sometimes.
The concept behind this is reference stability, and if you need a collection that has stable references, you must introduce a level of indirection, that is, instead of a vector<T>, you use a vector<unique_ptr<T>> and then you can take references as follows:
std::dequeue is as good as useless. The defaults for "batch size" in different compilers are at extreme opposites of the tradeoff spectrum. So unless you really don't care about performance, memory or portability, it's not a datastructure you can rely on.
From memory:
In MSVC, dequeue will allocate memory for every single element if your elements are > 8 bytes. This will never be changed, due to ABI compatibility.
Clang and gcc have batching sizes of 1K and 4K (i.e. you throw out a whole page of memory even if your dequeue contains only 1 element).
I had a vague feeling that std::deque is a "heavy" thing which you shouldn't have a million of, but iterating through a couple big ones is pretty fast. 1–4K batches wouldn't hurt my feelings. Looked up GCC, it's actually slightly less heavy at 512 bytes per node[1]. But the MSVC part — that caught me completely off guard.
In general it's better to use the same standard library everywhere - discrepancies like that occur for almost every type so if you care about having the same performance on every platform... Either use libc++ or boost
I would generally suggest avoiding reference stability here (extra heap allocations) and going with the offset-based approach mentioned in the other responses.
I would generally suggest going for correctness over performance and the solution I provided is correct in the general case. Using an offset is only correct in the special case where objects will not be inserted or removed at an index less than the offset, otherwise you will end up with bugs as the offset becomes invalid upon such operations.
Furthermore, depending on the size of T, the performance penalty of the extra heap allocations is amortized over the cost of resizing the vector. That is vector reallocation is significantly faster for a unique_ptr<T> than it is for T when T is large and almost all memory allocators are tuned to allocate objects close together in space when they are allocated close together in time, so you don't lose the cache locality or need to worry about memory fragmentation.
In addition to other answers, sometimes you do know the final/max length of the vector when you construct it. In that case reserve() can reserve the necessary space, and as long as you stay under the limit all the addresses will remain valid.
(Though it's still pretty brittle, so you may want to add a ton of comments to warn yourself in the future...)
I'm not going to pretend that I understand everything on that site (particularly anything about complex templates and things like SFINAE) but often there's comprehensible stuff in there. It can be really helpful.
Why does it seem like C++ is constantly replacing its subtle hazards with even more subtle hazards? It's like they never go away, they just turn into something more obscure.
I think the simple reason is that developer ergonomics are the last priority. Zero cost abstractions and expressive features win the day even if you need to hold 19 obscure things in your head to use it right. (I like C++ btw, but it's not a friendly language). I guess those values kind of work though.. I mean, we keep using it.
Because the people involved with C++ have a toxic aversion to learning from other languages. Anytime issues are brought up the people involved with C++ standardization keep yapping the line that "But C++ is not like <other language>." and then proceed to implement a half-assed version of what other languages have and then 2-3 years later people find all kinds of footguns that could have simply been avoided had proper research been performed.
The big issue is that becoming involved in the C++ standardization process is purposely arcane, requiring people to physically travel to remote locations, miss time off work, and spend a lot of money. There are literally substantial language features added to C++ that are nothing more than the work of maybe 3-4 people. These features could have benefited enormously from having 100s, if not 1000s of potential developers exercising use cases and contributing suggestions, but the standardization process is heavily gated.
There is a large culture among C++ users of distributing pre-compiled binaries of dependencies. See every Linux distribution, or the MSVC ecosystem where sharing precompiled .dll's is the norm.
In this context ABI breaking changes are not viable due to the proliferation of pre-existing compiled binaries. What the standard says is only half the story...
> There is a large culture among C++ users of distributing pre-compiled binaries of dependencies. See every Linux distribution, or the MSVC ecosystem where sharing precompiled .dll's is the norm.
That is why I can code on the go in C++ on my travel netbook, while I mostly use Rust when plugged, or have to do a full build before leaving.
The real problem is not specific to C++, memory or shared pointers, but as the author mentions later, the fact that "function parameters evaluation order is unspecified".
The problem is similar in C as well.
`printf("%d, %d", i++, i++);` will give you different results depending on the compiler.
How is it "the same"? If the evaluations of the individual arguments can't overlap, then the C example's problem still exists while the C++ example's problem doesn't. And if the order of evaluation were guaranteed but the evaluations could overlap, the C example wouldn't have a problem but the C++ example still would. To me the two problem's causes seem quite different.
One of my favourites too. Here's a good StackOverflow answer on the topic. [0] Looking here [1] though, did something change in C++17 to ameliorate things? edit Turns out yes, see dvt's comment. Second edit Apparently [2] I knew this 3 years ago and forgot.
struct Dummy {
Dummy(int) {}
};
struct World {
World(const Dummy &) {}
void hello() {}
};
int main() {
int value = 42;
World world(Dummy(value)); // fails to compile
// World world(Dummy(42)); // works
world.hello();
return 0;
}
In 99.9% of cases, it's just a surprising compiler error message that you learn to avoid. However, there's a related version of this problem which can lead to accidentally declaring a function prototype rather than acquiring a lock. You'll get compiler warnings from -Wshadow, but the seriousness of the problem might not be obvious.
That said, it's a very rare problem these days, due to the introduction of uniform initialization in C++11.
Jumping on the footgun bandwagon, one I had always known, but never realized how bad of a design decision it was was "the c++ rvalue lifetime disaster" [0]. The talk linked in [0] has a different conclusion, but describes the problem well.
Basically, the lifetime aspects of r/l-values are improperly coupled to the "can I scavenge the internals?" aspect (via const). R-values should not have been promoted to const-ref (I think it was a legacy behavior?) because then this const-ref can be passed around (e.g. returned) as if it were going to outlive the original reference's frame!
This was the first time I realized what a mess C++ can be (although I still use it a lot). This is a subtle error that I have seen seasoned experts miss in code review (especially in templated code). You can't work around it like you can the underperforming parts of the stdlib (Abseil and Folly help out a lot!). This promotion choice of auto&& -> auto const& is really baked into the language and I don't see a path to change it.
The other "biggest foot cannon" has gotta be the incredibly subtle ways one can violate the ABI, e.g. executable A linking against libs B and C, each of which bundle an ABI-incompatible implementation of the same parent dep D (versions 1.1 and 2.1). You never know which type of object you're really passing around with D::SomeType!
Binding temporaries to const was a very very very old decision and it is relied on pervasively, so it would have been very painful to break. For more existential horrors, VC6 as an extension used to allow binding temporaries to non-const references!
Ha. When looking at the example I did not realize what is really wrong with it for quite a while. But. For what it worth I would never write code like this ( new XXX ) in the parameters section of the function. It just inherently feels wrong to me. I am also paranoid to the degree that I always put brackets in the expressions even when it is not really needed and do all kinds of other paranoid stuff.
Slightly offtopic, but the word "meaningful", present in the first sentence of the article, has seen a massive growth in use in last months. Suddenly I see it everywhere. Politics, economics, personal anecdotes, programming.
Not just clothes can be fashionable; vocabulary, too.
It's a powerful word. Do those last months co-incide with you using HN more? I ask because that happened to me when I started using this site (~8 years ago). Other words and terms that I noticed more were "useful", "conformation bias", and "correlation isn't causation"; all linked back to me using HN more.
That’s not a valid evaluation order, afaik.
You can not start evaluating one argument, stop and switch to the other, then come back to the earlier one.
Prior to C++17 such an evaluation order would have violated the standard. ISO/IEC 14882:2011 1.9/13 [Note: Indeterminately sequenced evaluations cannot overlap, but either could be executed first. — end note]. The arguments in a faction call are a comma-separated indeterminate sequence (5.2.2/4 [Note: Such initializations are indeterminately sequenced with respect to each other (1.9) —
end note]).
> When a function is called, each parameter ([dcl.fct]) shall be initialized ([dcl.init], [class.copy], [class.ctor]) with its corresponding argument. [ Note: Such initializations are indeterminately sequenced with respect to each other ([intro.execution]) — end note ]
So it's only about initialisation of parameters from arguments, not to the evaluation of arguments. (For example if foo() takes a std::string parameter, and bar() returns a const char*, then in the expression foo(bar()) the above quote refers to the call to the string constructor, not to the call to bar().)
The more relevant part is 5.2.2/8:
> [ Note: The evaluations of the postfix expression and of the argument expressions are all unsequenced relative to one another. All side effects of argument expression evaluations are sequenced before the function is entered (see [intro.execution]). — end note ]
It may be valid, but I don't see any value in doing it, and it would take extra code in the compiler (and possibly the resulting binary). So did any compiler actually do that?
That one blew my mind when I found out. You can rip off a chunk of an object by accident... and nothing happens whatsoever. There's no warning from the compiler.
You can stuff a Derived into a std::list<Base>, it'll work just fine, and if Derived happened to say, hold a pointer, it'll happily vanish into the ether.
Besides the amount of confusion that can ensure from this, it means you can have problems if you want to introduce inheritance into a place that didn't have it before.
Especially if it's some sort of external component. You think you're clever inheriting from a framework class and adding some data on top? Nope, the internal structure you don't control doesn't want to cooperate with that plan.
> You can rip off a chunk of an object by accident... and nothing happens whatsoever. There's no warning from the compiler.
Slicing is a bad name for this phenomenon, because what you've described is what the name implies, but that is not what is happening. The original object is completely unchanged. Nothing was ripped off of it. Constructors create new objects.
"OpenMW is a free, open source, and modern engine which re-implements and extends the 2002 Gamebryo engine for the open-world role-playing game The Elder Scrolls III: Morrowind."
Maybe, just maybe, that shouldn't be a FAQ but right on the main page.
Pretty sure that raw new has been a smell since C++11, 10 years ago, since the introduction of smart pointers that aren't terrible. Maybe longer than a few years since last time you looked at it!
Yes, one time I looked and people were saying new was bad and to use auto_ptr, and then the next time I looked everyone was saying that had never happened.
I'm a big fan of Scott Meyer's Effective C++. It's a little dated now, but it was the most valuable book I'd ever read on the language.
Louis Brandy's Curiously Recurring Bugs at Facebook presentation from CppCon17 has somewhat of the same feeling, though it only covers a few bugs: https://youtu.be/lkgszkPnV8g
speaking about shared_ptr, they have a reference count that is done as an atomic counter, because you could use the shared_ptr instance from multiple threads, so that it has to be as general as possible. Now that atomic counter was not needed in my case, as the objects were only accessed from the same thread; however it had some surprisingly adverse effects on performance, in my projects at least...
i think they should add some template argument to determine if an atomic counter is used or not, exactly for the cases where the shared_ptr object is used exclusively from a single thread; but they didn't do that in the standard.
in gcc the shared_ptr is derived from __shared_ptr, where you can pass a lock policy trait that goes without locking. wow. But i am afraid that is implementation specific.
A capturing lambda is another way to leak a std::shared_ptr<> - i feel like it's an even easier way to run into this kind of footgun but i suppose it depends whether you typically use lots of lambdas.
A reasonable solution in my case could be to use std::weak_ptr<> instead but that wouldn't be useful here.
This question was genuine curiosity so I'm a bit disappointed there was never a reply.
On reflection, I suspect the parent comment was referring to creating circular dependencies using lambdas and std::function.
For example, a class Inner wants allow notifying its owner when events happen, so it has a std::function member variable it uses for the callback function, and a class Outer has an instance of Inner as a member variable. But then passes as a callback:
My guess is that the parent poster meant something more like: it's hard to tell whether A=B will deep copy, shallow copy or move the right hand side and which operator=() will actually be called.
First you need to figure out the types of A and B and their "modifiers" (like const or references) to start figuring out which operator=() could potentially be called. That might still be very hard to do if for example there are implicit conversions from B to some other type involved. Such conversions can depend on code which might be nowhere close to where the operator=() is defined.
Well, what happens when you call func(A, B) if both are pointers in C?
Or if we take an Algol derived language, like Pascal or Modula-2, func might have been defined as,
type
my_complex_type = ....
procedure func(var a, b:my_complex_type);
begin
end;
Raising the same kind of questions.
Yes there are issues with C++ approach as you describe, however sometimes it seems C++ hate kind of blinds the people that similar issues occur in other languages to some extent.
Due to function overloading, figuring out which function is called is not exclusively a problem of operators. And IDE can help here (and so does consistency: similarly named functions/operators should behave similarly).
Slicing requires you to treat a reference type as a value type, nobody who knows C++ does this, so slicing is pretty much restricted to people coming in from Java land or similar.
I've not encountered a single instance of slicing in 10+ years of C++.
You clearly only use either GCC or clang C, without much knowledge of WG14 documents.
Also some of the C++ horrors are caused by the consequences to stay copy paste compatible with said documents and tooling across all platforms where a C compiler exists.
Could you please make some basic effort to engage in a discussion with something other than outright hostility? To actually consider what people say to you?
You are a absolutely exhausting person to talk to. You stand out in every single conversation on this site that you take part in as being really pointlessly hostile and unpleasant.
This is really adding nothing whatsoever to the conversation other than making people dislike you.
If you do, here is a tip for next time, be more constructive than "horrific mess", and understang that a big deal of that mess is a consequence to bundle C as part of the package.
This way to call a function smells wrong anyway. A shared pointer to a new value is nonsense as a parameter, because at the point of the call the pointer is not shared, it is unique! Since a shared pointer is implicitly constructible from a unique pointer, this makes more sense, assuming there are other call sites that really intend to share ownership:
If you know that the pointer is going to be shared, it's more efficient to use `make_shared` as it will use a single allocation for both the object and the control block.
There is a mental optimization to be had by using only a subset of the language. If you can get by using fewer constructs (and work on code with like-minded authors) you can better learn to use them correctly and save on thought as to which is the most optimal approach in every case.
Great, so stick to the subset of providing a function that asks for a shared_ptr with a shared_ptr instead of introducing unique_ptr as an additional construct. This way you get the benefits of the optimization and don't need to add the overhead of thinking about how shared_ptr will take a unique_ptr and perform a conversion on it.
> The second one is there since this is C++, priority() might raise an exception, meaning that new Widget will be called, but never passed to std::shared_ptr, and thus never deleted!
Is now impossible (thank God!). See a full SO discussion here[1]. Stuff like this makes me so happy I don't write C++ anymore, but the gist of it is (from the standard):
> For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.
[1] https://stackoverflow.com/questions/38501587/what-are-the-ev...