Most vexing parse
Posted by Michał ‘mina86’ Nazarewicz on 25th of April 2021
Here’s a puzzle: What does the following C++ code output:
#include <cstdio> #include <string> struct Foo { Foo(unsigned n = 1) { std::printf("Hell%s,", std::string(n, 'o').c_str()); } ~Foo() { std::printf("%s", " world"); } }; static constexpr double pi = 3.141592653589793238; int main(void) { Foo foo(); Foo bar(unsigned(pi)); }
If your reaction is ‘What do you mean? It’s obvious!’ than you’ve answered incorrectly. But if you’ve exclaimed ‘This is implementation-defined because there’s no terminating new-line character in text stream!’ than congratulations on knowing this fact; that’s still the wrong answer though. The correct reaction is a sigh followed by a lament at C++ syntax.
The Foo foo();
line is a function declaration and has no effect on the behaviour of the code. Perhaps more confusingly, Foo bar(unsigned(pi);
is also just a function declaration — this time of a function with a single unsigned argument — and doesn’t affect output of the program. In the end, the program outputs nothing and returns with a zero exit status.
This is an instance of the infamous most vexing parse which is a syntax ambiguity where a piece of code could be interpreted as a function declaration or creation of an object. In those situations C++ mandates the syntax to be parsed as a function declaration.
Removing the ambiguity
There are a few possible ways to force the compiler to treat the lines as an object definitions. In case of foo
, the simplest (and arguably best) is to remove the parenthesis all together. In case of bar
, an explicit cast, that is using static_cast<unsigned>(pi)
, would resolve the issue.
C++11 offers another option in the form of aggregate initialisation (also known as brace initialisation). Simply replace parenthesis around the supposed constructor arguments with braces and the code will behave as expected:
static constexpr double pi = 3.141592653589793238; int main(void) { Foo foo{}; Foo bar{unsigned(pi)}; }
But before you go on a rampage and start replacing all parenthesis with braces…
Here’s another puzzle
What does the following C++ code output:
#include <iostream> #include <vector> int main(void) { typedef std::vector<int> Vector; Vector a(6); Vector b{6}; Vector c({6}); Vector d(6, 9); Vector e{6, 9}; Vector f({6, 9}); std::cout << a.size() << ' ' << a[0] << ", " << b.size() << ' ' << b[0] << ", " << c.size() << ' ' << c[0] << ", " << d.size() << ' ' << d[0] << ", " << e.size() << ' ' << e[0] << ", " << f.size() << ' ' << f[0] << '\n'; }
In constructor call the parenthesis can be replaced by braces so in the code above a
is the same as b
and d
is the same as e
, right? Not so fast.
- The definition of variable
a
is equivalent toVector a(6, 0);
. Nothing surprising here. This constructs a six-element vector with all elements initialised to zero. - However, the definition of variable
b
is different. Becausestd::vector
has a constructor which takesstd::initializer_list
as an argument, when braces are used that constructor is used. As such, theb
variable is in fact equivalent toc
. They are both one-element vectors holding the value six. - The definition of variable
d
is analogous toa
and results in a six-element vector with all elements set to nine. Again, no surprises there. - Finally, the definitions of variables
e
andf
are equivalent and — just like withb
andc
— call constructor with an initialiser list resulting in a two-argument vector holding values six and nine.
The output of the program is therefore 6 0, 1 6, 1 6, 6 9, 2 6, 2 6
.
Conclusion
The most vexing parse is not a new issue — it fact it’s now two decades since Scott Meyers coined the term in his Effective STL book — and I certainly won’t make any new revelations about it. It is unfortunate that in attempt to solve it the committee decided it was a good idea to allow braces themselves to be a shorthand for calling constructor with initialiser list. This decision means that adding a constructor which takes std::initializer_list
is now an API-breaking change. How? Let’s consider the following code:
#include <iostream> #include <memory> struct DynArr { DynArr(size_t size) : ptr(std::make_unique<int[]>(size)), sz(size) {} /* DynArr(std::initializer_list<int> lst) : DynArr(lst.size()) { std::copy(lst.begin(), lst.end(), ptr.get()); } */ size_t size() const { return sz; } int *data() { return ptr.get(); } int &operator[](size_t offset) { return ptr[offset]; } int &last() { return ptr[sz - 1]; } private: const std::unique_ptr<int[]> ptr; const size_t sz; }; int main(void) { DynArr arr{5}; arr[0] = arr[1] = 1; for (size_t i = 2; i < arr.size(); ++i) { arr[i] = arr[i - 2] + arr[i - 1]; } std::cout << arr.last() << '\n'; }
With only the DynArr(size_t)
constructor present, the DynArr arr{5}
definition allocates a five-element array and the code calculates fifth number in the Fibonacci sequence. But if a later revisions of the class introduce the DynArr(std::initializer_list<int>)
constructor, the same line will invoke this new constructor and later cause a buffer overflow.
Unfortunately (as often is the case when talking about C++) there are no definite answers to the most vexing parse or the potential confusion with aggregate initialisation. It’s remains one of those things a C++ programmer has to be aware of.
Epilogue: A bonus puzzle
I’ll leave you, Dear Reader, with this puzzle. What does the following C++ code output:
#include <cstdio> #include <math.h> #include <string> struct Foo { Foo(unsigned n = 1) { std::printf("Hell%s, world\n", std::string(n, 'o').c_str()); } }; int main(void) { Foo bar(unsigned(M_PI)); }
C and C++ are so much fun!