Toward correct-by-default, efficient-by-default, and pitfall-free-by-default variable declarations, using “AAA style”… where “triple-A” is both a mnemonic and an evaluation of its value.
Problem
JG Questions
1. What does this code do? What would be a good name for some_function?
template<class Container, class Value>
void some_function( Container& c, const Value& v ) {
if( find(begin(c), end(c), v) == end(c) )
c.emplace_back(v);
assert( !c.empty() );
}
2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?
Guru Questions
3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.
4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:
(a) auto x = init; when you don’t need to commit to a specific type? (Note: The expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type.)
(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?
List as many as you can. (Hint: Look back to GotW #93.)
5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:
(a) Heap allocation syntax.
(b) Literal suffixes, including user-defined literal operators.
(c) Named lambda syntax.
(d) Function declarations.
(e) Template alias declarations.
6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?
Solution
1. What does this code do? What would be a good name for some_function?
template<class Container, class Value>
void append_unique( Container& c, const Value& v ) {
if( find(begin(c), end(c), v) == end(c) )
c.emplace_back(v);
assert( !c.empty() );
}
Let’s call this function append_unique. First, it checks to see whether the value v is already in the container. If not, it appends it at the end. Finally, it asserts that c is not empty, since by now it must contain one copy of the value v.
You probably thought this question was fairly easy.
Maybe too easy.
If so, good. That’s the point of the example. Hold the thought, and we’ll come back to this in Question 3.
2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?
It means we should care principally about “what,” not “how.” This separation of concerns applies at all levels in high-quality modern software—hiding code, hiding data, and hiding type. Each increases encapsulation and reduces coupling, which are essential for large-scale and robust software.
Please indulge a little repetition in the following paragraphs. It’s there to make a point about similarity.
Hiding code. With the invention of separately compiled functions and structured programming, we gained “encapsulation to hide code.” The caller knows the signature only—the function’s internal code is not his concern and not accessible programmatically, even if the function is inline and the body happens to be visible in source code. We try hard not to inadvertently leak implementation details, such as internal data structure types. The point is that the caller does not, and should not, commit to knowledge of the current internal code; if he did, it would create interdependencies and make separately compiled libraries impossible.
Hiding data (and code). With object oriented styles (OO), we gained two new manifestations of this separation. First, we got “more encapsulation to hide both code and data.” The caller knows the class name, bases, and member function signatures only—the class’s internal data and internal code are hidden and not accessible programmatically, even though the private class members are lexically visible in the class definition and inline function bodies may also be visible. (In turn, dynamic libraries and the potential future-C++ modules work aim to accomplish the same thing at a still larger scale.) Again we try hard not to inadvertently leak implementation details, and again the point is that the caller does not, and should not, commit to knowledge of the current internal data or code, which would make the class difficult to ever change or to ship on its own as a library.
Hiding type (run-time polymorphism). Second, OO also gave us “separation of interfaces to hide type.” A base class or interface can delegate work to a concrete derived implementation via virtual functions. Now the interface the caller sees and the implementation are actually different types, and the caller knows the base type only—he doesn’t know or care about the concrete type, including even its size. The point, once again, is that the caller does not, and should not, commit to a single concrete type, which would make the caller’s code less general and less able to be reused with new types.
Hiding type (compile-time polymorphism). With templates, we gained a new compile-time form of this separation—and it’s still “separation of interfaces to hide type.” The caller knows an ad-hoc “duck typed” set of operations he wants to perform using a type, and any type that supports those operations will do just fine. The contemplated future C++ concepts feature will allow making this stricter and less ad-hoc, but still avoids committing to a concrete type at all. The whole point is still is that the caller does not, and should not, commit to a single concrete type, which would make the caller’s code less generic and less able to be reused with new types.
3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.
In many languages, not just C++, there are several reasons people commonly give for why they are reluctant to use auto to declare variables (or the equivalent in another language, such as var or let). We could summarize them as: laziness, commitment, and readability. Let’s take them in order.
Laziness and commitment
First, laziness: One common concern is that “writing auto to declare a variable is primarily about saving typing.” However, this is just a misunderstanding of auto. As we saw in GotW #92 and #93 and will see again below, the main reasons to declare variables using auto are for correctness, performance, maintainability, and robustness—and, yes, convenience, but that’s in last place on the list.
Guideline: Remember that preferring auto variables is motivated primarily by correctness, performance, maintainability, and robustness—and only lastly about typing convenience.
Second, commitment: “But in some cases I do want to commit to a specific type, not automatically deduce it, so I can’t use auto.” It’s true that sometimes you do want to commit to a specific type, but you can still use auto. As demonstrated in GotW #92 and #93, not only can you still write declarations of the form auto x = type{ init }; (instead of type x{init};) to commit to a specific type, but there are good reasons for doing so, such as that saying auto means you can’t possibly forget to initialize the variable.
Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion, it guarantees the variable will be initialized, and it won’t allow an accidental implicit narrowing conversion. Only when you do want explicit narrowing, use ( ) instead of { }.
(Un)readability?
The third and most common argument concerns readability: “My code gets unreadable quickly when I don’t know what exact type my variable is without hunting around to see what that function or expression returns, so I can’t just use auto all the time.” There is truth to this, including losing the ability to search for occurrences of specific types when using the non-typed syntax auto x = expr; in 4(a) below, so this appears at first to be a strong argument. And it’s true that any feature can be overused. However, I think this argument is actually weaker than it first seems for four reasons, two minor and two major.
The two minor counterarguments are:
- The “can’t use auto” part isn’t actually true, because as we just saw above you can be explicit about your type and still use auto, with good benefit.
- The argument doesn’t apply when you’re using an IDE, because you can always tell the exact type, for example by hovering over the variable. Granted, this mitigation goes away when you leave the IDE, such as if you print the code.
But we should focus on the two major counterarguments:
- It reflects a bias to code against implementations, not interfaces. Overcommitting to explicit types makes code less generic and more interdependent, and therefore more brittle and limited. It runs counter to the excellent reasons to “write code against interfaces, not implementations” we saw in Question 2.
- We (meaning you) already ignore actual types all the time…
“… Wait, what? I do not ignore types all the time,” someone might say. Actually, not only do you do it, but you’re so comfortable and cavalier about it that you may not even realize you’re doing it. Let’s go back to that code in Question 1:
template<class Container, class Value>
void append_unique( Container& c, const Value& v ) {
if( find(begin(c), end(c), v) == end(c) )
c.emplace_back(v);
assert( !c.empty() );
}
Quick quiz: How many specific types are mentioned in that function? Name as many as you can.
Take a moment to consider that before reading on…
… We can see pretty quickly that the answer is a nice round number: Zero. Zilch. (Pedantic mode: Yes, there’s void, but I’m going to declare that void doesn’t count because it’s to denote “no type,” it’s not a meaningful type.)
Not a single specific type appears anywhere in this code, and the lack of exact types makes it much more powerful and doesn’t significantly harm its readability. Like most people, you probably thought Question 1 felt “easy” when we did it in isolation. Granted, this is generic code, and not all your code will be templates—but the point is that the code isn’t unreadable even though it doesn’t mention specific types, and in fact auto gives you the ability to write generic code even when not writing a template.
So starting with the cases illustrated in this short example, let’s consider some places where we routinely ignore exact types. First, function template parameters:
- What exact type is Container? We have no idea, and that’s great… anything we can call begin, end, emplace_back and empty on and otherwise use as needed by this code will do just fine. In fact, we’re glad we don’t know anything about the exact type, because it means we’re following the Open/Closed Principle and staying open for extension— this append_unique will work fine with a type that won’t be written until years from now. Interestingly, the concepts feature currently being proposed for ISO C++ to express template parameter constraints doesn’t change how this works at all, it only makes it more convenient to express and check the requirements. Note how much more powerful this is compared to OO style frameworks: In OO frameworks where containers have to inherit from a base class or interface, that’s already inducing coupling and limiting the ability to just plug in and use arbitrary suitable types. It is important that we can know nothing at all about the type here besides its necessary interface, not even restricting it by as much as limiting it to types in a particular inheritance hierarchy. We should strongly resist compromising this wonderful and powerful “strictly typed but loosely coupled” genericity.
- What exact type is Value? Again, we don’t know, and we don’t want to know… anything we can pass to find and emplace_back is just dandy. At this point some of you may be thinking: “Oh yes we know what type it is, it’s the container’s value type!” No, it doesn’t have to be that, it just has to be convertible, and that’s important. For example, we want vector<string> vec; append_unique(vec, “xyzzy”); to work, and “xyzzy” is a const char[6], not a string.
Second, function return values:
- What type does find return? Some iterator type, the same as begin(c) coughed up, but we don’t know specifically what type it is just from reading this code, and it doesn’t matter. We can look up the signature if we’re feeling really curious, but nobody bothers doing that because anything that’s comparable to end(c) will do.
- What type does empty return? We don’t even think twice about it. Something testable like a bool… we don’t care much what exactly as long as we can “not” it.
Third, many function parameters:
- What specific type does emplace_back take? Don’t know; might be the same as v, might not. Really don’t care. Can we pass v to it? Yes? Groovy.
And that’s just in this example. We routinely and desirably ignore types in many other places, such as:
- Fourth, any temporary object: We never get to name the object, much less name its type, and we may know what the type is but we don’t care about actually spelling out either name in our code.
- Fifth, any use of a base class: We don’t know the dynamic concrete type we’re actually using, and that’s a benefit, not a bug.
- Sixth, any call to a virtual function: Ditto; plus on top of that if the virtual function return type itself could also be covariant for another layer of “we don’t know the dynamic concrete type” since in the presence of covariance we don’t know what type we’re actually getting back.
- Seventh, any use of function<>, bind, or other type erasure: Just think about how little we actually know, and how happy it makes us. For example, given a function<int(string)>, not only don’t we know what specific function or object it’s bound to, we don’t even know that thing’s signature—it might not actually even take a string or return an int, because conversions are allowed in both directions, so it only has to take something a string can be converted to, and return something that can be converted to an int. All we know is that it’s something that we can invoke with a string and that gives us back something we can use as an int. Ignorance is bliss.
- Eighth, Any use of a C++14 generic lambda function: A generic lambda just means the function call operator is a template, after all, and like any function template it gets stamped out individually for whatever actual argument types you pass each time you use it.
There are probably more.
Although lack of commitment may be a bad thing in other areas of life, not committing to a specific type is often desirable by default in reusable code.
4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:
Let’s consider the base case first, which has by far the strongest arguments in its favor and is gaining quite a bit of traction in the C++ community.
(a) auto x = init; when you don’t need to commit to a specific type?
GotW #93 offered many concrete examples to support habitually declaring local variables using auto x = expr; when you don’t need to explicitly commit to a type. The advantages include:
- It guarantees the variable will be initialized. Uninitialized variables are impossible because once you start by saying auto the = is required and cannot be forgotten.
- It is efficient by default and guarantees that no implicit conversions (including narrowing conversions), temporary objects, or wrapper indirections will occur. In particular, prefer using auto instead of function<> to name lambdas unless you need the type erasure and indirection.
- It guarantees that you will use the correct exact type now.
- It guarantees that you will continue to use the correct exact type under maintenance as the code changes, and the variable’s type automatically tracks other functions’ and expressions’ types unless you explicitly said otherwise.
- It is the simplest way to portably spell the implementation-specific type of arithmetic operations on built-in types, which vary by platform, and ensure that you cannot accidentally get lossy narrowing conversions when storing the result.
- It is the only good option for hard-to-spell and impossible-to-spell types such as lambdas, binders, detail:: helpers, and template helpers (including expression templates when they should stay unevaluated for performance), short of resorting to repetitive decltype expressions or more-expensive indirections like function<>.
- It is more symmetric and consistent with other parts of modern C++ (see Question 5).
- And yes, it is just generally simpler and less typing.
See GotW #93 for concrete examples of these cases, where using auto helps eliminate correctness bugs, performance bugs, and silently nonportable code.
As noted in the questions, the expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type. As shown in GotW #93, prefer to use auto x = as_signed(integer_expr); or auto x = as_unsigned(integer_expr); to store the result of an integer computation that should be signed or unsigned—these should be viewed as “casts that preserve width,” so we are not casting to a specific type but rather casting an attribute of the type while correctly preserving the other basic characteristics of the type, notably by not forcing it to commit to a particular size.
Using auto together with as_signed or as_unsigned makes code more portable: the variable will both be large enough (thanks to auto) and preserve the required signedness on all platforms. Note that signed/unsigned conversions within integer_expr may still occur and so you may need additional finer-grained as_signed/as_unsigned casts within the expression for full portability.
(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?
This is the explicitly typed form, and it still has advantages but they are not as clearly strong as implicitly typed form. The jury is still out on whether to recommend this one wholesale, as we’re still trying it out, but it does offer some advantages and I suggest you try it out for a while and see if it works well for you.
So here’s the recommendation to consider trying out for yourself: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. (Only when you do want to allow explicit narrowing, use ( ) instead of { }.) The advantages of this typed auto declaration style include:
- It guarantees the variable will be initialized; you can’t forget.
- It is self-documenting to show that the code is explicitly requesting a conversion.
- It won’t allow an accidental implicit narrowing conversion.
- It is more symmetric and consistent, both with the basic auto x = init; form and with other parts of C++…
… which brings us to Question 5.
5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:
Let’s start off this question with some side-by-side examples that give us a taste of the symmetry we gain when we habitually declare variables using modern auto style. Starting with two examples where we don’t need to commit to a type and then two where we do, we see that the right-hand style is not only more robust and maintainable for the reasons already given, but also arguably cleaner and more regular with the type consistently on the right when it is mentioned:
// Classic C++ declaration order // Modern C++ style
const char* s = "Hello"; auto s = "Hello";
widget w = get_widget(); auto w = get_widget();
employee e{ empid }; auto e = employee{ empid };
widget w{ 12, 34 }; auto w = widget{ 12, 34 };
Now consider the (dare we say elegant) symmetry with each of the following.
(a) Heap allocation syntax.
When allocating heap variables, did you notice that the type name is already on the right naturally anyway? And since it’s there, we don’t want to have to repeat it. (I’ll show the raw “new” form for completeness, but prefer make_unique and make_shared in that order for allocation in modern code, resorting to raw new only well-encapsulated inside the implementation of low-level data structures.)
// Classic C++ declaration order // Modern C++ style
widget* w = new widget{}; /* auto w = new widget{}; */
unique_ptr<widget> w auto w = make_unique<widget>();
= make_unique<widget>();
(b) Literal suffixes, including user-defined literal operators.
Using auto declaration style doesn’t merely work naturally with built-in literal suffixes like ul for unsigned long, plus user-defined literals including standard ones now in draft C++14, but it actively encourages using them:
// Classic C++ declaration order // Modern C++ style
int x = 42; auto x = 42;
float x = 42.; auto x = 42.f;
unsigned long x = 42; auto x = 42ul;
std::string x = "42"; auto x = "42"s; // C++14
chrono::nanoseconds x{ 42 }; auto x = 42ns; // C++14
Based on the examples so far, which do you think is more regular? But wait, there’s more…
(c) Named lambda syntax.
(d) Function declarations.
Lambdas have unutterable types, and auto is the best way to capture them exactly and efficiently. But because their declarations are now so similar, let’s consider lambdas and (other) functions together, and in the last two lines of this example also use C++14 return type deduction:
// Classic C++ declaration order // Modern C++ style
int f( double ); auto f (double) -> int;
… auto f (double) { /*...*/ };
… auto f = [=](double) { /*...*/ };
(e) Template alias declarations.
Modern C++ frees us from the tyranny of un-template-able typedef:
// Classic C++ workaround // Modern C++ style
typedef set<string> dict; using dict = set<string>;
template<class T> struct myvec { template<class T>
typedef vector<T,myalloc> type; using myvec = vector<T,myalloc>;
};
An observation
Have you noticed that the C++ world is moving to a left-to-right declaration style everywhere, of the form
category name = type and/or initializer ;
where “category” can be auto or using?
Take a moment to re-skim the two columns of examples above. Even ignoring correctness and performance advantages, do you find the right-hand column to be most consistent, and most readable?
6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?
There is one case I know of where this style cannot be followed, and it applies to the type-specific auto x = type{ init }; form. In that form, type has to be moveable (even though the move operation will be routinely elided by compilers), so these won’t work:
auto lock = lock_guard<mutex>{ m }; // error, not moveable
auto ai = atomic<int>{}; // error, not moveable
(Aside: For at least some of these cases, an argument could be made that this is actually more of a defect in the type itself, in particular that perhaps atomic<int> should be moveable.)
Having said that, there are three other cases I know of that you might encounter that may at first look like they don’t work with this auto style, but actually do. Let’s consider those for completeness.
First, the basic form auto x = init; will exactly capture an initializer_list or a proxy type, such as an expression template. This is a feature, not a bug, because you have a convenient way to spell both “capture the list or proxy” and “resolve the computation” depending which you mean, and the default syntax goes to the more efficient one: If you want to efficiently capture the list or proxy, use the basic form which gives you performance by default, and if you mean to force the proxy to resolve the computation, specify the explicit type to ask for the conversion you want. For example:
auto i1 = { 1 }; // initializer_list<int>
auto i2 = 1; // int
auto a = matrix{...}, b = matrix{...}; // some type that does lazy eval
auto ab = a * b; // to capture the lazy-eval proxy
auto c = matrix{ a * b }; // to force computation
Second, here is a rare case that you may discover now that we have auto: Due to the mechanics of the C++ grammar, you can’t legally write a multi-word type like long long or class widget in the place where type goes in the auto x = type{ init }; form. However, note that this affects only those two cases:
- The multi-word built-in types like long long, where you’re better off anyway writing a known-width type alias or using a literal.
- Elaborated type specifiers like class widget, where the “class” part is already redundant. The “class widget” syntax is allowed as a compatibility holdover from C which liked seeing struct widget everywhere unless you typedef‘d the struct part away.
So just avoid the multi-word form and use the better alternative instead:
auto x = long long{ 42 }; // error
auto x = int64_t{ 42 }; // ok, better
auto x = 42LL; // ok, better
auto y = class X{1,2,3}; // error
auto y = X{1,2,3}; // ok
Summary
We already ignore explicit and exact types much of the time, including with temporary objects, virtual functions, templates, and more. This is a feature, not a bug, because it makes our code less tightly coupled, and more generic, flexible, reusable, and future-proof.
Declaring variables using auto, whether or not we want to commit to a type, offers advantages for correctness, performance, maintainability, and robustness, as well as typing convenience. Furthermore, it is an example of how the C++ world is moving to a left-to-right declaration style everywhere, of the form
category name = type and/or initializer ;
where “category” can be auto or using, and we can get not only correctness and performance but also consistency benefits by using the style to consistently declare local variables (including using literals and user-defined literals), function declarations, named lambdas, aliases, template aliases, and more.
Acknowledgments
Thanks in particular to Scott Meyers and Andrei Alexandrescu for their time and insights in reviewing and discussing drafts of this material. Both helped generate candidate names for this idiom; it was Alexandrescu who suggested the name “AAA (almost always auto)” which I merged with the best names I’d thought of to that point (“auto style” or “auto (+type) style”) to get “AAA Style (almost always auto).” Thanks also to the following for their feedback to improve this article: Adrian, avjewe, mttpd, ned, zadecn, noniussenior, Marcel Wid, J Guy Davidson, Mark Garcia, Jonathan Wakely, “x y.”