{"id":901,"date":"2005-05-11T00:15:02","date_gmt":"2005-05-11T00:15:02","guid":{"rendered":"https:\/\/www.joelonsoftware.com\/?p=901"},"modified":"2016-12-05T19:13:57","modified_gmt":"2016-12-05T19:13:57","slug":"making-wrong-code-look-wrong","status":"publish","type":"post","link":"https:\/\/www.joelonsoftware.com\/2005\/05\/11\/making-wrong-code-look-wrong\/","title":{"rendered":"Making Wrong Code Look Wrong"},"content":{"rendered":"<p>Way back in September 1983, I started my first real job, working at Oranim, a big bread factory in Israel that made something like 100,000 loaves of bread every night in six giant ovens the size of aircraft carriers.<\/p>\n<p>The first time I walked into the bakery I couldn\u2019t believe what a mess it was. The sides of the ovens were yellowing, machines were rusting, there was grease everywhere.<\/p>\n<p>\u201cIs it always this messy?\u201d I asked.<\/p>\n<p>\u201cWhat? What are you talking about?\u201d the manager said. \u201cWe just finished cleaning. This is the cleanest it\u2019s been in weeks.\u201d<\/p>\n<p>Oh boy.<\/p>\n<p>It took me a couple of months of cleaning the bakery every morning before I realized what they meant. In the bakery, clean meant no dough on the machines. Clean meant no fermenting dough in the trash. Clean meant no dough on the floors.<\/p>\n<p>Clean did not mean the paint on the ovens was nice and white. Painting the ovens was something you did every decade, not every day. Clean did not mean no grease. In fact there were a lot of machines that needed to be greased or oiled regularly and a thin layer of clean oil was usually a sign of a machine that had just been cleaned.<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" style=\"MARGIN-LEFT: 5px\" border=\"0\" alt=\"This is what a dough rounder looks like.\" align=\"right\" src=\"https:\/\/i0.wp.com\/www.joelonsoftware.com\/wp-content\/uploads\/2005\/05\/DoughRounder.png?w=730&#038;ssl=1\" \/>The whole concept of clean in the bakery was something you had to learn. To an outsider, it was impossible to walk in and judge whether the place was clean or not. An outsider would never think of looking at the inside surfaces of the dough rounder (a machine that rolls square blocks of dough into balls, shown in the picture at right) to see if they had been scraped clean. An outsider would obsess over the fact that the old oven had discolored panels, because those panels were <em>huge<\/em>. But a baker couldn\u2019t care less whether the paint on the outside of their oven was starting to turn a little yellow. The bread still tasted just as good.<\/p>\n<p>After two months in the bakery, you learned how to \u201csee\u201d clean.<\/p>\n<p>Code is the same way.<\/p>\n<p>When you start out as a beginning programmer or you try to read code in a new language it all looks equally inscrutable. Until you understand the programming language itself you can\u2019t even see obvious syntactic errors.<\/p>\n<p>During the first phase of learning, you start to recognize the things that we usually refer to as \u201ccoding style.\u201d So you start to notice code that doesn\u2019t conform to indentation standards and Oddly-Capitalized variables.<\/p>\n<p>It\u2019s at this point you typically say, \u201cBlistering Barnacles, we\u2019ve <em>got<\/em> to get some consistent coding conventions around here!\u201d and you spend the next day writing up coding conventions for your team and the next six days arguing about the One True Brace Style and the next three weeks rewriting old code to conform to the One True Brace Style until a manager catches you and screams at you for wasting time on something that can never make money, and you decide that it\u2019s not really a bad thing to only reformat code when you revisit it, so you have about half of a True Brace Style and pretty soon you forget all about that and then you can start obsessing about something else irrelevant to making money like replacing one kind of string class with another kind of string class.<\/p>\n<p>As you get more proficient at writing code in a particular environment, you start to learn to see other things. Things that may be perfectly legal and perfectly OK according to the coding convention, but which make you worry.<\/p>\n<p>For example, in C:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>char* dest, src;<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>This is legal code; it may conform to your coding convention, and it may even be what was intended, but when you\u2019ve had enough experience writing C code, you\u2019ll notice that this declares <tt><strong>dest<\/strong><\/tt> as a <tt><strong>char<\/strong><\/tt> <i>pointer<\/i> while declaring <tt><strong>src<\/strong><\/tt> as merely a <tt><strong>char<\/strong><\/tt>, and even if this <em>might<\/em> be what you wanted, it probably isn\u2019t. That code smells a little bit dirty. <\/p>\n<p>Even more subtle:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>if (i != 0)<br \/>&nbsp;&nbsp;&nbsp;&nbsp;foo(i);<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>In this case the code is 100% correct; it conforms to most coding conventions and there\u2019s nothing wrong with it, but the fact that the single-statement body of the <tt>if<\/tt>statement is not enclosed in braces may be bugging you, because you might be thinking in the back of your head, gosh, somebody might insert another line of code there<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>if (i != 0)<br \/> <span style=\"BACKGROUND-COLOR: yellow\">&nbsp;&nbsp;&nbsp;&nbsp;bar(i);<\/span><br \/> &nbsp;&nbsp;&nbsp;&nbsp;foo(i);<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>\u2026 and forget to add the braces, and thus accidentally make <tt><strong>foo(i)<\/strong><\/tt>unconditional! So when you see blocks of code that aren\u2019t in braces, you might sense just a tiny, wee, soup\u00e7on of uncleanliness which makes you uneasy.<\/p>\n<p>OK, so far I\u2019ve mentioned three levels of achievement as a programmer:<\/p>\n<p>1. You don\u2019t know clean from unclean.<\/p>\n<p>2. You have a superficial idea of cleanliness, mostly at the level of conformance to coding conventions.<\/p>\n<p>3. You start to smell subtle hints of uncleanliness beneath the surface and they bug you enough to reach out and fix the code.<\/p>\n<p>There\u2019s an even higher level, though, which is what I really want to talk about:<\/p>\n<p>4. You deliberately architect your code in such a way that your nose for uncleanliness makes your code more likely to be correct.<\/p>\n<p>This is the real art: making robust code by literally <em>inventing conventions<\/em> that make errors stand out on the screen.<\/p>\n<p>So now I\u2019ll walk you through a little example, and then I\u2019ll show you a general rule you can use for inventing these code-robustness conventions, and in the end it will lead to a defense of a certain type of Hungarian Notation, probably not the type that makes people carsick, though, and a criticism of exceptions in certain circumstances, though probably not the kind of circumstances you find yourself in most of the time. <\/p>\n<p>But if you\u2019re so convinced that Hungarian Notation is a Bad Thing and that exceptions are the best invention since the chocolate milkshake and you don\u2019t even want to hear any other opinions, well, head on over to Rory\u2019s and read the <a href=\"http:\/\/neopoleon.com\/blog\/posts\/13932.aspx\">excellent comix<\/a> instead; you probably won\u2019t be missing much here anyway; in fact in a minute I\u2019m going to have actual code samples which are likely to put you to sleep even before they get a chance to make you angry. Yep. I think the plan will be to lull you almost completely to sleep and then to sneak the Hungarian=good, Exceptions=bad thing on you when you\u2019re sleepy and not really putting up much of a fight.<\/p>\n<p><font size=\"5\"><strong>An Example<\/strong><\/font><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" style=\"MARGIN-LEFT: 5px\" border=\"0\" alt=\"Somewhere in Umbria\" align=\"right\" src=\"https:\/\/i0.wp.com\/www.joelonsoftware.com\/wp-content\/uploads\/2005\/05\/Umbria.jpg?w=730&#038;ssl=1\" \/><\/p>\n<p>Right. On with the example. Let\u2019s pretend that you\u2019re building some kind of a web-based application, since those seem to be all the rage with the kids these days.<\/p>\n<p>Now, there\u2019s a security vulnerability called the Cross Site Scripting Vulnerability, a.k.a. <a href=\"http:\/\/www.cert.org\/advisories\/CA-2000-02.html\">XSS<\/a>. I won\u2019t go into the details here: all you have to know is that when you build a web application you have to be careful never to repeat back any strings that the user types into forms.<\/p>\n<p>So for example if you have a web page that says \u201cWhat is your name?\u201d with an edit box and then submitting that page takes you to another page that says, Hello, Elmer! (assuming the user\u2019s name is Elmer), well, that\u2019s a security vulnerability, because the user could type in all kinds of weird HTML and JavaScript instead of \u201cElmer\u201d and their weird JavaScript could do narsty things, and now those narsty things appear to come from you, so for example they can read cookies that you put there and forward them on to Dr. Evil\u2019s evil site.<\/p>\n<p>Let\u2019s put it in pseudocode. Imagine that<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>s = Request(\"name\")<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>reads input (a POST argument) from the HTML form. If you ever write this code:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>Write \"Hello, \" &amp; Request(\"name\")<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>your site is already vulnerable to XSS attacks. That\u2019s all it takes.<\/p>\n<p>Instead you have to encode it before you copy it back into the HTML. Encoding it means replacing <tt>\"<\/tt> with <tt>&amp;quot;<\/tt>, replacing <tt>&gt;<\/tt> with <tt>&amp;gt;<\/tt>, and so forth. So<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>Write \"Hello, \" &amp; Encode(Request(\"name\"))<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is perfectly safe.<\/p>\n<p>All strings that originate from the user are <em>unsafe<\/em>. Any unsafe string must not be output without encoding it.<\/p>\n<p>Let\u2019s try to come up with a coding convention that will ensure that if you ever make this mistake, the code will just <em>look<\/em> wrong. If wrong code, at least, <em>looks<\/em> wrong, then it has a fighting chance of getting caught by someone working on that code or reviewing that code.<\/p>\n<p><strong><font size=\"4\">Possible Solution #1<\/font><\/strong><\/p>\n<p>One solution is to encode all strings right away, the minute they come in from the user:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>s = Encode(Request(\"name\"))<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>So our convention says this: if you ever see <tt><strong>Request<\/strong><\/tt> that is not surrounded by <tt><strong>Encode<\/strong><\/tt>, the code must be wrong.<\/p>\n<p>You start to train your eyes to look for naked <tt><strong>Request<\/strong><\/tt>s, because they violate the convention.<\/p>\n<p>That works, in the sense that if you follow this convention you\u2019ll never have a XSS bug, but that\u2019s not necessarily the best architecture. For example maybe you want to store these user strings in a database somewhere, and it doesn\u2019t make sense to have them stored HTML-encoded in the database, because they might have to go somewhere that is not an HTML page, like to a credit card processing application that will get confused if they are HTML-encoded. Most web applications are developed under the principle that all strings internally are <em>not<\/em> encoded until the <em>very last moment <\/em>before they are sent to an HTML page, and that\u2019s probably the right architecture.<\/p>\n<p>We really need to be able to keep things around in unsafe format for a while.<\/p>\n<p>OK. I\u2019ll try again. <\/p>\n<p><strong><font size=\"4\">Possible Solution #2<\/font><\/strong><\/p>\n<p>What if we made a coding convention that said that when you <em>write out <\/em>any string you have to encode it?<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>s = Request(\"name\")<br \/><\/strong><\/tt><tt><br \/><strong>\/\/ much later:<br \/><\/strong><\/tt><strong><tt>Write Encode(s)<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>Now whenever you see a naked <tt><strong>Write<\/strong><\/tt> without the <tt><strong>Encode<\/strong><\/tt> you know something is amiss.<\/p>\n<p>Well, that doesn\u2019t quite work\u2026 sometimes you have little bits of HTML around in your code and you <em>can\u2019t<\/em> encode them:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>If mode = \"linebreak\" Then prefix = \"&lt;br&gt;\"<\/tt><\/strong><\/p>\n<p><strong><tt>\/\/ much later:<br \/><\/tt><tt>Write prefix<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>This looks wrong according to our convention, which requires us to encode strings on the way out:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>Write Encode(prefix)<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>But now the <tt>\"<strong>&lt;br&gt;<\/strong>\"<\/tt>, which is supposed to start a new line, gets encoded to <tt><strong>&amp;lt;br&amp;gt;<\/strong><\/tt> and appears to the user as a literal <tt><strong>&lt; b r &gt;<\/strong><\/tt>. That\u2019s not right either.<\/p>\n<p>So, sometimes you can\u2019t encode a string when you read it in, and sometimes you can\u2019t encode it when you write it out, so neither of these proposals works. And without a convention, we\u2019re still running the risk that you do this:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>s = Request(\"name\")<\/strong><\/tt><\/p>\n<p><tt>...pages later...<br \/><\/tt><strong><tt>name = s<\/tt><\/strong><\/p>\n<p><tt>...pages later...<br \/><\/tt><strong><tt>recordset(\"name\") = name \/\/ store name in db in a column \"name\"<\/tt><\/strong><\/p>\n<p><tt>...days later...<br \/><\/tt><strong><tt>theName = recordset(\"name\") <\/tt><\/strong><\/p>\n<p><tt>...pages or even months later...<br \/><\/tt><tt><strong>Write theName<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>Did we remember to encode the string? There\u2019s no single place where you can look to see the bug. There\u2019s no place to sniff. If you have a lot of code like this, it takes a ton of detective work to trace the origin of every string that is ever written out to make sure it has been encoded.<\/p>\n<p><strong><font size=\"4\">The Real Solution<\/font><\/strong><\/p>\n<p>So let me suggest a coding convention that works. We\u2019ll have just one rule:<\/p>\n<p>All strings that come from the user must be stored in variables (or database columns) with a name starting with the prefix &#8220;us&#8221; (for Unsafe String). All strings that have been HTML encoded or which came from a known-safe location must be stored in variables with a name starting with the prefix &#8220;s&#8221; (for Safe string).<\/p>\n<p>Let me rewrite that same code, changing nothing but the variable names to match our new convention.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>us = Request(\"name\")<\/tt><\/strong><\/p>\n<p><tt>...pages later...<br \/><strong>usName = us<\/strong><\/tt><\/p>\n<p><tt>...pages later...<br \/><strong>recordset(\"usName\") = usName <\/strong><\/tt><\/p>\n<p><tt>...days later...<br \/><strong>sName = Encode(recordset(\"usName\"))<\/strong><\/tt><\/p>\n<p><tt>...pages or even months later...<br \/><strong>Write sName<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>The thing I want you to notice about the new convention is that now, if you make a mistake with an unsafe string, <em>you can always see it on some single line of code<\/em>, as long as the coding convention is adhered to:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>s = Request(\"name\")<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is a priori wrong, because you see the result of <tt>Request<\/tt> being assigned to a variable whose name begins with <tt>s<\/tt>, which is against the rules. The result of <tt>Request<\/tt> is always unsafe so it must always be assigned to a variable whose name begins with \u201cus\u201d.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>us = Request(\"name\")<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>is always OK.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>usName = us<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is always OK.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>sName = us<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is certainly wrong.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>sName = Encode(us)<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is certainly correct.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>Write usName<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is certainly wrong.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>Write sName<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>is OK, as is<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>Write Encode(usName)<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>Every line of code can be inspected <em>by itself<\/em>, and if every line of code is correct, the entire body of code is correct.<\/p>\n<p>Eventually, with this coding convention, your eyes learn to see the <tt><strong>Write usXXX<\/strong><\/tt> and know that it\u2019s wrong, and you instantly know how to fix it, too. I know, it\u2019s a little bit hard to see the wrong code at first, but do this for three weeks, and your eyes will adapt, just like the bakery workers who learned to look at a giant bread factory and instantly say, \u201cjay-zuss, nobody cleaned insahd rounduh fo-ah! What the hayl kine a opparashun y\u2019awls runnin&#8217; heey-uh?\u201d <\/p>\n<p>In fact we can extend the rule a bit, and rename (or wrap) the <tt><strong>Request<\/strong><\/tt> and <tt><strong>Encode<\/strong><\/tt>functions to be <tt><strong>UsRequest<\/strong><\/tt> and <tt><strong>SEncode<\/strong><\/tt>&#8230; in other words, functions that return an unsafe string or a safe string will start with <tt><strong>Us<\/strong><\/tt> and <tt><strong>S<\/strong><\/tt>, just like variables. Now look at the code:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>us = UsRequest(\"name\")<br \/><\/strong><\/tt><tt><strong>usName = us<br \/><\/strong><\/tt><tt><strong>recordset(\"usName\") = usName <br \/><\/strong><\/tt><strong><tt>sName = SEncode(recordset(\"usName\"))<br \/><\/tt><tt>Write sName<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>See what I did? Now you can look to see that both sides of the equal sign start with the same prefix to see mistakes.<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><span><\/span><strong><u>us<\/u> = <u>Us<\/u>Request(\"name\") <\/strong>\/\/ ok, both sides start with US<br \/><span><\/span><\/tt><tt><span><\/span><strong><u>s<\/u> = <u>Us<\/u>Request(\"name\") <\/strong>\/\/ bug<br \/><span><\/span><\/tt><tt><span><\/span><strong><u>us<\/u>Name = <u>us<\/u> <\/strong>\/\/ ok<br \/><span><\/span><\/tt><span><\/span><tt><strong><u>s<\/u>Name = <u>us<\/u> <\/strong>\/\/ certainly wrong.<br \/><\/tt><tt><strong><u>s<\/u>Name = <u>S<\/u>Encode(us) <\/strong>\/\/ certainly correct.<\/tt><span><\/span><\/p>\n<\/blockquote>\n<p>Heck, I can take it one step further, by naming <tt><strong>Write<\/strong><\/tt> to <tt><strong>WriteS<\/strong><\/tt> and renaming <tt><strong>SEncode<\/strong><\/tt> to <strong><tt>SFromUs<\/tt><\/strong>:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong><u>us<\/u> = <u>Us<\/u>Request(\"name\")<br \/><\/strong><\/tt><tt><strong><u>us<\/u>Name = <u>us<\/u><br \/><\/strong><\/tt><tt><strong>recordset(\"<u>us<\/u>Name\") = <u>us<\/u>Name <br \/><\/strong><\/tt><strong><tt><u>s<\/u>Name = <u>S<\/u>From<u>Us<\/u>(recordset(\"<u>us<\/u>Name\"))<br \/><\/tt><tt>Write<u>S<\/u> <u>s<\/u>Name<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>This makes mistakes even <em>more <\/em>visible. Your eyes will learn to \u201csee\u201d smelly code, and this will help you find obscure security bugs just through the normal process of writing code and reading code.<\/p>\n<p>Making wrong code look wrong is nice, but it\u2019s not necessarily the best possible solution to every security problem. It doesn\u2019t catch every possible bug or mistake, because you might not look at every line of code. But it\u2019s sure a heck of a lot better than nothing, and I\u2019d much rather have a coding convention where wrong code, at least, looked wrong. You instantly gain the incremental benefit that every time a programmer\u2019s eyes pass over a line of code, that particular bug is checked for and prevented.<\/p>\n<p><font size=\"5\"><strong>A General Rule<\/strong><\/font><\/p>\n<p>This business of making wrong code look wrong depends on getting the right things close together in one place on the screen. When I\u2019m looking at a string, in order to get the code right, I need to know, everywhere I see that string, whether it\u2019s safe or unsafe. I don\u2019t want that information to be in another file or on another page that I would have to scroll to. I have to be able to see it <em>right there<\/em> and that means a variable naming convention.<\/p>\n<p>There are a lot of other examples where you can improve code by moving things next to each other. Most coding conventions include rules like:<\/p>\n<ul>\n<li>Keep functions short.<\/li>\n<li>Declare your variables as close as possible to the place where you will use them.<\/li>\n<li>Don\u2019t use macros to create your own personal programming language.<\/li>\n<li>Don\u2019t use <tt>goto<\/tt>.<\/li>\n<li>Don\u2019t put closing braces more than one screen away from the matching opening brace.<\/li>\n<\/ul>\n<p>What all these rules have in common is that they are trying to get the relevant information about what a line of code really does physically as close together as possible. This improves the chances that your eyeballs will be able to figure out everything that\u2019s going on. <\/p>\n<p>In general, I have to admit that I\u2019m a little bit scared of language features that hide things. When you see the code<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><tt><strong>i = j * 5;<\/strong><\/tt><\/p>\n<\/blockquote>\n<p>\u2026 in C you know, at least, that <strong><tt>j<\/tt><\/strong> is being multiplied by five and the results stored in <strong><tt>i<\/tt><\/strong>. <\/p>\n<p>But if you see that same snippet of code in C++, you don\u2019t know anything. Nothing. The only way to know what\u2019s really happening in C++ is to find out what types <tt><strong>i<\/strong><\/tt> and <tt><strong>j<\/strong><\/tt> are, something which might be declared somewhere altogether else. That\u2019s because <tt><strong>j<\/strong><\/tt> might be of a type that has <tt><strong>operator*<\/strong><\/tt> overloaded and it does something terribly witty when you try to multiply it. And <tt><strong>i<\/strong><\/tt> might be of a type that has <tt><strong>operator=<\/strong><\/tt> overloaded, and the types might not be compatible so an automatic type coercion function might end up being called. And the only way to find out is not only to check the type of the variables, but to find the code that implements that type, and God help you if there\u2019s inheritance somewhere, because now you have to traipse all the way up the class hierarchy all by yourself trying to find where that code really <em>is<\/em>, and if there\u2019s polymorphism somewhere, you\u2019re <em>really<\/em> in trouble because it\u2019s not enough to know what type <tt>i<\/tt> and <tt>j<\/tt> are <em>declared<\/em>, you have to know what type they are <em>right now<\/em>, which might involve inspecting an arbitrary amount of code and you can never really be sure if you\u2019ve looked everywhere thanks to the halting problem (phew!).<\/p>\n<p>When you see <tt><strong>i=j*5<\/strong><\/tt> in C++ you are really on your own, bubby, and that, in my mind, reduces the ability to detect possible problems just by looking at code.<\/p>\n<p>None of this was supposed to matter, of course. When you do clever-schoolboy things like override <tt>operator*<\/tt>, this is meant to be to help you provide a nice waterproof abstraction. Golly, <tt>j<\/tt> is a Unicode String type, and multiplying a Unicode String by an integer is <em>obviously<\/em> a good abstraction for converting Traditional Chinese to Standard Chinese, right?<\/p>\n<p>The trouble is, of course, that waterproof abstractions aren\u2019t. I\u2019ve already talked about this extensively in <a href=\"https:\/\/www.joelonsoftware.com\/articles\/LeakyAbstractions.html\">The Law of Leaky Abstractions<\/a> so I won\u2019t repeat myself here.<\/p>\n<p>Scott Meyers has made a whole career out of showing you all the ways they fail and bite you, in C++ at least. (By the way, the third edition of Scott\u2019s book <a href=\"http:\/\/www.awprofessional.com\/title\/0321334876\">Effective C++<\/a> just came out; it\u2019s completely rewritten;&nbsp;get your copy today!)<\/p>\n<p>Okay.<\/p>\n<p>I\u2019m losing track. I better summarize The Story Until Now:<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p>Look for coding conventions that make wrong code look wrong. Getting the right information collocated all together in the same place on screen in your code lets you see certain types of problems and fix them right away.<\/p>\n<\/blockquote>\n<p><font size=\"5\"><strong>I\u2019m Hungary<\/strong><\/font><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" style=\"MARGIN-LEFT: 5px\" border=\"0\" alt=\"Lugnano, Umbria, Italy\" align=\"right\" src=\"https:\/\/i0.wp.com\/www.joelonsoftware.com\/wp-content\/uploads\/2005\/05\/Lugnano.jpg?w=730&#038;ssl=1\" \/>So now we get back to the infamous Hungarian notation.<\/p>\n<p>Hungarian notation was invented by Microsoft programmer Charles Simonyi. One of the major projects Simonyi worked on at Microsoft was Word; in fact he led the project to create the world\u2019s first WYSIWYG word processor, something called Bravo at Xerox Parc.<\/p>\n<p>In WYSIWYG word processing, you have scrollable windows, so every coordinate has to be interpreted as either relative to the window or relative to the page, and that makes a big difference, and keeping them straight is pretty important.<\/p>\n<p>Which, I surmise, is one of the many good reasons Simonyi started using something that came to be called Hungarian notation. It looked like Hungarian, and Simonyi was from Hungary, thus the name. In Simonyi\u2019s version of Hungarian notation, every variable was prefixed with a lower case tag that indicated the kind of thing that the variable contained. <\/p>\n<p align=\"center\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" border=\"0\" alt=\"For example, if the variable name is rwCol, rw is the prefix.\" src=\"https:\/\/i0.wp.com\/www.joelonsoftware.com\/wp-content\/uploads\/2005\/05\/hungarian.png?resize=130%2C91&#038;ssl=1\" width=\"130\" height=\"91\" \/><\/p>\n<p>I\u2019m using the word <em>kind<\/em> on purpose, there, because Simonyi mistakenly used the word <em>type<\/em> in his paper, and generations of programmers misunderstood what he meant.<\/p>\n<p>If you read Simonyi\u2019s paper closely, what he was getting at was the same kind of naming convention as I used in my example above where we decided that <tt><strong>us<\/strong><\/tt> meant \u201cunsafe string\u201d and <tt><strong>s<\/strong><\/tt> meant \u201csafe string.\u201d They\u2019re both of type <tt><strong>string<\/strong><\/tt>. The compiler won\u2019t help you if you assign one to the other and Intellisense won\u2019t tell you bupkis. But they are semantically different; they need to be interpreted differently and treated differently and some kind of conversion function will need to be called if you assign one to the other or you will have a <em>runtime<\/em> bug. <em>If<\/em> you\u2019re lucky.<\/p>\n<p>Simonyi\u2019s original concept for Hungarian notation was called, inside Microsoft, Apps Hungarian, because it was used in the Applications Division, to wit, Word and Excel. In Excel\u2019s source code you see a lot of <tt><strong>rw<\/strong><\/tt> and <tt><strong>col<\/strong><\/tt> and when you see those you know that they refer to rows and columns. Yep, they\u2019re both integers, but it never makes sense to assign between them. In Word, I&#8217;m told, you see a lot of <tt><strong>xl<\/strong><\/tt> and <tt><strong>xw<\/strong><\/tt>, where <tt><strong>xl<\/strong><\/tt> means \u201chorizontal coordinates relative to the layout\u201d and <tt><strong>xw<\/strong><\/tt> means \u201chorizontal coordinates relative to the window.\u201d Both ints. Not interchangeable. In both apps you see a lot of <tt>cb<\/tt> meaning \u201ccount of bytes.\u201d Yep, it\u2019s an int again, but you know so much more about it just by looking at the variable name. It\u2019s a count of bytes: a buffer size. And if you see <tt><strong>xl = cb<\/strong><\/tt>, well, blow the Bad Code Whistle, that is obviously wrong code, because even though <tt><strong>xl<\/strong><\/tt> and <tt><strong>cb<\/strong><\/tt> are both integers, it\u2019s completely crazy to set a horizontal offset in pixels to a count of bytes.<\/p>\n<p>In Apps Hungarian prefixes are used for functions, as well as variables. So, to tell you the truth, I\u2019ve never seen the Word source code, but I\u2019ll bet you dollars to donuts there\u2019s a function called <tt><strong>YlFromYw<\/strong><\/tt> which converts from vertical window coordinates to vertical layout coordinates. Apps Hungarian requires the notation <tt><strong>TypeFromType<\/strong><\/tt> instead of the more traditional&nbsp;<tt><strong>TypeToType<\/strong><\/tt> so that every function name could begin with the type of thing that it was returning, just like I did earlier in the example when I renamed <strong><tt>Encode<\/tt> <tt>SFromUs<\/tt><\/strong>. In fact in proper Apps Hungarian the Encode function would <em>have<\/em> to be named <tt><strong>SFromUs<\/strong><\/tt>. Apps Hungarian wouldn\u2019t really give you a choice in how to name this function. That\u2019s a good thing, because it\u2019s one less thing you need to remember, and you don\u2019t have to wonder what kind of encoding is being referred to by the word <tt>Encode<\/tt>: you have something much more precise.<\/p>\n<p>Apps Hungarian was extremely valuable, especially in the days of C programming where the compiler didn\u2019t provide a very useful type system. <\/p>\n<p>But then something kind of wrong happened. <\/p>\n<p>The dark side took over Hungarian Notation.<\/p>\n<p>Nobody seems to know why or how, but it appears that the documentation writers on the Windows team inadvertently invented what came to be known as Systems Hungarian.<\/p>\n<p>Somebody, somewhere, read Simonyi\u2019s paper, where he used the word \u201ctype,\u201d and thought he meant type, like class, like in a type system, like the type checking that the compiler does. He did not. He explained very carefully exactly what he meant by the word \u201ctype,\u201d but it didn\u2019t help. The damage was done. <\/p>\n<p>Apps Hungarian had very useful, meaningful prefixes like \u201cix\u201d to mean an index into an array, \u201cc\u201d to mean a count, \u201cd\u201d to mean the difference between two numbers (for example \u201cdx\u201d meant \u201cwidth\u201d), and so forth.<\/p>\n<p>Systems Hungarian had far less useful prefixes like \u201cl\u201d for long and \u201cul\u201d for \u201cunsigned long\u201d and \u201cdw\u201d for double word, which is, actually, uh, an unsigned long. In Systems Hungarian, the only thing that the prefix told you was the actual data type of the variable.<\/p>\n<p>This was a subtle but complete misunderstanding of Simonyi\u2019s intention and practice, and it just goes to show you that if you write convoluted, dense academic prose nobody will understand it and your ideas will be misinterpreted and then the misinterpreted ideas will be ridiculed even when they weren\u2019t your ideas. So in Systems Hungarian you got a lot of <tt>dwFoo<\/tt> meaning \u201cdouble word foo,\u201d and doggone it, the fact that a variable is a double word tells you darn near nothing useful at all. So it\u2019s no wonder people rebelled against Systems Hungarian.<\/p>\n<p>Systems Hungarian was promulgated far and wide; it is the standard throughout the Windows programming documentation; it was spread extensively by books like <a href=\"http:\/\/www.charlespetzold.com\/pw5\/\">Charles Petzold\u2019s Programming Windows<\/a>, the bible for learning Windows programming, and it rapidly became the dominant form of Hungarian, even inside Microsoft, where very few programmers outside the Word and Excel teams understood just what a mistake they had made.<\/p>\n<p>And then came The Great Rebellion. Eventually, programmers who never understood Hungarian in the first place noticed that the misunderstood subset they were using was Pretty Dang Annoying and Well-Nigh Useless, and they revolted against it. Now, there are still some nice qualities in Systems Hungarian, which help you see bugs. At the very least, if you use Systems Hungarian, you\u2019ll know the type of a variable at the spot where you\u2019re using it. But it\u2019s not nearly as valuable as Apps Hungarian.<\/p>\n<p>The Great Rebellion hit its peak with the first release of .NET. Microsoft finally started telling people, \u201cHungarian Notation Is Not Recommended.\u201d There was much rejoicing. I don\u2019t even think they bothered saying why. They just went through the naming guidelines section of the document and wrote, \u201cDo Not Use Hungarian Notation\u201d in every entry. Hungarian Notation was so doggone unpopular by this point that nobody really complained, and everybody in the world outside of Excel and Word were relieved at no longer having to use an awkward naming convention that, they thought, was unnecessary in the days of strong type checking and Intellisense.<\/p>\n<p>But there\u2019s still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug, and maintain, and, most importantly, it makes wrong code look wrong. <\/p>\n<p>Before we go, there\u2019s one more thing I promised to do, which is to bash exceptions one more time. The last time I did that I got in a lot of trouble. In an off-the-cuff remark on the Joel on Software homepage, <a href=\"https:\/\/www.joelonsoftware.com\/items\/2003\/10\/13.html\">I wrote<\/a> that I don\u2019t like exceptions because they are, effectively, an invisible goto, which, I reasoned, is even worse than a goto you can see. Of course millions of people jumped down my throat. The only person in the world who leapt to my defense was, of course, Raymond Chen, who is, by the way, the best programmer in the world, so that has to say something, right?<\/p>\n<p>Here\u2019s the thing with exceptions, in the context of this article. Your eyes learn to see wrong things, as long as there is something to see, and this prevents bugs. In order to make code really, really robust, when you code-review it, you need to have coding conventions that allow collocation. In other words, the more information about what code is doing is located right in front of your eyes, the better a job you\u2019ll do at finding the mistakes. When you have code that says<\/p>\n<blockquote style=\"MARGIN-RIGHT: 0px\" dir=\"ltr\">\n<p><strong><tt>dosomething();<br \/>cleanup();<\/tt><\/strong><\/p>\n<\/blockquote>\n<p>\u2026 your eyes tell you, what\u2019s wrong with that? We always clean up! But the possibility that <tt><strong>dosomething<\/strong><\/tt> might throw an exception means that <tt><strong>cleanup<\/strong><\/tt>might not get called. And that\u2019s easily fixable, using <tt><strong>finally<\/strong><\/tt> or whatnot, but that\u2019s not my point: my point is that the only way to know that <tt><strong>cleanup<\/strong><\/tt> is definitely called is to investigate the entire call tree of <tt><strong>dosomething <\/strong><\/tt>to see if there\u2019s anything in there, anywhere, which can throw an exception, and that\u2019s ok, and there are things like checked exceptions to make it less painful, but the real point is that exceptions eliminate collocation. You have to look <em>somewhere else<\/em> to answer a question of whether code is doing the right thing, so you\u2019re not able to take advantage of your eye\u2019s built-in ability to learn to see wrong code, because there\u2019s nothing to see.<\/p>\n<p>Now, when I\u2019m writing a dinky script to gather up a bunch of data and print it once a day, heck yeah, exceptions are great. I like nothing more than to ignore all possible wrong things that can happen and just wrap up the whole damn program in a big ol\u2019 try\/catch that emails me if anything ever goes wrong. Exceptions are fine for quick-and-dirty code, for scripts, and for code that is neither mission critical nor life-sustaining. But if you\u2019re writing an operating system, or a nuclear power plant, or the software to control a high speed circular saw used in open heart surgery, exceptions are extremely dangerous. <\/p>\n<p>I know people will assume that I\u2019m a lame programmer for failing to understand exceptions properly and failing to understand all the ways they can improve my life if only I was willing to let exceptions into my heart, but, too bad. The way to write really reliable code is to try to use simple tools that take into account typical human frailty, not complex tools with hidden side effects and leaky abstractions that assume an infallible programmer.<\/p>\n<p><font size=\"4\"><strong>More Reading<\/strong><\/font><\/p>\n<p>If you&#8217;re still all gung-ho about exceptions, read Raymond Chen&#8217;s essay <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2005\/01\/14\/352949.aspx\">Cleaner, more elegant, and harder to recognize<\/a>. &#8220;I<!--StartFragment -->t is extraordinarily difficult to see the difference between bad exception-based code and not-bad exception-based code&#8230;&nbsp;<!--StartFragment -->exceptions are too hard and I&#8217;m not smart enough to handle them.&#8221;<\/p>\n<p>Raymond&#8217;s rant about Death by Macros, <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2005\/01\/06\/347666.aspx\">A rant against flow control macros<\/a>, is about another case where failing to get information all in the same place makes code unmaintainable. &#8220;<!--StartFragment -->When you see code that uses [macros], you have to go dig through header files to figure out what they do.&#8221;<\/p>\n<p>For background on the history of Hungarian notation, start with Simonyi&#8217;s original paper, <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/aa260976(v=vs.60).aspx\">Hungarian Notation<\/a>. Doug Klunder <a href=\"http:\/\/www.byteshift.de\/msg\/hungarian-notation-doug-klunder\">introduced this to the Excel team<\/a>&nbsp;in a somewhat clearer paper. For more stories about Hungarian and how it got ruined by documentation writers, read <a href=\"http:\/\/blogs.msdn.com\/larryosterman\/archive\/2004\/06\/22\/162629.aspx\">Larry Osterman<\/a>&#8216;s post, especially <a href=\"http:\/\/blogs.msdn.com\/larryosterman\/archive\/2004\/06\/22\/162629.aspx#163721\">Scott Ludwig&#8217;s comment<\/a>, or <a href=\"http:\/\/blogs.msdn.com\/rick_schaut\/archive\/2004\/02\/14\/73108.aspx\">Rick Schaut&#8217;s post<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Way back in September 1983, I started my first real job, working at Oranim, a big bread factory in Israel that made something like 100,000 loaves of&hellip; <span class=\"read-more\"><a class=\"more-link\" href=\"https:\/\/www.joelonsoftware.com\/2005\/05\/11\/making-wrong-code-look-wrong\/\" rel=\"bookmark\">Read more <span class=\"screen-reader-text\">&#8220;Making Wrong Code Look Wrong&#8221;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[5,2],"tags":[],"class_list":["post-901","post","type-post","status-publish","format-standard","hentry","category-rock-star-developer","category-news"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p83KNI-ex","_links":{"self":[{"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/posts\/901","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/comments?post=901"}],"version-history":[{"count":1,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/posts\/901\/revisions"}],"predecessor-version":[{"id":3012,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/posts\/901\/revisions\/3012"}],"wp:attachment":[{"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/media?parent=901"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/categories?post=901"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.joelonsoftware.com\/wp-json\/wp\/v2\/tags?post=901"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}