Hungarian notation is an identifier naming convention in computer programming in which the name of a variable or function indicates its intention or kind, or in some dialects, its type. The original Hungarian notation uses only intention or kind in its naming convention and is sometimes called Apps Hungarian as it became popular in the Microsoft Apps division in the development of Microsoft Office applications. When the Microsoft Windows division adopted the naming convention, they based it on the actual data type, and this convention became widely spread through the Windows API; this is sometimes called Systems Hungarian notation.
Hungarian notation was designed to be language-independent, and found its first major use with the BCPL programming language. Because BCPL has no data types other than the machine word, nothing in the language itself helps a programmer remember variables' types. Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variable's data type.
In Hungarian notation, a variable name starts with a group of lower-case letters which are mnemonics for the type or purpose of that variable, followed by whatever name the programmer has chosen; this last part is sometimes distinguished as the given name. The first character of the given name can be capitalized to separate it from the type indicators (see also CamelCase). Otherwise the case of this character denotes scope.
History
The original Hungarian notation was invented by Charles Simonyi, a programmer who worked at Xerox PARC circa 1972–1981, and who later became Chief Architect at Microsoft. The name of the notation is a reference to Simonyi's nation of origin, and also, according to Andy Hertzfeld, because it made programs "look like they were written in some inscrutable foreign language". Hungarian people's names are "reversed" compared to most other European names; the family name precedes the given name. For example, the anglicized name "Charles Simonyi" in Hungarian was originally "Simonyi Károly". In the same way, the type name precedes the "given name" in Hungarian notation. The similar Smalltalk "type last" naming style (e.g. aPoint and lastPoint) was common at Xerox PARC during Simonyi's tenure there.
Simonyi's paper on the notation referred to prefixes used to indicate the "type" of information being stored.
- <code>pX</code> is a pointer to another type X; this contains very little semantic information.
- <code>d</code> is a prefix meaning difference between two values; for instance, dY might represent a distance along the Y-axis of a graph, while a variable just called y might be an absolute position. This is entirely semantic in nature.
- <code>sz</code> is a null-terminated string. In C, this contains some semantic information because it is not clear whether a variable of type char* is a pointer to a single character, an array of characters or a null-terminated string.
- <code>w</code> marks a variable that is a word. This contains essentially no semantic information at all, and would probably be considered Systems Hungarian.
- <code>b</code> marks a byte, which in contrast to w might have semantic information, because in C the only byte-sized data type is the char, so these are sometimes used to hold numeric values. This prefix might clear ambiguity between whether the variable is holding a value that should be treated as a character or a number.
While the notation always uses initial lower-case letters as mnemonics, it does not prescribe the mnemonics themselves. There are several widely used conventions (see examples below), but any set of letters can be used, as long as they are consistent within a given body of code.
It is possible for code using Apps Hungarian notation to sometimes contain Systems Hungarian when describing variables that are defined solely in terms of their type.
Relation to sigils
In some programming languages, a similar notation now called sigils is built into the language and enforced by the compiler. For example, in some forms of BASIC, <code>name$</code> names a string and <code>count%</code> names an integer. The major difference between Hungarian notation and sigils is that sigils declare the type of the variable in the language, whereas Hungarian notation is purely a naming scheme with no effect on the machine interpretation of the program text.
Examples
- <code>bBusy</code> : Boolean
- <code>chInitial</code> : char
- <code>cApples</code> : count of items
- <code>dwLightYears</code> : double word (Systems)
- <code>fBusy</code> : flag (or float)
- <code>nSize</code> : integer (Systems) or count (Apps)
- <code>iSize</code> : integer (Systems) or index (Apps)
- <code>fpPrice</code> : floating-point
- <code>decPrice</code> : decimal
- <code>dbPi</code> : double (Systems)
- <code>pFoo</code> : pointer
- <code>rgStudents</code> : array, or range
- <code>szLastName</code> : null-terminated string
- <code>u16Identifier</code> : unsigned 16-bit integer (Systems)
- <code>u32Identifier</code> : unsigned 32-bit integer (Systems)
- <code>stTime</code> : clock time structure
- <code>fnFunction</code> : function name
The mnemonics for pointers and arrays, which are not actual data types, are usually followed by the type of the data element itself:
- <code>pszOwner</code> : pointer to null-terminated string
- <code>rgfpBalances</code> : array of floating-point values
- <code>aulColors</code> : array of unsigned long (Systems)
While Hungarian notation can be applied to any programming language and environment, it was widely adopted by Microsoft for use with the C language, in particular for Microsoft Windows, and its use remains largely confined to that area. In particular, use of Hungarian notation was widely evangelized by Charles Petzold's "Programming Windows", the original (and for many readers, the definitive) book on Windows API programming. Thus, many commonly seen constructs of Hungarian notation are specific to Windows:
- For programmers who learned Windows programming in C, probably the most memorable examples are the <code>wParam</code> (word-size parameter) and <code>lParam</code> (long-integer parameter) for the WindowProc() function.
- <code>hwndFoo</code> : handle to a window
- <code>lpszBar</code> : long pointer to a null-terminated string
The notation is sometimes extended in C++ to include the scope of a variable, optionally separated by an underscore. This extension is often also used without the Hungarian type-specification:
- <code>g_nWheels</code> : member of a global namespace, integer
- <code>m_nWheels</code> : member of a structure/class, integer
- <code>m_wheels</code>, <code>_wheels</code> : member of a structure/class
- <code>s_wheels</code> : static member of a class
- <code>c_wheels</code> : static member of a function
Advantages
(Some of these apply to Systems Hungarian only.)
Supporters argue that the benefits of Hungarian Notation include:
- The additional type information can insufficiently replace more descriptive names. E.g. sDatabase does not tell the reader what it is. databaseName might be a more descriptive name.
- When names are sufficiently descriptive, the additional type information can be redundant. E.g. firstName is most likely a string. So naming it sFirstName only adds clutter to the code.
- It's harder to remember the names.
- Multiple variables with different semantics can be used in a block of code with similar names: dwTmp, iTmp, fTmp, dTmp.
Notable opinions
- Robert Cecil Martin (against Hungarian notation and all other forms of encoding): <blockquote>... nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, member or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.</blockquote>
- Linus Torvalds (against Systems Hungarian): <blockquote>Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged—the compiler knows the types anyway and can check those, and it only confuses the programmer.</blockquote>
- Steve McConnell (for Apps Hungarian): <blockquote>Although the Hungarian naming convention is no longer in widespread use, the basic idea of standardizing on terse, precise abbreviations continues to have value. Standardized prefixes allow you to check types accurately when you're using abstract data types that your compiler can't necessarily check.</blockquote>
- Bjarne Stroustrup (against Systems Hungarian for C++):<blockquote>No I don't recommend 'Hungarian'. I regard 'Hungarian' (embedding an abbreviated version of a type in a variable name) as a technique that can be useful in untyped languages, but is completely unsuitable for a language that supports generic programming and object-oriented programming — both of which emphasize selection of operations based on the type and arguments (known to the language or to the run-time support). In this case, 'building the type of an object into names' simply complicates and minimizes abstraction.</blockquote>
- Joel Spolsky (for Apps Hungarian): <blockquote>If you read Simonyi's paper closely, what he was getting at was the same kind of naming convention as I used in my example above where we decided that <code>us</code> meant unsafe string and <code>s</code> meant safe string. They're both of type <code>string</code>. The compiler won't help you if you assign one to the other and Intellisense [an intelligent code completion system] won't tell you bupkis. But they are semantically different. They need to be interpreted differently and treated differently and some kind of conversion function will need to be called if you assign one to the other or you will have a runtime bug. If you're lucky. There's still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug and maintain, and, most importantly, it makes wrong code look wrong.... (Systems Hungarian) was a subtle but complete misunderstanding of Simonyi’s intention and practice.</blockquote>
- Microsoft's Design Guidelines discourage developers from using Systems Hungarian notation when they choose names for the elements in .NET class libraries, although it was common on prior Microsoft development platforms like Visual Basic 6 and earlier. These Design Guidelines are silent on the naming conventions for local variables inside functions.
See also
- Leszynski naming convention, a variant of Hungarian for database development
- Camel case, another widespread naming convention
- Polish notation, an unrelated concept with a similar name
References
External links
- Meta-Programming: A Software Production Method Charles Simonyi, December 1976 (PhD Thesis)
- Hugarian notation - it's my turn now :) – Larry Osterman's WebLog
- Hungarian Notation (MSDN)
- HTML version of Doug Klunder's paper, Idle Loop Software Design, archived May 9, 2023
- RVBA Naming Conventions
- Coding Style Conventions (MSDN)
