In some programming languages, is a type qualifier (a keyword applied to a data type) that indicates that the data is read-only. While this can be used to declare constants, in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.
Introduction
When applied in an object declaration, it indicates that the object is a constant: its value may not be changed, unlike a variable. This basic use – to declare constants – has parallels in many other languages.
However, unlike in other languages, in the C family of languages the <code>const</code> is part of the type, not part of the object. For example, in C, declares an object <code>x</code> of <code>const int</code> type – the <code>const</code> is part of the type, as if it were parsed "(const int) x" – while in Ada, declares a constant (a kind of object) <code>X</code> of <code>INTEGER</code> type: the <code>constant</code> is part of the object, but not part of the type.
This has two subtle results. Firstly, <code>const</code> can be applied to parts of a more complex type – for example, <code>const int* const x;</code> declares a constant pointer to a constant integer, while <code>const int* x;</code> declares a variable pointer to a constant integer, and <code>int* const x;</code> declares a constant pointer to a variable integer. Secondly, because <code>const</code> is part of the type, it must match as part of type-checking. For example, the following code is invalid:
<syntaxhighlight lang="cpp">
void f(int& x);
// ...
const int i;
f(i);
</syntaxhighlight>
because the argument to <code>f</code> must be a variable integer, but <code>i</code> is a constant integer. This matching is a form of program correctness, and is known as const-correctness. This allows a form of programming by contract, where functions specify as part of their type signature whether they modify their arguments or not, and whether their return value is modifiable or not. This type-checking is primarily of interest in pointers and references – not basic value types like integers – but also for composite data types or templated types such as containers. It is concealed by the fact that the <code>const</code> can often be omitted, due to type coercion (implicit type conversion) and C being call-by-value (C++ and D are either call-by-value or call-by-reference).
Consequences
The idea of const-ness does not imply that the variable as it is stored in computer memory is unwritable. Rather, <code>const</code>-ness is a compile-time construct that indicates what a programmer should do, not necessarily what they can do. Note, however, that in the case of predefined data (such as <code>const char*</code> string literals), C <code>const</code> is often unwritable.
Distinction from constants
While a constant does not change its value while the program is running, an object declared <code>const</code> may indeed change its value while the program is running. A common example are read only registers within embedded systems like the current state of a digital input. The data registers for digital inputs are often declared as <code>const</code> and <code>volatile</code>. The content of these registers may change without the program doing anything (<code>volatile</code>) but it would be ill-formed for the program to attempt write to them (<code>const</code>).
Other uses
In addition, a (non-static) member-function can be declared as <code>const</code>. In this case, the <code>this</code> pointer inside such a function is of type <code>const T*</code> rather than merely of type <code>T*</code>. This means that non-const functions for this object cannot be called from inside such a function, nor can member variables be modified. In C++, a member variable can be declared as <code>mutable</code>, indicating that this restriction does not apply to it. In some cases, this can be useful, for example with caching, reference counting, and data synchronization. In these cases, the logical meaning (state) of the object is unchanged, but the object is not physically constant since its bitwise representation may change.
Syntax
In C, C++, and D, all data types, including those defined by the user, can be declared <code>const</code>, and const-correctness dictates that all variables or objects should be declared as such unless they need to be modified. Such proactive use of <code>const</code> makes values "easier to understand, track, and reason about", and it thus increases the readability and comprehensibility of code and makes working in teams and maintaining code simpler because it communicates information about a value's intended use. This can help the compiler as well as the developer when reasoning about code. It can also enable an optimizing compiler to generate more efficient code.
Simple data types
For simple non-pointer data types, applying the <code>const</code> qualifier is straightforward. It can go on either side of some types for historical reasons (for example, <code>const char foo = 'a';</code> is equivalent to <code>char const foo = 'a';</code>). On some implementations, using <code>const</code> twice (for instance, <code>const char const</code> or <code>char const const</code>) generates a warning but not an error.
Pointers and references
For pointer and reference types, the meaning of <code>const</code> is more complicated – either the pointer itself, or the value being pointed to, or both, can be <code>const</code>. Further, the syntax can be confusing. A pointer can be declared as a <code>const</code> pointer to writable value, or a writable pointer to a <code>const</code> value, or <code>const</code> pointer to <code>const</code> value. A <code>const</code> pointer cannot be reassigned to point to a different object from the one it is initially assigned, but it can be used to modify the value that it points to (called the pointee<!-- Might be better described somewhere else, hence circular link with possibilities for now. If you point or move this elsewhere, please also take care of the incoming redirects to "Pointee" and "Cray pointer". -->). Reference variables in C++ are an alternate syntax for <code>const</code> pointers. A pointer to a <code>const</code> object, on the other hand, can be reassigned to point to another memory location (which should be an object of the same type or of a convertible type), but it cannot be used to modify the memory that it is pointing to. A <code>const</code> pointer to a <code>const</code> object can also be declared and can neither be used to modify the pointee nor be reassigned to point to another object. The following code illustrates these subtleties:
<syntaxhighlight lang="c">
void foo(int* p, const int* cp, int* const pc, const int* const cpc) {
- p = 0; // OK: modifies the pointed to data
p = NULL; // OK: modifies the pointer
- cp = 0; // Error! Cannot modify the pointed to data
cp = NULL; // OK: modifies the pointer
- pc = 0; // OK: modifies the pointed to data
pc = NULL; // Error! Cannot modify the pointer
- cpc = 0; // Error! Cannot modify the pointed to data
cpc = NULL; // Error! Cannot modify the pointer
}
</syntaxhighlight>
C convention
Following usual C convention for declarations, declaration follows use, and the <code>*</code> in a pointer is written on the pointer, indicating dereferencing. For example, in the declaration <code>int *p</code>, the dereferenced form <code>*p</code> is an <code>int</code>, while the reference form <code>p</code> is a pointer to an <code>int</code>. Thus <code>const</code> modifies the name to its right. The C++ convention is instead to associate the <code>*</code> with the type, as in <code>int* p</code>, and read the <code>const</code> as modifying the type to the left. <code>const int * cp</code> can thus be read as "<code>*cp</code> is a <code>const int</code>" (the value is constant), or "<code>cp</code> is a <code>const int *</code>" (the pointer is a pointer to a constant integer). Thus:
<syntaxhighlight lang="c">
int* p; // *p is an int value
const int* cp; // *cp is a constant int
int* const pc; // pc is a constant int*
const int* const cpc; // cpc is a constant pointer and points to a constant value
</syntaxhighlight>
C++ convention
Following C++ convention of analyzing the type, not the value, a rule of thumb is to read the declaration from right to left. Thus, everything to the left of the star can be identified as the pointed type and everything to the right of the star are the pointer properties. For instance, in our example above, <code>const int*</code> can be read as a writable pointer that refers to a non-writable integer, and <code>int* const</code> can be read as a non-writable pointer that refers to a writable integer.
A more generic rule to aid in understanding complex declarations and definitions works like this:
- find the identifier of the declaration in question
- read as far as possible to the right (i.e., until the end of the declaration or to the next closing parenthesis, whichever comes first)
- back up to where reading began, and read backwards to the left (i.e., until the beginning of the declaration or to the open-parenthesis matching the closing parenthesis found in the previous step)
- once reaching the beginning of the declaration, finish. If not, continue at step 2, beyond the closing parenthesis that was matched last.
Here is an example:
When reading to the left, it is important to read the elements from right to left. So an <code>const int*</code> becomes a pointer to a <code>const int</code> and not a <code>const</code> pointer to an <code>int</code>.
In some cases C/C++ allows the <code>const</code> keyword to be placed to the right of the type. Here are some examples:
<syntaxhighlight lang="cpp">
const int* cp; // equivalent to int const* cp,
const int* const cpc; // equivalent to int const* const cpc
</syntaxhighlight>
Although C/C++ allows such definitions (which closely match the English language when reading the definitions from left to right), the compiler still reads the definitions according to the abovementioned procedure: from right to left. But putting <code>const</code> before what must be constant quickly introduces mismatches between what is intended to be written and what the compiler decides was written. Consider pointers to pointers:
<syntaxhighlight lang="cpp">
int** pp; // a pointer to a pointer to ints
const int** cpp; // a pointer to a pointer to constant int value (not a pointer to a constant pointer to ints)
int* const* pcp; // a pointer to a const pointer to int values (not a constant pointer to a pointer to ints)
int** const ppc; // a constant pointer to pointers to ints (ppc, the identifier, being const makes no sense)
const int** const cppc; // a constant pointer to pointers to constant int values
</syntaxhighlight>
Some choose to write the pointer symbol on the variable, reasoning that attaching it to the type is potentially confusing, as it strongly suggests a pointer "type" which is not necessarily the case in C.
<syntaxhighlight lang="cpp">
// two ways:
int* a; // left-aligned asterisk
int *a; // right-aligned asterisk
int* a, b; // confusing (a is a pointer to an int but b is merely an int)
int *a, b; // a is a pointer to an int and b is an int
int* a, *b; // ugly (a and b are both pointers to ints, but this is an awkward way to write)
int *a, *b;
</syntaxhighlight>
Note that in C#, right-alignment of the asterisk is illegal.
<syntaxhighlight lang="csharp">
int *a, *b; // illegal in C#
int* a, b; // declares two int*
</syntaxhighlight>
Bjarne Stroustrup's FAQ recommends only declaring one variable per line if using the C++ convention, to avoid this issue.
The same considerations apply to defining references and rvalue references:
<syntaxhighlight lang="cpp">
int i = 22;
const int& cr = i;
const int& cr2 = i, cr3 = i;
// confusing:
// cr2 is a reference, but cr3 isn't:
// cr3 is a constant int initialized with i's value
// error: as references can't change anyway.
int& const rc = myInt;
// C++:
int&& rref = int(5), value = 10; // confusing: rref is an rvalue reference, but value is a mere int.
int &&rref = int(5), value = 10;
</syntaxhighlight>
More complicated declarations are encountered when using multidimensional arrays and references (or pointers) to pointers. Although it is sometimes argued that such declarations are confusing and error-prone and that they therefore should be avoided or be replaced by higher-level structures, the procedure described at the top of this section can always be used without introducing ambiguities or confusion.
Parameters and variables
<code>const</code> can be declared both on function parameters and on variables (static or automatic, including global or local). The interpretation varies between uses. A <code>const</code> static variable (global variable or static local variable) is a constant, and may be used for data like mathematical constants, such as <code>const double PI = 3.14159</code> – realistically longer, or overall compile-time parameters. A <code>const</code> automatic variable (non-static local variable) means that single assignment is happening, though a different value may be used each time, such as <code>const int x_squared = x * x</code>. A <code>const</code> parameter in pass-by-reference means that the referenced value is not modified – it is part of the contract – while a <code>const</code> parameter in pass-by-value (or the pointer itself, in pass-by-reference) does not add anything to the interface (as the value has been copied), but indicates that internally, the function does not modify the local copy of the parameter (it is a single assignment). For this reason, some favor using <code>const</code> in parameters only for pass-by-reference, where it changes the contract, but not for pass-by-value, where it exposes the implementation.
C++
In C++, there are four different kinds of :
- , the traditional runtime constant
- , a constant or expression which can be evaluated at compile time (see constexpr)
- , which declares that any call to a function must produce a value which is a compile-time expression
- , which declares that the expression has static (constant) initialisation
Methods
In order to take advantage of the design by contract approach for user-defined types (structs and classes), which can have methods as well as member data, the programmer may tag instance methods as <code>const</code> if they don't modify the object's data members.
Applying the <code>const</code> qualifier to instance methods thus is an essential feature for const-correctness, and is not available in many other object-oriented languages such as Java and C# or in Microsoft's C++/CLI or Managed Extensions for C++.
While <code>const</code> methods can be called by <code>const</code> and non-<code>const</code> objects alike, non-<code>const</code> methods can only be invoked by non-<code>const</code> objects.
The <code>const</code> modifier on an instance method applies to the object pointed to by the "<code>this</code>" pointer, which is an implicit argument passed to all instance methods.
Thus having <code>const</code> methods is a way to apply const-correctness to the implicit "<code>this</code>" pointer argument just like other arguments.
This example illustrates:
<syntaxhighlight lang="cpp">
class Integer {
private:
int i;
public:
// note the const tag
nodiscard
int get() const noexcept {
return i;
}
// Note the lack of "const"
void set(int j) noexcept {
i = j;
}
};
void foo(Integer& nonConstInteger, const Integer& constInteger) {
int y = nonConstInteger.get(); // OK
int x = constInteger.get(); // OK: get() is const
nonConstInteger.set(10); // OK: nonConstInteger is modifiable
constInteger.set(10); // Error! set() is a non-const method and constInteger is a const-qualified object
}
</syntaxhighlight>
In the above code, the implicit "<code>this</code>" pointer to <code>set()</code> has the type "<code>Integer* const</code>"; whereas the "<code>this</code>" pointer to <code>get()</code> has type "<code>Integer const* const</code>", indicating that the method cannot modify its object through the "<code>this</code>" pointer.
Often the programmer will supply both a <code>const</code> and a non-<code>const</code> method with the same name (but possibly quite different uses) in a class to accommodate both types of callers. Consider:
<syntaxhighlight lang="cpp">
import std;
using std::array;
class IntegerArray {
private:
array<int, 100> data;
public:
nodiscard
int& get(int i) noexcept {
return data[i];
}
nodiscard
int const& get(int i) const noexcept {
return data[i];
}
};
void foo(IntegerArray& a, const IntegerArray& ca) {
// Get a reference to an array element
// and modify its referenced value.
a.get(5) = 42; // OK! (Calls: int& IntegerArray::get(int))
ca.get(5) = 42; // Error! (Calls: int const& IntegerArray::get(int) const)
}
</syntaxhighlight>
The <code>const</code>-ness of the calling object determines which version of <code>IntegerArray::get()</code> will be invoked and thus whether or not the caller is given a reference with which he can manipulate or only observe the private data in the object.
The two methods technically have different signatures because their "<code>this</code>" pointers have different types, allowing the compiler to choose the right one. (Returning a <code>const</code> reference to an <code>int</code>, instead of merely returning the <code>int</code> by value, may be overkill in the second method, but the same technique can be used for arbitrary types, as in the Standard Template Library.)
Loopholes to const-correctness
There are several loopholes to pure const-correctness in C and C++. They exist primarily for compatibility with existing code.
The first, which applies only to C++, is the use of <code>const_cast</code>, which allows the programmer to strip the <code>const</code> qualifier, making any object modifiable.
The necessity of stripping the qualifier arises when using existing code and libraries that cannot be modified but which are not const-correct. For instance, consider this code:
<syntaxhighlight lang="cpp">
// Prototype for a function which we cannot change but which
// we know does not modify the pointee passed in.
void myLibFunc(int* p, int size);
void callLibFunc(const int* p, int size) {
myLibFunc(p, size); // Error! Drops const qualifier
int* nonConstPtr = const_cast<int*>(p); // Strip qualifier
myLibFunc(nonConstPtr, size); // OK
}
</syntaxhighlight>
However, any attempt to modify an object that is itself declared <code>const</code> by means of a const cast results in undefined behavior according to the ISO C++ Standard.
In the example above, if <code>ptr</code> references a global, local, or member variable declared as <code>const</code>, or an object allocated on the heap via <code>new int const</code>, the code is only correct if <code>LibraryFunc</code> really does not modify the value pointed to by <code>ptr</code>.
The C language has a need of a loophole because a certain situation exists. Variables with static storage duration are allowed to be defined with an initial value. However, the initializer can use only constants like string constants and other literals, and is not allowed to use non-constant elements like variable names, whether the initializer elements are declared <code>const</code> or not, or whether the static duration variable is being declared <code>const</code> or not. There is a non-portable way to initialize a <code>const</code> variable that has static storage duration. By carefully constructing a typecast on the left hand side of a later assignment, a <code>const</code> variable can be written to, effectively stripping away the <code>const</code> attribute and 'initializing' it with non-constant elements like other <code>const</code> variables and such. Writing into a <code>const</code> variable this way may work as intended, but it causes undefined behavior and seriously contradicts const-correctness:
<syntaxhighlight lang="cpp">
constexpr size_t BUFFER_SIZE = 8 * 1024;
const size_t userTextBufferSize; // initial value depends on const BUFFER_SIZE, can't be initialized here
...
int setupUserTextBox(TextBox* defaultTextBoxType, Rectangle* defaultTextBoxLocation) {
- (size_t*)&userTextBufferSize = BUFFER_SIZE - sizeof(TextBoxControls); // warning: might work, but not guaranteed by C
...
}
</syntaxhighlight>
Another loophole applies both to C and C++. Specifically, the languages dictate that member pointers and references are "shallow" with respect to the <code>const</code>-ness of their owners – that is, a containing object that is <code>const</code> has all <code>const</code> members except that member pointees (and referees) are still mutable. To illustrate, consider this C++ code:
<syntaxhighlight lang="cpp">
struct MyStruct {
int val;
int* ptr;
};
void foo(const MyStruct& s) {
int i = 42;
s.val = i; // Error: s is const, so val is a const int
s.ptr = &i; // Error: s is const, so ptr is a const pointer to int
- s.ptr = i; // OK: the data pointed to by ptr is always mutable, even though this is sometimes not desirable
}
</syntaxhighlight>
Although the object <code>s</code> passed to <code>foo()</code> is constant, which makes all of its members constant, the pointee accessible through <code>s.ptr</code> is still modifiable, though this may not be desirable from the standpoint of <code>const</code>-correctness because <code>s</code> might solely own the pointee.
For this reason, Meyers argues that the default for member pointers and references should be "deep" <code>const</code>-ness, which could be overridden by a <code>mutable</code> qualifier when the pointee is not owned by the container, but this strategy would create compatibility issues with existing code.
Thus, for historical reasons, this loophole remains open in C and C++.
The latter loophole can be closed by using a class to hide the pointer behind a <code>const</code>-correct interface, but such classes either do not support the usual copy semantics from a <code>const</code> object (implying that the containing class cannot be copied by the usual semantics either) or allow other loopholes by permitting the stripping of <code>const</code>-ness through inadvertent or intentional copying.
Finally, several functions in the C standard library violate const-correctness before C23, as they accept a <code>const</code> pointer to a character string and return a non-<code>const</code> pointer to a part of the same string. <code>strstr</code> and <code>strchr</code> are among these functions.
Some implementations of the C++ standard library, such as Microsoft's try to close this loophole by providing two overloaded versions of some functions: a "<code>const</code>" version and a "non-<code>const</code>" version.
Problems
The use of the type system to express constancy leads to various complexities and problems, and has accordingly been criticized and not adopted outside the narrow C family of C, C++, and D. Java and C#, which are heavily influenced by C and C++, both explicitly rejected <code>const</code>-style type qualifiers, instead expressing constancy by keywords that apply to the identifier (<code>final</code> in Java, <code>const</code> and <code>readonly</code> in C#). Even within C and C++, the use of <code>const</code> varies significantly, with some projects and organizations using it consistently, and others avoiding it.
<code>strchr</code> problem
The <code>const</code> type qualifier causes difficulties when the logic of a function is agnostic to whether its input is constant or not, but returns a value which should be of the same qualified type as an input. In other words, for these functions, if the input is constant (const-qualified), the return value should be as well, but if the input is variable (not <code>const</code>-qualified), the return value should be as well. Because the type signature of these functions differs, it requires two functions (or potentially more, in case of multiple inputs) with the same logic – a form of generic programming.
This problem arises even for simple functions in the C standard library, notably <code>strchr</code>; this observation is credited by Ritchie to Tom Plum in the mid-1980s. The <code>strchr</code> function locates a character in a string; formally, it returns a pointer to the first occurrence of the character <code>c</code> in the string <code>s</code>, and in classic C (K&R C) its prototype is:
<syntaxhighlight lang="c">
char* strchr(char* s, int c);
</syntaxhighlight>
The <code>strchr</code> function does not modify the input string, but the return value is often used by the caller to modify the string, such as:
<syntaxhighlight lang="c">
if (p = strchr(q, '/')) {
- p = ' ';
}
</syntaxhighlight>
Thus on the one hand the input string can be <code>const</code> (since it is not modified by the function), and if the input string is <code>const</code> the return value should be as well – most simply because it might return exactly the input pointer, if the first character is a match – but on the other hand the return value should not be <code>const</code> if the original string was not <code>const</code>, since the caller may wish to use the pointer to modify the original string.
In C++ this is done via function overloading, typically implemented via a template, resulting in two functions, so that the return value has the same <code>const</code>-qualified type as the input:
<syntaxhighlight lang="cpp">
char* strchr(char* s, int c);
const char* strchr(const char* s, int c);
</syntaxhighlight>
These can in turn be defined by a template:
<syntaxhighlight lang="cpp">
template <T>
T* strchr(T* s, int c) { ... }
</syntaxhighlight>
In D this is handled via the <code>inout</code> keyword, which acts as a wildcard for const, immutable, or unqualified (variable), yielding:
