In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.
The C language provides basic arithmetic types, such as integer and real number types, and syntax to build array and compound types. The C standard library contains additional definitions of support types, that have additional properties, such as providing storage with an exact size, independent of the language implementation on specific hardware platforms.
Primary types
Main types
The C language provides the four basic arithmetic type specifiers , , and (as well as the Boolean type ), and the modifiers , , , and . The following table lists the permissible combinations in specifying a large set of storage size-specific declarations.
{| class="wikitable"
|-
! Type
! Explanation
! Size (bits)<!-- Shouldn't the article be updated to only use the term "width" with this meaning? -->
! Format specifier
! Range
! Suffix for decimal constants
|-
| || Boolean type, added in C23. (Previously added in C99, but its size and its range were not specified.) || 1 (exact) || <code>%d</code> || ||
|-
| || Smallest addressable unit of the machine that can contain basic character set. It is an integer type. Actual type can be either signed or unsigned. It contains bits. However, most platforms use two's complement, implying a range of the form with for these implementations, e.g. [−128,127] (<code>SCHAR_MIN == −128</code> and <code>SCHAR_MAX == 127</code>) for an 8-bit . Since C23, the only representation allowed is two's complement, therefore the values range from at least . || ≥8 || <code>%c</code> || ||
|-
| || Of the same size as , but guaranteed to be unsigned. Contains at least the range. || ≥8 || <code>%c</code> || ||
|-
|
| Short signed integer type. Capable of containing at least the range. || ≥16 || <code>%hi</code> or <code>%hd</code> || ||
|-
|
| Short unsigned integer type. Contains at least the range.
|-
|
| Basic unsigned integer type. Contains at least the range.
In practice, is usually 8 bits in size and is usually 16 bits in size (as are their unsigned counterparts). This holds true for platforms as diverse as 1990s SunOS 4 Unix, Microsoft MS-DOS, modern Linux, and Microchip MCC18 for embedded 8-bit PIC microcontrollers. POSIX requires to be exactly 8 bits in size.
Various rules in the C standard make the basic type used for arrays suitable to store arbitrary non-bit-field objects: its lack of padding bits and trap representations, the definition of object representation,
The actual size and behavior of floating-point types also vary by implementation. The only requirement is that is not smaller than , which is not smaller than . Usually, the 32-bit and 64-bit IEEE 754 binary floating-point formats are used for and respectively.
The C99 standard includes new real floating-point types and , defined in <code><math.h></code>. They correspond to the types used for the intermediate results of floating-point expressions when is 0, 1, or 2. These types may be wider than .
C99 also added complex types: , , . C11 added imaginary types (which were described in an informative annex of C99): , , . Including the header <code><complex.h></code> allows all these types to be accessed with using and respectively.
Boolean type
C99 added a Boolean data type . Additionally, the <code><stdbool.h></code> header defines as a convenient alias for this type, and also provides macros for <code>true</code> and <code>false</code>. functions similarly to a normal integer type, with one exception: any conversion to a gives 0 (false) if the value equals 0; otherwise, it gives 1 (true). This behavior exists to avoid integer overflows in implicit narrowing conversions. For example, in the following code:
<syntaxhighlight lang=C>
unsigned char b = 256;
if (b) {
// do something
}
</syntaxhighlight>
Variable <code>b</code> evaluates to false if has a size of 8 bits. This is because the value 256 does not fit in the data type, which results in the lower 8 bits of it being used, resulting in a zero value. However, changing the type causes the previous code to behave normally:
<syntaxhighlight lang=C>
_Bool b = 256;
if (b) {
// do something
}
</syntaxhighlight>
The type also ensures true values always compare equal to each other:
<syntaxhighlight lang=C>
_Bool a = 1;
_Bool b = 2;
if (a == b) {
// this code will run
}
</syntaxhighlight>
In C23, (and its values <code>true</code> and <code>false</code>) became a core functionality of the language (making the contents of <code><stdbool.h></code> obsolescent), allowing for the following examples of code:
<syntaxhighlight lang=C>
bool b = true;
if (b) {
// this code will run
}
</syntaxhighlight>
Bit-precise integer types
Since C23, the language allows the programmer to define integers that have a width of an arbitrary number of bits. Those types are specified as , where N is an integer constant expression that denotes the number of bits, including the sign bit for signed types, represented in two's complement. The maximum value of N is provided by <code>BITINT_MAXWIDTH</code> and is at least <code>ULLONG_WIDTH</code>. Therefore, the type (or ) takes values from −2 to 1 while takes values from 0 to 3. The type also exists, being either 0 or 1 and has no equivalent signed type. A proposal for C2Y proposes to lift this restriction and allow which then has the possible values 0 and -1, removing the special case for .
Size and pointer difference types
The C language specification includes the s and to represent memory-related quantities. Their size is defined according to the target processor's arithmetic capabilities, not the memory capabilities, such as available address space. Both of these types are defined in the <code><stddef.h></code> header.
is an unsigned integer type used to represent the size of any object (including arrays) in the particular implementation. The operator yields a value of the type . The maximum size of is provided via <code>SIZE_MAX</code>, a macro constant which is defined in the <code><stdint.h></code> header. is guaranteed to be at least 16 bits wide. Additionally, POSIX includes , which is a signed integer type of the same width as .
is a signed integer type used to represent the difference between pointers. It is guaranteed to be valid only against pointers of the same type; subtraction of pointers consisting of different types is implementation-defined.
Interface to the properties of the basic types
Information about the actual properties, such as size, of the basic arithmetic types, is provided via macro constants in two headers: <code><limits.h></code> header defines macros for integer types and <code><float.h></code> header defines macros for floating-point types. The actual values depend on the implementation.
Properties of integer types
- <code>CHAR_BIT</code> – size of the char type in bits, commonly referred to as the size of a byte (at least 8 bits)
- <code>SCHAR_MIN</code>, <code>SHRT_MIN</code>, <code>INT_MIN</code>, <code>LONG_MIN</code>, <code>LLONG_MIN</code><small>(C99)</small> – minimum possible value of signed integer types: signed char, signed short, signed int, signed long, signed long long
- <code>SCHAR_MAX</code>, <code>SHRT_MAX</code>, <code>INT_MAX</code>, <code>LONG_MAX</code>, <code>LLONG_MAX</code><small>(C99)</small> – maximum possible value of signed integer types: signed char, signed short, signed int, signed long, signed long long
- <code>UCHAR_MAX</code>, <code>USHRT_MAX</code>, <code>UINT_MAX</code>, <code>ULONG_MAX</code>, <code>ULLONG_MAX</code><small>(C99)</small> – maximum possible value of unsigned integer types: unsigned char, unsigned short, unsigned int, unsigned long, unsigned long long
- <code>CHAR_MIN</code> – minimum possible value of char
- <code>CHAR_MAX</code> – maximum possible value of char
- <code>MB_LEN_MAX</code> – maximum number of bytes in a multibyte character
- <code>BOOL_WIDTH</code> (C23) - bit width of <code>_Bool</code>, always 1
- <code>CHAR_WIDTH</code> (C23) - bit width of <code>char</code>; <code>CHAR_WIDTH</code>, <code>UCHAR_WIDTH</code> and <code>SCHAR_WIDTH</code> are equal to <code>CHAR_BIT</code> by definition
- <code>SCHAR_WIDTH</code>, <code>SHRT_WIDTH</code>, <code>INT_WIDTH</code>, <code>LONG_WIDTH</code>, <code>LLONG_WIDTH</code> (C23) - bit width of <code>signed char</code>, <code>short</code>, <code>int</code>, <code>long</code>, and <code>long long</code> respectively
- <code>UCHAR_WIDTH</code>, <code>USHRT_WIDTH</code>, <code>UINT_WIDTH</code>, <code>ULONG_WIDTH</code>, <code>ULLONG_WIDTH</code> (C23) - bit width of <code>unsigned char</code>, <code>unsigned short</code>, <code>unsigned int</code>, <code>unsigned long</code>, and <code>unsigned long long</code> respectively
Properties of floating-point types
- <code>FLT_MIN</code>, <code>DBL_MIN</code>, <code>LDBL_MIN</code> – minimum normalized positive value of float, double, long double respectively
- <code>FLT_TRUE_MIN</code>, <code>DBL_TRUE_MIN</code>, <code>LDBL_TRUE_MIN</code> (C11) – minimum positive value of float, double, long double respectively
- <code>FLT_MAX</code>, <code>DBL_MAX</code>, <code>LDBL_MAX</code> – maximum finite value of float, double, long double, respectively
- <code>FLT_ROUNDS</code> – rounding mode for floating-point operations
- <code>FLT_EVAL_METHOD</code> (C99) – evaluation method of expressions involving different floating-point types
- <code>FLT_RADIX</code> – radix of the exponent in the floating-point types
- <code>FLT_DIG</code>, <code>DBL_DIG</code>, <code>LDBL_DIG</code> – number of decimal digits that can be represented without losing precision by float, double, long double, respectively
- <code>FLT_EPSILON</code>, <code>DBL_EPSILON</code>, <code>LDBL_EPSILON</code> – difference between 1.0 and the next representable value of float, double, long double, respectively
- <code>FLT_MANT_DIG</code>, <code>DBL_MANT_DIG</code>, <code>LDBL_MANT_DIG</code> – number of <code>FLT_RADIX</code>-base digits in the floating-point significand for types float, double, long double, respectively
- <code>FLT_MIN_EXP</code>, <code>DBL_MIN_EXP</code>, <code>LDBL_MIN_EXP</code> – minimum negative integer such that <code>FLT_RADIX</code> raised to a power one less than that number is a normalized float, double, long double, respectively
- <code>FLT_MIN_10_EXP</code>, <code>DBL_MIN_10_EXP</code>, <code>LDBL_MIN_10_EXP</code> – minimum negative integer such that 10 raised to that power is a normalized float, double, long double, respectively
- <code>FLT_MAX_EXP</code>, <code>DBL_MAX_EXP</code>, <code>LDBL_MAX_EXP</code> – maximum positive integer such that <code>FLT_RADIX</code> raised to a power one less than that number is a normalized float, double, long double, respectively
- <code>FLT_MAX_10_EXP</code>, <code>DBL_MAX_10_EXP</code>, <code>LDBL_MAX_10_EXP</code> – maximum positive integer such that 10 raised to that power is a normalized float, double, long double, respectively
- <code>DECIMAL_DIG</code> (C99) – minimum number of decimal digits such that any number of the widest supported floating-point type can be represented in decimal with a precision of <code>DECIMAL_DIG</code> digits and read back in the original floating-point type without changing its value. <code>DECIMAL_DIG</code> is at least 10.
Fixed-width integer types
The C99 standard includes definitions of several new integer types to enhance the portability of programs. The name "pointer-to-<code>void</code>" does not imply it points to <code>void</code> memory (as <code>void</code> is an incomplete type with no size), but rather that it points to something of unspecified type. It must be converted to another pointer type before it can be dereferenced. A <code>void *</code> may not point to a function. A call to <code>malloc</code> returns <code>void *</code>, and <code>free</code> takes a <code>void *</code>.
C uses the concept of a null pointer to denote a pointer that does not refer to any valid data. The macro <code>NULL</code> is often used in place of a null pointer, relying on implicit type conversion when possible. However, this usage can be problematic and may be a source of programming errors. In particular, the expansion of <code>NULL</code> may have a pointer type or an integer type, depending on the implementation. C23 introduced the predefined constant <code>nullptr</code> and its type <code>nullptr_t</code> (which has the single value <code>nullptr</code>) to express a null pointer constant. <code>nullptr</code> is unambiguously a pointer, and may convert to any object or function pointer, and allows a specific <code>nullptr_t</code> case in <code>_Generic</code>. The size and alignment of this type is the same as for a pointer to character type (or <code>void *</code>), but other pointer types may still have a different size and alignment; thus not all null pointers are replaceable with <code>nullptr</code>.
Arrays
For every type <code>T</code>, except void and function types, there exist the types "array of <code>N</code> elements of type <code>T</code>". An array is a collection of values, all of the same type, stored contiguously in memory. An array of size <code>N</code> is indexed by integers from <code>0</code> up to and including <code>N − 1</code>. Here is a brief example:
<syntaxhighlight lang=C>
int a[10]; // array of 10 elements, each of type int
</syntaxhighlight>
Arrays can be initialized with a compound initializer, but not assigned. Arrays are passed to functions by passing a pointer to the first element. Multidimensional arrays are defined as "array of array ...", and all except the outermost dimension must have compile-time constant size:
<syntaxhighlight lang=C>
int aa[10][8]; // array of 10 elements, each of type 'array of 8 int elements'
</syntaxhighlight>
In C, a string is often stored as an array of <code>char</code> (<code>char[]</code>), but this is distinct from a pointer-to-<code>char</code> (<code>char *</code>). <code>char[]</code> cannot be reassigned, and lives where it is defined, while <code>char *</code> is reassignable, but cannot be modified.
<syntaxhighlight lang="c">
char s[] = "Hello, world!";
char *p = s; // s decays to &s[0], p points to the first character
</syntaxhighlight>
Even if a function takes <code>T[]</code> as a parameter, it decays into a <code>T *</code> which points to its first element.
<syntaxhighlight lang="c">
void f(int a[]);
// this is the same as:
void f(int *a);
</syntaxhighlight>
Indexing an array is defined in terms of pointer arithmetic, that is, <code>a[i]</code> is equivalent to <code>*(a + i)</code>.
Enums
In C, an enum is an integer type whose values are restricted to a set of named constants. Enums cannot be forward-declared. Enums may also directly be assigned values, and are commonly used with <code>switch</code> to enumerate multiple cases.
<syntaxhighlight lang="c">
- include <stddef.h>
enum Status {
OK = 200,
NOT_FOUND = 404,
SERVER_ERROR = 500
};
const char *get_status_string(enum Status status) {
switch (status) {
case OK:
return "Success";
case NOT_FOUND:
return "Not found";
case SERVER_ERROR:
return "Server error";
default:
unreachable();
}
}
</syntaxhighlight>
Enum constants are known at compile time, and are often a safer means to define integral constants than macros. An enum's underlying size is usually <code>int</code>, but since C23 an enum's underlying size can be directly specified to be any integral type.
<syntaxhighlight lang="c">
enum Color : char {
RED = 1,
ORANGE,
YELLOW,
GREEN,
BLUE,
INDIGO,
VIOLET
};
</syntaxhighlight>
Because enums are not type-safe, they can be assigned a value of another enum. Enums also need not be assigned values necessarily within the range of defined values.
<syntaxhighlight lang="c">
enum Color color = YELLOW; // YELLOW = 3
enum Month month = JUNE; // JUNE = 6
color = month; // color is now 6 (INDIGO)
month = 999; // there may not be a value of 999 in enum Month, but still allowed
</syntaxhighlight>
Structs
Structures (structs) aggregate the storage of multiple data items, of potentially differing data types, into one contiguous memory block referenced by a single variable. Members may possibly be padded for memory alignment, and thus it is often recommended to order fields from largest to smallest size for efficient memory usage.
<syntaxhighlight lang=C>
struct Student {
char name[50];
unsigned int id;
unsigned int semester;
float gpa;
};
// Positional initialization - values matches field order
struct Student alice = { "Alice", 123, 2, 3.8 };
// Designated initializer (since C99)
struct Student bob = {
.name = "Bob",
.id = 246,
.semester = 1,
.gpa = 3.9
};
</syntaxhighlight>
Structs may also use bit fields to allow fields to share the same storage units, but layouts are implementation-defined.
<syntaxhighlight lang="c">
struct Properties {
// three fields can be compactly packed in one byte
unsigned char visible : 1; // a occupies 1 bit
unsigned char color : 3; // b occupies 3 bits
unsigned char size : 4; // c occupies 4 bits
};
</syntaxhighlight>
The memory layout of a struct is a language implementation issue for each platform, with a few restrictions. The memory address of the first member must be the same as the address of structure itself. Structs may be initialized or assigned to using compound literals. A function may directly return a struct, although this is often not efficient at run-time. Since C99, a struct may also end with a flexible array member.
Functions may take a struct as a parameter by value, but this is expensive as it copies the entire struct. Meanwhile, passing it by pointer is often preferable as the size of a pointer is known (typically 4 or 8 bytes).
<syntaxhighlight lang="c">
- include <stdio.h>
// passing by value
void print_student(struct Student s) {
printf("Name: %s, ID: %d, in semester %d, with GPA: %.2f\n", s.name, s.id, s.semester, s.gpa);
}
// passing by pointer
void print_student(struct Student *s) {
printf("Name: %s, ID: %d, in semester %d, with GPA: %.2f\n", s->name, s->id, s->semester, s->gpa);
}
</syntaxhighlight>
Structs can be composed of other structs:
<syntaxhighlight lang="c">
struct Date {
int year;
int month;
int day;
};
struct Birthday {
char name[50];
struct Date dob;
};
</syntaxhighlight>
A struct containing a pointer to a struct of its own type is commonly used to build linked data structures:
<syntaxhighlight lang=C>
struct LinkedList {
void *item; // stores the current item
struct LinkedList *next; // stores the next list, or NULL if nothing next
};
</syntaxhighlight>
Unions
A union type is a special construct that permits access to the same memory block by using a choice of differing type descriptions.
<syntaxhighlight lang="c">
// holds either an integer or floating point value
union Number {
int i;
float f;
} d;
d.i = 10; // d now holds 10
d.f = 3.14f; // d now holds 3.14, overwriting 10
</syntaxhighlight>
In the following example, a union of data types may be declared to permit reading the same data either as an integer, a float, or any other user declared type:
<syntaxhighlight lang=C>
union {
int i;
float f;
struct {
unsigned int u;
double d;
} s;
} u;
</syntaxhighlight>
The total size of <code>u</code> is the size of <code>u.s</code> – which happens to be the sum of the sizes of <code>u.s.u</code> and <code>u.s.d</code> – since <code>s</code> is larger than both <code>i</code> and <code>f</code>. When assigning something to <code>u.i</code>, some parts of <code>u.f</code> may be preserved if <code>u.i</code> is smaller than <code>u.f</code>.
Reading from a union member is not the same as casting since the value of the member is not converted, but merely read.
Function pointers
Function pointers allow referencing functions with a particular signature. For example, to store the address of the standard function <code>abs</code> in the variable <code>my_int_f</code>:
<syntaxhighlight lang=C>
int (*my_int_f)(int) = &abs;
// the & operator can be omitted, but makes clear that the "address of" abs is used here
</syntaxhighlight>
Function pointers are invoked by name just like normal function calls.
<syntaxhighlight lang="c">
- include <stdio.h>
- include <stdlib.h>
int (*my_abs)(int) = &abs;
int x = -42;
int abs_of_x = my_abs(x);
printf("abs(%d) = %d\n", x, abs_of_x);
</syntaxhighlight>
Type qualifiers
The aforementioned types can be characterized further by type qualifiers, yielding a qualified type. and C11, there are four type qualifiers in standard C:
- <code>const</code> (C89)
- <code>volatile</code> (C89)
- <code>restrict</code> (C99)
- <code>_Atomic</code> (C11) the latter has a private name to avoid clashing with user names, but the more ordinary name <code>atomic</code> can be used if the <code><stdatomic.h></code> header is included.
Of these, <code>const</code> is by far the best-known and most used, appearing in the C standard library and encountered in any significant use of the C language, which must satisfy const-correctness. The other qualifiers are primarily used for low-level programming: <code>volatile</code> is intended suppress compiler optimisations on a variable by suggesting it may change at any time, <code>restrict</code> indicates that the object pointed to by a pointer is accessed only by that pointer, and <code>_Atomic</code> indicates that the object is "atomic", i.e. reads and writes are indivisible.
See also
- C syntax
- Uninitialized variable
- Integer (computer science)
- Offsetof
