Types in C Language
In C, every piece of data has a type that the compiler must understand to properly operate on it. The “type” refers to the shared characteristics of similar data, allowing you to know its properties and operations once its type is known.
There are three basic data types: char (character), int (integer), and float (floating-point). More complex types are built from these.
Character Type
The character type represents a single character and is declared with the char keyword.
1 | char c = 'B'; |
In this example, c is declared as a character type and assigned the value ‘B’. In C, character constants must be enclosed in single quotes.
Internally, characters are stored in one byte (8 bits). C treats them as integers, so a character is essentially an integer with a width of one byte. Each character corresponds to an integer (defined by the ASCII code), with ‘B’ corresponding to the integer 66.
Different systems may have different default ranges for the char type. Some systems use -128 to 127, while others use 0 to 255. Both ranges cover the ASCII range of 0 to 127.
Integers within the char range can be used interchangeably with characters and assigned to char variables.
1 | char c = 66; |
In this example, assigning 66 to c has the same effect as assigning ‘B’.
Character variables can also participate in arithmetic operations.
1 | char a = 'B'; // Equivalent to char a = 66; |
Here, adding a and b is treated like adding two integers, and %d prints the result as a decimal integer, yielding 133.
Single quotes themselves are characters, so to represent a single quote in a character constant, you need to escape it.
1 | char t = '\''; |
In this case, t holds the single quote character. Since character constants must be in single quotes, the internal single quote needs to be escaped.
Escape sequences are also used to represent unprintable control characters, which are part of the character type values:
\a: Alert (causes an alert sound or visual flash)\b: Backspace (moves the cursor back one character without deleting it)\f: Form feed (moves the cursor to the next page; in modern systems, it behaves like\v)\n: Newline\r: Carriage return (moves the cursor to the start of the line)\t: Horizontal tab (moves the cursor to the next tab stop, usually every 8 characters)\v: Vertical tab (moves the cursor to the next vertical tab stop, usually the same column on the next line)\0: Null character (represents no content; different from the number 0)
Escape sequences can also use octal and hexadecimal notation:
\nn: Octal representation of the character, wherennis the octal value\xnn: Hexadecimal representation of the character, wherennis the hexadecimal value
1 | char x = 'B'; |
All four of these declarations are equivalent.
Integer Types
Overview
In C, integer types are used to represent whole numbers. The type is declared using the int keyword.
1 | int a; |
In the example above, an integer variable a is declared.
The size of the int type can vary between different computers. Commonly, an int is stored in 4 bytes (32 bits), but it can also be 2 bytes (16 bits) or 8 bytes (64 bits). The ranges of integers that these types can represent are as follows:
- 16-bit: -32,768 to 32,767
- 32-bit: -2,147,483,648 to 2,147,483,647
- 64-bit: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
Signed and Unsigned
In C, the signed keyword indicates that a type can hold both positive and negative values. Conversely, the unsigned keyword means the type can only hold zero and positive values.
By default, int is signed, which means int is equivalent to signed int. Although the signed keyword is usually omitted, it is not incorrect to include it.
1 | signed int a; // Equivalent to |
To declare an int without a sign (only non-negative values), use the unsigned keyword:
1 | unsigned int a; |
An unsigned int can represent a larger maximum value compared to a signed int of the same size. For instance, a 16-bit signed int has a maximum value of 32,767, while an unsigned int can go up to 65,535.
The int in unsigned int can be omitted:
1 | unsigned a; |
The char type can also be signed or unsigned:
1 | signed char c; // Range: -128 to 127 |
Note that the default sign of char (whether it is signed or unsigned) is system-dependent. Unlike int, char is not guaranteed to be signed or unsigned.
Integer Subtypes
When an int type uses 4 or 8 bytes, it may be overkill for small integers. On the other hand, if larger integers are needed, 8 bytes might not be sufficient. To address these issues, C provides three subtypes of integers, allowing for more precise control over the integer range:
- short int (or
short): Uses no more thanint, generally 2 bytes (range: -32,768 to 32,767) - long int (or
long): Uses at least as much space asint, generally 4 bytes - long long int (or
long long): Uses more space thanlong, at least 8 bytes
1 | short int a; |
By default, short, long, and long long are signed. You can declare them as unsigned to double the maximum value they can represent:
1 | unsigned short int a; |
The int in these declarations can be omitted:
1 | short a; |
The size of data types can vary between computers. Use long for 32-bit integers and long long for 64-bit integers to ensure proper size, and use short for 16-bit integers. For 8-bit integers, use char.
Limits of Integer Types
To determine the maximum and minimum values of integer types on your system, use constants from the header file limits.h. For example:
SCHAR_MINandSCHAR_MAXforsigned charSHRT_MINandSHRT_MAXforshortINT_MINandINT_MAXforintLONG_MINandLONG_MAXforlongLLONG_MINandLLONG_MAXforlong longUCHAR_MAXforunsigned charUSHRT_MAXforunsigned shortUINT_MAXforunsigned intULONG_MAXforunsigned longULLONG_MAXforunsigned long long
Using these constants ensures your code remains portable across different systems.
Integer Literals and Formats
By default, integers in C are written in decimal. To represent octal and hexadecimal numbers, use specific prefixes:
- Octal: Prefix with
0(e.g.,017or0377).
1 | int a = 012; // Octal, equivalent to decimal 10 |
- Hexadecimal: Prefix with
0xor0X(e.g.,0xfor0X10).
1 | int a = 0x1A2B; // Hexadecimal, equivalent to decimal 6699 |
Some compilers also support binary literals with the 0b prefix (though this is not standard):
1 | int x = 0b101010; |
Different bases are only for representation; they do not affect the actual storage of the integer, which is always in binary. You can mix bases in expressions, such as 10 + 015 + 0x20.
Use the following format specifiers with printf() to display integers in different bases:
%d: Decimal%o: Octal%x: Hexadecimal%#o: Octal with prefix0%#x: Hexadecimal with prefix0x%#X: Hexadecimal with prefix0X
1 | int x = 100; |
Floating Point Numbers
In programming, a floating point number is a value with a decimal point, represented in the form ( $ m \times b^e $ ), where ( m ) is the mantissa, ( b ) is the base (usually 2), and ( e ) is the exponent. This format balances precision and range, allowing for representation of very large or very small numbers.
In C, floating point types are declared using the float keyword. For example:
1 | float c = 10.5; |
Here, c is a floating point variable. The float type uses 4 bytes (32 bits), with 8 bits for the exponent and sign, and 24 bits for the mantissa and sign. It provides at least 6 decimal digits of precision and can represent values from $ 10^{-37} $ to $ 10^{37} $.
For cases where more precision or range is needed, C offers two larger floating point types:
double: Uses 8 bytes (64 bits), with at least 13 decimal digits of precision.long double: Typically uses 16 bytes, but this can vary by system.
Due to precision limits, floating point numbers are approximations, and calculations may not be exact. For example, in C, 0.1 + 0.2 does not exactly equal 0.3, but has a small error:
1 | if (0.1 + 0.2 == 0.3) // false |
Floating point numbers can also be represented using scientific notation, where e separates the decimal part from the exponent:
1 | double x = 123.456e+3; // Equivalent to 123.456 x 10^3 |
The + in the exponent can be omitted, and spaces around e are not allowed. Leading or trailing zeros in the decimal part can be omitted:
1 | 0.3E6 // Equivalent to .3E6 |
Boolean Type
Originally, C did not have a dedicated boolean type. Instead, 0 was used for false, and any non-zero value was considered true:
1 | int x = 1; |
With the C99 standard, _Bool was introduced to represent boolean values. This is essentially an alias for int, with 0 as false and 1 as true:
1 | _Bool isNormal; |
The header file stdbool.h defines a bool type alias and constants true and false:
1 |
|
Literal types
Literals are fixed values directly written into the code.
For example:
1 | int x = 123; |
In this code, x is a variable, and 123 is a literal.
At compile time, literals are also written to memory, so the compiler must assign a data type to them just as it does for variables.
Typically, decimal integer literals (like 123) are assigned the int type by the compiler. If a number is too large for an int, the compiler assigns it long int. If it exceeds long int, it will be assigned unsigned long, and if that’s still not sufficient, it will be assigned long long or unsigned long long.
Floating-point literals (like 3.14) are assigned the double type.
Literal Suffixes
Sometimes, programmers want to specify a different type for a literal. For instance, the compiler assigns an integer literal to the int type by default, but if a programmer wants to assign it as a long, they can add the suffix l or L to the literal. This signals to the compiler to treat the literal as a long type.
1 | int x = 123L; |
In the above example, the literal 123 has the suffix L, so the compiler will treat it as a long. You can also write it as 123l, but using L is recommended because the lowercase l can be easily confused with the number 1.
Octal and hexadecimal values can also use the l or L suffix to indicate they should be treated as long, such as 020L and 0x20L.
1 | int y = 0377L; |
If you want to specify an unsigned integer, use the suffix u or U.
1 | int x = 123U; |
The L and U suffixes can be combined to indicate an unsigned long. The order and case of L and U do not matter.
1 | int x = 123LU; |
For floating-point numbers, the compiler defaults to the double type. If you want to specify a different type, add the suffix f (for float) or l (for long double) after the decimal.
You can also use suffixes with scientific notation.
1 | 1.2345e+10F |
To summarize, here are the common literal suffixes:
fandF:floattype.landL:long intfor integers,long doublefor floating-point numbers.llandLL:long long int, such as3LL.uandU:unsigned int, such as15Uor0377U.ucan be combined with other integer suffixes, and the order doesn’t matter, e.g.,10UL,10ULL, and10LLUare all valid.
Here are some examples:
1 | int x = 1234; |
Overflow
Each data type has a defined range of values. If a value exceeds this range (either smaller than the minimum or larger than the maximum), an overflow occurs, requiring more binary space than available. If the value exceeds the maximum, it’s called an overflow; if it’s smaller than the minimum, it’s an underflow.
Generally, the compiler won’t throw an error for overflows. Instead, it will execute the code and discard the excess binary bits, leading to unexpected results. Therefore, overflows should be avoided.
1 | unsigned char x = 255; |
In this example, adding 1 to x doesn’t result in 256, but rather 0. This happens because x is an unsigned char with a maximum value of 255 (binary 11111111). Adding 1 causes an overflow, and the highest bit in 256 (binary 100000000) is discarded, leaving 0.
Here’s another example:
1 | unsigned int ui = UINT_MAX; // 4,294,967,295 |
The constant UINT_MAX is the maximum value for unsigned int. Adding 1 causes an overflow, resulting in 0. Subtracting 1 from 0 returns UINT_MAX.
Overflows are easy to overlook because the compiler won’t issue warnings, so extra caution is needed.
1 | for (unsigned int i = n; i >= 0; --i) // Error |
At first glance, this loop looks fine, but the variable i is an unsigned int, and its minimum value is 0. It cannot produce a value less than 0. When i reaches 0 and is decremented by 1, it doesn’t result in -1, but rather the maximum value for unsigned int, which is always greater than or equal to 0, leading to an infinite loop.
To prevent overflows, the best approach is to compare the result of an operation with the limits of the data type.
1 | unsigned int ui; |
In the above example, both sum and ui are of unsigned int type, and their sum might cause an overflow. However, you cannot determine an overflow by checking if the sum exceeds UINT_MAX, because sum + ui returns the overflowed result, which cannot be greater than UINT_MAX. The correct method is to compare UINT_MAX - sum with ui.
Here’s another common mistake:
1 | unsigned int i = 5; |
The above code will always print positive. This is because both i and j are unsigned int, so the result of i - j is also unsigned int, which has a minimum value of 0. It cannot be less than 0. The correct way to write this is:
1 | if (j > i) // ... |
sizeof Operator
The sizeof operator in C is used to determine the number of bytes occupied by a data type or a specific value. Its argument can be a data type keyword, a variable, or a literal value.
1 | // Argument as data type |
In the first example, sizeof(int) returns the number of bytes used by the int type (typically 4 or 8 bytes). The second example returns the size of the integer variable i, which will be the same as the first. The third example returns the size of the literal 3.14, which is treated as a double by default, so it returns 8.
The return type of sizeof is an unsigned integer, but C does not mandate its specific type—it varies by system. It could be unsigned int, unsigned long, or even unsigned long long, and the corresponding printf() format specifiers are %u, %lu, and %llu respectively. This variability can impact the portability of programs across different systems.
To solve this, C provides the size_t type, defined in the stddef.h header (included automatically with stdio.h), which represents the return type of sizeof across different systems. This helps ensure portability.
C also provides the constant SIZE_MAX, which represents the maximum value that size_t can hold. The valid range for size_t is [0, SIZE_MAX].
For printf(), the specifiers %zd or %zu are specifically designed to handle size_t values.
1 | printf("%zd\n", sizeof(int)); |
In this example, %zd ensures correct output regardless of the underlying type returned by sizeof. If your system doesn’t support %zd or %zu, you can use %u for unsigned int or %lu for unsigned long.
Automatic Type Conversion
In certain situations, C automatically converts one data type to another.
Assignment Operations
- Assigning a floating-point value to an integer variable
When assigning a floating-point value to an integer, C discards the fractional part rather than rounding.
1 | int x = 3.14; |
In this example, the value assigned to x is 3, as the fractional part .14 is discarded.
1 | int x = 12.99; |
Here, x becomes 12, not the rounded value 13.
- Assigning an integer value to a floating-point variable
When assigning an integer to a floating-point variable, C automatically converts the integer to a float.
1 | float y = 12 * 2; |
The value of y is 24.0 because the integer result 24 is automatically converted to a floating-point number.
- Assigning a smaller type to a larger type
When assigning a value of a narrower type (likechar) to a wider type (likeint), the value is automatically promoted to the larger type.
1 | char x = 10; |
Here, x is promoted to int before the operation.
- Assigning a larger type to a smaller type
When assigning a larger type (likeint) to a smaller type (likechar), truncation occurs, and excess bits are discarded.
1 | int i = 321; |
In this example, ch stores 65 because the excess binary digits of 321 are discarded, leaving only the last 8 bits (which represent 65 in decimal).
Similarly, assigning a floating-point value to an integer variable truncates the decimal part.
1 | double pi = 3.14159; |
Mixed-Type Operations
When different types are mixed in an expression, C converts them to a common type before performing the operation:
- Integer and floating-point operations
In operations involving both integers and floating-point values, integers are converted to floating-point types.
1 | 3 + 1.2 // Result: 4.2 |
- Floating-point type promotion
When performing operations between different floating-point types, the smaller type is promoted to the larger type (e.g.,floattodouble). - Integer type promotion
For integer operations, smaller integer types are promoted to larger types (e.g.,shorttoint).
1 | int a = -5; |
Here, a is a signed integer, and sizeof(int) returns an unsigned integer (size_t). C automatically converts a to an unsigned type, which leads to an unexpected comparison result.
Integer Operations
For integer types smaller than int, arithmetic results are promoted to int.
1 | unsigned char a = 66; |
In this example, -a is promoted to int, so the result is negative even though a is unsigned char.
Another example:
1 | unsigned char a = 1; |
In both cases, the expressions are promoted to int, so the operations execute as expected.
Function Parameter and Return Type Conversion
Function parameters and return values are automatically converted to match the function’s defined types.
1 | int dostuff(int, unsigned char); |
In this case, m and n are converted to int and unsigned char respectively, matching the function’s signature.
Similarly, return values are automatically converted:
1 | char func(void) { |
Here, the function returns a char, so the int variable a is converted accordingly.
Explicit Type Casting
In principle, automatic type conversion should be avoided to prevent unexpected results. C provides explicit type casting, allowing you to manually convert a value to a specified type.
To cast a value or variable, simply place the desired type in parentheses before the value or variable. This is called “type casting.”
1 | (unsigned char) ch |
The example above casts the variable ch to an unsigned character type.
1 | long int y = (long int) 10 + 12; |
In this example, (long int) explicitly converts 10 to a long int. However, this cast is unnecessary because the assignment operator will automatically convert the right-hand value to the type of the left-hand variable.
Portable Data Types
The width of C’s integer types (short, int, long) can vary across different machines, making it difficult to predict how many bytes these types will occupy. To write more portable code, programmers can control the exact width of integers. The header file stdint.h provides type aliases for this purpose.
- Exact-width integer types guarantee the size of an integer:
int8_t: 8-bit signed integer.int16_t: 16-bit signed integer.int32_t: 32-bit signed integer.int64_t: 64-bit signed integer.uint8_t: 8-bit unsigned integer.uint16_t: 16-bit unsigned integer.uint32_t: 32-bit unsigned integer.uint64_t: 64-bit unsigned integer.
These are type aliases that the compiler maps to the appropriate underlying type. For example, if int is 32 bits on a given system, int32_t will map to int. If long is 32 bits, int32_t will map to long.
Example usage:
1 |
|
In this example, x32 is declared as an int32_t, ensuring a width of 32 bits.
- Minimum-width types guarantee a minimum number of bits:
int_least8_tint_least16_tint_least32_tint_least64_tuint_least8_tuint_least16_tuint_least32_tuint_least64_t
These types ensure that the integer occupies at least the specified number of bits. For example, int_least8_t is guaranteed to be at least 8 bits wide.
- Fastest minimum-width types ensure the fastest integer operations for a given width:
int_fast8_tint_fast16_tint_fast32_tint_fast64_tuint_fast8_tuint_fast16_tuint_fast32_tuint_fast64_t
These types guarantee both the width and the fastest operation speed. For instance, int_fast8_t represents the fastest type for storing an 8-bit signed integer. On some machines, processing 32-bit integers may be faster than processing 16-bit integers, so the fastest type might be larger than 8 bits.
- Pointer-sized integer types store pointers as integers:
intptr_t: A signed integer capable of holding a pointer (memory address).uintptr_t: An unsigned integer capable of holding a pointer.
- Maximum-width integer types store the largest possible integers:
intmax_t: Can store any valid signed integer.uintmax_t: Can store any valid unsigned integer.
These types can store larger integers than long long and unsigned long.