Types in C Language
In C, every piece of data has a type that the compiler must understand to properly operate on it. The “type” refers to the shared characteristics of similar data, allowing you to know its properties and operations once its type is known.
There are three basic data types: char
(character), int
(integer), and float
(floating-point). More complex types are built from these.
Character Type
The character type represents a single character and is declared with the char
keyword.
1 | char c = 'B'; |
In this example, c
is declared as a character type and assigned the value ‘B’. In C, character constants must be enclosed in single quotes.
Internally, characters are stored in one byte (8 bits). C treats them as integers, so a character is essentially an integer with a width of one byte. Each character corresponds to an integer (defined by the ASCII code), with ‘B’ corresponding to the integer 66.
Different systems may have different default ranges for the char
type. Some systems use -128 to 127, while others use 0 to 255. Both ranges cover the ASCII range of 0 to 127.
Integers within the char
range can be used interchangeably with characters and assigned to char
variables.
1 | char c = 66; |
In this example, assigning 66 to c
has the same effect as assigning ‘B’.
Character variables can also participate in arithmetic operations.
1 | char a = 'B'; // Equivalent to char a = 66; |
Here, adding a
and b
is treated like adding two integers, and %d
prints the result as a decimal integer, yielding 133.
Single quotes themselves are characters, so to represent a single quote in a character constant, you need to escape it.
1 | char t = '\''; |
In this case, t
holds the single quote character. Since character constants must be in single quotes, the internal single quote needs to be escaped.
Escape sequences are also used to represent unprintable control characters, which are part of the character type values:
\a
: Alert (causes an alert sound or visual flash)\b
: Backspace (moves the cursor back one character without deleting it)\f
: Form feed (moves the cursor to the next page; in modern systems, it behaves like\v
)\n
: Newline\r
: Carriage return (moves the cursor to the start of the line)\t
: Horizontal tab (moves the cursor to the next tab stop, usually every 8 characters)\v
: Vertical tab (moves the cursor to the next vertical tab stop, usually the same column on the next line)\0
: Null character (represents no content; different from the number 0)
Escape sequences can also use octal and hexadecimal notation:
\nn
: Octal representation of the character, wherenn
is the octal value\xnn
: Hexadecimal representation of the character, wherenn
is the hexadecimal value
1 | char x = 'B'; |
All four of these declarations are equivalent.
Integer Types
Overview
In C, integer types are used to represent whole numbers. The type is declared using the int
keyword.
1 | int a; |
In the example above, an integer variable a
is declared.
The size of the int
type can vary between different computers. Commonly, an int
is stored in 4 bytes (32 bits), but it can also be 2 bytes (16 bits) or 8 bytes (64 bits). The ranges of integers that these types can represent are as follows:
- 16-bit: -32,768 to 32,767
- 32-bit: -2,147,483,648 to 2,147,483,647
- 64-bit: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
Signed and Unsigned
In C, the signed
keyword indicates that a type can hold both positive and negative values. Conversely, the unsigned
keyword means the type can only hold zero and positive values.
By default, int
is signed, which means int
is equivalent to signed int
. Although the signed
keyword is usually omitted, it is not incorrect to include it.
1 | signed int a; // Equivalent to |
To declare an int
without a sign (only non-negative values), use the unsigned
keyword:
1 | unsigned int a; |
An unsigned int
can represent a larger maximum value compared to a signed int
of the same size. For instance, a 16-bit signed int
has a maximum value of 32,767, while an unsigned int
can go up to 65,535.
The int
in unsigned int
can be omitted:
1 | unsigned a; |
The char
type can also be signed or unsigned:
1 | signed char c; // Range: -128 to 127 |
Note that the default sign of char
(whether it is signed or unsigned) is system-dependent. Unlike int
, char
is not guaranteed to be signed or unsigned.
Integer Subtypes
When an int
type uses 4 or 8 bytes, it may be overkill for small integers. On the other hand, if larger integers are needed, 8 bytes might not be sufficient. To address these issues, C provides three subtypes of integers, allowing for more precise control over the integer range:
- short int (or
short
): Uses no more thanint
, generally 2 bytes (range: -32,768 to 32,767) - long int (or
long
): Uses at least as much space asint
, generally 4 bytes - long long int (or
long long
): Uses more space thanlong
, at least 8 bytes
1 | short int a; |
By default, short
, long
, and long long
are signed. You can declare them as unsigned to double the maximum value they can represent:
1 | unsigned short int a; |
The int
in these declarations can be omitted:
1 | short a; |
The size of data types can vary between computers. Use long
for 32-bit integers and long long
for 64-bit integers to ensure proper size, and use short
for 16-bit integers. For 8-bit integers, use char
.
Limits of Integer Types
To determine the maximum and minimum values of integer types on your system, use constants from the header file limits.h
. For example:
SCHAR_MIN
andSCHAR_MAX
forsigned char
SHRT_MIN
andSHRT_MAX
forshort
INT_MIN
andINT_MAX
forint
LONG_MIN
andLONG_MAX
forlong
LLONG_MIN
andLLONG_MAX
forlong long
UCHAR_MAX
forunsigned char
USHRT_MAX
forunsigned short
UINT_MAX
forunsigned int
ULONG_MAX
forunsigned long
ULLONG_MAX
forunsigned long long
Using these constants ensures your code remains portable across different systems.
Integer Literals and Formats
By default, integers in C are written in decimal. To represent octal and hexadecimal numbers, use specific prefixes:
- Octal: Prefix with
0
(e.g.,017
or0377
).
1 | int a = 012; // Octal, equivalent to decimal 10 |
- Hexadecimal: Prefix with
0x
or0X
(e.g.,0xf
or0X10
).
1 | int a = 0x1A2B; // Hexadecimal, equivalent to decimal 6699 |
Some compilers also support binary literals with the 0b
prefix (though this is not standard):
1 | int x = 0b101010; |
Different bases are only for representation; they do not affect the actual storage of the integer, which is always in binary. You can mix bases in expressions, such as 10 + 015 + 0x20
.
Use the following format specifiers with printf()
to display integers in different bases:
%d
: Decimal%o
: Octal%x
: Hexadecimal%#o
: Octal with prefix0
%#x
: Hexadecimal with prefix0x
%#X
: Hexadecimal with prefix0X
1 | int x = 100; |
Floating Point Numbers
In programming, a floating point number is a value with a decimal point, represented in the form ( $ m \times b^e $ ), where ( m
) is the mantissa, ( b
) is the base (usually 2
), and ( e
) is the exponent. This format balances precision and range, allowing for representation of very large or very small numbers.
In C, floating point types are declared using the float
keyword. For example:
1 | float c = 10.5; |
Here, c
is a floating point variable. The float
type uses 4 bytes (32 bits), with 8 bits for the exponent and sign, and 24 bits for the mantissa and sign. It provides at least 6 decimal digits of precision and can represent values from $ 10^{-37} $ to $ 10^{37} $.
For cases where more precision or range is needed, C offers two larger floating point types:
double
: Uses 8 bytes (64 bits), with at least 13 decimal digits of precision.long double
: Typically uses 16 bytes, but this can vary by system.
Due to precision limits, floating point numbers are approximations, and calculations may not be exact. For example, in C, 0.1 + 0.2
does not exactly equal 0.3
, but has a small error:
1 | if (0.1 + 0.2 == 0.3) // false |
Floating point numbers can also be represented using scientific notation, where e
separates the decimal part from the exponent:
1 | double x = 123.456e+3; // Equivalent to 123.456 x 10^3 |
The +
in the exponent can be omitted, and spaces around e
are not allowed. Leading or trailing zeros in the decimal part can be omitted:
1 | 0.3E6 // Equivalent to .3E6 |
Boolean Type
Originally, C did not have a dedicated boolean type. Instead, 0
was used for false, and any non-zero value was considered true:
1 | int x = 1; |
With the C99 standard, _Bool
was introduced to represent boolean values. This is essentially an alias for int
, with 0
as false and 1
as true:
1 | _Bool isNormal; |
The header file stdbool.h
defines a bool
type alias and constants true
and false
:
1 |
|
Literal types
Literals are fixed values directly written into the code.
For example:
1 | int x = 123; |
In this code, x
is a variable, and 123
is a literal.
At compile time, literals are also written to memory, so the compiler must assign a data type to them just as it does for variables.
Typically, decimal integer literals (like 123
) are assigned the int
type by the compiler. If a number is too large for an int
, the compiler assigns it long int
. If it exceeds long int
, it will be assigned unsigned long
, and if that’s still not sufficient, it will be assigned long long
or unsigned long long
.
Floating-point literals (like 3.14
) are assigned the double
type.
Literal Suffixes
Sometimes, programmers want to specify a different type for a literal. For instance, the compiler assigns an integer literal to the int
type by default, but if a programmer wants to assign it as a long
, they can add the suffix l
or L
to the literal. This signals to the compiler to treat the literal as a long
type.
1 | int x = 123L; |
In the above example, the literal 123
has the suffix L
, so the compiler will treat it as a long
. You can also write it as 123l
, but using L
is recommended because the lowercase l
can be easily confused with the number 1
.
Octal and hexadecimal values can also use the l
or L
suffix to indicate they should be treated as long
, such as 020L
and 0x20L
.
1 | int y = 0377L; |
If you want to specify an unsigned integer, use the suffix u
or U
.
1 | int x = 123U; |
The L
and U
suffixes can be combined to indicate an unsigned long
. The order and case of L
and U
do not matter.
1 | int x = 123LU; |
For floating-point numbers, the compiler defaults to the double
type. If you want to specify a different type, add the suffix f
(for float
) or l
(for long double
) after the decimal.
You can also use suffixes with scientific notation.
1 | 1.2345e+10F |
To summarize, here are the common literal suffixes:
f
andF
:float
type.l
andL
:long int
for integers,long double
for floating-point numbers.ll
andLL
:long long int
, such as3LL
.u
andU
:unsigned int
, such as15U
or0377U
.u
can be combined with other integer suffixes, and the order doesn’t matter, e.g.,10UL
,10ULL
, and10LLU
are all valid.
Here are some examples:
1 | int x = 1234; |
Overflow
Each data type has a defined range of values. If a value exceeds this range (either smaller than the minimum or larger than the maximum), an overflow occurs, requiring more binary space than available. If the value exceeds the maximum, it’s called an overflow; if it’s smaller than the minimum, it’s an underflow.
Generally, the compiler won’t throw an error for overflows. Instead, it will execute the code and discard the excess binary bits, leading to unexpected results. Therefore, overflows should be avoided.
1 | unsigned char x = 255; |
In this example, adding 1
to x
doesn’t result in 256
, but rather 0
. This happens because x
is an unsigned char
with a maximum value of 255
(binary 11111111
). Adding 1
causes an overflow, and the highest bit in 256
(binary 100000000
) is discarded, leaving 0
.
Here’s another example:
1 | unsigned int ui = UINT_MAX; // 4,294,967,295 |
The constant UINT_MAX
is the maximum value for unsigned int
. Adding 1
causes an overflow, resulting in 0
. Subtracting 1
from 0
returns UINT_MAX
.
Overflows are easy to overlook because the compiler won’t issue warnings, so extra caution is needed.
1 | for (unsigned int i = n; i >= 0; --i) // Error |
At first glance, this loop looks fine, but the variable i
is an unsigned int
, and its minimum value is 0
. It cannot produce a value less than 0
. When i
reaches 0
and is decremented by 1
, it doesn’t result in -1
, but rather the maximum value for unsigned int
, which is always greater than or equal to 0
, leading to an infinite loop.
To prevent overflows, the best approach is to compare the result of an operation with the limits of the data type.
1 | unsigned int ui; |
In the above example, both sum
and ui
are of unsigned int
type, and their sum might cause an overflow. However, you cannot determine an overflow by checking if the sum exceeds UINT_MAX
, because sum + ui
returns the overflowed result, which cannot be greater than UINT_MAX
. The correct method is to compare UINT_MAX - sum
with ui
.
Here’s another common mistake:
1 | unsigned int i = 5; |
The above code will always print positive
. This is because both i
and j
are unsigned int
, so the result of i - j
is also unsigned int
, which has a minimum value of 0
. It cannot be less than 0
. The correct way to write this is:
1 | if (j > i) // ... |
sizeof
Operator
The sizeof
operator in C is used to determine the number of bytes occupied by a data type or a specific value. Its argument can be a data type keyword, a variable, or a literal value.
1 | // Argument as data type |
In the first example, sizeof(int)
returns the number of bytes used by the int
type (typically 4 or 8 bytes). The second example returns the size of the integer variable i
, which will be the same as the first. The third example returns the size of the literal 3.14
, which is treated as a double
by default, so it returns 8.
The return type of sizeof
is an unsigned integer, but C does not mandate its specific type—it varies by system. It could be unsigned int
, unsigned long
, or even unsigned long long
, and the corresponding printf()
format specifiers are %u
, %lu
, and %llu
respectively. This variability can impact the portability of programs across different systems.
To solve this, C provides the size_t
type, defined in the stddef.h
header (included automatically with stdio.h
), which represents the return type of sizeof
across different systems. This helps ensure portability.
C also provides the constant SIZE_MAX
, which represents the maximum value that size_t
can hold. The valid range for size_t
is [0, SIZE_MAX].
For printf()
, the specifiers %zd
or %zu
are specifically designed to handle size_t
values.
1 | printf("%zd\n", sizeof(int)); |
In this example, %zd
ensures correct output regardless of the underlying type returned by sizeof
. If your system doesn’t support %zd
or %zu
, you can use %u
for unsigned int
or %lu
for unsigned long
.
Automatic Type Conversion
In certain situations, C automatically converts one data type to another.
Assignment Operations
- Assigning a floating-point value to an integer variable
When assigning a floating-point value to an integer, C discards the fractional part rather than rounding.
1 | int x = 3.14; |
In this example, the value assigned to x
is 3
, as the fractional part .14
is discarded.
1 | int x = 12.99; |
Here, x
becomes 12
, not the rounded value 13
.
- Assigning an integer value to a floating-point variable
When assigning an integer to a floating-point variable, C automatically converts the integer to a float.
1 | float y = 12 * 2; |
The value of y
is 24.0
because the integer result 24
is automatically converted to a floating-point number.
- Assigning a smaller type to a larger type
When assigning a value of a narrower type (likechar
) to a wider type (likeint
), the value is automatically promoted to the larger type.
1 | char x = 10; |
Here, x
is promoted to int
before the operation.
- Assigning a larger type to a smaller type
When assigning a larger type (likeint
) to a smaller type (likechar
), truncation occurs, and excess bits are discarded.
1 | int i = 321; |
In this example, ch
stores 65
because the excess binary digits of 321
are discarded, leaving only the last 8 bits (which represent 65
in decimal).
Similarly, assigning a floating-point value to an integer variable truncates the decimal part.
1 | double pi = 3.14159; |
Mixed-Type Operations
When different types are mixed in an expression, C converts them to a common type before performing the operation:
- Integer and floating-point operations
In operations involving both integers and floating-point values, integers are converted to floating-point types.
1 | 3 + 1.2 // Result: 4.2 |
- Floating-point type promotion
When performing operations between different floating-point types, the smaller type is promoted to the larger type (e.g.,float
todouble
). - Integer type promotion
For integer operations, smaller integer types are promoted to larger types (e.g.,short
toint
).
1 | int a = -5; |
Here, a
is a signed integer, and sizeof(int)
returns an unsigned integer (size_t
). C automatically converts a
to an unsigned type, which leads to an unexpected comparison result.
Integer Operations
For integer types smaller than int
, arithmetic results are promoted to int
.
1 | unsigned char a = 66; |
In this example, -a
is promoted to int
, so the result is negative even though a
is unsigned char
.
Another example:
1 | unsigned char a = 1; |
In both cases, the expressions are promoted to int
, so the operations execute as expected.
Function Parameter and Return Type Conversion
Function parameters and return values are automatically converted to match the function’s defined types.
1 | int dostuff(int, unsigned char); |
In this case, m
and n
are converted to int
and unsigned char
respectively, matching the function’s signature.
Similarly, return values are automatically converted:
1 | char func(void) { |
Here, the function returns a char
, so the int
variable a
is converted accordingly.
Explicit Type Casting
In principle, automatic type conversion should be avoided to prevent unexpected results. C provides explicit type casting, allowing you to manually convert a value to a specified type.
To cast a value or variable, simply place the desired type in parentheses before the value or variable. This is called “type casting.”
1 | (unsigned char) ch |
The example above casts the variable ch
to an unsigned character type.
1 | long int y = (long int) 10 + 12; |
In this example, (long int)
explicitly converts 10
to a long int
. However, this cast is unnecessary because the assignment operator will automatically convert the right-hand value to the type of the left-hand variable.
Portable Data Types
The width of C’s integer types (short
, int
, long
) can vary across different machines, making it difficult to predict how many bytes these types will occupy. To write more portable code, programmers can control the exact width of integers. The header file stdint.h
provides type aliases for this purpose.
- Exact-width integer types guarantee the size of an integer:
int8_t
: 8-bit signed integer.int16_t
: 16-bit signed integer.int32_t
: 32-bit signed integer.int64_t
: 64-bit signed integer.uint8_t
: 8-bit unsigned integer.uint16_t
: 16-bit unsigned integer.uint32_t
: 32-bit unsigned integer.uint64_t
: 64-bit unsigned integer.
These are type aliases that the compiler maps to the appropriate underlying type. For example, if int
is 32 bits on a given system, int32_t
will map to int
. If long
is 32 bits, int32_t
will map to long
.
Example usage:
1 |
|
In this example, x32
is declared as an int32_t
, ensuring a width of 32 bits.
- Minimum-width types guarantee a minimum number of bits:
int_least8_t
int_least16_t
int_least32_t
int_least64_t
uint_least8_t
uint_least16_t
uint_least32_t
uint_least64_t
These types ensure that the integer occupies at least the specified number of bits. For example, int_least8_t
is guaranteed to be at least 8 bits wide.
- Fastest minimum-width types ensure the fastest integer operations for a given width:
int_fast8_t
int_fast16_t
int_fast32_t
int_fast64_t
uint_fast8_t
uint_fast16_t
uint_fast32_t
uint_fast64_t
These types guarantee both the width and the fastest operation speed. For instance, int_fast8_t
represents the fastest type for storing an 8-bit signed integer. On some machines, processing 32-bit integers may be faster than processing 16-bit integers, so the fastest type might be larger than 8 bits.
- Pointer-sized integer types store pointers as integers:
intptr_t
: A signed integer capable of holding a pointer (memory address).uintptr_t
: An unsigned integer capable of holding a pointer.
- Maximum-width integer types store the largest possible integers:
intmax_t
: Can store any valid signed integer.uintmax_t
: Can store any valid unsigned integer.
These types can store larger integers than long long
and unsigned long
.