Types in C Language

In C, every piece of data has a type that the compiler must understand to properly operate on it. The “type” refers to the shared characteristics of similar data, allowing you to know its properties and operations once its type is known.

There are three basic data types: char (character), int (integer), and float (floating-point). More complex types are built from these.

Character Type

The character type represents a single character and is declared with the char keyword.

1	char c = 'B';

In this example, c is declared as a character type and assigned the value ‘B’. In C, character constants must be enclosed in single quotes.

Internally, characters are stored in one byte (8 bits). C treats them as integers, so a character is essentially an integer with a width of one byte. Each character corresponds to an integer (defined by the ASCII code), with ‘B’ corresponding to the integer 66.

Different systems may have different default ranges for the char type. Some systems use -128 to 127, while others use 0 to 255. Both ranges cover the ASCII range of 0 to 127.

Integers within the char range can be used interchangeably with characters and assigned to char variables.

1
2
3

char c = 66;
// Equivalent to
char c = 'B';

In this example, assigning 66 to c has the same effect as assigning ‘B’.

Character variables can also participate in arithmetic operations.

char a = 'B'; // Equivalent to char a = 66;
char b = 'C'; // Equivalent to char b = 67;

printf("%d\n", a + b); // Outputs 133

Here, adding a and b is treated like adding two integers, and %d prints the result as a decimal integer, yielding 133.

Single quotes themselves are characters, so to represent a single quote in a character constant, you need to escape it.

1	char t = '\'';

In this case, t holds the single quote character. Since character constants must be in single quotes, the internal single quote needs to be escaped.

Escape sequences are also used to represent unprintable control characters, which are part of the character type values:

\a: Alert (causes an alert sound or visual flash)
\b: Backspace (moves the cursor back one character without deleting it)
\f: Form feed (moves the cursor to the next page; in modern systems, it behaves like \v)
\n: Newline
\r: Carriage return (moves the cursor to the start of the line)
\t: Horizontal tab (moves the cursor to the next tab stop, usually every 8 characters)
\v: Vertical tab (moves the cursor to the next vertical tab stop, usually the same column on the next line)
\0: Null character (represents no content; different from the number 0)

Escape sequences can also use octal and hexadecimal notation:

\nn: Octal representation of the character, where nn is the octal value
\xnn: Hexadecimal representation of the character, where nn is the hexadecimal value

char x = 'B';
char x = 66;
char x = '\102'; // Octal
char x = '\x42'; // Hexadecimal

All four of these declarations are equivalent.

Integer Types

Overview

In C, integer types are used to represent whole numbers. The type is declared using the int keyword.

int a;

In the example above, an integer variable a is declared.

The size of the int type can vary between different computers. Commonly, an int is stored in 4 bytes (32 bits), but it can also be 2 bytes (16 bits) or 8 bytes (64 bits). The ranges of integers that these types can represent are as follows:

16-bit: -32,768 to 32,767
32-bit: -2,147,483,648 to 2,147,483,647
64-bit: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

Signed and Unsigned

In C, the signed keyword indicates that a type can hold both positive and negative values. Conversely, the unsigned keyword means the type can only hold zero and positive values.

By default, int is signed, which means int is equivalent to signed int. Although the signed keyword is usually omitted, it is not incorrect to include it.

1 2	signed int a; // Equivalent to int a;

To declare an int without a sign (only non-negative values), use the unsigned keyword:

1	unsigned int a;

An unsigned int can represent a larger maximum value compared to a signed int of the same size. For instance, a 16-bit signed int has a maximum value of 32,767, while an unsigned int can go up to 65,535.

The int in unsigned int can be omitted:

1	unsigned a;

The char type can also be signed or unsigned:

1 2	signed char c; // Range: -128 to 127 unsigned char c; // Range: 0 to 255

Note that the default sign of char (whether it is signed or unsigned) is system-dependent. Unlike int, char is not guaranteed to be signed or unsigned.

Integer Subtypes

When an int type uses 4 or 8 bytes, it may be overkill for small integers. On the other hand, if larger integers are needed, 8 bytes might not be sufficient. To address these issues, C provides three subtypes of integers, allowing for more precise control over the integer range:

short int (or short): Uses no more than int, generally 2 bytes (range: -32,768 to 32,767)
long int (or long): Uses at least as much space as int, generally 4 bytes
long long int (or long long): Uses more space than long, at least 8 bytes

1
2
3

short int a;
long int b;
long long int c;

By default, short, long, and long long are signed. You can declare them as unsigned to double the maximum value they can represent:

1
2
3

unsigned short int a;
unsigned long int b;
unsigned long long int c;

The int in these declarations can be omitted:

short a;
unsigned short a;

long b;
unsigned long b;

long long c;
unsigned long long c;

The size of data types can vary between computers. Use long for 32-bit integers and long long for 64-bit integers to ensure proper size, and use short for 16-bit integers. For 8-bit integers, use char.

Limits of Integer Types

To determine the maximum and minimum values of integer types on your system, use constants from the header file limits.h. For example:

SCHAR_MIN and SCHAR_MAX for signed char
SHRT_MIN and SHRT_MAX for short
INT_MIN and INT_MAX for int
LONG_MIN and LONG_MAX for long
LLONG_MIN and LLONG_MAX for long long
UCHAR_MAX for unsigned char
USHRT_MAX for unsigned short
UINT_MAX for unsigned int
ULONG_MAX for unsigned long
ULLONG_MAX for unsigned long long

Using these constants ensures your code remains portable across different systems.

Integer Literals and Formats

By default, integers in C are written in decimal. To represent octal and hexadecimal numbers, use specific prefixes:

Octal: Prefix with 0 (e.g., 017 or 0377).

1	int a = 012; // Octal, equivalent to decimal 10

Hexadecimal: Prefix with 0x or 0X (e.g., 0xf or 0X10).

1	int a = 0x1A2B; // Hexadecimal, equivalent to decimal 6699

Some compilers also support binary literals with the 0b prefix (though this is not standard):

1	int x = 0b101010;

Different bases are only for representation; they do not affect the actual storage of the integer, which is always in binary. You can mix bases in expressions, such as 10 + 015 + 0x20.

Use the following format specifiers with printf() to display integers in different bases:

%d: Decimal
%o: Octal
%x: Hexadecimal
%#o: Octal with prefix 0
%#x: Hexadecimal with prefix 0x
%#X: Hexadecimal with prefix 0X

int x = 100;
printf("dec = %d\n", x);       // 100
printf("octal = %o\n", x);     // 144
printf("hex = %x\n", x);       // 64
printf("octal = %#o\n", x);    // 0144
printf("hex = %#x\n", x);      // 0x64
printf("hex = %#X\n", x);      // 0X64

Floating Point Numbers

In programming, a floating point number is a value with a decimal point, represented in the form ( $ m \times b^e $ ), where ( m ) is the mantissa, ( b ) is the base (usually 2), and ( e ) is the exponent. This format balances precision and range, allowing for representation of very large or very small numbers.

In C, floating point types are declared using the float keyword. For example:

1	float c = 10.5;

Here, c is a floating point variable. The float type uses 4 bytes (32 bits), with 8 bits for the exponent and sign, and 24 bits for the mantissa and sign. It provides at least 6 decimal digits of precision and can represent values from $ 10^{-37} $ to $ 10^{37} $.

For cases where more precision or range is needed, C offers two larger floating point types:

double: Uses 8 bytes (64 bits), with at least 13 decimal digits of precision.
long double: Typically uses 16 bytes, but this can vary by system.

Due to precision limits, floating point numbers are approximations, and calculations may not be exact. For example, in C, 0.1 + 0.2 does not exactly equal 0.3, but has a small error:

1	if (0.1 + 0.2 == 0.3) // false

Floating point numbers can also be represented using scientific notation, where e separates the decimal part from the exponent:

1
2
3

double x = 123.456e+3; // Equivalent to 123.456 x 10^3
// or simply
double x = 123.456e3;

The + in the exponent can be omitted, and spaces around e are not allowed. Leading or trailing zeros in the decimal part can be omitted:

1 2	0.3E6 // Equivalent to .3E6 3.0E6 // Equivalent to 3.E6

Boolean Type

Originally, C did not have a dedicated boolean type. Instead, 0 was used for false, and any non-zero value was considered true:

int x = 1;
if (x) {
  printf("x is true!\n");
}

With the C99 standard, _Bool was introduced to represent boolean values. This is essentially an alias for int, with 0 as false and 1 as true:

_Bool isNormal;
isNormal = 1;
if (isNormal)
  printf("Everything is OK.\n");

The header file stdbool.h defines a bool type alias and constants true and false:

1
2
3

#include <stdbool.h>

bool flag = false;

Literal types

Literals are fixed values directly written into the code.

For example:

1	int x = 123;

In this code, x is a variable, and 123 is a literal.

At compile time, literals are also written to memory, so the compiler must assign a data type to them just as it does for variables.

Typically, decimal integer literals (like 123) are assigned the int type by the compiler. If a number is too large for an int, the compiler assigns it long int. If it exceeds long int, it will be assigned unsigned long, and if that’s still not sufficient, it will be assigned long long or unsigned long long.

Floating-point literals (like 3.14) are assigned the double type.

Literal Suffixes

Sometimes, programmers want to specify a different type for a literal. For instance, the compiler assigns an integer literal to the int type by default, but if a programmer wants to assign it as a long, they can add the suffix l or L to the literal. This signals to the compiler to treat the literal as a long type.

1	int x = 123L;

In the above example, the literal 123 has the suffix L, so the compiler will treat it as a long. You can also write it as 123l, but using L is recommended because the lowercase l can be easily confused with the number 1.

Octal and hexadecimal values can also use the l or L suffix to indicate they should be treated as long, such as 020L and 0x20L.

1 2	int y = 0377L; int z = 0x7fffL;

If you want to specify an unsigned integer, use the suffix u or U.

1	int x = 123U;

The L and U suffixes can be combined to indicate an unsigned long. The order and case of L and U do not matter.

1	int x = 123LU;

For floating-point numbers, the compiler defaults to the double type. If you want to specify a different type, add the suffix f (for float) or l (for long double) after the decimal.

You can also use suffixes with scientific notation.

1 2	1.2345e+10F 1.2345e+10L

To summarize, here are the common literal suffixes:

f and F: float type.
l and L: long int for integers, long double for floating-point numbers.
ll and LL: long long int, such as 3LL.
u and U: unsigned int, such as 15U or 0377U.
u can be combined with other integer suffixes, and the order doesn’t matter, e.g., 10UL, 10ULL, and 10LLU are all valid.

Here are some examples:

int           x = 1234;
long int      x = 1234L;
long long int x = 1234LL;

unsigned int           x = 1234U;
unsigned long int      x = 1234UL;
unsigned long long int x = 1234ULL;

float x       = 3.14f;
double x      = 3.14;
long double x = 3.14L;

Overflow

Each data type has a defined range of values. If a value exceeds this range (either smaller than the minimum or larger than the maximum), an overflow occurs, requiring more binary space than available. If the value exceeds the maximum, it’s called an overflow; if it’s smaller than the minimum, it’s an underflow.

Generally, the compiler won’t throw an error for overflows. Instead, it will execute the code and discard the excess binary bits, leading to unexpected results. Therefore, overflows should be avoided.

unsigned char x = 255;
x = x + 1;

printf("%d\n", x); // Output: 0

In this example, adding 1 to x doesn’t result in 256, but rather 0. This happens because x is an unsigned char with a maximum value of 255 (binary 11111111). Adding 1 causes an overflow, and the highest bit in 256 (binary 100000000) is discarded, leaving 0.

Here’s another example:

unsigned int ui = UINT_MAX;  // 4,294,967,295
ui++;
printf("ui = %u\n", ui); // 0
ui--;
printf("ui = %u\n", ui); // 4,294,967,295

The constant UINT_MAX is the maximum value for unsigned int. Adding 1 causes an overflow, resulting in 0. Subtracting 1 from 0 returns UINT_MAX.

Overflows are easy to overlook because the compiler won’t issue warnings, so extra caution is needed.

1	for (unsigned int i = n; i >= 0; --i) // Error

At first glance, this loop looks fine, but the variable i is an unsigned int, and its minimum value is 0. It cannot produce a value less than 0. When i reaches 0 and is decremented by 1, it doesn’t result in -1, but rather the maximum value for unsigned int, which is always greater than or equal to 0, leading to an infinite loop.

To prevent overflows, the best approach is to compare the result of an operation with the limits of the data type.

unsigned int ui;
unsigned int sum;

// Incorrect
if (sum + ui > UINT_MAX) too_big();
else sum = sum + ui;

// Correct
if (ui > UINT_MAX - sum) too_big();
else sum = sum + ui;

In the above example, both sum and ui are of unsigned int type, and their sum might cause an overflow. However, you cannot determine an overflow by checking if the sum exceeds UINT_MAX, because sum + ui returns the overflowed result, which cannot be greater than UINT_MAX. The correct method is to compare UINT_MAX - sum with ui.

Here’s another common mistake:

unsigned int i = 5;
unsigned int j = 7;

if (i - j < 0) // Error
  printf("negative\n");
else
  printf("positive\n");

The above code will always print positive. This is because both i and j are unsigned int, so the result of i - j is also unsigned int, which has a minimum value of 0. It cannot be less than 0. The correct way to write this is:

1	if (j > i) // ...

`sizeof` Operator

The sizeof operator in C is used to determine the number of bytes occupied by a data type or a specific value. Its argument can be a data type keyword, a variable, or a literal value.

// Argument as data type
int x = sizeof(int);

// Argument as a variable
int i;
sizeof(i);

// Argument as a literal
sizeof(3.14);

In the first example, sizeof(int) returns the number of bytes used by the int type (typically 4 or 8 bytes). The second example returns the size of the integer variable i, which will be the same as the first. The third example returns the size of the literal 3.14, which is treated as a double by default, so it returns 8.

The return type of sizeof is an unsigned integer, but C does not mandate its specific type—it varies by system. It could be unsigned int, unsigned long, or even unsigned long long, and the corresponding printf() format specifiers are %u, %lu, and %llu respectively. This variability can impact the portability of programs across different systems.

To solve this, C provides the size_t type, defined in the stddef.h header (included automatically with stdio.h), which represents the return type of sizeof across different systems. This helps ensure portability.

C also provides the constant SIZE_MAX, which represents the maximum value that size_t can hold. The valid range for size_t is [0, SIZE_MAX].

For printf(), the specifiers %zd or %zu are specifically designed to handle size_t values.

1	printf("%zd\n", sizeof(int));

In this example, %zd ensures correct output regardless of the underlying type returned by sizeof. If your system doesn’t support %zd or %zu, you can use %u for unsigned int or %lu for unsigned long.

Automatic Type Conversion

In certain situations, C automatically converts one data type to another.

Assignment Operations

Assigning a floating-point value to an integer variable
When assigning a floating-point value to an integer, C discards the fractional part rather than rounding.

1	int x = 3.14;

In this example, the value assigned to x is 3, as the fractional part .14 is discarded.

1	int x = 12.99;

Here, x becomes 12, not the rounded value 13.

Assigning an integer value to a floating-point variable
When assigning an integer to a floating-point variable, C automatically converts the integer to a float.

1	float y = 12 * 2;

The value of y is 24.0 because the integer result 24 is automatically converted to a floating-point number.

Assigning a smaller type to a larger type
When assigning a value of a narrower type (like char) to a wider type (like int), the value is automatically promoted to the larger type.

1 2	char x = 10; int i = x + y;

Here, x is promoted to int before the operation.

Assigning a larger type to a smaller type
When assigning a larger type (like int) to a smaller type (like char), truncation occurs, and excess bits are discarded.

1 2	int i = 321; char ch = i; // ch holds 65, which is 321 % 256

In this example, ch stores 65 because the excess binary digits of 321 are discarded, leaving only the last 8 bits (which represent 65 in decimal).

Similarly, assigning a floating-point value to an integer variable truncates the decimal part.

1 2	double pi = 3.14159; int i = pi; // i becomes 3

Mixed-Type Operations

When different types are mixed in an expression, C converts them to a common type before performing the operation:

Integer and floating-point operations
In operations involving both integers and floating-point values, integers are converted to floating-point types.

1	3 + 1.2 // Result: 4.2

Floating-point type promotion
When performing operations between different floating-point types, the smaller type is promoted to the larger type (e.g., float to double).
Integer type promotion
For integer operations, smaller integer types are promoted to larger types (e.g., short to int).

1
2
3

int a = -5;
if (a < sizeof(int)) 
  do_something();

Here, a is a signed integer, and sizeof(int) returns an unsigned integer (size_t). C automatically converts a to an unsigned type, which leads to an unexpected comparison result.

Integer Operations

For integer types smaller than int, arithmetic results are promoted to int.

1 2	unsigned char a = 66; if (-a < 0) printf("negative\n");

In this example, -a is promoted to int, so the result is negative even though a is unsigned char.

Another example:

unsigned char a = 1;
unsigned char b = 255;
if ((a - 5) < 0) do_something();
if ((b + 255) > 300) do_something();

In both cases, the expressions are promoted to int, so the operations execute as expected.

Function Parameter and Return Type Conversion

Function parameters and return values are automatically converted to match the function’s defined types.

int dostuff(int, unsigned char);

char m = 42;
unsigned short n = 43;
long long c = dostuff(m, n);

In this case, m and n are converted to int and unsigned char respectively, matching the function’s signature.

Similarly, return values are automatically converted:

char func(void) {
  int a = 42;
  return a;
}

Here, the function returns a char, so the int variable a is converted accordingly.

Explicit Type Casting

In principle, automatic type conversion should be avoided to prevent unexpected results. C provides explicit type casting, allowing you to manually convert a value to a specified type.

To cast a value or variable, simply place the desired type in parentheses before the value or variable. This is called “type casting.”

1	(unsigned char) ch

The example above casts the variable ch to an unsigned character type.

1	long int y = (long int) 10 + 12;

In this example, (long int) explicitly converts 10 to a long int. However, this cast is unnecessary because the assignment operator will automatically convert the right-hand value to the type of the left-hand variable.

Portable Data Types

The width of C’s integer types (short, int, long) can vary across different machines, making it difficult to predict how many bytes these types will occupy. To write more portable code, programmers can control the exact width of integers. The header file stdint.h provides type aliases for this purpose.

Exact-width integer types guarantee the size of an integer:
- int8_t: 8-bit signed integer.
- int16_t: 16-bit signed integer.
- int32_t: 32-bit signed integer.
- int64_t: 64-bit signed integer.
- uint8_t: 8-bit unsigned integer.
- uint16_t: 16-bit unsigned integer.
- uint32_t: 32-bit unsigned integer.
- uint64_t: 64-bit unsigned integer.

These are type aliases that the compiler maps to the appropriate underlying type. For example, if int is 32 bits on a given system, int32_t will map to int. If long is 32 bits, int32_t will map to long.

Example usage:

#include <stdio.h>
#include <stdint.h>

int main(void) {
    int32_t x32 = 45933945;
    printf("x32 = %d\n", x32);
    return 0;
}

In this example, x32 is declared as an int32_t, ensuring a width of 32 bits.

Minimum-width types guarantee a minimum number of bits:
- int_least8_t
- int_least16_t
- int_least32_t
- int_least64_t
- uint_least8_t
- uint_least16_t
- uint_least32_t
- uint_least64_t

These types ensure that the integer occupies at least the specified number of bits. For example, int_least8_t is guaranteed to be at least 8 bits wide.

Fastest minimum-width types ensure the fastest integer operations for a given width:
- int_fast8_t
- int_fast16_t
- int_fast32_t
- int_fast64_t
- uint_fast8_t
- uint_fast16_t
- uint_fast32_t
- uint_fast64_t

These types guarantee both the width and the fastest operation speed. For instance, int_fast8_t represents the fastest type for storing an 8-bit signed integer. On some machines, processing 32-bit integers may be faster than processing 16-bit integers, so the fastest type might be larger than 8 bits.

Pointer-sized integer types store pointers as integers:
- intptr_t: A signed integer capable of holding a pointer (memory address).
- uintptr_t: An unsigned integer capable of holding a pointer.

Maximum-width integer types store the largest possible integers:
- intmax_t: Can store any valid signed integer.
- uintmax_t: Can store any valid unsigned integer.

These types can store larger integers than long long and unsigned long.