Now that you have seen how to declare variables and constants, let’s take a closer look at the data types available in C#. As you will see, C# is much stricter about the types available and -their definitions than some other languages are.
Value Types and Reference Types
Before examining the data types in C#, it is important to understand that C# distinguishes between two categories of data type:
- Value types
- Reference types
The next few sections look in detail at the syntax for value and reference types. Conceptually, the difference is that a value type stores its value directly, whereas a reference type stores a reference to the value. Value types in C# are basically the same as simple types (integer, float, but not pointers or references) in Visual Basic or C++. Reference types are the same as reference types in Visual Basic and are similar to types accessed through pointers in C++.
These types are stored in different places in memory; value types are stored in an area known as the stack, and reference types are stored in an area known as the managed heap. It is important to be aware of whether a type is a value type or a reference type because of the different effect each assignment has. For example, int is a value type, which means that the following statement will result in two locations in memory storing the value 20:
// i and j are both of type int
i = 20;
j = i;
However, consider the following code. For this code, assume that you have defined a class called Vector. Assume that Vector is a reference type and has an int member variable called Value:
The crucial point to understand is that after executing this code, there is only one Vector object around. x and y both point to the memory location that contains this object. Because x and y are variables of a reference type, declaring each variable simply reserves a reference-it doesn’t instantiate an object of the given type. This is the same as declaring a pointer in C++ or an object reference in Visual Basic. In neither case is an object actually created. In order to create an object, y at 1 have to use the new keyword, as shown. Because x and y refer to the same object, changes made to x will affect y and vice versa. Hence the code will display 30 then 50.
C++ developers should note that this syntax is like a reference, not a pointer. We use the. notation, not ->, to access object members. Syntactically, C# references look more like C++ reference variables.
However, behind the superficial syntax. the real similarity is with C++ pointers.
If a variable is a reference, it is possible to indicate that it does not refer to any object by setting its value to null:
. y = null;
This is the same as setting a reference to null in Java, a pointer to NULL in C++, or an object reference in Visual Basic to Nothing. If a reference is set to null, then clearly it is not possible to call any non static member functions or fields against it; doing so would cause an exception to be thrown at runtime.
In languages like C++, the developer can choose whether a given value is to be accessed directly or via a pointer. Visual Basic is more restrictive, taking the view that COM objects are reference types and simple types are always value types. C# is similar to Visual Basic in this regard: whether a variable is a value or reference is determined solely by its data type, so int, for example, is always a value type. It is not possible to declare an int variable as a reference (although in Chapter 6, “Operators and Casts,” which covers boxing, you see it is possible to wrap value types references of type object) .
. In C#, basic data types like bool and long are value types. This means that if you declare a bool ‘variable and assign it the value of another bool variable, you will have two separate bool values in memory. Later, if you change the value of the original bool variable, the value of the second bool variable does not change. These types are copied by value.
In contrast, most of the more complex C# data types, including classes that you yourself declare, are reference types. They are allocated upon the heap, have lifetimes that can span multiple function calls, and can be accessed through one or several aliases. The Common Language Runtime (CLR) implements an elaborate algorithm to track which reference variables are still reachable and which have been orphaned. Periodically, the CLR will destroy orphaned objects and return the memory that they once occupied back to the operating system. This is done by the garbage collector.
C# has been designed this way because high performance is best served by keeping primitive types (like int and bool) as value types and larger types that contain many fields (as is usually the case with classes as reference types. If you want to define your awn type as a value type, you should declare it as a struct.
The basic predefined types recognized by C# are not intrinsic to the language but are part of the .NET Framework. For example, when you declare an int in C#, what you are actually declaring is an instance of a .NET struct, System. Int32. This may sound like a small point, but it has a profound significance: it means that you are able to treat all the primitive data types syntactically as if they were classes that supported certain methods. For example, to convert an
int i to a string, you. can write:
string s = i.ToString();
It should be emphasized that, behind this syntactical convenience, the types really are stored as primitive types, so there is absolutely no performance cost associated with the idea that the primitive types are notionally represented by .NET structs.
The following sections review the types that are recognized as built-in types in C#. Each type is listed, along with its definition and the name of the corresponding .NET type (CTS type). C# has 15 predefined types, 13 value types, and 2 (string and object) reference types.
Predefined Value Types
The built-in value types represent primitives, such as integer and floating-point numbers, character, and Boolean types.
C# supports eight predefined integer types, shown in the following table.
Future versions of Windows will target 64-bit processors; which can move bits into and out of memory in larger chunks to achieve faster processing times. Consequently, C# supports a rich palette of signed and unsigned integer types ranging in size from 8 to 64 bits.
Many of these type names will be new to programmers experienced in Visual Basic.C++ and Java developers should be careful; some C# types have the same names as C++ and Java types but have different definitions. For example, in C#, an int is always a 32-bit signed integer. In C++ an int is a signed integer, but the number of bits is platform-dependent (32 bits on Windows). In C#, all data types have been defined in a platform-independent manner to allow for the possible future porting of C# and .NET to other platforms.
A byte is the standard 8-bit type for values in the range 0 to 255 inclusive. Be aware that, in keeping with its emphasis on type safety, C# regards the.byte type and the char type as completely distinct, and any programmatic conversions between the two must be explicitly requested. Also be aware that unlike the other types in the integer family, a byte type is by default unsigned. Its signed version bears the special name sbyte.
With .NET,a short is no longer quite so short; it is now 16 bits long. The int type is 32bits long. The long type reserves 64 bits for values. All integer type variables can be assigned values in decimal or in
hex notation. The latter require the Ox prefix,
long x = Ox12ab;
If there is any ambiguity about whether an integer is int, uint, long, or ulong, it will default to an int. To specify which of the other integer types the value should take, you can append one of the following characters to the number:
uint .ui =.1234U;
long 1 = l234L;
ulong u1 = 1234UL;
You can also use lowercase u and 1, although the latter could be confused with the integer 1 (one).
Although C# provides a plethora of integer data types, it supports floating-point types as well. They will be familiar to C and C++ programmers.
The float data type is for smaller floating-point values, for which less precision is required. The double data type is bulkier than the float data type but offers twice ~he precision (15 digits).
If you hard-code a non-integer number (such as 12.3) in your code, the compiler will normally assume that you want the number interpreted as a double. If you want to specify that the value is a float, you append the character f (or f) to it:
float f = i2.3F;
The Decimal Type
The decimal type represents higher-precision floating-point numbers, as shown in the following table.
One of the great things about the CTS and C# is the provision of a dedicated decimal type for financial calculations. How you use the 28 digits that the decimal type provides is up to you. In other words, you can track smaller dollar amounts with greater accuracy for cents or larger dollar amounts with more rounding in the fractional area. Bear in mind, however, that decimal is not implemented under the hood as a primitive type, so using decimal will have a performance effect on your calculations.
To specify that your number is a decimal type rather than a double, float, or an integer, you can append the M (or m) character to the value, as shown in the following example:
decimal d = 12.30M;
The Boolean Type
The C# bool type is used to contain Boolean values of either true or false.
You cannot implicitly convert bool values to and from integer values. If a variable (or a function return type) is declared as a bool, you can only use values of true and false. You will get an error if you try to use zero for false and a non-zero value for true, as is possible to do in C++..
The Character Type
For storing the value of a single character, C# supports the char data type.
Although this data type has a superficial resemblance’to the char type provided by C and C++, there is a significant difference. A C++ char represents an 8-bit character, whereas a C# char contains 16bits. This is part of the reason that implicit conversions between the char type and the 8-bit byte type are not permitted.
Although 8 bits may be enough to encode every character in the English language and the digits 0-9, they aren’t enough to encode every character in more expansive symbol systems (such as Chinese). In a gesture toward universality, the computer industry is moving away from the 8-bit character set and toward the 16-bit Unicode scheme, of which the ASCIf’encoding is a subset.
Literals of type char are signified by being enclosed in single quotation marks, for example ‘A’ . If you try to enclose a character in double quotation marks, the compiler will treat this as a string and throw an error.
As well as representing chars as character literals, you can represent them with four-digit hex Unicode values (for example’ \u0041 ,), as integer values with a cast (for example, char) 65), or as
hexadecimal values (, \x0041′). You can also represent them with an escape sequence, as shown in the following table.
C++ developers should note that because C# has a native string type, you don’t need to represent strings as arrays of chars .