### Floating point variables of different lengths

• Trade-off: accuracy vs, memory space

• Recall that the computer can combine adjacent bytes in the RAM memory to form larger memory cells • Effect of combining memory cells:

Combination    # bits in memory cell    Capability
1 byte   8 bits   28 = 256 possible patterns
2 bytes    16 bits   216 = 65536 possible patterns
4 bytes    32 bits   232 = 4294967296 possible patterns

• Trade-off between accuracy and memory usage

 To obtain a higher accuracy (= more significant digit of accuracy), we need to combine more memory cells

• Fact:

 Arithmetic expressions that uses higher accuracy will take longer time to compute (It take longer time to multiple 15 digit numbers than 2 digit numbers)

• Trade-off between accuracy and speed

 When a Java program needs to use higher accurate numbers, it will not only use more memory, but it will also take a longer time to complete.

• Different kinds of floating point numbers in Java

• Java provides 2 different sizes of floating point numbers (and variables):

 Single precision floating numbers (has lower accuracy and uses less memory) Double precision floating numbers (has more accuracy and uses more memory)

• This offer programmers more flexibility:

 Computations that have low accuracy requirements can use a single precision to save memory space and run faster Computations that have high accuracy requirements must use a larger size to attain the required accuracy

• Single precision floating point variable:

• uses 4 consecutive bytes of memory as a single 32 bit memory cell

• A single precision floating point variable can represent a floating point number:

 in range of from −1038 to 1038 and with about 7 decimal digits accuracy

• A double precision floating point variable is a variable that:

• uses 8 consecutive bytes of memory as a single 64 bit memory cell

• A double precision floating point variable can represent a floating point number:

 in range of from −10308 to 10308 and with about 15 decimal digits accuracy

• Defining single and double precision floating point variables

• We have already learned how to define double precision floating point variables:

 ``` double variableName ; ```

• The syntax used to define single precision floating point variables is similar to the one used to define double precision floating point variables

The only difference is that we need to use a different keyword to denote single precision floating point variables

• Syntax to define single precision floating point variables:

 ``` float variableName ; ```

• Warning: float and double are considered as different types

• What determine a type:

 Each data type uses a unique data encoding method

• Because single and double precision floating point numbers uses different encoding methods, Java considers them as different types

• You can use the tool below to experience how a single precision floating point number is encoded:

• Type in a decimal number in the field named Decimal representation

• Press enter to see the 32 bits pattern used to encode the decimal number

The row of tiny squares are the 32 bits

Checked square represents a bit 1 and unchecked square represents a bit 0

• The following webpage let you compare the different encoding used in single and double precision: click here

Usage: Enter a decimal number in the Decimal Floating-Point field and press Not Rounded

• Converting (casting) to a single or a double precision representation

• The computer has built-in machine instructions to convert between different encodings

• The Java programming language provides access the computer's conversion operations through a number of conversion operators

• Computer jargon:

 Casting operation = a type conversion operation

• Java's Casting operators for single and double precision floating point numbers:

 ``` (float) --- convert to the single precision floating point representation (double) --- convert to the double precision floating point representation ```

• Example: conversion sequence float ⇒ double ⇒ float

 ``` public class Casting01 { public static void main(String[] args) { float x; // Define single precision floating point double y; // Define double precision floating point x = 3.1415927f; // f denotes "float" y = (double) x; // **** convert to double representation System.out.print("Original single precision x = "); System.out.println(x); System.out.print("Converted double precision y = "); System.out.println(y); x = (float) y; // **** convert to float representation System.out.print("Re-converted single precision x = "); System.out.println(x); } } ```

• Example Program: (Demo above code) How to run the program:

 Right click on link and save in a scratch directory To compile:   javac Casting01.java To run:          java Casting01

Output:

 ``` Original single precision x = 3.1415927 Converted double precision y = 3.1415927410125732 Re-converted single precision x = 3.1415927 ```

Notes:

 The trailing letter f in "3.1415927f" denotes a float typed number (Yes, even numbers are typed in Java) Notice that accuracy of variable x was preserved after the second conversion

• Priority level of the casting operators

• I want to emphasize that:

 (double)             (float)

are operators

• The casting operators in Java are unary operators (i.e., has 1 operand)

Analogy:

Unary negation operator     Casting operator is a unary operator
−x   (negates the value in variable x)     (float)x   (converts the value in variable x)

• Operators in Java has a priority level

Priority level of casting operators:

Operator     Priority     Note
(   ....   ) Highest
(float)   (double)     Higher   Unary operator, e.g.: (float) 3.0
*   / High   Binary operator, e.g.: 4 * 5
+   - Lowest   Binary operator, e.g.: 4 + 5

• When operators of different priority appear in one single arithmetic expression, then the operator with the highest priority is executed first.

 It's the same as what you have learned in Elementary School...

• Very important phenomenon in computer programming: lost of accuracy

• Consider the following example: conversion sequence double ⇒ float ⇒ double

 ``` public class Casting02 { public static void main(String[] args) { float x; // Define single precision floating point double y; // Define double precision floating point y = 3.14159265358979; // A "double" typed value x = (float) y; // **** convert to float representation System.out.print("Original double precision y = "); System.out.println(y); System.out.print("Converted single precision x = "); System.out.println(x); y = (double) x; // **** convert to double representation System.out.print("Re-converted double precision y = "); System.out.println(y); } } ```

• Example Program: (Demo above code) How to run the program:

 Right click on link and save in a scratch directory To compile:   javac Casting02.java To run:          java Casting02

Output:

 ``` Original double precision x = 3.14159265358979 Converted single precision y = 3.1415927 Re-converted double precision x = 3.1415927410125732 ```

Notes:

 Notice that we have lost many digits of accuracy in the float ⇒ double conversion !!! A floating point number without a trailing "f" belongs to the data type double

• In the previous 2 examples, we have observed the following phenomenon:

• Conversions used:
• float double float
• double float double Observation:

 When we convert a float to a double and then back to a float, there is no loss in accuracy When we convert a double to a float and then back to a float, there is a high loss in accuracy

• We can understand why we lose accuracy if we depict the conversion process as follows: Explanation:

 The float ⇒ double ⇒ float steps pass through a widening conversion and retain accuracy The double ⇒ float ⇒ double steps pass through a narrowing conversion and lost accuracy

• Another important phenomemon in computer programming: Overflow condition

• When converting a higher accuracy type to a lower accuracy type, you may cause an overflow condition

• Overflow:

 Each data type can represent a certain range of values Overflow = storing a out of range value into a variable

• Example:

• A double typed variable can store a value in the range of −10308 ... 10308
• A float typed variable can only store a value in the range of −1038 ... 1038

• Here is a Java program with an overflow condition:

 ``` public class Overflow1 { public static void main(String[] args) { double d; // range: -10^(308) .. 10^(308) float f; // range: -10^(38) .. 10^(38) d = 3.1415e100; // In range of "double", out of range of "float" f = (float) d; // Overflow !!! System.out.print("d = "); System.out.println(d); System.out.print("f = "); System.out.println(f); } } ```

Output of this program:

 ``` d = 3.1415E100 f = Infinity ```

• Conclusion:

 When you convert a values from a higher accuracy type to a lower accuracy type, you may cause a significant loss of information

• Safe and unsafe conversion operations

• Safe and unsafe conversions:

 Safe conversion = a conversion from one representation (encoding) to another representation (encoding) where there is no (or very little) loss in accuracy Unsafe conversion = a conversion from one representation (encoding) to another representation (encoding) where there is significant loss in accuracy

• We saw in the previous example that:

 double ⇒ float is a unsafe conversion float ⇒ double is a safe conversion

• Expressions containing values of different types

• It is common to use different data types in the same Java program

• An very important (but rarely taught) fact about a computer:

• A computer can only operate on data of the same data type

In other words:

 A computer can only add two double typed values A computer can only subtract two double double values And so on. A computer can only add two float typed values A computer can only subtract two float double values And so on.

• A computer does not have an instruction to add (or subtract) a double typed value and a float typed value

(This has to do with the encoding method used for different types of data)

• Operations on different types of value:

• In order to perform any operation on two values of differing types, the computer must:

 convert one of the types into the other type Perform the operation on the value (now of the same type

• Automatic conversions: when do they occur ?

• There are 5 situations where automatic conversions wil take place.

Right now, only two situations are relevant for our discussion:

 During a calculation of a arithmetic expressions Storing the result to a variable by an assignment operator

• Automatic conversion between float types in arithmetic expressions

• It is extremely inconvenient for programmers to have to write conversion operations when values of different floating point types are used in the same expression

• Java makes writing programs less painful by providing a number of automatic floating point conversions

• Java's automatic floating point promotion:

• Arithmetic promotion of float to double:

• If either operand in a (binary) arithmentic operation is of type double, the other operand is converted to double.

In other words:

 ``` float + double (automatic) ⇒ double + double double + float (automatic) ⇒ double + double ```

• If float value is assigned to a double variable, the float value is converted to double.

In other words:

 ``` double variable = float value (automatic) ⇒ double variable = double value ```

• Example 1:

 ``` float a = 2.5; double b = 3.4, c; c = a + b; // a (float typed) is first converted // to a double type // Then the addition is performed. ```

• Example 2:

 ``` float a = 2.5, b = 3.4; double c; c = a + b; // a and b (float typed) are first converted // to a double type // Then the addition is performed. ```

• Quiz: can you spot what is wrong with this program

• Java program with an error:

 ``` public class Caveat1 { public static void main(String[] args) { double a; float b, c; a = 2.5f; b = 3.4f; c = a + b; // Compilation error !!! } } ```

• Example Program: (Demo above code) How to compile the program:

 Right click on link and save in a scratch directory To compile:   javac Caveat1.java Compile error !!!

Can you see why the statement c = a + b will cause a compilation error ?

• Because the expression:

 ``` a + b is: double + float ```

the value in variable b is first (automatically) converted to double:

 ``` a + b is: double + float ⇒ double + double ```

So the RHS a + b will produce a result that is of the type double

• Now, the receiving variable c has the type float:

 ``` c = a + b ; ^^^ ^^^^^^^ float double ```

• A double typed value cannot be assigned to a float typed variable

We must use a casting operator:

 ``` c = (float) (a + b); ```

• Note:

• This solution will also work:

 ``` c = (float) a + b; ```

the casting operator (float) has a higher priority than the + operation, and will result in an addition of 2 float values.

• The general rule for automatic type conversion in the assignment operation

• General rule in the assignment operation:

• Assignment statement:

 ``` variable = expression ; ^^^^^^^^ ^^^^^^^^^^ type1 type2 ```

• The assignment operator "=" in Java performs safe conversions from type2type1 automatically

• In other words:

• If type1 is a higher accuracy type than type2, then:

 the type2 value is automatically converted to type1 before the assignment statement is executed. (Because the conversion was safe)

• If type1 is a lower accuracy type than type2, then:

 the assignment statement is not allowed You must use an casting operator to make the assignment statement valid.

This general rule is applicable when we discuss other numerical data types (like int, short, etc).

• Examples:

 ``` float x; double y; y = x; ===> higher accuracy type = lower accuracy type 1. the float value in x is converted to a double 2. the (converted) value is assigned to y x = y; ===> lower accuracy type = higher accuracy type This assignment is NOT allowed (see rules above) (This is because the conversion is unsafe) x = (float) y; ===> 1. The casting operator (float) converts the double value into a float value 2. The (converted) float value is assigned to x y = x + y; ===> x + y 1. the float value is x is converted to a double 2. then + is performed on 2 double values y = double result 3. The result is double and can be assigned to y (because y is a double typed variable) x = x + y; ===> x + y 1. the float value is x is converted to a double 2. then + is performed on 2 double values x = double result 3. The result is double and cannot be assigned to x (because x is a float typed variable ```