Representing and manipulating information

Representing and manipulating information#

Relevant Reading

This lecture will cover contents from Chapter 4 of the book.

1. How do we “see” things?#

1.2 Everything is a bit#

Each bit is 0 or 1
By encoding/interpreting sets of bits in various ways
- Computers determine what to do (instructions)
- … and represent and manipulate numbers, sets, strings, etc…
Why bits? Electronic Implementation
- Easy to store with bistable elements.
- Reliably transmitted on noisy and inaccurate wires.

Electronic representation of bits

1.3 Encoding byte values#

Byte = 8 bits
Binary: 0000 0000 to 1111 1111.
Decimal: 0 to 255.
Hexadecimal: 00 to FF.
- Base 16 number representation
- Use character 0 to 9 and A to F.
Example: 15213 (decimal) = 0011 1011 0110 1101 (binary) = 3B6D (hex)

Hex	Decimal	Binary	Binary to Decimal Calculation
0	0	0000	0 * \(2^3\) + 0 * \(2^2\) + 0 * \(2^1\) + 0 * \(2^0\)
1	1	0001	0 * \(2^3\) + 0 * \(2^2\) + 0 * \(2^1\) + 1 * \(2^0\)
2	2	0010	0 * \(2^3\) + 0 * \(2^2\) + 1 * \(2^1\) + 0 * \(2^0\)
3	3	0011	0 * \(2^3\) + 0 * \(2^2\) + 1 * \(2^1\) + 1 * \(2^0\)
4	4	0100	0 * \(2^3\) + 1 * \(2^2\) + 0 * \(2^1\) + 0 * \(2^0\)
5	5	0101	0 * \(2^3\) + 1 * \(2^2\) + 0 * \(2^1\) + 1 * \(2^0\)
6	6	0110	0 * \(2^3\) + 1 * \(2^2\) + 1 * \(2^1\) + 0 * \(2^0\)
7	7	0111	0 * \(2^3\) + 1 * \(2^2\) + 1 * \(2^1\) + 1 * \(2^0\)
8	8	1000	1 * \(2^3\) + 0 * \(2^2\) + 0 * \(2^1\) + 0 * \(2^0\)
9	9	1001	1 * \(2^3\) + 0 * \(2^2\) + 0 * \(2^1\) + 1 * \(2^0\)
A	10	1010	1 * \(2^3\) + 0 * \(2^2\) + 1 * \(2^1\) + 0 * \(2^0\)
B	11	1011	1 * \(2^3\) + 0 * \(2^2\) + 1 * \(2^1\) + 1 * \(2^0\)
C	12	1100	1 * \(2^3\) + 1 * \(2^2\) + 0 * \(2^1\) + 0 * \(2^0\)
D	13	1101	1 * \(2^3\) + 1 * \(2^2\) + 0 * \(2^1\) + 1 * \(2^0\)
E	14	1110	1 * \(2^3\) + 1 * \(2^2\) + 1 * \(2^1\) + 0 * \(2^0\)
F	15	1111	1 * \(2^3\) + 1 * \(2^2\) + 1 * \(2^1\) + 1 * \(2^0\)

Google Spreadsheet demonstrating conversion process

1.4 How are data represented?#

C data type	typical 32-bit	typical 64-bit	x86_64
char	1	1	1
short	2	2	2
int	4	4	4
long	4	8	8
float	4	4	4
double	8	8	8
pointer	4	8	8

2. Bit-level operations in C#

Boolean algebra developed by George Boole in 19th century
Algebraic representation of logic: encode True as 1 and False as 0.
Operations: AND (&), OR (|), XOR (^), NOT (~).

A	B	A&B	A\|B	A^B	~A
0	0	0	0	0	1
0	1	0	1	1	1
1	0	0	1	1	0
1	1	1	1	0	0

General Boolean algebra
- Operate on bit vectors
- Operation applied bitwise.
- All properties of boolean algebra apply.

bitwise boolean operations

Operation and notation
- Boolean operations: &, |, ^, ~.
- Shift operations:
  - Left Shift: x << y
    - Shift bit-vector x left y positions
    - Throw away extra bits on left
    - Fill with 0’s on right
  - Right Shift: x >y
    - Shift bit-vector x right y positions
    - Throw away extra bits on right
    - Logical shift (for unsigned values)
      - Fill with 0’s on left
    - Arithmetic shift (for signed values)
      - Replicate most significant bit on left
  - Undefined Behavior
    - Shift amount < 0 or ≥ word size
- Apply to any “integral” data type: long, int, short, char, unsigned
- View arguments as bit vectors.
- Arguments applied bit-wise.
- Mathematical operations:
  - Bit-wise with carry
  - \(0 + 0 = 0\)
  - \(0 + 1 = 1\)
  - \(1 + 0 = 1\)
  - \(1 + 1 = 0\) and carry \(1\) to the next bit operation (or add 1 to left of the most significant bit position)

Hands-on: bit-level operations in C

Inside your csc231, create another directory called 03-data and change into this directory.
Create a file named bitwise_demo.c with the following contents:

Compile and run bitwise_demo.c.
Confirm that the binary printouts match the corresponding decimal printouts and the expected bitwise operations.

3. Encoding integers#

3.1 Mathematical equation#

Assumption:
- \(X\) is a decimal number
- \(X\) can be represented using \(w\) bits under the form \(x_{w-1}x_{w-2}...x_{i}...x_{1}x_{0}\).
- \(x_i\) is a binary value at bit position \(i\) with \(0\leq i \leq (w - 1)\)
The mathematical equation governing the encoding from an unsigned value of \(X\) into a sequence of binary values \(x_{w-1}x_{w-2}...x_{i}...x_{1}x_{0}\) is:

\(X=\sum_{i=0}^{w-1}x_{i}*2^{i}\)

3.2 What about negative numbers?#

Approaches:
- Reserve first bit as sign bit
- One’s complement: The addition of a negative number and its corresponding positive value (complement) in an N-bit binary representation will result in a binary representation that has N ones.
  - For example: in a 3-bit representation, \(2\) is represented as 010, and \(-2\) is represented as 101. Then, \(2+(-2)\) becomes \(010+101=111\).
- Two’s complement: The addition of a negative number and its corresponding positive value (‘complement) in an N-bit` binary representation will a binary representation of to \(2^N\).
  - For example: in a 3-bit representation, \(3\) is represented as 011 and \(-3\) is represented as 101. The sum of these two binary representations is 1000, which is the binary representation of \(2^3\).
Two’s complement is preferred in modern computing design as it supports fundamental arithmetic operations of addition, subtraction, and multiplication of integer numbers as if these numbers were positive.
The mathematical equation governing the encoding from a signed value of \(X\) into a 2’s complement sequence of binary values \(x_{w-1}x_{w-2}...x_{i}...x_{1}x_{0}\) is:

\(X=-x_{w-1} * 2^{w-1} + \sum_{i=0}^{w-2}x_{i}*2^{i}\)

For 2’s complement, most significant bit indicates sign.
- 0 for nonnegative
- 1 for negative

Unsigned	Binary	2’s complement	1’s complement
0	0000	0	0
1	0001	1	1
2	0010	2	2
3	0011	3	3
4	0100	4	4
5	0101	5	5
6	0110	6	6
7	0111	7	7
8	1000	8	-7
9	1001	-7	-6
10	1010	-6	-5
11	1011	-5	-4
12	1100	-4	-3
13	1101	-3	-2
14	1110	-2	-1
15	1111	-1	0

C does not mandate using 2’s complement.
- But, most machines do, and we will assume so.

	Decimal	Hex	Binary
short int x	15213	3B 6D	00111011 01101101
short int y	-15213	C4 93	11000100 10010011

2’s complement examples

2’s complement representation depends on the number of bits.
Technical trick: A binary representation of the absolute value of negative 2 to the power of the number of bits minus the absolute value of the negative number.
Simple example for 5-bit representation

	-16	8	4	2	1
10	0	1	0	1	0	8 + 2 = 10
-10	1	0	1	1	0	-16 + 4 + 2 = -10

Simple example for 6-bit representation

	-32	16	8	4	2	1
10	0	0	1	0	1	0	8 + 2 = 10
-10	1	1	0	1	1	0	-32 + 16 + 4 + 2 = -10

Complex example

	Decimal	Hex	Binary
short int x	15213	3B 6D	00111011 01101101
short int y	-15213	C4 93	11000100 10010011

Weight	15213		-15213
1	1	1	1	1
2	0	0	1	2
4	1	4	0	0
8	1	8	0	0
16	0	0	1	16
32	1	32	0	0
64	1	64	0	0
128	0	0	1	128
256	1	256	0	0
512	1	512	0	0
1024	0	0	1	1024
2048	1	2048	0	0
4096	1	4096	0	0
8192	1	8192	0	0
16384	0	0	1	16384
-32768	0	0	1	-32768
——	—–	—–	——	——
Sum		15213		-15213

3.3 Numeric ranges#

Unsigned values for w-bit word
- UMin = 0
- UMax = \(2^{w} - 1\)
2’s complement values for w-bit word
- TMin = \(-2^{w-1}\)
- TMax = \(2^{w-1} - 1\)
- -1: 111..1
Values for different word sizes:

	8 (1 byte)	16 (2 bytes)	32 (4 bytes)	64 (8 bytes)
UMax	255	65,535	4,294,967,295	18,446,744,073,709,551,615
TMax	127	32,767	2,147,483,647	9,223,372,036,854,775,807
TMin	-128	-32,768	-2,147,483,648	-9,223,372,036,854,775,808

Observations
- abs(TMin) = TMax + 1
  - Asymetric range
- UMax = 2 * TMax + 1
C programming
- #include <limits.h>
- Declares constants: ULONG_MAX, LONG_MAX, LONG_MIN
- Platform specific

Challenge

Write a C program called numeric_ranges.c that prints out the value of ULONG_MAX, LONG_MAX, LONG_MIN. Also answer the following question: If we multiply LONG_MIN by -1, what do we get?
Note: You need to search for the correct format string specifiers.

4. Conversions (casting)#

5. Addition, multiplication, and negation (of integers)#

6. Byte-oriented memory organization#

7. Fractional binary numbers (float and double)#

8. Floating operations#

Basic idea

Compute exact result.
Make it fit into desired precision.
- Possible overflow if exponent too large
- Possible round to fit into frac
Rounding modes

	1.40	1.60	1.50	2.50	-1.50
Towards zero	1	1	1	2	-1
Round down	1	1	1	2	-2
Round up	2	1	1	3	-1
Nearest even (default)	1	2	2	2	-2

Nearest even
- Hard to get any other mode without dropping into assembly.
- C99 has support for rounding mode management
All others are statistically based
- Sum of set of positive numbers will consistently be over- or under-estimated.