CIS307: Data Transmission

[Chapters 3, 4, 5 - Comer (1999)]

Computing takes place mainly in offices, factories, and homes. Local Area Networks (LANs) connect workstations in a building or campus or factory. Soon enough LANs will also connect the equipment in private homes. These islands of communication need to be interconnected: using telephone lines (provided by the common carriers), cable, and satellite communications. In time mobile computing will become commonplace and then also wearable computers will be integrated in the communication network. The information exchanged is data, voice, video. Our interest is mainly in data (digital).

Transmission Media: Copper Wire (Coaxial Cable, Twisted Pair), Glass Fiber,
Microwave, Radio, Infrared. These are the media usually used in communication channels. In considering different kinds of media we are concerned, among other things, with the amount of data that can be transmitted through the medium per unit of time, the power required for the transmission, its attenuation rate, the distorsion, the error rate, the cost, and the security.

Simplex/Duplex: If in a channel transmission can take place in only one direction, we say that the transmission is simplex. If the transmission can take place in both directions, but not at the same time, we say that the transmission is half-duplex, and if it can take place at the same time in both directions it is full-duplex.

Multiplexing: We often want to share a communication channel between more than two pairs of communicating entities. This can be done by Time Division Multiplexing (the channel is used in different time slots by different pairs of entities) or by Frequency Division Multiplexing (communicating pairs use different carriers that can propagate simultaneously through the channel). We call multiplexor the device that inserts two or more signals into the communication channel, and demultiplexor the device that separates the signals out.

Digital/Analog Signals: The signals we consider are usually electromagnetic waves. They propagate at the speed of light (300,000Km/sec) in the vacuum or slightly slower (200,000Km/sec) in materials. Signals are analog i.e. they are represented as a continuous line usually with a variety of levels. Digital signals are analog signals that can be approximated as having two levels, i.e. they are square waves. A communication channel, usually a local area network, where the signal is digital is called a Baseband System. If the signal instead is analog we talk of a Broadband System. [A different interpretation: baseband system = communication system with only one communication channel; broadband system = system with multiple communication channels.] A codec (coder/decoder) is a device that transforms analog signals to/from binary signals (for example, phone, where voice is transformed into digital signal). For instance the conversion from analog to binary signal can be done using PCM (Pulse Code Modulation). See below the Nyquist Sampling Theorem.

Fourier Analysis: Any signal can be expressed as the sum of sinusoidal signals. When desirable, we can focus on the individual sinusoidal signal instead of the original signal. We talk of the period of the sinusoidal signal (the time it takes to complete 360 degrees of the signal), of its wavelength (the space covered by the signal during one period), and of its frequency (the number of periods that fit in a second). The period is measured in seconds, the wavelength in meters, and the frequency in Hertz. We have

SignalVelocity = wavelength * frequency Baud Rate: The number of changes in the signal per unit of time (second). For this rate the number of levels in the signal is immaterial.

Bandwidth: If we consider the signals that propagate through a physical medium, there will be one with highest frequency and one with lowest frequency. The difference between these two frequencies is the bandwidth of the medium. For example for phone communication we use frequencies between 300Hz and 3300Hz, for a bandwidth of 3000Hz. Normally a telephone channel is allocated 4KHz. Note that when you hear people talk of baud rate, they usually mean bandwidth.

Digital/Analog Data: Analog data is real (continuous) data, digital data is binary data. The conversion between binary data and analog signals is done by modems (modulator/demodulator). Some modems use 4 wires (2 to transmit/modulate and 2 to receive/demodulate) to connect two computers, each computer with its own modem. Most modems are dialup modems. A computer or terminal can use a dialup modem to connect to the phone network. This modem has only two wires and it has equipment for making and receiving phone calls. The two wires are used to transmit and to receive (using different frequencies). Cable modems are used to connect a computer to a TV cable network. Data is transmitted usually at different speeds from the cable to the computer and viceversa. Data rates in the Mbps are possible. Essentially one TV channel (6 MHz) per direction is shared by all the users connected to the cable.

Modulation: Modulation/Demodulation is the conversion between binary data and analog signals. Usually the binary data is used to modify some characteristics of a sinusoidal signal, the carrier. We can represent the binary data by modifying the amplitude of the carrier(Amplitude Modulation), or by modifying its frequency (Frequency Modulation), or its phase (Phase Shift Modulation).

Data Rate: The number of bits transmitted per unit of time through the physical medium (also called throughput). Some examples of data rates:

    Transmission System   	 Data Rate      Comment
    ================================================================
    Telephone Twisted Pair       33.6kbps       4Khz telephone channel
    Cable Modem                  500Kbps        CATV cable - shared
                                 up to 4Mbps  
    ADSL - Twisted Pair          64-640Kbps out     coexists with phone
				 1.536-6.144Mbps in
    Radio LAN in 2.4Ghz band     2Mbps           IEEE802.11 wireless LAN
    Ethernet - Twisted Pair      10Mbps         Few hundred feet
    Fast Ethernet - Twisted Pair 100Mbps        same
    Optical Fiber                2.4Gbps-9.6Gbps Using single wavelength

Propagation Delay: The time it takes a signal to propagate from the sender to the receiver. It is the distance divided by the propagation speed (light).

Transmission Delay, the time it takes to transmit a message. It is the size of the message in bits divided by the data rate (measured in bps) of the channel over which the transmission takes place.

Round-Trip Delay: Time delay from instant we start a transmission of a message and the instant the acknowledgement for the message is received. It is at least equal to 2*PropagationDelay + TransmissionDelay.

Delay-Throughput Product: It represents the number of bits in transit between the sender and the receiver. It is the product of the propagation delay times the data rate. So in Ethernet if the sender is 200 meters away from the receiver, the propagation delay is 1 microsecond, thus, since data rate is 10Mbps, there are 10 bits in transit: no more at sender, not yet at receiver. It tells us how much data must be transmitted before the receiver starts getting it. Also, it has an impact on the size of the buffers used at the transmitter and receiver. The buffer size should be at least as big as the delay-throughput product otherwise some data may get lost (if receiver's buffer is smaller) or the communication medium (the pipe) may be become unfilled (if transmitter's buffer is smaller). Another aspect of this product: suppose that data at the receiver arrives too fast and we want to indicate to the sender that it should slow down; then we need to buffer at the receiving end at least 2*Delay-Throughput Product (what was in the pipe when the receiver decides, plus what transmitted while signal propagates from receiver to sender).

Bit-Length: The length of a one-bit signal. It can be easily understood with an example. We are in communication channel where the data-rate is 10Mbps. That means that one bit is transmitted in 1/10^7 seconds (this is the time-to-transmit-one-bit). Since signals propagate in a medium at about 200,000km/s, ie 2*10^8 m/s, the bit-length will be 10^-7 * 2 * 10^8 meters, that is, 20 meters.
The larger the bit-length of a channel, the slower it is that channel.

Nyquist Sampling Theorem: The maximum data rate C (in bps) of a communication channel (Channel Capacity)with bandwidth B where we can recognize K levels in the signal is
            C = 2*B*log2(K)

Another statement of the Sampling Theorem says that we can reconstruct a signal if we can take two samples of its maximum frequency. The sampling theorem is the foundation for PCM since it tells how to sample the analog signal. For example, to binary encode a voice signal, for example a telephone voice channel (4KHz), by Nyquist we need 8000 samples. We choose to recognize 256 amplitude levels (8 bits). Thus a voice channel requires 8000 bytes per second, one byte every 125 microseconds.

In theory, in the absence of noise, according to Nyquist's Theorem it is possible to obtain codes that are as high as desired. Shannon has also derived a theorem (Channel Coding Theorem) to determine the optimal capacity of a noisy channel as a function of the bandwidth of the channel and of the Signal-to-Noise Ratio:

    Capacity (in bps) = 1/2 * SamplingRate * log(1 + Signal-to-Noise-Ratio)
that is, from Nyquist's Sampling Theorem,
    Capacity (in bps)  = Bandwidth * log(1 + Signal-to-Noise-Ratio)
For example, in a phone channel with bandwidth 3000 and Signal-to-Noise Ratio of 2000 the capacity in bps is 3000 * 11 = 33Kbps.

Elementary Coding Theory: Suppose that we need to transmit a set A of symbols. We will encode using a code K the symbols of A with binary patterns, say K(a) with len(K, a) bits. If p(a) is the probability of transmitting code a, then the average number of bits used by K to encode a symbol of A is

	the sum of p(a)*len(K,a) for all a in A
For example, if we are using an alphabet with 4 symbols, we can encode these symbols using two bits. But if we know that the first symbol is used 90% of the time, the second 8%, and the other two each 1%, then if we encode them as 0, 10, 110, and 111 then the average length of a symbol becomes 0.9*1+0.08*2+0.01*3+0.01*3 = 1.12 bits, clearly better than 2 bits.

We may wonder what is the best that a code can do to minimize the average length of the code of a symbol of A. The answer is given by the Entropy of the probability of the symbols of A:

	the sum of p(a)*log(1/p(a)) for all a in A

RS-232 C: Standard for serial data communication (data, electrical, and mechanical levels). Used to transmit character data, for example to connect keyboard to computer or modem to computer. -15 volts is used to represent 1 and +15 volts to represent 0. When the line is idle the voltage is kept at -15 volts. Each character is transmitted asynchronously (i.e. the time interval between successive characters can be of any length) but within a character bits are transmitted synchronously (i.e. at fixed time intervals). The transmission starts with a start bit (a 0) and ends with a stop bit (a 1). In between are sent 7 bits that encode the character. For example the following drawing represents the character 1100101

 

(1)-15 ------+     +-----+-----+           +-----+     +-----+-----+ 
             |     |           |           |     |     |     
(0)+15       +-----+           +-----+-----+     +-----+     
             +Start+  1  +  1  +  0  +  0  +  1  +  0  +  1  +Stop +
The encoding of digital data into binary signals used for RS-232C is called Non-Return-To-Zero-Level (NRZ-L). It is an encoding such that the average voltage on the line is non-zero, thus the line must be grounded. It has another defect: it is not self-synchronizing, in that if we send a sequence of 0s (or a sequence of 1s) it is difficult for the receiver to know exactly where each bit is located. An encoding that does not suffer of these problems is the Differential Manchester encoding which is used in Ethernet (IEEE 802.3) and in token rings (IEEE 802.5). In Differential Manchester there will be a level change in the middle of each bit. Then there will be a transition of level at the beginning of a bit if the bit is 0, no transition if the bit is 1. Notice that this means that Differential Manchester is not efficient since it requires up to twice as many level transitions as there are bits being transmitted.
    ---------+     +-----+  +--+  +--+     +--+  +-----+
             |     |     |  |  |  |  |     |  |  |     |
             +-----+     +--+  +--+  +-----+  +--+     +--------
                1     1     0     0     1     0     1
Asyncronous and Synchronous Transmission: RS-232 is an example of asyncronous transmission, that is of transmission where data is not transmitted continuously, instead it is sent as individual characters, where each character has information to identify its start and its end. It is a simple and cheap way of transmitting information but it has a high overhead (in RS-232 two out of nine bits are overhead, i.e. 22%). In synchronous transmission, whether bit-oriented or character-oriented, each bit occurs at a predictable position. A transmitted block is started and terminated by a well defined delimiter (bit stuffing or byte stuffing - see next section) and in between them the data is transmitted in sequence. Synchronous transmission is more complex but with lower overhead (thus more efficient in terms of utilization of the communication channel) than asynchronous transmission. Example of synchronous codes are BISYNC (for byte oriented bodies) and HDLC (for bit oriented bodies).