Ethernet and streaming


After reading many interesting posts on Ethernet cables and switches I thought it would be good to describe how Ethernet and other networking protocols are used in streaming services. This post does not cover the use of USB, AES or Coax cables. All the words used are factual and taken from other authors. Please feel free to correct any factual inaccuracies. The following topics are covered:

  • Streaming Architecture

  • Media Server

  • Media Renderer

  • Ethernet Data Transmission

  • Ethernet Cable, Noise and Differential Signaling

  • Ethernet Clock

  • Ethernet Switch

  • Summary

128x128welcher

Streaming Architecture:

Network audio is based upon error free bulk data transfer. It incorporates the same/similar technology that are used for rendering the web page you are currently reading. Qobuz, Tidal, Roon and UPnP/DLNA have similar architectures. They all use Ethernet, IP and TCP protocols in the same manner. They differ in their use of the upper level protocols. Since Qobuz, Tidal and Roon are closed architectures we will use UPnP/DLNA as an example.

Basically a media server provides media discovery and media transportation to a media render which converts a binary audio file to analog audio. The following diagram depicts the UpnP / DLNA architecture and protocol suite


 


 


 


 


 


 


 


 

Media Server:

In this example we will transport Jennifer Warnes First We take Manhattan from the media server to the media render. The file is in 96/24 PCM uncompressed format and approximately 133 mega bytes in length. For simplicity we will ignore the upper level protocols and concentrate on Ethernet, IP and TCP.

The media server uses the TCP process to divide the file into TCP packets. The TCP layer forwards the packets to the IP process which encapsulates a TCP packet into an IP packet. The IP process forwards the IP packet to the Ethernet process which encapsulate the IP packet into an Ethernet frame and transmit the frame on an Ethernet cable.

It is common to limit a TCP/IP packet size so that it fits in one Ethernet frame. Doing this will require more than 91,000 frames to transmit the file.

Media Render:

We will now examine the render's processing in a little more detail. The renderer contains a CPU, system memory (RAM) and a system bus. It will also contain a CPU clock which is used to synchronize operations (executing CPU instructions, moving data to and from memory, moving data onto a system bus).

Each process element (TCP, IP and Ethernet) have a designated space in system memory to perform their work. When data is transferred from one processing element to another the data will be copied/moved from one location in system memory to another location. We will refer to this movement of data between the process elements as streaming.

The Ethernet process creates an Ethernet frame from the electrical signals on the cable. Once a frame has been identified the checksum is then verified. If any of the bits are received in error the entire frame is discarded. The contents of the Ethernet frame are then streamed to the IP process.

The IP process extracts an IP packet from the Ethernet frame. If the packet is valid and the destination address matches the renderer address a TCP packet is extracted from the IP packet and streamed to the TCP process.

The TCP process verifies the packet checksum to ensure that data has not been corrupted. If the checksum is not verified the packet is discarded. The packet sequence number is compared to the next expected sequence number. If the numbers match the contents of the packet are streamed to the next process in the process chain and the sequence number is acknowledged to the media server. If the received sequence number is less than the expected sequence number the packet is discarded. If the received sequence number is greater than the expected sequence number the packet can be discarded or saved for later processing. The media server will automatically retransmit packets which are not acknowledged within a defined time limit.

After enough data has been accumulated in system memory the process of generating an analog signal from the digital data may begin.


 

Ethernet data transmission:

Ethernet cables transmit binary data. The only two values that can be transmitted are 0 and 1. The transmitter converts a bit value into a voltage value for transmission on the wire (modulation). The receiver converts the voltage value into a bit value at the other end (demodulation). The sending and receiving station agree on a clock rate, also known as frequency, which determines how long each ‘instance’ of voltage must be applied. This process of mapping voltages to and from bits is known as line coding. There are several standards for line coding. For this example we will use 4B5B / MLT-3. (100Mb Ethernet)

4B5B is a line coding used to create a data stream that is a self clocking. 4B5B maps groups of 4 bits of data onto groups of 5 bits for transmission. These 5-bit words are defined in a dictionary and they are chosen to ensure that there will be sufficient transitions in the line state to produce a self-clocking signal. The receiver then maps the 5 received bits into 4 data bits. The following table shows the mapping of data bits to transmitted bits and vice versa.


 


 


 


 


 


 


 


 


 

Once the bits to be transmitted have been created they are converted to voltage using the MLT-3 coding. MLT-3 cycles through three voltage levels -1, 0, +1. It alternates from -1 to 0 to +1, back to 0, then back to -1, repeating indefinitely. The transmitter moves to the next voltage level in the cycle to transmit a one bit and stays at the same level to transmit a zero bit. The following table shows an example MLT3 transfer:

The receiving station uses MLT-3 to generate a bit from a line voltage and 4B5B to generate data bits from transmitted bits.


 

Ethernet cable, noise and differential signaling

Standards govern Ethernet cable construction. The 8P8C specification governs the physical connector on each end of the cable. The RJ45 specification specifies the amount of wires in the cable, the order in which they appear and the usage of the 8P8C connector.

Copper based Ethernet cables use eight individual wires in a bundle. The eight individual wires are paired in sets of two with each pair twisted around each other. This creates four channels through which data can be transmitted. Ethernet cables can be shielded or unshielded. A shielded cable has shielding around each pair of wires and additional shielding around all four pairs. A shielded cable must also be coupled with a shielded 8P8C connectors. If the shielding becomes damaged it can act as an antenna and introduce additional electromagnetic noise from the environment.

Copper Ethernet cables are divided into categories based mainly on bandwidth (measured in MHz), maximum data rate (measured in megabits per second) and shielding. Bandwidth or frequency is a property of a cable and measures the rate at which a signal will cycle each second. 1 MHz is equal to 1 million cycles per second. Cat5 can handle up to 100 million signals a second (or 100 MHz). As a general rule, the higher the category number, the higher the noise reduction and lower attenuation, and consequently the higher the bandwidth. For example, CAT6 can handle higher data rates at longer distances than CAT5 can.

Copper Ethernet cables can transmit electromagnetic noise, absorb electromagnetic noise from the environment and generate electromagnetic noise. This noise can affect the transmission of the electrical signals representing binary zeros and ones.

To increase immunity to noise Ethernet uses differential signaling. One pair of wires are used to transmit and another pair is used to receive data. The electrical signal is transmitted on both wires of the pair but in opposite polarity. The signal at the receiving end is interpreted as the difference between the two lines that make up the differential pair. If a bit is not recognized properly then the entire frame will be discarded and eventually retransmitted.

It should be noted that fiber optic Ethernet cables do not generate, absorb or transmit electromagnetic noise.


 

Ethernet clock

There is no specific clock signal passed on an Ethernet cable. The Ethernet specification requires that the transmitted data be self-clocking. This is accomplished through several mechanisms. First is that each Ethernet frame begins with a predefined set of 64 bits. Fifty six bits are the preamble which consists of alternating zero and one values. The next 8 bits indicate start of frame and contain the values 10101011.

The receiving station uses the preamble to sync its clock to the transmitting stations clock. This synchronization is performed for every frame received. The receiving station maintains synchronization by recognizing at least two voltage changes over five bit times.

If synchronization is lost the receiving station will discard bits until it synchronizes at the start of the next frame. After a frame has been transmitted the transmitting station will wait 96 bit times (Inter-Packet gap) before it starts transmitting the next frame.

Ethernet Switch

Ethernet switches connect multiple devices together by physically cabling devices to the same switch or devices connected to another switch.

Every Ethernet compatible device has a hard coded physical address called a MAC address that the connecting switch uses to uniquely identify a device. When a switch receives an Ethernet frame, it stores the sending device's MAC address and the port it is connected to in a locally held table called a MAC address table. The switch then checks the MAC address table to see if the destination MAC address is connected to the same switch. If it is, the switch forwards the frame to the known destination port. If not, the switch broadcasts the frame to all ports.

Ethernet switches can not and do not change the physical properties of the data being transmitted.

Summary

Ethernet cables only transmit two values 0 or 1.

The bits transmitted on the cable are not the actual data but a representation of the data.

Copper Ethernet cables can transmit electromagnetic noise, absorb electromagnetic noise from the environment and generate electromagnetic noise.

Ethernet design increases immunity to noise, thus reducing transmission bit errors.

Fiber optic Ethernet cables do not transmit electromagnetic noise, absorb electromagnetic noise from the environment or generate electromagnetic noise.

There is no specific clock signal passed on an Ethernet cable the data is self-clocking. An Ethernet receiver re-clocks with the transmitter at the start of every frame.

Ethernet, IP and TCP operate on a block basis (frame/packet). A checksum is used to detect bit errors. If an error is detected the entire block is discarded and eventually retransmitted.

Ethernet switches can not and do not change the physical properties of the data being transmitted.

The music file is transmitted in small discrete chunks and reassembled in destinations system memory. The analog signal is generated by transferring bits from system memory and not from the Ethernet cable.