Warning: This is a development version. The latest stable version is Version 4.0.1.

Content-Aware Coding

One of the key goals of Rely is to provide low latency loss recovery using ECC/FEC algorithms for latency sensitive application. In order to achieve this goal sometimes we need the ECC/FEC algorithm to be “aware” of the structure of the content we are protecting.

Background

Before we get to that part though, let’s first consider how Rely works in the general case. Rely is based on a sliding window ECC/FEC algorithm. The following figure illustrates how this works for an input stream of source symbols (could be, e.g., a video stream).

../_images/rely_content_aware_background.svg

As can be seen the coding window moves as packets are being generated. Exactly how the coding window moves is dependent on the desired packet loss protection, maximum latency allowed etc. In the figure above we see that for every four source symbols a single repair packet is generated. That would give a repair rate of every one out of five symbols 1 / 5 = 20%. The basic promise of the ECC/FEC algorithm is then that as long as no more than 20% of the symbols are lost during transmission we will be able to decode original source symbols - without the need for retransmissions. This is not totally accurate (we need to account for the loss distribution and the size of the coding window), but as a general rule-of-thumb we should be able to recover from most of the observed packet loss in this way.

This works well for general traffic like IP packets where we just want to protect a generic network stream with packets coming from several independent applications (an example of this would be using Rely increase reliability of a VPN type service).

However let’s consider what can happen when we want to protect the traffic coming from a latency sensitive application where the structure of the content is important.

Example: Video Streaming

Here we will use video streaming as an example, but in general this should hold true for most other types of content delivery where boundaries in the content are important.

The Problem

Let us consider a live video streaming service using H264 video compression. The typical frame rate for such an application is 30 fps (frames per second). This means that every 1/30 = 33 ms a frame is generated. The frames will have different sizes and therefore translate into a variable number of network packets. The reason H264 produces different sized frames is that the compression depends on the amount of activity in the video. If lots of stuff is happening the compression will be lower and the frames will be larger, if only very little is changing frame to frame the compression can be very efficient and the frames very small. In addition H264 can be configured to regularly output what is called an I-frame that contains all information needed for a receiver to start decoding the video and are required for on-the-fly joining or to catch up after a loss of data. These frames tend to be quite large.

Let look at an example output from a H264 video encoder:

../_images/video_encoder_symbols.svg

As can be seen each source symbol belongs to a specific frame (indicated by their color).

Let imagine that we configure Rely to produce one repair packet for every five input packets. This would translate into the following network flow:

../_images/rely_video_encoding_problem.svg

The problem we can observe here is that the repair for frame 0 is not scheduled until after frame 1 is produced by the H264 video encoder. This means that if one of the packets from frame 0 was lost we would have to wait 33 ms to get repair for it.

Of course this problems gets even worse if the repair_interval is large and the frame sizes small. An example of this problem is illustrated here:

../_images/worse_rely_video_encoding_problem.svg

In addition we may observe a related problem namely that since repair is unaware of the content structure sometimes repair is generated which only covers part of the content. Again this means that if losses occur in unfortunate locations we may have to wait additional frames in order to repair a specific frame.

This is in fact not a new problem but also one that we will often see when using traditional block codes such as Reed-Solomon.

In Rely this problem can be mostly addressed by specifying a suitable timeout for the symbols in the encoder, see rely::encoder::configure(). In which case rely::encoder::upcoming_flush() would indicate that the encoder has source symbols that are about to expire and therefore we need to run rely::encoder::flush().

Rely’s Solution

In order to avoid this problem using Rely we can ensure that repair is generated for any unprotected source symbols currently stored within the encoder.

Note

We call a source symbol unprotected if it has not yet been part of any repair symbols.

This is done by calling rely::encoder::flush_repair().

Lets look at the effect of using this mechanism on the repair_rate and therefore also the bandwidth consumption.

Content smaller than the repair_interval

Let’s look at an example where we always call rely::encoder::flush_repair() after consuming a video frame:

repair_interval = 4
repair_target = 1

repair_rate = 1 / (1 + 4) = 20%

If the number of source symbols in the video frame equals 3. Then repair generated would be:

repair_generated = ceil(3 / (1 - 0.2)) - 3 = 1

The actual repair generated would then in this case be:

1 / (1 + 3) = 25%.

Which is slightly larger than our target of 20%.

Content Larger or Equal to the repair_interval

If more symbols are added than the size of the repair_interval. We may encode as usual, but use rely::encoder::flush_repair() to ensure that repair is also generated even if the number of source symbols does not evenly divide the repair_interval.

Let’s take again an example with the same setup as before:

repair_interval = 4
repair_target = 1

repair_rate = 1 / (1 + 4) = 20%

If our input video frame covers five source symbols we will generate repair twice:

  1. First time when the repair_interval is filled.
  2. Seconds time after consuming the last source symbol from the video frame.

The first time we generate:

repair_generated = ceil(4 / (1 - 0.2)) - 4 = 1

The second time we generate:

repair_generated = ceil(1 / (1 - 0.2)) - 1 = 1

For the entire frame we have two repair packets generated for the five source symbols so our actual repair_rate will be:

2 / (2 + 5) = 28%.

Considerations and Observations

As seen in the two examples above we may avoid “waiting” for data by using Rely’s rely::encoder::flush_repair() functionality. However, as it is also shown this may cause the bandwidth usage of the application to increase.

So before considering this as a solution a few observations and considerations should be made.

  1. If latency is more important than bandwidth usage (and bandwidth is available), this may be a good solution.
  2. If the video bitrate is high, i.e. the source symbols of a single video frame spans multiple repair_interval, then the overhead of flushing repair on the last interval may be amortized and therefore present a good solution.
  3. If the incoming rate of source symbols is high (e.g. due to multiplexing of several streams) using flush repair may be infrequent and therefore present a good solution in cases where the input rate temporarily drops.

Implementation

From an implementation point-of-view there are multiple ways we can use rely::encoder::flush_repair():

  1. One approach is to have a timer fire and call rely::encoder::flush_repair() if you have not received any new data in the past X ms. This has the advantage of adapting ot the input rate. You only use flushing when you need to.
  2. You can also flush for every frame added (if you know where it starts and stops). This aims for the lowest possible latency, i.e., we fully protect a frame without any delay.
Versions
4.0.1
4.0.0
3.0.2
3.0.1
3.0.0
2.0.0
Development
latest