当前位置：搜档网 › Model-based delay-distortion optimization for video streaming using packet interleaving

Model-based delay-distortion optimization for video streaming using packet interleaving

Model-Based Delay-Distortion Optimization for Video Streaming

Using Packet Interleaving

Yi J.Liang??,John G.Apostolopoulos and Bernd Girod?

Streaming Media Systems Group?Information Systems Laboratory Hewlett-Packard Labs,Palo Alto,CA94304Stanford University,Stanford,CA94305

Invited Paper

ABSTRACT

Bursty channel losses have been shown to generally produce larger total mean-square error distortion in streaming video than an equivalent number of isolated losses.This paper proposes a sim-ple packet interleaving scheme to combat the effect of bursty losses by dispersing them.The optimal interleaver for minimizing the ex-pected total distortion of the decoded video,subject to a delay con-straint,is determined using a model that accurately predicts the expected distortion for different packet loss https://www.sodocs.net/doc/ed7065606.html,pared to other forms of error-resilience,packet interleaving has the ad-vantages of(1)simplicity and(2)not requiring any extra bitrate. For a simple burst loss channel,where each loss event has100ms duration,packet interleaving can provide between.24and.87dB gain over conventional non-interleaved streaming,at the expense of a delay increase of400ms.

1:Introduction

The unreliable and error-prone nature of the wireless channel is one of the major challenges for wireless video streaming.Applica-tions,such as video streaming to handheld devices with the emerg-ing Third Generation(3G)cellular system,have to cope with this lack of QoS guarantees,including packet loss.Furthermore,wire-less channels are af?icted by time-varying fading and interference conditions,which may lead to bursty packet losses[1].

In[2],we?nd that the pattern of packet loss,including the length of burst loss,has a signi?cant effect on video quality. Speci?cally,the total distortion produced by a packet loss pattern is not proportional to the average packet loss rate,as is often as-sumed.For example,the total distortion produced by a burst loss is much higher than that by an equal number of isolated losses.The quality degradation increases as the burst length increases.Burst losses are more dif?cult to conceal due to the amount of informa-tion lost.

The problem of error-resilient video communication has re-ceived signi?cant attention in recent years,and a variety of tech-niques have been proposed to combat channel losses and increase the robustness of communication[3].Examples of recent work in this area include,inter/intra-mode switching[4],[5],[6],[7], the use of Forward Error Correction(FEC)[8],dynamic control of prediction dependency using long-term memory[9],[10],[11], channel-adaptive packet scheduling[12],[13],[14],[15],and the ?This work was performed during a summer internship at HP https://www.sodocs.net/doc/ed7065606.html,e of multiple description coding and path diversity[16],[17], [18].These techniques manage and optimize the dependency across predictively coded packets,or alleviate the negative effect of temporal correlation of the channel.All of these schemes im-prove error-resilience at the expense of increased bitrate.

In this work,we explore a simple packet scheduling scheme, packet interleaving,to convert burst losses into an equivalent num-ber of isolated losses which in general are easier to recover from and produce lower total https://www.sodocs.net/doc/ed7065606.html,pared to other types of error-resilience techniques,packet interleaving provides the ad-vantages of(1)being simple,and(2)it does not require any in-crease in bitrate.A particularly appealing scenario is wireless mul-ticast or broadcast,where the channel is af?icted by fading and in-terference which may lead to burst losses,and these one-to-many applications prohibit the use of retransmission of lost packets.Fur-thermore,packet interleaving may be used in conjunction with other error-resilient techniques.A potential drawback of packet interleaving is that it requires additional delay.However,the re-quired delay,which depends on the channel burst length charac-teristics,generally can be relatively short.

It is bene?cial to compare this work versus prior work.Inter-leavers have been used for decades to improve FEC performance in channel coding by converting burst losses into isolated(random) losses spread over many FEC codewords(instead of the burst loss af?icting a single codeword)which are then easier to correct using a random-error correcting code.However,in this work we identify that from a error-resilient source coding perspective we can reduce the total distortion for video streaming by converting burst losses into isolated losses.Speci?cally,in our case the total number of losses that af?ict the video decoder is the same with or without in-terleaving,however we are changing the pattern of the losses that af?icts the video decoder.

This paper continues in Section2by reviewing the model pro-posed in[2]for accurately estimating the expected total distortion produced by each packet loss pattern.Section3presents the pro-posed packet interleaving scheme,and describes how to determine the optimal interleaver given a delay constraint and knowledge of the channel burst length behavior.Experimental results demon-strating the potential gains are presented in Section4.

2:Modeling the Effect of Packet Losses

Many of the works on packet scheduling,such as[12],[13],[15], use RD optimization techniques to improve the performance of

video communication over lossy channels.The goal of these opti-mization algorithms is to minimize the expected distortion subject to a bitrate constraint.The performance of these RD optimizations crucially depend on an accurate estimate of the distortion that re-sults for different packet loss events.In this work,the proposed packet interleaving scheme also depends on an accurate estimate of the distortion to perform the optimization.

In[2],a model was proposed which accurately estimates the expected mean-squared error distortion for different packet loss patterns.To estimate the distortion,the model explicitly considers the effect of different loss patterns,including burst losses and sep-arated(non-consecutive)losses spaced apart by a lag,and accounts for inter-frame error propagation and the correlation between error frames.

In the model,and also in this paper,we assume that each pre-dictively coded frame(P-frame)is coded into a single packet,so that the loss of a packet corresponds to the loss of an entire frame. The results can also be extended to the case when each frame is coded into multiple packets where the loss of one packet does not result in the loss of an entire frame.A simple loss concealment scheme is assumed where the lost frame is replaced by the previ-ous frame at the decoder output.

This model differs from prior models in that prior models gen-erally assumed that the total distortion is proportional to the num-ber of lost packets,and therefore prior models did not consider the effect of the speci?c packet loss pattern.However,it was shown that the total distortion produced by a packet loss pattern is not proportional to the number of lost packets.For example,the total distortion produced by a burst loss is generally much higher than that produced by an equal number of isolated losses.For a burst loss of length2,it was shown in[2]that the resulting total distor-tion is not equal to the sum of two independent losses occurring at the same locations,but it also includes a cross-correlation term be-tween the two error frames.Due to the correlation term,which is positive in most cases,the distortion resulting from the burst loss is generally greater than the sum of the distortions of two single and independent losses.The distortion as a function of burst length is illustrated in Figure1,which plots(1)the actual measured dis-tortion,(2)an estimate of the distortion assuming that the total distortion is proportional to the number of lost packets(this lin-ear relationship corresponds to a straight line in the log-log plot), and(3)the estimate of the distortion using the model proposed in [2].There are two key observations from this?gure.First,the ac-tual distortion produced by a burst loss is generally greater than an equal number of isolated losses,and hence it is not proportional to the packet loss rate.This difference is approximately1.5dB for a burst loss of length two,and increases with burst length.Second, the model of[2]is able to accurately estimate the total distortion for different loss events,e.g.within±.25dB for a burst loss of length two.

A loss model for two losses separated by a small lag is also derived in[2].In that case the total distortion is determined by two independent components,plus a correlation term between the propagated?rst error and the second error.The total distortion is a function of the lag between the two losses.With the models above,the distortion for more general loss patterns can be obtained by using those models concatenated and combined.

Fig.1.Measured versus estimated total distortion using two mod-els[2],as a function of burst loss length,normalized by total dis-tortion for a single loss.The total distortion is the sum of MSEs of all frames affected by channel errors in the recovery period,not including quantization error.

3:Delay-Distortion Optimized Packet Interleaving

The previous section described that a burst loss generally pro-duces greater total distortion than an equivalent number of isolated losses.This suggests that when communicating over a channel that exhibits burst losses,it would be bene?cial to use interleaving to convert the burst losses into an equal number of isolated losses. In this section we use the loss model from[2]to design the op-timal packet interleaving scheme that maximizes the performance (minimizes total distortion)given knowledge of the burst loss char-acteristics of the channel.Since there exist many approaches for interleaving,we begin by introducing the speci?c packet interleav-ing strategy used in this paper.

3.1:Packet Interleaver

A simple block interleaver is used at the sender to interleave the packets before transmission.Packets are?rst read into the inter-leaver in rows,with each row corresponding to a block of n pack-ets.Packets are transmitted as soon as d rows of packets?ll up,and are transmitted by columns.Here n is referred to as the block size and d is the interleaving depth of the interleaver.Fig.2shows an (n,d)interleaver.For example,consider the case when the trans-mission channel is af?icted by a burst loss of length three.If no packet re-scheduling(interleaving)is performed then packets1,2 and3would be https://www.sodocs.net/doc/ed7065606.html,ing the interleaver shown in Fig.2,packets 1,5and8would be lost if af?icted by the same loss event.Since the same burst loss now affects separated packets instead of suc-cessive packets,according to the loss model discussed in Section 2we expect the resulting distortion to be lower.To illustrate this effect,Fig.3plots the PSNRs of the Claire sequence transmitted over a bursty channel,with and without using a(9,3)interleaver, when the same packet loss realization affects both transmissions. The locations of the lost frames that result for using and not using interleaving,for the same packet loss pattern,is also illustrated in the?gure.It is observed that the simple interleaver leads to lower total distortion by converting the burst losses into isolated losses, and the quality of the reconstructed video is more uniform and also recovers faster from packet losses.Note that the total number of losses is the same in both cases,the difference is the pattern of the losses.

A simple packet interleaver permutes the locations of losses in

!"#$%&'

d n

Packet read in

Packet Fig.2.A block interleaver with block size n =4and interleaving depth d =3.

Fig.3.PSNRs of Claire sequence transmitted,with and without a (9,3)interleaver,over the channel with the same loss realization.Slices are intra updated periodically to facilitate loss recovery.

order to convert burst losses into isolated losses.The effectiveness of the interleaver depends on the block size and the interleaving depth of the interleaver,and the loss characteristics of the channel.With an interleaving depth of d ,a burst loss of length B can be converted into a shorter burst with a maximum length of B/d ,where x denotes the smallest integer not smaller than x .In an ideal case,when d ≥B ,the burst loss is converted into isolated losses.In this case,the separation between any two losses is either n or n ?1.

A larger interleaver is more effective in that it can convert a longer burst loss into isolated losses or increase the separation of the converted isolated losses.However,this is at the cost of higher latency.At the client,an interleaved packet received cannot be used until all the packets it depends on are received.For a (n,d )interleaver,the n -th packet in the original order suffers from the highest delay,which has to be transmitted in the ((n ?1)·d +1)-th place.Hence,the decoding delay corresponding to a (n,d )interleaver is (n ?1)×(d ?1),and a trade-off exists between the effectiveness in permuting the packets and the latency.It should be noted that generally a large delay is not required since,as shown in Section 4,as n ×d increases beyond a certain point,further increase in n or d does not necessarily improve the performance,i.e.a larger interleaver is not always better.Also note that the total delay here is not the typical n ×d which arises in channel coding situations,since we do not have the delay of applying FEC across the entire interleaved data.In the next subsection,we determine the optimal interleaver (n,d )under certain delay constraints.

3.2:Optimal Interleaving

We use the set K orig ={k 1,k 2,...}to denote the indices of the original lost packets when transmitted over the channel with no interleaving.With interleaving,the losses are redistributed across packets,and the loss indices are a function of the inter-leaver parameters.We use K =I (n,d,K orig )to denote the indices of the lost packets when a (n,d )interleaver is used,where I (·)is the functional representation of the interleaver (n,d ),and K orig denotes the indices of the lost packets before interleav-ing.In the example in Fig.3,where two bursts of losses occur,K orig ={17,18,19,62,63,64},while K =I (9,3,K orig )={6,15,23,57,66,74}.

The total distortion D of the decoded video sequence,which depends on the loss pattern,is a function of the lost packets K ,and hence a function of the interleaver used,(n,d ).If the chan-nel loss statistics are known (for instance,the distribution of B is known)we are able to design the optimal interleaver (n opt ,d opt )that achieves the lowest distortion given a delay constraint.The problem is formally stated as follows:given the channel loss char-acteristics,and the delay constraint C delay ,determine the optimal interleaver (n opt ,d opt ),such that the total distortion of the de-coded video sequence D [I (n,d,K orig )]is minimized,i.e.,

(n opt ,d opt )=

arg min n,d :(n ?1)×(d ?1)≤C delay D [I (n,d,K orig )].

This is a delay-distortion optimization problem.To solve for the optimal n and d ,we need to estimate the distortion that results for different loss patterns K .This is achieved using our proposed loss model discussed in Section 2.

When the characteristics of a channel are known,e.g.,the probability distribution for burst loss length B ,the distortion used above for optimal interleaver selection is the expected distortion.The optimal interleaver is selected to minimize the expected dis-tortion.3.3:Algorithm

Given an estimate of the channel loss characteristics,we can esti-mate the probability of different loss patterns and hence the associ-ated loss events K orig .For a given delay constraint C delay ,we de-termine all factorizations of n and d ,such that (n ?1)×(d ?1)≤C delay .For each set of interleaver parameters (n,d ),we calculate the indices I (n,d,K orig )of the redistributed losses.For a par-ticular loss event K =I (n,d,K orig ),we are able to estimate the corresponding total distortion,D [K ],using the loss model dis-cussed in Section 2.The estimated distortion for a particular loss event K ,and for a particular video sequence,can also be stored at the sender or streaming server for future use.

In Fig.4,we present an example of determining the optimal interleaver (n,d )for a delay constraint of 13frames,and given a simple channel loss model where each loss event corresponds to a burst loss of length 3packets (frames).The detailed experimental conditions are described in Section 4.There are 34eligible inter-leavers for this delay,including (14,2),which has a delay of 13,and (2,13),(3,7),(4,5),(5,4),(7,3),(13,2),which all have de-lays of 12,and many other factorizations (not shown in example)which have lower delays.For each of these eligible interleavers (n,d ),the total distortion is calculated using the loss model,and averaged result over multiple loss realizations.

Fig.4.Determining the optimal interleaver given a delay con-straint of13frames.Candidate interleavers include(2,13),(3,7), (4,5),(5,4),(7,3),(13,2),and(14,2),with only the block size

n labeled on the horizontal axis.The distortion reduction is calcu-lated based on the conventional case when no interleaving is used.

The reduction in total distortion by using different interleavers (only the best7out of34are shown)is plotted in Fig.4for the F oreman and Claire sequences.Results shown are given both by actual measurements and model estimation.Note that the model predicts that the(7,3)interleaver provides the maximum improvement(minimum distortion),which agrees with the mea-sured results.Intuitively,since the burst loss length is3,it is suf-?cient to have the interleaving depth d=3to convert the burst loss into separated losses,and a large block size n provides the advantage of further separating the losses.However as n further increases,d has to drop to satisfy the delay constraint.When d drops to below3,the performance degrades since a burst loss can no longer be converted into isolated losses.It is also important to note that an interleaver with higher delay does not necessarily give better performance.In this example,the(14,2)interleaver, with a delay of13,does not outperform some of the other inter-leavers with a delay of12,due to the limited factorizations and hence limited choices of(n,d).Also note that the same inter-leaver was identi?ed as optimal by both the model and the mea-surements for both the F oreman and Claire sequences.This demonstrates the high accuracy of the model-based distortion es-timation and its effectiveness in selecting the optimal interleaver to minimize the total distortion.This example also suggests that while the optimal interleaver depends on the channel burst loss characteristics,it may not strongly depend on the speci?c video sequence to be transmitted.

4:Experimental Results

This section presents experimental results to illustrate the poten-tial performance gain that may be achieved by using the proposed simple interleaving scheme for a channel that exhibits a signi?-cant amount of burst loss.In addition,we investigate the trade-off between performance gain from larger interleavers and the corre-sponding delay.We use a simple bursty channel model to illustrate the effects.We simulate that time is divided into100ms intervals, with each interval corresponding to3packets(frames)for a frame rate of30fps.Each interval may be in either a good state or a bad state.In a good state,3consecutive packets are received;while in a bad state,3are lost.Each time interval is assumed to be indepen-dent and

identically distributed(Bernoulli),with the probability that a time interval is in the bad state is0.10.The average packet Fig.5.Optimal PSNR versus delay constraint.Experimental data points(eligible interleavers)are marked with circles,and only those corresponding to optimal interleavers are marked with stars. loss rate is therefore also0.10,i.e.10%of the packets are lost.

Our primary reason for choosing this simple channel model is that it simpli?es interpretation of the results.

Video sequences are coded using JM2.0of the emerging

JVT/H.26L video compression standard.Four standard test se-

quences in QCIF format are used,Foreman,Mother-Daughter,

Salesman and Claire.Each has280frames at30fps,and each is

coded with a constant quantization level of16which produces an

average PSNR of about36dB.The?rst frame of each sequence

is intra-coded,and all subsequent frames are coded as P-frames.

Every4frames a slice is intra updated to improve error-resilience, corresponding to an intra-frame update period of N=4×9=36

frames.The distortions are obtained by averaging the results for

6random channel loss realizations shifted across the whole se-

quence,or,a total of280×6loss realizations.

For different delay constraints,all of the eligible interleavers are identi?ed and their performances are then estimated.The

PSNRs for Foreman and Claire with different interleavers as a

function of delay constraint are shown in Fig.5.Note that the

PSNRs shown in Fig.5are the averaged results for all frames,in-cluding both good and error-af?icted frames,in a sequence;while

the quantities shown in Fig.1and4are the normalized total dis-

tortion of the error-af?icted frames only.For a particular delay

constraint C delay,an optimal interleaver(n opt,d opt)is found us-ing the algorithm in Subsection3.3.Although many eligible in-

terleavers are tested and marked in the plots,only those that pro-

vide optimal performance are marked with stars.For example,

for C delay=12frames,(n opt,d opt)=(7,3)is found.For 12

It is observed from Fig.5that using interleaver(5,3)with a

delay of8frames(267ms)provides a gain of0.67dB over the

case of no interleaving for https://www.sodocs.net/doc/ed7065606.html,ing interleaver(7,3)with a delay of12frames(400ms),increases the gain to0.72dB.

For Claire sequence,gains of0.81dB and0.93dB are

achieved for delays of333and533ms,respectively.It is also

Table1.Gain in PSNR(dB)provided by the optimal interleaver for different delay constaints.

Delay Foreman Mother Salesman Claire

(frame/ms)

8/2670.670.160.320.60

12/4000.720.240.360.87

16/5330.720.240.360.93 observed that as the delay constraint exceeds16frames,the per-formance of eligible interleavers does not further increase.This is because as the interleaver becomes larger,each burst loss is sepa-rated farther apart,and the isolated losses of one burst loss come closer to the isolated losses corresponding to the next burst loss in the sequence.Also note that as the distances between isolated losses increases(and approach the intra-period)the losses begin to act as independent losses and any further increase in the spacing between losses does not lead to further reduction in total distortion.

The gains in PSNR for all four video test sequences examined in the experiments are listed in Table1,for different delay con-straints and corresponding optimal interleavers.Note that these gains are obtained without requiring any increase in bitrate.The optimal interleavers with delay of8and12frames are(5,3)and (7,3),respectively,for all sequences,which indicates the optimal interleaver’s weak dependence on the sequence.

5:Conclusions

Burst losses in packetized compressed video produce greater to-tal distortion than an equivalent number of isolated losses.This paper proposed a simple packet interleaving(or packet schedul-ing)scheme to combat bursty channel losses by converting the burst losses into isolated losses.Given knowledge of the channel burst loss characteristics,the optimal interleaver is determined for a given delay constraint,in order to minimize the total distortion as seen by the receiver.Speci?cally,the optimal interleaver is deter-mined by using an accurate model for predicting the distortion that results for different packet loss patterns.The proposed approach of packet interleaving is simple,provides an improvement without requiring any additional bit rate,and can be used in conjunction with other forms of error-resilient coding and communication.

References

[1]Theodore S.Rappaport,Wireless Communications:Princi-

ples and Practice,Prentice Hall,2nd edition,2001.

[2]Y.J.Liang,J.G.Apostolopoulos,and B.Girod,“Analysis of

packet loss for compressed video:Does burst-length matter?,”

IEEE International Conference on Acoustics,Speech,and Sig-nal Processing,ICASSP’03,submitted.

[3]S.Wenger,G.D.Knorr,J.Ott,and F.Kossentini,“Error re-

silience support in H.263+,”IEEE Trans.Circuits Syst.Video Technol.,vol.8,no.7,pp.867–877,Nov.1998.

[4]J.Y.Liao and J.D.Villasenor,“Adaptive intra update for video

coding over noisy channels,”in Proceedings IEEE Interna-

tional Conference on Image Processing,Lausanne,Switzer-land,Sept.1996,vol.3,pp.763–6.

[5]R.O.Hinds,T.N.Pappas,and J.S.Lim,“Joint block-based

video source/channel coding for packet-switched networks,”

in Proceedings of the SPIE VCIP98,San Jose,CA,Oct.1998, vol.3309,pp.124–33.

[6]K.Stuhlm¨u ller,N.F¨a rber,M.Link,and B.Girod,“Analysis

of video transmission over lossy channels,”IEEE Journal on Selected Areas in Communications,vol.18,no.6,pp.1012–32,June2000.

[7]R.Zhang,S.L.Regunathan,and K.Rose,“Video coding with

optimal inter/intra-mode switching for packet loss resilience,”

IEEE Journal on Selected Areas in Communications,vol.18, no.6,pp.966–976,June2000.

[8]W.Tan and A.Zakhor,“Video multicast using layered FEC

and scalable compression,”IEEE Trans.Circuits Syst.Video Technol.,vol.11,no.3,pp.373–87,Mar.2001.

[9]T.Wiegand,N.F¨a rber,and B.Girod,“Error-resilient video

transmission using long-term memory motion-compensated prediction,”IEEE Journal on Selected Areas in Communi-cations,vol.18,no.6,pp.1050–1062,June2000.

[10]M.Budagavi and J.D Gibson,“Multiframe video coding for

improved performance over wireless channels,”IEEE Trans-actions on Image Processing,vol.10,no.2,pp.252–265, Feb.2001.

[11]Yi J.Liang and B.Girod,“Low-latency streaming of pre-

encoded video using channel-adaptive bitstream assembly,”

in Proc.of the IEEE International Conference on Multimedia and Expo(ICME),Lausanne,Switzerland,Aug.2002,vol.1, pp.873–876.

[12]P.A.Chou and Z.Miao,“Rate-distortion optimized stream-

ing of packetized media,”IEEE Trans.Multimedia,Feb.2001, submitted.https://www.sodocs.net/doc/ed7065606.html,/?pachou.

[13]M.Kalman,E.Steinbach,and B.Girod,“R-D optimized

media streaming enhanced with adaptive media playout,”in Proc.IEEE International Conference on Multimedia,ICME-2002,Lausanne,Switzerland,Aug.2002,vol.1,pp.869–872.

[14]S.J.Wee,W.Tan,J.G.Apostolopoulos,and M.Etoh,“Opti-

mized video streaming for networks with varying delay,”in Proc.of the IEEE International Conference on Multimedia and Expo(ICME),Lausanne,Switzerland,Aug.2002,vol.2, pp.89–92.

[15]J.Chakareski,P.A.Chou,and B Aazhang,“Computing rate-

distortion optimized policies for streaming media to wireless clients,”in Proc.Data Compression Conference,Snowbird, UT,Apr.2002,pp.53–62.

[16]John G.Apostolopoulos,“Reliable video communication

over lossy packet networks using multiple state encoding and path diversity,”in Proceedings Visual Communication and Image Processing,Jan.2001,pp.392–409.

[17]S.Lin,S.Mao,Y.Wang,and S.Panwar,“A reference picture

selection scheme for video transmission over ad-hoc networks using multiple paths,”in Proc.of the IEEE International Con-ference on Multimedia and Expo(ICME),Aug.2001. [18]Y.J.Liang,E.Setton,and B.Girod,“Channel-adaptive video

streaming using packet path diversity and rate-distortion opti-mized reference picture selection,”in Proceedings of IEEE 5th Workshop on Multimedia Signal Processing,St.Thomas, Virgin Island,Dec.2002.

Model-based delay-distortion optimization for video streaming using packet interleaving

相关文档

最新文档