Audio/Video Transport Working Group Minutes Reported by Stephen Casner and Colin Perkins The AVT working group met in two sessions at the 55th IETF meeting in Atlanta. In the first session, the group discussed RTP payload formats for MIDI, DTMF digits and tones, iLBC speech, ATRAC-X audio, and uncompressed video. The session ended with an important discussion of the issues to be resolved for IESG approval of Secure RTP. In the second session, the discussion focused on RTP payload formats for MPEG-4 and JVT video plus RTCP extensions for voice quality reporting and for SSM sessions, and RTP retransmission. A bonus topic on the RGL codec and payload format was squeezed in at the end. Introduction, Document Status, and Open Issues This meeting began with an update by Steve Casner on document publication status, including a few issues identified for documents in the queue. One RFC was published since the last meeting (RFC 3389 on Comfort Noise payload format), two are in the RFC editor queue (the MIME registration for the payload formats in the RTP profile and the SDP bandwidth modifiers for RTCP bandwidth, both blocked on the RTP specification), and seven are with the IESG. Of the latter, two are the RTP specification and A/V profile (revisions of RFC 1889 and 1890) which have been "tentatively approved". Final approval is pending preparation of a set of "RFC Editor notes" to be passed with the drafts to the RFC Editor to implement the changes requested by the IESG and the resolution of comments by the working group while the documents have been under IESG review. Steve Casner will prepare these notes for approval by the Area Directors. Some of those RFC Editor notes implement the resolution of an issue with the RTP A/V profile that was raised just before the previous (54th) IETF meeting. This was a request to change the sample packing order for G.726 audio encoding to be consistent with the packing order for ATM AAL2 transport as specified in ITU-T Recommendation I.366.2 Annex E. A request for comments on the proposal to make this change was sent to at least ten relevant mailing lists in IETF and ITU-T. The number of comments was surprisingly small, which indicates that there may not be many implementations of G.726 transport in RTP. However, the comments did indicate that both packing orders are in use and that there are parties opposed to making the change in addition to the those who proposed the change. The conclusion reached by the chairs in consultation with the Area Directors is that we need to define MIME subtypes for two payload formats reflecting the two packing orders. We generally prefer not to have multiple choices because of the risk of incompatibility that imposes, but we are forced into it in this case by an incompatibility that already exists. Furthermore, both packing orders are specified in separate areas of ITU-T (AAL2 and X.400 mail). In order to make clear the incompatibility between the existing G726-* payload formats and the AAL2 packing, we will add a note in the A/V profile section that specifies those formats to note the incompatibility and say that a second set of payload formats named AAL2-G726-* will be specified in a separate document. Or, if the IESG agrees, the AAL2-G726-* specification will be added as a new section in the profile. One problem would be that the profile is to be published as a Draft Standard, which means there should first be two interoperable implementations. Alternatively, a separate draft can be produced quickly to be published as a Proposed Standard. Flemming Andreasen asked why not make the existing G726-* names indicate the AAL2 packing and make up new names for the existing payload formats for RTP. The primary justification for that approach would be if most implementations used the existing name to indicate a payload format with the same packing as ITU I366.2. That appears not to be the case. The real issue is not the name but the interpretation of static payload type 2 which is assigned to G726-32, since most implementations are probably using the 32K rate and using the static payload type rather than the MIME name. This incompatible interpretation exists and can't be avoided. Consequently, we will deprecate the use of static payload type 2. All systems should negotiate a dynamic payload type using the MIME subtypes G726-32 or AAL2-G726-32 depending upon which packetization they want to use. A longer summary of the comments and the details of the conclusion was posted by the chairs to the AVT mailing list on November 14, just before this meeting. Five other drafts have been submitted to the IESG but not yet accepted for publication. These include enhanced CRTP and TCRTP, the secure RTP profile, the payload format for EVRC/SMV speech, and the payload format for distributed speech recognition. Our Area Director Allison Mankin asked for some changes on ECRTP and TCRTP; revisions were submitted. Discussion of the issues for the secure RTP profile is covered later in these minutes. Several drafts are in (extended) working group last call. The RTCP feedback profile draft was updated for this meeting to address comments from the last call, but the authors did not have time to complete a "wording cleanup" pass they want to do, so we will wait for that and give the WG a last chance to read it before passing it on to the IESG. Steve Casner asked for the feedback simulation draft to be updated and resubmitted the so it can accompany the feedback profile as an Informational RFC to help convince the IESG that congestion control issues have been properly addressed. José Rey said he would try to do this. The MPEG-4 payload format has been revised to address comments regarding the section on interleaving; those were discussed in the second AVT session. Two drafts specify unequal error protection: the ULP and UXP FEC mechanisms. At the previous AVT meeting, Steve Casner requested that the ULP draft be changed to update and replace RFC 2733 FEC rather than extend it. The motivation is to correct an unfortunate design choice in RFC 2733 resulting in the X, P and CC bits in the RTP header not following the usual rules (these bits are the XOR of the bits in the protected packets instead) and thus requiring a special case for header validation. A new draft-ietf-avt-ulp-07.txt was submitted in response to this request, but the new design repeats all of the RTP header in the FEC payload so the overhead is too large at 7 octets. It may be possible to just insert the problem bits into the FEC header by reducing the mask size instead. This will be discussed with the authors, and others are asked to comment as well. Finally, the SMPTE 292 video draft completed last call in October but needed a few tweaks to the security considerations and references. Steve also mentioned one new document this is not otherwise on the agenda: draft-kreuter-avt-rtp-clearmode-00.txt, a CLEARMODE payload format that is just the same as PCMU (G.711) audio except that the bits carry ISDN data rather than audio. A question is what media type should be used in an SDP description since the bits are not necessarily audio. There is also the possibility of charter overlap with PWE3 working group. Comments are requested. MIDI Wire Protocol Packetization (MWPP) Colin Perkins, sitting in for John Lazzaro, gave an update on the MIDI Wire Protocol Packetization (draft-ietf-avt-mwpp-midi-rtp-05.txt). This revision incorporates many changes reflecting WG comments (the change log itself is 2 pages). There are about 20 open issues remaining, however; John plans a -06 revision early in the New Year to list those issues and proposed resolutions, and then a -07 revision to incorporate the consensus and be ready for working group last call. This normative draft on the payload format is now accompanied by a new informative draft intended as an implementers guide for MWPP. It includes a walk-through of sample coding techniques intended to help those in the MIDI community who are totally unfamiliar with RTP applications. The new draft is not finished; comments are requested on the approach and what should be added or removed. In parallel with the document preparation, a reference implementation of MWPP in the sfront program is tracking the spec for validation. The MIDI Manufacturers Association has also provided comments and positive feedback on the MWPP work. John has also been contacted by an IEEE WG that is forming to develop transport of MIDI directly over Ethernet (without IP). He asks whether there are any standards or work on using RTP, SDP, RTSP, and SIP in that mode. Anyone with information should let us know. One answer would be, "Don't do that." RTP end-to-end liveness test Henning Schulzrinne presented a topic resulting from a discussion on the mailing list: Flemming Andreasen had asked whether the RFC 2833 tones payload format could be extended to include an active end-to-end liveness test (an RTP "ping"). The purpose is to detect problems above the IP level that might be induced by NATs or firewalls; some risks are that the function could be used for DoS attacks or result in multicast implosion. One solution, which doesn't require anything new, is to just rely on RTCP reception reports. A dummy RTP packet, perhaps with no payload, can be sent if no real traffic is being sent. RTCP already accommodates multicast scaling, although the consequence is that the RTCP response is not immediate. The delay is probably not an unreasonable wait. Not all receivers implement RTCP, but you can distinguish that case from a problem in the RTP forward path by whether you don't get any RTCP at all or just don't get an RR indicating receipt of the RTP packet. A second solution involves signaling (e.g., in SDP) an RTP "ping" capability, then sending a special type of RTP packet that would elicit a response packet sent to a signaled address or to the source address/port of the request packet. But this solution poses the potential for DoS and implosion problems requiring complicated solutions some of which are already in RTCP. That's likely a killer. Flemming favors the RTCP solution, but wants faster response in the case of success. Could the RTCP interval be reduced? Steve Casner responded that the RTCP feedback timing rules would be appropriate. Dave Oran asked why we need a dummy packet, why not just send comfort noise? Magnus Westerlund pointed out that works fine for audio sessions. For others, an empty payload may be needed. Flemming confirmed this because in some SIP scenarios early media packets can cut off ringback tone. Dave continued that this was all started by people who don't do RTCP... they should just do it! Roni Even said monitoring RTCP is important because if the other side dies, there may be no other indication that the packets go into a black hole. Maybe this just needs a hints-for-implementers document. Henning will put the discussion on his RTP web page. RTP Payload for DTMF Digits, Tones and Signals Henning Schulzrinne discussed draft-ietf-avt-rfc2833bis-02.txt, which updates the payload format for DTMF tones in RFC 2833. This payload format transports DTMF and other tones in the form of named events as an alternative encoding the tone waveform with low fidelity when a high-compression codec is in use. There is also a second mode in which tones are specified by their component frequencies. An amazing amount of email has been received with comments and requests for additions, so many people must be implementing and using this payload format. The changes from the -00 revision are: Addition of a formal notion of state to clarify that signals such as on/off-hook and the ABCD bits used on T1 trunks represent sets of states out of which only one can be active. Also, the notion of soft state was added for signals that reset to default value after a period of time. Clarification that events longer than the maximum duration (about 8 seconds for 8 kHz RTP clock) can be expressed as the concatenation of multiple events. Clarification of which tones can meaningfully have a volume specified. Addition of a few data tones and clarification of the meaning and naming of ANS signals. Colin Perkins expressed concern that the state additions may be introducing too much application semantics into the protocol. Henning responded that the concern is understood, but that for the few cases that exist the semantics are already fixed. There are two open issues. The first is that some signals (in particular MF R1 signals) have acquired different names or descriptions over the decades, some of which are not even documented well by the ITU, so help is requested to supply definitive references for the complete and correct text. The second issue is more significant. Some (potential) users of the payload format want to pass the signals required for fax setup and negotiation, but this involves a non-trivial number of bits sent as 300-baud V.21 modem data. Sending these bits as a sequence of tones is very inefficient at one symbol (tone) per packet. This could be improved in various ways, but any significant improvement would require redefining the fields of the payload format to be interpreted differently. There is a real concern is that we're slipping down a dangerous slope of mission creep: this is not a signaling protocol. The purpose of this payload format is to convey tones with more fidelity than low-rate codecs can provide, and to allow the receiver to avoid the need to implement tone detection for some scenarios. Do we want to support a full-featured fax negotiation as a sequence of named events? Or should we say that if you want to do fax you should do T.38 or whatever else might be appropriate, and deprecate what is in RFC 2833 for V.21 now. Either extend to do the whole job in a reasonable way, or don't do it at all. Jim Rafferty, who has participated in IP-FAX standards work, commented that T.38 has its pluses and minuses. A number of people in ITU might be interested in an RTP-based alternative to T.38, but he questioned whether it is worth doing at this point in time. Flemming Andreasen agreed that this payload format shouldn't be a new way of sending fax, but there is a strong need for it in the initial phases of call establishment (V.8, V.8bis, V.25), and most of these signals are sent using V.21. Steve Casner took off his chairman's hat to express the opinion that we should do nothing more than provide for the sending of tones. If it is feasible for some applications to send each bit of V.21 data as a tone in one RFC 2833 packet, that's fine, but we should do nothing to provide a higher-density representation. Flemming asked for a review of the code points that are included; the CI signal is there, but TM and JM are not, and might be useful. Henning said the important point is to get this work completed, and that requires interop testing to allow advancing to Draft Standard. The number of points in the matrix is large, including features such as redundancy; plus for each codepoint the matrix need to state what it means to be supported. One attendee indicated that the tones portion of the draft (specified by frequencies) has been implemented, but a second would be needed for interop testing. Robert Sparks has posted an initial draft of the matrix and others have volunteered to help. They plan to gather as much interop input as possible at SIPit, but for those who are not going to be there, please send interop input to Robert Sparks (see draft-sparks-avt-2833-interop-00.txt). An attendee asked if other forms of DTMF-represented coding can be added, e.g. some signaling supplementary services as defined by Telcordia related to voiceband data transmission. Henning replied that there is more room to add tones that fit the design of the payload format. If there is something that exists now, and preferably is already implemented since we want to get to Draft Standard, send the info: common name, succinct description, and citable reference. However, the list is intended to be extendible after the draft is published; there is an IANA registration mechanism. Payload format for iLBC Speech Alan Duric presented an update of two drafts on the iLBC speech codec and its associated payload format, in draft-ietf-avt-ilbc-codec-00.txt and draft-ietf-avt-rtp-ilbc-00.txt, respectively (each was preceded by two revisions as individual submissions). Extensive changes were made to the iLBC codec since last meeting. The number of bits per frame was reduced from 416 to 399 bits to fit in 50 bytes while at the same time the quality was improved and the complexity was significantly reduced (to less than G.729a). A/B tests by the authors and by third parties confirmed the quality improvement which derives from the addition of a 57th sample in the quantized residual state and an increase in the number of bits allocated to gain (utilizing bits freed elsewhere). A demo SIP client with the iLBC codec is available by request from alan.duric@globalipsound.com. Steve Casner asked why the 400th bit should not be used for something more than setting it to zero. Alan replied that several ideas have been proposed and that these will be sent on the mailing list. Steve also commented that the codec seems to be still changing a lot. We don't want to progress this until the codec format has stabilized. Alan responded that no further changes are expected on the codec itself. This round of changes completes the work on reduction of frame size and complexity as planned. Plans for a 20ms frame option may be dropped because the need does not appear strong. Comments on that are requested. Work on voice activity detection is ongoing; this may be paired with the RFC 3389 Comfort Noise. That work is expected to be completed in time for interop testing planned for the next SIPit in February. Steve asked whether the sorting of bits for ULP is intended to be applied across frames, because the payload format draft is not clear on this. The answer is yes. Steve said that is appropriate (otherwise the sorting does no good), but it is a lot of work which gives no advantage in environments without ULP at lower layers. It may be necessary to allow both sorted and unsorted modes as in the AMR codec. We'd like feedback from implementers about the cost and utility of the ULP sorting. Alan asked about the possibility of adding another document giving the qualification criteria for the codec. Steve replied that this would need to be standards track to be effective, but the status even of standardizing the codec itself is still not entirely clear. Generally IETF avoids conformance testing. Stephan Wenger asked when the general issue of standardizing media codecs in IETF will be resolved. Steve Casner replied that, although the Transport ADs were consulted and were in favor of this work before it started, we won't know the answer for sure until the work is submitted to the whole IESG for approval. RTP Payload Format for ATRAC-X Matthew Romaine present a new payload format for ATRAC-X audio in draft-hatanaka-avt-rtp-atracx-00.txt. Sony's ATRAC family of perceptual codecs is used in MD's and solid-state recorders. The -X version supports multiple channels in a wide range of data rates from 8kbps to 1.4Mbps. The payload format supports multiplexing of multiple streams and metadata within a single session, redundant data to mitigate packet loss, and fragmentation. The draft details the segmentation of streams into segments and the association of segments from different streams in the same time slot. Two open issues were identified; the first was how to manage the allocation of metadata identifiers. Some appropriate body could static identifiers, as is done in MIDI, or the assignments could be a dynamic free-for-all. There was no input on this. The second issue is the determination of the RTP timestamp: the draft currently specifies transmit time, but it has already been pointed out that a presentation (sampling) timestamp is needed to allow synchronization with other streams. The problem is that a single session might carry multiple sampling rates. Steve Casner offered the example of MPEG audio in which the timestamp clock rate is always 90kHz synchronized to the sampling clock, which may vary in rate. Could a similar arrangement be used here? Magnus Westerlund suggested that if different rates are needed, perhaps different RTP sessions should be used. Steve Casner asked why the multiplexing of streams built into the payload format rather than using multiplexing at the UDP/RTP level. Is the format derived from something already in use on MD or other media and therefore hard to change, or is it a new design that is part of the payload format and therefore open to discussion? Matthew responded that the format was developed with streaming in mind; it is supposed to be extensible. Multiple bit rates are supported for scalable QoS, and they have specified multi-channel configurations up to 7.1 but it could be expanded to 32 channels. The benefit is payload overhead. Steve asked how this would be used for QoS: keep some parts of the packet and throw away others? That does not work. It might make sense for the file format to contain multiple rates for scalability, but the packets should only contain the rate appropriate for the receive or you have not achieved the goal of fitting the available bandwidth. If you need to deliver different rates to different receives, send different streams, or layered coding for multicast. Roni Even echoed this concern; if the multiplexing of streams is for redundancy, the draft needs to explain the relationship between the fragments, redundant segments, etc. Colin Perkins asked why redundancy was built into the payload format rather than using RFC 2198. The authors were unaware of 2198. Steve also pointed out that for redundancy to be useful the redundant copy may need to be separated further in time than one slot. He also suggested that it would be useful for the authors to review several of the other payload formats since several of the architectural ideas commonly used in AVT have been missed, such as separate streams for separate needs. Magnus asked if is it possible for fragments to be independently decoded, or must a segment be fully reassembled to decode it. Matthew said the answer depends on the encoder, and needs to look into this further. In summary, this payload format may need quite a bit of change from what is defined so far. RTP Payload Format for Uncompressed Video Ladan Gharai presented updates to draft-ietf-avt-uncomp-video-01.txt. In addition to the correction of editorial nits and the inclusion of an applicability statement and a comparison to RFC 2431 (BT.656 video), some new features were added: 12- and 16-bit sample sizes join to the 8- and 10-bit sizes specified previously, and monochrome, 4:4:4:4 chrominance subsampling, and RGBA color representations were added. The payload header was unchanged except that the 'M' bit was renamed 'C' to avoid possible confusion with marker bit in RTP header. The draft has established a list of mandatory SDP fmtp parameters and a partial list of optional parameters. The authors are still working on the representation of these parameters, but will complete this work for the next draft. Ladan identified a few open issues. Currently only packed sample formats are provided; the authors are considering adding planar and macro-blocked formats as well. The planar format, in which color planes are sent separately, is straightforward; it would be identified by an SDP parameter. However, it is unclear whether it makes sense to have packed and macro-block formats in the same payload format. To accommodate macro-blocks, width and length parameters would have to be added to the payload header (there is room), and then the packed format would be indicated by a macro-block size of 1. Stephan Wenger would like to see the planar representation added, but has doubts about a macro-block-based scheme. There are applications for which it would be useful, but there are too many complications related to interlacing. You can't assume that the shape of a macro-block will be 16x16 in a progressive scan or in one field. Sometimes a macro-block is a different size with parts from both fields. It is also affected by transcoding. A second open issue is the transport of interlaced 4:2:0 color subsampling. This has been discussed on the mailing list and work is still in progress. Lastly, for interlaced video, there is a question whether the two fields should have distinct timestamps. A problem is that for the current 90kHz timestamp clock rate which increments at 3003 for 29.97fps NTSC video, a fractional increment of 1501.5 would be needed for the intermediate field timestamp, but the RTP timestamp is an integer. It should be possible instead to derive the timestamp from header bits and the frame rate. Stephan explained that you need to have a timestamp for every field in order to indicate the proper mapping of fields between 24fps film content and 30fps video using 3-2-pulldown because an individual field may be repeated so they do not always appear in even-odd pairs. However, we don't worry about the exact timestamp value for this, it would be safe to round up to the next integer. Resolution of comments on draft-ietf-avt-srtp-05.txt The first session ended with a discussion of IESG security concerns regarding the Secure RTP profile (draft-ietf-avt-srtp-05.txt). For this discussion, Allison Mankin introduced herself as the Transport AD for this group, Eric Rescorla as security advisor to the Transport Area, and Steve Bellovin who is one of the Security Area Directors. Eric Rescorla started by noting that the SRTP profile has some unusual design features: it uses AES in counter mode, rather than in CBC mode, and it offers a choice of several message authentication codes (MACs), including no authentication. These features, in particular the option to use AES in counter mode with no authentication, don't make security folks comfortable. Eric then summarized his understanding of the issues that require SRTP to use these modes of operation. The first is latency, since shorter packets mean less latency for voice and MACs consume bandwidth. Secondly, wireless channels are noisy and packets often contain bit errors. If integrity checks are used in this environment, the bandwidth consumption will be excessive and the bit errors may lead to unacceptable packet discard rates due to failure of the integrity checks. Eric moved on to explain that counter mode has no integrity protection unless protected by a MAC. This is not obviously a problem for voice, one of the key applications for SRTP, but may be a problem for other types of content. From a security viewpoint, it is desirable to use SRTP with a MAC, but the default MAC in SRTP is a weak 32-bit code and there is the option to use SRTP without integrity protection (there is also a strong MAC option). The choices lead to the threat of modified message streams and forged traffic, unless the optional strong MAC is used. Two solutions to this problem were proposed: Make the MAC mandatory and add FEC after encryption to correct bit errors so that the integrity check will work on somewhat corrupted packets. There was considerable discussion of this in private email with the authors, who were opposed on the grounds that it expands packets and makes SRTP uneconomic for cellular links, which already employ link-layer FEC. Eric was not convinced by these arguments, citing the qualitative nature of the concerns rather than hard numbers giving performance impact. Define a wireless voice profile for SRTP where the MAC protects only the control data leaving the media data unprotected. The reduced MAC causes limited packet expansion, but is less sensitive to bit-errors than SRTP as currently specified. Other types of traffic will use a mandatory 80 bit MAC. Mark Baugher noted that SRTP has the ability to use strong integrity protection now, but it's not the default. The question is whether the vendors or the users should be able to make the determination, based on their environment, their application, whether they want a strong MAC or not. Steve Bellovin agreed with this formulation, but noted that the IESG has a strong preference for protocols that are secure by default, and a protocol won't be published unless it has strong mandatory to implement security. If a protocol has weaker security options, it needs a Security Considerations section that describes the environments where the weaker options may be acceptable, and explains the consequences and tradeoffs of selecting those options. Eric Rescorla asked Steve Bellovin if it was acceptable for SRTP to have the option of no authentication? Steve answered that it was permitted in certain other situations, but would take detailed analysis to show where it is safe and useful and where it isn't. Mark Baugher asked if changing the default mandatory transforms, adding CBC mode as an option, would satisfy concerns? Steve Bellovin answered that, assuming you meet requirements for safely using counter mode, there is no strong need for CBC mode; the MAC is much more critical. Mark Baugher asked if the security folks are not happy with the default 32-bit HMAC-SHA1? Steve Bellovin replied that he needs to think more on that, but the group needs to better analyse the environment before he can make a good decision. Allison Mankin reminded the group that SRTP is for all environments, and expressed her preference for a specification where the MAC was mandatory in all cases, with a possible exception for cellular telephony. Elisabetta Carrara reminded Allison that SRTP includes a 32-bit MAC by default, and that stronger options are specified. Eric Rescorla again noted that it is necessary to analyse individual threats and the environment, giving numbers to characterize the impact of security on performance. Elisabetta noted that the MAC cannot be used in cellular telephony, since that environment cannot afford the bandwidth of the MAC. She reminded the group that the requirements driving ROHC and UDPlite also apply to SRTP. Steve Bellovin replied this is the sort of thing that has to go into the security considerations section, explaining why the environment has these requirements and how they affect security. Allison commented that the draft is intended to be general purpose, but is optimized for cellular use. The default transform needs to be suitable for the general case, with a non-optional MAC if counter mode is used, and justification why weaker options are present for cellular operation. Steve Casner asked if there was a problem with changing the defaults to be more general-purpose, signaling specific settings for telephony applications, and clearly documenting the rationale in the security consideration section? There were no objections. Liaison statement from MPEG Steve Casner started the second session by reading a liaison statement the group has received from the MPEG committee, stating that they have revised the RTP Payload Format for MPEG-4 taking into account comments from the last AVT working-group last call, and requesting publication of the draft as an RFC. MIME Type Registration for MPEG-4 Jan van der Meer, sitting in for Young-Kwon Lim, outlined the draft draft-lim-mpeg4-mime-01.txt that specifies MIME type registrations for the MPEG-4 file formats and their relation to the MPEG4-on-IP framework (ISO/IEC 14496-8). Steve Casner noted that this draft includes some discussion of RTP MIME parameters, which needs to be moved to the payload format drafts. Steve also expressed concern that the previous versions of the full framework document, submitted to the IETF, had problems which needed to be resolved but it's not clear that these have been addressed in ISO. There is a need to address these issues in future, especially if this MIME registration and the framework conflict. Mike Coleman asked about the difference between streams and files, in this context, since MPEG-4 streams are not well defined. Steve Casner and Colin Perkins clarified that this draft should cover only the MP4 file format, and that the RTP payload format drafts will contain MIME types for use with RTP. Stephan Wenger asked about the presumed existence of an informational RFC, pointing to the MPEG4-on-IP framework. Colin Perkins and Steve Casner explained that this was agreed in the AVT meeting at the 52nd IETF (Salt Lake City). RTP Payload Format for MPEG-4 Jan van der Meer discussed draft-ietf-avt-mpeg4-simple-05.txt, the RTP Payload Format for MPEG-4. This document is in working group last call and several comments, mostly editorial, have been received. The main issues are the suggested replacement of the "Profile" parameter with "InterleaveDelay", and whether RTP timestamps should be allowed to go backwards when interleaving. These have been discussed in AVT, and in MPEG and ISMA, and it has been agreed to allow both features. Current discussion on the mailing list is on the exact meaning of interleave delay and emission rules. This discussion continued in the meeting with Steve Casner, Stephan Wenger, Colin Perkins and Andrea Basso commenting on the RTP system model and how it leaves much to the discretion of the receiver when compared to the MPEG buffer model. They saw no need for the emission rules, viewing them as implementation details that do not need to be specified. In addition, they noted that the characteristics of an IP network are such that the sender cannot control the buffering at the receiver. This also led to the definition of the interleaving delay, with concern being expressed that the attempt to precisely define the delay being unnecessary, since what is really needed is a hint to the receiver suggesting an starting estimate of the buffering delay. Much of the complexity comes from trying to tightly bound the interleaving delay, and a tight bound is not necessary or feasible. Stephan Wenger asked what would be the impact of pulling interleaving out of the payload format? Colin Perkins said that this is not possible, but we may consider leaving the interleave delay parameter, and letting the sender chose an appropriate value without saying how to do that. Mike Coleman asked about the draft status, since it is not available in the archives and because parts of the MPEG committee belive it complete, but it clearly is not. Steve Casner noted that the draft will be in the archives after the meeting. Steve and Colin also noted that the current working group last call is not completed. There will be time to review any changes introduced before the draft is advanced. RTP Payload Format for JVT Video Stephan Wenger discussed draft-ietf-avt-rtp-h264-00.txt, the payload format for JVT video. This updates draft-wenger-avt-rtp-jvt-01.txt to align with the latest JVT specification and adds MTAPs with 8-, 16-, 24- and 32-bit timestamp offsets (as discussed at the previous AVT meeting). Stephan is considering removing the 8- and 32-bit timestamp offsets, since they are not believed to be useful. The next open issue is the relation between this payload format and the MPEG-4 payload format, since JVT video is referenced as part of MPEG-4. Stephan believes that using the MPEG-4 format for JVT is not acceptable, since MTAPs and STAPs cannot be sent efficiently with that format. He also believes that full binary compatibility between the JVT payload format and the MPEG-4 payload format is not achievable. However, it is possible to define a common operation point, providing compatibility at the expense of limited optimization. Steve Casner noted that the draft specifies use of the latest timestamp when doing AU aggregation, but that other payload formats use the oldest timestamp. Stephan agreed that this is an issue, and should be changed. Mike Coleman noted that section 3 says the draft is "not intended to be used with MPEG-4 systems" and asked for clarification what is meant? It is possible to use it with MPEG-4 systems, but there are some features of this draft that are not compatible with the MPEG-4 payload format. Jan van der Meer noted that some in MPEG will ask "what are the features offered with this draft that cannot be supported by the MPEG-4 payload format?" Stephan answered that the main reason is STAPs and aggregation which cannot be supported efficiently, and multiple fragments of AUs are vital but not supported in the MPEG-4 payload format. There was some discussion of this, and it may be appropriate to clarify in a future version of the draft. RTCP Reporting Extensions Alan Clark discussed draft-ietf-avt-rtcp-report-extns-01.txt, on RTCP reporting extensions. This is the combination of the various reporting extensions drafts discussed in Yokohama, with the addition of loss run-length encoding, updated VoIP metrics, and security and IANA considerations. Colin Perkins noted that the IANA considerations section needs work to specify the registration in detail, and will supply detailed comments offline. Colin also asked if the jitter buffer metrics are useful and match implementations? Have implementors looked at the draft to see if the information is meaningful in their context? Alan Duric noted that jitter buffer and PLC functions are separate in the draft, but these sometimes combined in implementations. Alan Clark said that the broad intent is to provide rough info for diagnostic purposes, not an exact description of an implementation. Alan Clark noted the need to be management friendly, even if SRTP is used. Accordingly, he would like to add a note to the draft indicating that the SRTP E bit can be used to send extended RTCP report frames in plaintext, even if encryption has been selected as the default setting. Colin agreed that this might be possible, but noted that the draft shouldn't specify a security policy. Steve Casner also noted that the draft should talk about this issue in the security considerations section. Steve Casner also highlighted that a receive-only endpoint will not know the RTT that is supposed to be included in the VoIP metrics report since the RTCP mechanism works only for senders. Alan Clark replied that these metrics are expected to be used for full-duplex conversations. Steve said that in that case, the draft needs to make clear in what scenarios the VoIP metrics report is applicable. RTCP Extensions for SSM Jörg Ott described changes to draft-ietf-avt-rtcpssm-02.txt, the RTCP extensions for source-specific multicast. The main changes are to the security considerations. In addition, SSRC distribution has been removed from this version and cumulative values are now included in the distribution. There are "work-in-progress" changes to the IANA considerations section and to use the XR packet formats (on this subject, Jörg noted that there are several proposed RTCP extensions using packet type 205, and we need to resolve this conflict). The security considerations section has been significantly reworked, with the assumptions that we need to maintain low overhead, that the session parameters are securely distributed out of band, and that the security weaknesses should be addressed at the transport layer and above since weaknesses may exist in the SSM layer below. The threats identified are denial of service, packet forgery, session replay and eavesdropping. The draft also categorizes threats according to the direction of the traffic flow, and discusses the trust models. Colin Perkins approved of the security considerations section, but would like discussion of specific applications and mandatory security behavior for those applications in this draft (e.g. how to use SSM with RTSP and SIP). Jörg highlighted the issue of relation to other I-Ds, since this uses the features of the extended RTCP reporting draft. He asked on the time schedule for the RTCP reporting extensions draft. Alan Clark would like to get the RTCP Reporting Extensions draft done quickly, and was willing to cooperate on the IANA issues, ensuring they're aligned. Jörg asked if future drafts relating to RTCP should include a section on SSM considerations? Steve Casner was not sure if we need to establish a requirement, but noted that this draft should have a section giving advice to authors of RTCP extensions that might be affected by SSM. Open issues include cumulative BYE packets, a possible revision to the message format, discussion of the relation to other RTP/RTCP extensions, completion of IANA considerations, etc. A revised draft is expected by the end of the year. Retransmission The RTP retransmission format (draft-ietf-avt-rtp-retransmission-03.txt) was discussed by José Rey. This is the merger of the two previous drafts, as was discussed in Yokohama. The new draft uses a dynamic payload format to indicate the original payload type of the retransmission. It supports session multiplexing, with streams associated using an a=fmtp parameter and FID, and SSRC multiplexing using an a=fmtp parameter to associate the retransmission with the original stream. José also outlined the RTSP considerations regarding SSRC-multiplexing. There will be a minor revision shortly, which is expected to be ready for last call. Anders Klemets asked if one MUST NOT do session multiplexing and SSRC multiplexing in the same session? It was clarified that this is correct. RGL codec and payload format The final presentation was a brief outline of the RGL lossless G.711 codec, by Michael Ramalho, which was presented as a possible future work item. Steve Casner noted that standardizing codecs is not entirely within scope of AVT, and will need discussion, as with iLBC. Drafts will be submitted shortly after the meeting.