XCON Working Group S. Srinivasan
Internet-Draft T. Moore
Intended status: Informational Microsoft Corporation
Expires: September 5, 2007 March 4, 2007
Media usages and SDP in the XCON data model
draft-srinivasan-xcon-usecases-mediausage-01
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 5, 2007.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
The scope of this document is to describe the association of media
streams to the XCON data model for various media usages captured in
the XCON conferencing scenarios [11].
Srinivasan & Moore Expires September 5, 2007 [Page 1]
Internet-Draft mediausage March 2007
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Media stream definitions . . . . . . . . . . . . . . . . . . . 3
3.1. Available media clarification . . . . . . . . . . . . . . 3
3.2. Per-user or per-endpoint media definitions . . . . . . . . 4
4. SDP negotiation and conferencing media usage . . . . . . . . . 5
4.1. Criteria for including media label attribute in SDP . . . 5
4.2. Mapping of media label (in SDP) to media id . . . . . . . 5
5. Media controls definitions . . . . . . . . . . . . . . . . . . 6
6. Media scenarios . . . . . . . . . . . . . . . . . . . . . . . 6
6.1. An example mixer model . . . . . . . . . . . . . . . . . . 6
6.1.1. Conference notification example . . . . . . . . . . . 7
6.2. Common audio/video scenarios . . . . . . . . . . . . . . . 11
6.2.1. Muting an audio stream . . . . . . . . . . . . . . . . 11
6.2.2. Pausing a video stream . . . . . . . . . . . . . . . . 13
6.3. Changing media streams . . . . . . . . . . . . . . . . . . 16
6.4. Changing media sources . . . . . . . . . . . . . . . . . . 16
7. Security Considerations . . . . . . . . . . . . . . . . . . . 16
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9.1. Normative References . . . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18
Intellectual Property and Copyright Statements . . . . . . . . . . 19
Srinivasan & Moore Expires September 5, 2007 [Page 2]
Internet-Draft mediausage March 2007
1. Introduction
The document clarifies the usages of SDP level attributes used in
negotiating media to a conferencing server. RFC 4574 [3] describes a
mechanism to label media streams to identify them, but leaves the
offer/answer model as an implementation detail. RFC 4575 [2] and the
XCON extensions to the conference event package [10] describe
mechanisms to notify state of conferences and defines a data model
for centralized conferencing. They, however, do not specify
semantics of those attributes and their use by the conferencing
server in signaling and media negotiations performed with the
conferencing client. This document attempts to close the gap by
suggesting a means for establishing the relationship between media
and the conferencing state information maintained by the XCON
conferencing server (for which the data model is described in [7]).
2. Terminology
In this document, the key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as
described in BCP 14, RFC 2119 [1] and indicate requirement levels for
compliant implementations.
3. Media stream definitions
The XCON framework [1] describes a framework for establishing and
participating in a centralized conference. The sections that follow
discuss the media aspect of conferencing in more detail.
3.1. Available media clarification
The available-media XML element in RFC 4575 [2] offers the ability
for a conferencing system to provide a list of media stream inputs or
outputs. A conferencing client joining the conference via the
conferencing server (eg. the focus) typically subscribes to the
conference event package as seen in [11]. The conference event
package provides a mechanism for conferencing clients to be notified
of conference state as described in the conference framework [1].
The XCON event package [10] further extends RFC 4575 [2] to specify
controls (based on [7]) to media streams negotiated by conferencing
clients. The XCON data model [7] is derived from the data model
described in RFC 4575 [2], so for all practical purposes any
references to [2] also apply to the XCON data model [7] and the XCON
extensions to the conference event package [10] with a few exceptions
(refer [10] for the key differences from the conference event package
Srinivasan & Moore Expires September 5, 2007 [Page 3]
Internet-Draft mediausage March 2007
in [2]).
As more and more conferencing systems begin to offer one or more
streams of the same type (like video), conferencing clients ought to
be capable of rendering more than one stream as offered by
conferencing systems in an inter-operable manner. Mechanisms such as
grouping of SDP media lines [8] and SDP media content [9] further
help in achieving this goal. It is however important to note that
unless a conferencing client understands the context of how these
streams ought to be renedered, the conferencing clients may not be
able to render streams that it is not aware of. This document only
addresses the problem of associating media SDP information (in
signaling protocols such as SIP) to the media information supplied in
the conferencing document (refer [7]).
Media streams offered in a conference in RFC 4575 [2] are each
identified via a label. This XML element is defined as optional. A
media label, however, SHOULD be assigned if more than one stream of
the same media type is offered by the conference. The label also is
used to associate the media 'id' attribute (as described in the
subsequent sections) to the corresponding stream (m line) in the SDP.
This enables conference-aware clients to negotiate SDP media in
relation to the conference event package data received from the
conference server without having to deal with the specifics of
ordering SDP media lines as required or specified by the conferencing
server. The label is also unique within the conference-info context
as defined in RFC 4575 [2]. A label is typically created when a
conference is scheduled, either via conference blueprints [1] or
through some other means. A new label MAY however appear in
available media element after the conference is active and
conferencing clients may decide to render these new streams as
required (based on local policy). When a conference is activated and
a conferencing client receives a notification with the conference
state, the conferencing server typically SHOULD label the media
streams. The conferencing client then may use this information or
may discover available media via signaling (for example, using SIP
OPTIONS) to join the conference and to start receiving media,
provided it understands the context in which the specific media needs
to be rendered. Mechanisms such as SDP media content [9] further aid
in providing conferencing clients with this context.
3.2. Per-user or per-endpoint media definitions
Streams sent from a specific user's endpoint device is usually
negotiated via some form of a signaling session. The conferencing
event package schema contains media XML elements within the users/
user/endpoint elements. The media XML element in RFC 4575 [2] refers
to a media stream of which there may be more than one. Each media
Srinivasan & Moore Expires September 5, 2007 [Page 4]
Internet-Draft mediausage March 2007
stream being sent from the conferencing client to the conference
server is identified, within the conferencing event package, via an
'id' (refer [2], for more information).
4. SDP negotiation and conferencing media usage
This section explains the semantics of the media label and its usage
within the XCON framework. The media label in the conferencing data
model or the conferencing event package maps to the media label
defined within SDP in RFC 4574 [3]. The media label is used to
identify and associate streams in the SDP offer/answer model to the
specific streams within conferencing. This section will explain how
this is done.
4.1. Criteria for including media label attribute in SDP
RFC 4574 [3] suggests that the label may appear either in the offer
or the answer and is used to identify the local stream either in the
offer or the answer. This section describes how conferencing servers
should integrate label into the offer/answer model and associate it
with the data model (and thus [10]). All conferencing clients and
servers MUST follow the offer/answer model as described in [6]. The
following sections only describe the usage of the media label in the
context of conferencing within the offer/answer model.
The conferencing server SHOULD include a SDP 'label' attribute (as
defined in RFC 4574 [3]) for each stream in SDP sent from the
conferencing server to the conferencing client (in either the offer
or the answer). If there are two or more streams of the same media
type (as defined in RFC 4575 [2], Section 5.3.4 with type being the
values registered for "media" of SDP [5]), the conferencing server
SHOULD include the label for each stream in the SDP sent from the
server. The media label MUST follow normative text described in RFC
4575 [2] and RFC 4574 [3].
4.2. Mapping of media label (in SDP) to media id
As the 'id' XML attribute (in Section 5.8 of [2]) is not directly
carried in the SDP (or any signaling for that matter), the label
attribute also serves the purpose of mapping the media 'id' defined
in the data model to the media label defined in the data model. What
this means is that a conferencing client will not be able to
negotiate different m-lines with the same label within the same
conference via separate signaling sessions.
[[ Note: Fixing this will require a new SDP attribute for conveying
the media id in SDP ]]
Srinivasan & Moore Expires September 5, 2007 [Page 5]
Internet-Draft mediausage March 2007
5. Media controls definitions
Media controls may be defined at the global conference-info level
(under available-media as specified in Section 4.1.5.1 in [7]) or may
be defined for a specific user and endpoint's media stream (as
specified in Section 4.5.2.1 of [7]). The former definition should
typically override the latter. For example, if the global audio is
muted, then none of the participants audio should be unmuted. The
control values for the endpoint's media stream may however have mute
set to false. But this value is ignored as the global control is set
to true.
Note that the global controls only refer to controls for the streams
coming from the conference mixer for that stream and does not refer
to controls for media streams being sent from the user's endpoint to
the mixer.
6. Media scenarios
6.1. An example mixer model
[to-mixer streams] [from-mixer streams]
|----mixer----|
userid=23 , id=34 ----| |--- label='10'
userid=23 , id=35 ----| |--- label='11'
userid=23 , id=36 ----| |--- label='12'
userid=24 , id=24 ----| |
userid=24 , id=35 ----| |
|-------------|
The mixer shown above takes in some set of input streams and mixes
them in some form or manner to produce output streams. This document
will not cover how the streams are grouped and/or mixed but will only
show how the media inputs and outputs tie into the conferencing data
model and signaling with an example. For further information refer
RFC 4575 [2]. The examples shown below are for information purposes
only and is offered to aid in understanding the solution presented in
this document.
The 'label' parameter above identifies the media stream from and to
the mixer. The streams to the mixer, from a specific user and
endpoint, are identified by an 'id' in the conferencing data model.
The label is unique throughout the conferencing data model. The id
is unique within the endpoint media element in the data model and is
Srinivasan & Moore Expires September 5, 2007 [Page 6]
Internet-Draft mediausage March 2007
generated by the conferencing server. Furthermore, each user is
identified by a user identifier, refer [4].
Consider that the label '10' is the stream containing the audio
stream mix from all audio input streams offered to every participant.
And '11' consists of a video mix that contains one of the layouts as
decribed in the scenarios section. And that '12' is an alternate mix
of the video streams that is voice activated.
And id's 34,35 and 36 for userid 23 are the user's main audio, main
video stream and secondary video streams respectively. And id's 24
and 35 for userid 24 are the user's main audio and main video streams
respectively.
Let us also consider that the mixer mixes the incoming video streams
from the participants (going to the mixer) into both label '11' and
label '12' streams. Also, the mixer accepts a single input stream at
most from the client (in any sendrecv media stream), while rejecting
the rest. This is a specific mixer model described here, other mixer
models may interpret the input streams differently. The next section
will cover how this specific mix will appear in the offer/answer
model in SDP.
Note that the floor control aspects of the streams above are omitted
here for brevity as floor control is defined as being optional.
6.1.1. Conference notification example
The notification example given below corresponds to the mixer model
defined above. The available-media element lists the media labels as
defined. Note that the media labels '11' and '12' are defined with a
status element of sendrecv.
Using the offer/answer model described earlier, users Bob (userid=23)
and Carl (userid=24) have joined the Focus and negotiated media
streams as shown in the notification below. It is useful to note
that Bob has chosen to recv all video streams, while Carl has decided
to opt on the secondary voice-activated video stream.
It is quite possible for a conferencing system to expose Bob's input
stream directly (without mixing) to the participants of the
conference if it deems necessary as Bob has a role of presenter. It
may do so, for example, by creating a new label on-the-fly to expose
this to the conferencing client.
The notification below is what a presenter (Bob) may receive.
-
Srinivasan & Moore Expires September 5, 2007 [Page 7]
Internet-Draft mediausage March 2007
main audioaudiosendrecvtruemain videovideosendrecvtruesecondary videovideosendrecvtrueBob HoskinspresenterBob's Laptopconnected
Srinivasan & Moore Expires September 5, 2007 [Page 8]
Internet-Draft mediausage March 2007
dialed-outmain audioaudio432424sendrecvtruetruemain videovideo324255sendrecvtruetruesecondary videovideo1324255recvonlytrue
Srinivasan & Moore Expires September 5, 2007 [Page 9]
Internet-Draft mediausage March 2007
full infohsjh8980vhsb78vav738dvbs8954jgjg8432CarlparticipantCarl's video phoneconnecteddialed-inmain audioaudio242443sendrecvtruetruesecondary videovideo632425sendrecvtrue
Srinivasan & Moore Expires September 5, 2007 [Page 10]
Internet-Draft mediausage March 2007
truefull infoaachsjh8980vhsb78ffvav738dvbsa8954jgjg8432
6.2. Common audio/video scenarios
The following sections are examples of how conference controls and
the SDP may be interpreted. This section will not cover the usages
of all the controls defined in the XCON data model [7]. [[TBD]]
6.2.1. Muting an audio stream
6.2.1.1. Mute all participants
Muting all participants (in other words, activating the control or
setting the value to 'true') in the conference typically means that
for the entire duration where mute is applicable, all current and
future participants of the conference are muted and will not receive
any audio. Typically this control is available to presenter or
moderator roles in a conferencing system. Setting this control
overrides any user-specific control settings specified (see the next
few sections). Since no audio is flowing to all participants,
activating this control, in turn, may cause the conferencing focus to
re-negotiate SDP with the various participants to stop media flowing
as and when necessary. This is entirely up to local policy. Note
that doing so may cause changes in conference state (with per-
endpoint media elements and controls, their respective id's and their
default states changing).
In the example mixer, the control appears under available-media
element as shown below.
Srinivasan & Moore Expires September 5, 2007 [Page 11]
Internet-Draft mediausage March 2007
true
6.2.1.2. Muting to-mixer stream from a specific participant
A mixer stream being sent from a participant to the mixer may be
mixed in any form or manner. For example, this may appear in
multiple media outputs from the mixer (though not the case in this
specific example). Thus, activating this control would most
certainly cause this input not appearing in any of the outputs from
the mixer. Similar to the previous scenario, activating this control
may end up re-negotiating SDP.
In the example mixer, the control appears under media element for
each user and endpoint. Bob's controls is shown below.
true
SDP from the conferencing server may look like
(some elements omitted)
v=0
c=IN IP4 131.164.74.2
t=0 0
m=audio 30000 RTP/AVP 0
a=label:10
Note that even though the above SDP does not contain any information
about the media id, the label provides a mapping of the specific
m-line to the media section in the data model.
Srinivasan & Moore Expires September 5, 2007 [Page 12]
Internet-Draft mediausage March 2007
6.2.1.3. Muting from-mixer stream to a specific participant
This is a control on a specific mixer stream that is sent from a
mixer to the participant negotiated via SDP. This is mostly optional
and many conferencing systems may instead opt to not implement such a
control. A client may instead, stop sending the media to the output
device instead of activating this control to mute the stream. Doing
so will have the mixer still sending media packets towards the
participant thus taking bandwidth on the network and CPU on the
mixer. Activating this control would stop media being send back from
the mixer to the participant. Similar to the previous scenarios,
activating this control may end up re-negotiating SDP.
In the example mixer, the control appears under media element for
each user and endpoint. Bob's controls is shown below.
true
SDP from the conferencing server may look like
(some elements omitted)
v=0
c=IN IP4 131.164.74.2
t=0 0
m=audio 30000 RTP/AVP 0
a=label:10
As before, note that even though the above SDP does not contain any
information about the media id, the label provides a mapping of the
specific m-line to the media section in the data model.
6.2.2. Pausing a video stream
6.2.2.1. Pausing video to all participants
Pausing the video being sent to all participants (in other words,
activating the control or setting the value to 'true') in the
conference typically means that for the entire duration where pause
is applicable, all current and future participants of the conference
would not receive video. Typically this control is available to
presenter or moderator roles in a conferencing system. Setting this
control overrides any user-specific control settings specified (see
Srinivasan & Moore Expires September 5, 2007 [Page 13]
Internet-Draft mediausage March 2007
the next few sections). Since no media is flowing to all
participants, activating this control, in turn, may cause the
conferencing focus to re-negotiate SDP with the various participants
to stop media flowing as and when necessary. This is entirely up to
local policy. Note that doing so may cause changes in conference
state (with per-endpoint media elements and controls, their
respective id's and their default states changing).
In the example mixer, the control appears under available-media
element as shown below.
true
6.2.2.2. Pausing to-mixer stream from a specific participant
A mixer stream being sent from a participant to the mixer may be
mixed in any form or manner. For example, this may appear in
multiple media outputs from the mixer (as is the case in the
example). Thus, activating this control would most certainly cause
this input not appearing in any of the outputs from the mixer.
Similar to the previous scenario, activating this control may end up
re-negotiating SDP.
In the example mixer, the control appears under media element for
each user and endpoint. Bob's controls is shown below. Activating
this control would end up not showing Bob in any of hte output
streams.
Srinivasan & Moore Expires September 5, 2007 [Page 14]
Internet-Draft mediausage March 2007
true
SDP from the conferencing server may look like
(some elements omitted)
v=0
c=IN IP4 131.164.74.2
t=0 0
m=video 30002 RTP/AVP 31
a=label:11
As before, note that even though the above SDP does not contain any
information about the media id, the label provides a mapping of the
specific m-line to the media section in the data model.
6.2.2.3. Pausing video from-mixer stream to a specific participant
This is a control on a specific mixer stream that is sent from a
mixer to the participant negotiated via SDP. This is mostly optional
and many conferencing systems may instead opt to not implement such a
control. A client may instead, stop sending the media to the display
device instead of activating this control to pause the stream. Doing
so will have the mixer still sending media packets towards the
participant thus taking bandwidth on the network and CPU on the
mixer. Activating this control would stop media being send back from
the mixer to the participant. Similar to the previous scenarios,
activating this control may end up re-negotiating SDP.
In the example mixer, the control appears under media element for
each user and endpoint. Bob's controls is shown below.
Srinivasan & Moore Expires September 5, 2007 [Page 15]
Internet-Draft mediausage March 2007
true
SDP from the conferencing server may look like
(some elements omitted)
v=0
c=IN IP4 131.164.74.2
t=0 0
m=video 30002 RTP/AVP 31
a=label:11
As before, note that even though the above SDP does not contain any
information about the media id, the label provides a mapping of the
specific m-line to the media section in the data model.
6.3. Changing media streams
TBD
6.4. Changing media sources
TBD
7. Security Considerations
TBD
8. Acknowledgements
Thanks to Gonzalo Camarillo and Even Roni for useful comments.
9. References
Srinivasan & Moore Expires September 5, 2007 [Page 16]
Internet-Draft mediausage March 2007
9.1. Normative References
[1] Barnes, M., "A Framework and Data Model for Centralized
Conferencing", draft-ietf-xcon-framework-05 (work in progress),
September 2006.
[2] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session
Initiation Protocol (SIP) Event Package for Conference State",
RFC 4575, August 2006.
[3] Levin, O., Camarillo, G., "The Session Description Protocol
(SDP) Label Attribute", RFC 4574, August 2006.
[4] Boulton, C., Barnes, M., "A User Identifier for Centralized
Conferencing (XCON)", draft-boulton-xcon-userid-00.txt
(work-in-progress), October 2006.
[5] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[6] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
Session Description Protocol (SDP)", RFC 3264, June 2002.
[7] Novo, O., Camarillo, G., Morgan, D., "A Common Conference
Information Data Model for Centralized Conferencing (XCON)",
draft-ietf-xcon-common-data-model-04 (work in progress),
March 2007.
[8] G. Camarillo, J. Holler, and H. Schulzrinne, "Grouping of Media
Lines in the Session Description Protocol (SDP)", RFC 3388,
December 2002.
[9] J. Hautakorpi, G. Camarillo, "The Session Description Protocol
(SDP) Content Attribute", RFC 4796, February 2007.
[10] S. Srinivasan, R. Even, "Conference event package extensions
for the XCON framework",
draft-srinivasan-xcon-eventpkg-extensions-00
(work-in-progress), February 2007.
[11] R. Even, N. Ismail, "Conferencing Scenarios", RFC 4597,
July 2006.
Srinivasan & Moore Expires September 5, 2007 [Page 17]
Internet-Draft mediausage March 2007
Authors' Addresses
Srivatsa Srinivasan
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052, USA
Email: srivats@microsoft.com
Tim Moore
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052, USA
Email: timmoore@microsoft.com
Srinivasan & Moore Expires September 5, 2007 [Page 18]
Internet-Draft mediausage March 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Srinivasan & Moore Expires September 5, 2007 [Page 19]