Meeting Minutes: RMONMIB WG Interim Area: OPS WG : RMONMIB Date: January 10 - 11, 2000 Time: 9:00 - 12:00, 13:00 - 18:00+ each day Location: Boston, MA USA Minutes: Andy Bierman & Russell Dietz A) Agenda: The RMONMIB WG will hold an interim meeting to focus on one new work item: Application Performance Measurement (APM). The WG has specific goals for the first phase of this work. Please read the charter text at: ftp://ftpeng.cisco.com/ftp/rmonmib/rmonmib_charter_nov99_b.txt Day 1: - Agenda bashing - Translate APM charter text into APM functional requirements and scope of work - Discuss APM protocol classification framework - includes protocol directory enhancements needed for APM - Discuss APM metrics classification framework Day 2: - Discuss APM results classification and reporting framework - includes bins vs. statistics discussion - Examine existing APM I-D contributions and other published solution proposals - draft-dietz-apmmon-mib-00.txt - draft-warth-rmon2-artmib-01.txt - others? (submission deadline is 1/8/00 17:00 EDT) - Plan first release of APM document deliverables B) Minutes [Day 1 1/10/2000] 1) Agenda Bashing No agenda changes were suggested, although the agenda order was not strictly followed. 2) Translate charter text into APM functional requirements The meeting started by looking into the charter and how we should take the requirements and translate them into actual MIB implementation. The charter text can be found at: http://www.ietf.org/html.charters/rmonmib-charter.html There was some general discussion on the measurements of application performance, and on the measurement of transport performance and the relationship to application performance. 2.2) Configuring the agent capabilities. The group reached these conclusions: - The APM standard does need to support installation and configuration of any specific APM measurement techniques. - The APM standard should support client based, probe based, active and synthetic traffic monitoring, and even non-network based mechanisms. - The focus of the APM standard will be on defining how the system reports and provides the information gathered via some APM collection technique. - The APM standard must define the capabilities of the device, at various levels. An NMS application should expect the same results from different APM collection devices, for each 'test' with the same global ID and/or identical 'test' attributes. 2.3) APM Terms The group then tried to establish some basic terms and an overall architectural model. 2.3.1) APM Data - The actual measurements, kept in some internal representation, collected from one or more APM Collection Points. This data is converted into one or more APM Reports, in an implementation-specific manner. APM Data is outside the scope of the APM standard. 2.3.2) APM Device - A device which contains an SNMP agent which implements the APM MIB. For the purpose of the standard, the APM device is the entity that is reporting the APM results, in the form of standard MIB objects. An APM Device can obtain APM data from one or more APM Collection Points. 2.3.3) APM Collection Point - The general location within the spectrum of possible vantage points, that APM Data is collected, for one or more APM Studies (and perhaps more than one APM device). The interaction between an APM device and and APM Collection Point is outside the scope of the APM standard. 2.3.4) APM Study Class - [called APM Study at the interim] The unique set of parameters that distinguishes a particular APM collection mechanism and/or capability. E.g., - the set of protocols collected - test attributes (standard and proprietary) - collection technique(s) - APM Collection Point location(s) - report capabilities - global vendor ID - vendor-specific Study Class ID 2.3.5) APM Study - [called APM Study instance at the interim] This is an instantiation of a particular APM Study Class. Some parameters, such as: - the dataSource to monitor - any resource restrictions to enforce - the specific set of result data to collect are relevant only in the context of an APM Study, not an APM Study Class. 2.3.6) APM Report Class - [not formalized at the interim] The type of APM report produced on behalf of a particular APM Study Class. The group identified these distinct APM Report Classes: - bin based aggregation - statistical based aggregation - exception based reporting 2.3.7) APM Report - the set of result data for a particular APM Study. This is in the form of standard MIB objects and notifications in the APM MIB. 2.3.8) APM Location - the notion of where an APM Collection Point is obtaining APM Data. At the highest level, the possible locations are: - client - network - server The actual APM MIB will contain a more refined definition of APM Location. 2.3.9) APM Class - [not formalized at the meeting] The group identified three classes of APM technology: - passive collection: no test traffic is introduced into the system being measured. APM Data is derived from observable 'user activity' in the actual operating environment. - active collection: some form of test traffic is introduced into the system being measured. APM Data is derived from measured attributes of this test traffic. - intrusive collection: actual network traffic is modified for the purpose of APM Data collection. APM Data is derived from measured attributes of this modified network traffic. 2.4) Measuring the User Experience The main goal of APM is to provide quantitative test results which directly reflect the experience perceived by the user of a network service. The group discussed examples such as: - entire WEB page load time, not just an individual HTTP GETs - entire application transaction time, not transport round trip time - classification granularity to application verb level, since different verbs require different metrics 2.5) APM Functional Requirements Summary The group listed these requirements at the end of this discussion: - measure end-user experience - link net based measurements - probe caps - test caps - collect point caps - notion of location - no config, just start and stop tests - quantitative analysis - relation to RMON-2 - PD verbs - finer classification: - URL - MIME type - verbs - relation to IPPM metrics - don't reinvent terms - dataSource granularity (not specific to APM) - subset of an interface - IP addresses - VLAN - DSCP 3) APM Framework The group spent a great deal of time discussing APM Framework issues. 3.1) Metric Classification Steve Waldbusser presented a model for classifying the user experience that was well-received by the group. At the highest level, the network interactions experienced by a real person can be divided into two metrics: - Availability The percentage of (usually) time that a network service is available for use - Responsiveness The perceived speed of the network service. This is measured differently, depending on transaction type. At the highest level, the protocol types experienced by a real person can be divided into three classes: - transaction-oriented The payload size is relatively constant for all transactions of this type. Examples include SNMP, pop3.login and html/http. Responsiveness is application response time. Availability over a given time interval is the number of successful transactions divided by the total number of transaction attempts in that interval. - throughput-oriented the availability and responsiveness are related to the size of the user data for transactions of this type. Examples include ftp.get. - streaming-oriented the availability and responsiveness are characterized by more complex metrics than throughput-oriented protocols. Examples include real-audio. The following chart was presented and used for reference throughout the meeting. Trans. Throughput Streaming +-------------------------------------------------------+ | Availability | % avail. | % avail. | % avail. | +-------------------------------------------------------+ | Responsiveness | tenths of | sec/Gbit | parts per | | | seconds | | million | +-------------------------------------------------------+ A great deal of discussion focused on complex protocols like 'WEB' (see sec x.x), and issues related to classifying transactions. Some 'meta-protocols' like this can be classified into different 'boxes', or classified into more than one box at once. Most protocols will fall into only one column, and both metrics for that column apply to the protocol. A great deal of discussion focused on the availability metric for 'Streaming' protocols. It is roughly defined as the ratio of time that the service is degraded or interrupted to total service time, and is measured in parts per million. It is difficult (through packet observation) to determine the true availability of some applications in this class, since many factors such as codec(s) used, negotiated bandwidth, playout buffer size, and user intervention (e.g. 'pause button'), influence the user experience. Measuring the user experience for bi-directional streaming protocols (e.g., VOIP) is also quite difficult, and may require special consideration in the APM MIB. The group discussed the units for the responsiveness metric. The units for all types of responsiveness metrics should be normalized, such that lower values indicate faster responsiveness. 3.2) APM Study Classification Each APM Study Class has characteristics which must be identified by an NMS application, or order to properly interoperate with APM Devices. 3.2.1) APM Architecture There are several distinct APM architectures which can be supported by the APM MIB. (See section 2.3.9). 3.2.2) Set of Metrics The metrics supported by each APM Study Class must be identified in the APM standard. 3.2.3) Set of Protocols The protocols supported by each APM Study Class must be identified in the APM standard. The RMON-2 protocolDirTable is used to identify protocols for APM. 3.2.4) APM Capabilities The APM standard will contain support for identifying the generic feature capabilities of an APM Device, similar to the dataSourceCapsTable in the SMON MIB. This includes collection and output capabilities. 3.2.5) Vendor ID Each APM Study Class must be assigned a unique 'registration OID' by the WG (for any standard APM study classes), or each vendor (for vendor-specific APM study classes). 3.3) APM Protocol Classification The group discussed various aspects of protocol classification, and protocol directory enhancements that would benefit APM. 3.3.1) Meta Protocols It is desirable to define a meta-protocols like 'WEB', to better measure the user experience. WEB (called capital-WEB), would represent a 'WEB page' as the transaction boundary, even though there may be many http flows needed to actually complete the 'WEB page transaction'. Meta-protocol capabilities and definitions need to be specified as extensions to the protocolDirectoryGroup of RMON-2. 3.3.2) APM Protocol Requirements The following issues regarding protocols were raised: Application performance and Network performance - There is a need for a table that relates Application to Network protocols supported - There may be a need for URL and Cookie analysis via Persistent names. [Day 2: Tuesday 1/11/00] 3.3.3) Protocol Verbs There was some discussion on the need for Protocol Verb Identifiers, similar to the PI Macros in RFC 2074. These verbs would need to be registered, and identified in a MIB. Text strings need to be handled somehow. The WG may need a document that includes all of the VERBs or other test based elements that need children. May need to hand that off to IANA. 3.3.4) Security Considerations There may be security implications for SSL, IP-sec, etc. Encouraging measurement could cause people to discourage deployment. There may also be security issues related to active monitoring, i.e., injection of packets into the network 3.4) APM Studies The group agreed that the basic capabilities of each APM Study [Class] should be listed in a read-only MIB of some kind. The following attributes of an APM Study were listed: APM Study = - Set of application protocols - Set of metrics - Data Source - Control of Output - Output - Distributions - History - Out of profile events 3.3.5) Partial Flows There was some discussion on the impact of partial flow information on an APM Study and APM Reports. Partial flows can occur if an APM collection started after a flow, or an APM collection ends during a flow. There was concern about flow information appearing in more than one report. The 'curFlowTable' was left as a detail to be worked out later. 3.5) MIB Presentations There were three MIBs presented to the group. 3.5.1) ART MIB (Albin Warth, Netscout) Document: draft-warth-rmon2-artmib-01.txt There was some discussion on the need for more metrics than just response time, and the use of one response time distribution configuration for all protocols. There was a lot of discussion on 'bins' vs. 'stats', which carried over into the next presentation. There is strong support for 'bins' (used in ART MIB), and not as much support for 'stats' (used in APMMON MIB). 3.5.2) APMMON MIB (Russell Dietz, Apptitude) Document: draft-dietz-apmmon-mib-00.txt Discussion focused mainly on the linkage between transport layer metrics and 'user experience metrics'. There was a lot of interest in 'drill-down metrics' and 'flow decomposition' for APM reports. The group decided these features are desirable, but they should be in another MIB, called the Transport Metrics MIB (TPM). Another topic of interest was the use of statistical sampling data collection vs. pre-defined 'goodness range' distributions. It was decided that more work was needed, but the MIB(s) should (optionally) support both types of report output. 3.5.3) APM MIB (Steve Waldbusser, Lucent) Document: not yet published There was a lot of discussion on the aggregation methods in the APM Reports: - Client-based aggregation - Server-based aggregation - client-server aggregation - NO protocol or applications aggregation. There was some discussion on a possible exception-based reporting mechanism, for flows or transactions that fall out of some range for some metric. There was also some discussion on the possible need for a pre-filtering mechanism for flow analysis, e.g., filtering on addresses and/or protocols and/or applications. 4) Summary The group listed the following 'To Do' list at the end of the meeting: - persistent labels - protocol verbs - exception-based reporting - pre-filter (select by net-address or protocol) - APM Arch: type of transaction - transaction-oriented - throughput-oriented - streaming-oriented - APM Caps - location - vendor ID in study - list of metrics - microflows - transaction in progress, partial flows 5) Attendees Andy Bierman abierman@cisco.com Russell Dietz rsdietz@apptitude.com Bob Massad massadb@netscout.com Albin Warth albin@netscout.com Tom Nisbet tnisbet@visualnetworks.com Emil Drottar emil_drottar@ne.3com.com Adam Liss adam_liss@ion-networks.com Pat Dochrty patrick_docherty@ion-networks.com Carl Mower cmower@nortelnetworks.com Brad Carey bradski@concord.com Viveu Anand anand@nxs-americas.com Gino Barille barille@swiss.nexus-ag.com Brian Pratt pratt@nxs-americas.com Renate Boergerding boergerding@swiss.nexus-ag.com Raymond Chudzinski rchudzinski@verio.net Bob Cole rgcole@att.com Dan Romascanu dromasca@lucent.com Dan Hansen dhansen@nai.com Bert Wijnen wignen@vnet.ibm.com Steve Waldbusser waldbusser@lucent.com Shekhar Kshirsagar skshivsa@nortelnetworks.com Russ Currie currier@netscout.com Francisco Aquino faquino@concord.com