Minutes of IPoIB meeting 3/18/2002 (about 42 people in the room) Administrative items were reviewed: - blue sheets - minute taker (jim pinkerton) - agenda Proposed agenda: - link and multicast draft (Jerry) 30 min - architecture/encap (Vivek) 30 min - Advanced capabilities (Vivek) 30 min - Next steps (Jerry) 15 minutes Jerry reviewed the changes from the multicast draft -00 to the -01. He stated that the changes were minor, to address feedback he had received. IPoIB Link Boundary: - Treat four IB layers as L2 to IP - IB partitions <-> IP links - IB partition may span multiple IB subnets o Must use right scope bits - Leave IB cross-subnet unicast/multicast details to IBTA First point above allows IPoIB spec to largely ignore the details of when/whether cross IB subnets (with caveat on scope bits above) Mutlicast Address mapping ServiceRecord vs. Algorithmic mapping ? we chose the later because the Advantage of ServiceRecord wasn?t clear. Avoids additional lookup. How many address bits to use? Decided to use the whole multicast group ID (see spec). Some concerns that some implementations only enable the lower 23 bits on Ethernet. Some feeling that this was ?implementation bugs? and that there was no reason we shouldn?t use the full address space. Embed an IPoIB signature and P_Key in MGIDs. IBTA limits each MGID unique To an IB subnet, not within a partition. Thus P_Key must be part of Multicast address to ensure the multicast address is unique across the IB Subnet. MTU Default MTU? ? some folks wanted a default, others said why? Current design is Let the admin decide and set it in the MCGroupRecord. Must work for both unicast and multicast.. Minimal MTU? ? Thomas stated he felt there needed to be a minimum MTU. Jerry commented that IPv6 is 1280, IB physical minimum is 2048. Jim Pinkerton mentioned that his concern is complexity. It seems easier to just require 2048 byte MTUs as a minimum. Quite a bit of discussion on the pluses of setting a clear direction on what IPoIB requires vs. what folks could implement. Thomas and JimP argued for a minimum 2 KB MTU, Vivek and (???) argued for not specifying and allowing admin to control this. Multicast Sender Does sender need to join the target multicast group to send - current draft is written with the assumption that they do not. Multicast Forwarding (routing) - Current RFC2236/2710 require interface on routers to be in promiscuous multicast mode. - Soulution: IPoIB driver module always sens a copy to all-router multicast group - Router and listener will receive duplicate copy o Driver skips joins on routers o Concern that it feels hacky. o Vivek comments he has an alternative solution which doesn?t require two copies What if multicast join fails? - Join the all-router multicast group - Will receive every multicast packet o Need to filter packets Vivek?s Presentation Agenda: Review architecture draft Encapsaultion draft Advanced capabilities DHCPv4 He has received quite a few comments on the -00, but -01 it has been primarily grammatical and typos. He briefly reviewed some of the main architectural themes of the arch spec. What are link characteristics, definition of a multicast GID. The ?401B? and ?601B? is ?IPv4/6 over IB?. Pointed out that the default scope for multicast is ?local?, and enable it to be greater when the IBTA defines how this is done. Vivek points out that the current draft requires two multicast packets to be sent. Vivek reviewed how Ethernet handles this: - IP_ADD_MEMBERSHIP o Join the IP multicast group o Ask interface to update reception filters o Send IBMP/MLD report - IPoIB interface filter o Join IB Mulitcast IG ? Create if doesn?t exist o Send IGMP/MLD report - Router can not forward unless it joins every group. Possible solution: - Ensure IGMP/MLD report reches router o Host sends report to all-routers MGID. One of: ? Send to all-routers MGID and IP group MGID ? Send report to broadcast MGID only (all receive it ) ? Solution is within the IPoIB interface driver o Router joins MGID for the group Jerry pointed out that this possible solution requires a smarter router ? it has to parses IGMP (two versions) & MLD. Jerry is also concerned about the leave operation ? it falls back to a timer rather than a one-to-one mapping from IP layer to IB layer. Vivek agrees, but feels it is appropriate since IB is not Ethernet, and feels it can be localized to just the driver in an end-node. For a router he?s not as sure. He mentioned though that Voltaire (an IB router company) has reviewed the proposal and thought is was workable. Leaving/Deleting MGIDs - never deleted by IPoIB elements o IB spec lists IB algorithms for pruning o MGIDs may end up shared with other protocols - IB_leave MGID when IP leaves IP mcast - If pure sender IB_Leave after some idle time - Router IB_Leaves if no listeners and no need to forward packets Vivek pointed out that the issue is not whether the leave is optional ? that is required. It?s the deletion of the MGID. Link layer address in current draft is 20 bytes: GID:QPN:reserved (16+3+1) Vivek reviewed the positions on whether ethertype is required. Discussion on advanced architecture draft. Vivek reviewed the slides. Advantages of using RC mode (larger MTU, APM, multiple QPs). Focus is to keep the onus of complexity should be on advanced modes, and keep UD simple. Suggested flags are RC| UC | RD | QPN, possibly SDP. UD is always implied, thus doesn?t need a flag. Interoperability rules: - default ? set to zero on transmit, ignore on receive - Advanced interfaces o Set supported capabilities on transmit o Process on receive Walked through some examples. Question on how two nodes try to decide on advanced capability. Vivek?s preference is to keep this outside of the spec and allow implementation choice. Use of multiple QPNs, use SIDR. Vivek walked through a possible packet format. Vivek has gotten some comments from reflector (large MTU size, eliminate TCP checksum, add SDP flag) JimP stated that he felt the main value in this approach is the larger MTU. Vivek agreed. JimP voiced a concern that the reflector seemed to think that path MTU could be leveraged to support this. JimP?s opinion is this much more closely maps to the ATM model, where a circuit is setup to a particular destination. OS?s that support path MTU would not be able to enable this capability easily. Seems to be some consensus on this point. DHCP Discussion Vivek voiced some concern about re-opening the argument of LID/GID, and while he would be fine with just using the LID, he does not recommend re-opening the issue. Thus he is recommending that we use the broadcast GID for the server reply. Proposal: Htype = 32, hlen = 0 Always use client identifier. - 4 bytes (default zero) o distinguishes multiple IPoIB interfaces per port o timestamp or QPN ? client needs to remember - 16 bytes (GID) Claims no change to the DHCP server. Jerry comments on . Jerry would like to see a solution that does not require either a server side or a client side change. JimP and several others comment that we can probably get away without a server change, but the client will have to be changed to some degree. Another concern was making sure it was clear that the client-id should be preserved across a reboot. Jerry mentioned that he?s not that concerned about using the broadcast. He is concerned that there are other protocols out there that we haven?t looked at yet (similar to DHCP in that there were open issues), and strongly recommends folks start looking at other protocols on IPoIB. MIB Status ? led by Sean Harnedy Reviewed the MIBS - textual convention MIB - Interface MIB - Subnet management agent MIB ? has a revised version going to the reflector shortly. Some comments were to possibly make the counters 64 bits. Pointed out that we have drafts on 2 out of 5 chartered MIBs. Jerry commented that we don?t need editors ? we have a lack of participation ? we need draft writers, not editors. Next Steps: Basic drafts ? resovle remaining issues (ethertype specifically). Jerry asked the group whether people had felt that there had been enough discussion on the reflector. Quick summary on issues. Main technical issue is that if ethertype is not present, than random protocol can?t run on top. Several folks discussed their perspectives. Some stating it is cleaner to have it explicit. Some claiming it is cleaner to not. Vivek pointed out that in the advanced case, ethertype is not needed. Jerry pointed out that even in the advanced case you might want to share the QP. Jerry stated his personal opinion (not as co-chair) that he is not sure he buys the argument that it is okay to dedicate the QP. He thinks it is valid from an IBTA perspective ? QP is a service point. Jerry?s concern is that there may not be that many QP. JimP also stated he felt it was cleaner, and it doesn?t cost very much for a lot of flexibility in the future. Jerry asked for any more comments. None offered. Sense of the room People voted for ethertype ? 8. People voting against ethertype ? 2. People who don?t care one-way-or-other ? 5. Those who need more time to have an opinion ? 2. Folks that need more time stated that getting more opinion from the reflector would be a good thing. Jerry is concerned that we haven?t gotten enough active review of the drafts. He also asked for more authors. MIBs ? need more participation Advanced features: need more discussion Issue last call before next IETF?