ChangeSet 1.1238, 2003/06/18 16:54:00-07:00, david-b@pacbell.net

[PATCH] USB: SMP ehci-q.c 1010 BUG()

Stefano Barbato wrote:
> Dual PIII
> kernel 2.4.21-rc2 (w/ SMP)  (2.5.69 below)
> ...
>
> I put a few printk before the BUG() and I found that the offending if() is
> this:
>         if(qh->qh_state != QH_STATE_LINKED
>                                 && qh->qh_state != QH_STATE_UNLINK_WAIT)
>
> because qh_state were QH_STATE_COMPLETING.

I got a similar SMP report recently, but without info about
which clause was failing -- which is a key clue, thanks!!

The COMPLETING state is used only while a QH is being
scanned for completed TDs.  (Think CPU-0 irq handler.)
Looking at the handful of places that call the routine
reporting the BUG(), a couple seem like they could make
trouble with multiple CPUs in the driver.


 drivers/usb/host/ehci-q.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)


diff -Nru a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c
--- a/drivers/usb/host/ehci-q.c	Wed Jun 18 17:35:29 2003
+++ b/drivers/usb/host/ehci-q.c	Wed Jun 18 17:35:29 2003
@@ -958,11 +958,11 @@
 	qh->qh_state = QH_STATE_IDLE;
 	qh->qh_next.qh = 0;
 	qh_put (ehci, qh);			// refcount from reclaim 
-	ehci->reclaim = 0;
-	ehci->reclaim_ready = 0;
 
 	/* other unlink(s) may be pending (in QH_STATE_UNLINK_WAIT) */
 	next = qh->reclaim;
+	ehci->reclaim = next;
+	ehci->reclaim_ready = 0;
 	qh->reclaim = 0;
 
 	qh_completions (ehci, qh, regs);
@@ -981,8 +981,10 @@
 			timer_action (ehci, TIMER_ASYNC_OFF);
 	}
 
-	if (next)
+	if (next) {
+		ehci->reclaim = 0;
 		start_unlink_async (ehci, next);
+	}
 }
 
 /* makes sure the async qh will become idle */
@@ -1086,7 +1088,8 @@
 			if (list_empty (&qh->qtd_list)) {
 				if (qh->stamp == ehci->stamp)
 					action = TIMER_ASYNC_SHRINK;
-				else if (!ehci->reclaim)
+				else if (!ehci->reclaim
+					    && qh->qh_state == QH_STATE_LINKED)
 					start_unlink_async (ehci, qh);
 			}
 
