[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Thu, 25 Aug 2005 08:50:15


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Various PCI bus errors can be signaled by newer PCI controllers. The
core error recovery routines are architecture dependent. This patch adds
a recovery infrastructure for the PPC64 pSeries systems.

Signed-off-by: Linas Vepstas < XXXX@XXXXX.COM >

--
arch/ppc64/kernel/Makefile | 2
arch/ppc64/kernel/eeh.c | 543 +++++++++++++++++++++++++++++------------
arch/ppc64/kernel/eeh_driver.c | 361 +++++++++++++++++++++++++++
arch/ppc64/kernel/eeh_event.c | 116 ++++++++
arch/ppc64/kernel/eeh_event.h | 52 +++
arch/ppc64/kernel/rtas_pci.c | 5
include/asm-ppc64/eeh.h | 105 +++++--
include/asm-ppc64/prom.h | 10
include/asm-ppc64/rtas.h | 2
9 files changed, 1001 insertions(+), 195 deletions(-)

Index: linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.13-rc6-git9.orig/arch/ppc64/kernel/eeh.c 2005-08-19 12:52:31.000000000 -0500
+++ linux-2.6.13-rc6-git9/arch/ppc64/kernel/eeh.c 2005-08-23 16:53:05.000000000 -0500
@@ -1,32 +1,33 @@
/*
+ *
* eeh.c
* Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation
- *
+ *
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
- *
+ *
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
- *
+ *
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

-#include <linux/bootmem.h>
+#include <linux/delay.h>
#include <linux/init.h>
+#include <linux/irq.h>
#include <linux/list.h>
-#include <linux/mm.h>
-#include <linux/notifier.h>
#include <linux/pci.h>
#include <linux/proc_fs.h>
#include <linux/rbtree.h>
#include <linux/seq_file.h>
#include <linux/spinlock.h>
+#include <asm/atomic.h>
#include <asm/eeh.h>
#include <asm/io.h>
#include <asm/machdep.h>
@@ -34,6 +35,7 @@
#include <asm/atomic.h>
#include <asm/systemcfg.h>
#include "pci.h"
+#include "eeh_event.h"

#undef DEBUG

@@ -49,8 +51,8 @@
* were "empty": all reads return 0xff's and all writes are silently
* ignored. EEH slot isolation events can be triggered by parity
* errors on the address or data busses (e.g. during posted writes),
- * which in turn might be caused by dust, vibration, humidity,
- * radioactivity or plain-old failed hardware.
+ * which in turn might be caused by low voltage on the bus, dust,
+ * vibration, humidity, radioactivity or plain-old failed hardware.
*
* Note, however, that one of the leading causes of EEH slot
* freeze events are buggy device drivers, buggy device mic
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Paul Macke » Thu, 25 Aug 2005 09:50:10

Linas Vepstas writes:

In this patch at least, your mailer seems to have blanked out lines
that match ^[-+]$. Could you send them to me again with a different
mailer or put them on a web or ftp site somewhere?

Thanks,
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/

 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Paul Macke » Thu, 25 Aug 2005 16:40:11


I got 3 copies of each of these mails, one directly, one through
linuxppc64-dev and one through linux-kernel. It looks like the copies
that came through the mailing lists are OK but the copy that came
directly to me is corrupted. Weird.

Regards,
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by John Ros » Fri, 26 Aug 2005 01:00:24

i Linas-

I like the idea of splitting the recovery stuff into its own driver. A
few comments on the last reorg patch:

...
...

Inconsistent spacing in new code... </nit>


This probably isn't the right header description for this file :)


This function breaks if rpaphp is compiled as a module. It's probably
bad for kernel code to depend on symbols exported from modules. This
raises two larger questions: Where should this new driver sit, and
should it be possible to compile it as a module as well?

...

Header doesn't match the function :)


This dependence on struct hotplug_slot might be problematic as we
restrict the registration of "PCI hotplug" slots to exclude PHBs, VIO,
and embedded slots. Noticed your comment to this effect. I can work
with you offline on this.

...

The second part of this conditional will never be true, as this has just
been checked above.


This conditional will always be true, as this has also ben checked
above.


How about a pointer to a struct of EEH fields? Folks are touchy about
adding anything PCI-specific to device nodes, especially since most DNs
aren't PCI at all.

Thanks-
John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Fri, 26 Aug 2005 01:40:11

On Wed, Aug 24, 2005 at 10:45:31AM -0500, John Rose was heard to remark:

Yes, this file is a little ball of ugliness that resulted from moving
things out of the rpaphp directory; and, yes, it's rather
un-reconstructed. I released it under the "release early" program.

The meta-issue that I'd like to reach consensus on first is whether
there should be any hot-plug recovery attempted at all. Removing
hot-plug-recovery support will make many of the issues you raise
to be moot.


I attempted to remove all of the pci-related stuff from this struct,
and got a crash in very very early boot (before the transition from
real to virtual addressing). Not sure why, I was surprised. It seems
related to the flattening of the device ndode tree. I'll try again
soon.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Paul Macke » Fri, 26 Aug 2005 09:20:08

Linas Vepstas writes:


Yes, this probably the thorniest issue we have.

My feeling is that the unplug half of it is probably fairly
uncontroversial, but the replug half is a can of worms. Would you
agree with that?

Is it udev that handles the hotplug notifications on the userspace
side in all cases (do both RHEL and SLES use udev, for instance)?
How hard is it to add a new sort of notification, on the kernel side
and in udev?

I think what I'd like to see is that when a slot gets isolated and the
driver doesn't have recovery code, the kernel calls the driver's
unplug function and generates a hotplug event to udev. Ideally this
would be a variant of the remove event which would say "and by the
way, please try replugging this slot when you've finished handling the
remove event" or something along those lines.

Thoughts?

Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Benjamin H » Fri, 26 Aug 2005 10:00:14


I'm still trying to understand why we care. What prevents us from just
uplugging the previous device and re-plugging right away ? After all,
the driver->remove() function is supposed to guarantee that no HW access
will happen after it returns and that everything was unmapped.

Of course, we'll possibly end up with a different ethX or whatever, but
I don't see the problem with that ... It's hopeless to think we might
manage to keep that identical anyway, unless the driver implements
proper error recovery.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Sat, 27 Aug 2005 01:20:15

n Thu, Aug 25, 2005 at 10:10:45AM +1000, Paul Mackerras was heard to remark:

Actually, no. There are three issues:
1) hotplug routines are called from within kernel. GregKH has stated on
multiple occasions that doing this is wrong/bad/evil. This includes
calling hot-unplug.

2) As a result, the code to call hot-unplug is a bit messy. In
particular, there's a bit of hoop-jumping when hotplug is built as
as a module (and said hoops were wrecked recently when I moved the
code around, out of the rpaphp directory).

3) Hot-unplug causes scripts to run in user-space. There is no way to
know when these scripts are done, so its not clear if we've waited
long enough before calling hot-add (or if waiting is even necessary).


Yes, and it seems to work fine despite the fact that the current
sles9/rhel4 use rather oooooold versions of udev, which is criminal,
according to Kay Seivers. I have not tested new versions of udev;
I assume new versions will work "even better".


Why? To acheive what goal? (Keep in mind that user-space eeh solutions
seem to fail when the affected device is storage, since block devices
and filesystems don't like the underlying storage to wink out.)


Ahh, yes, this addresses the timing issue raised in point 3). However,
I'm thinking that the timing issue is not really an issue, depending on
how udev is designed. For example, consider a 100% cpu loaded, heavy
i/o loaded PC, and now we rapidly unplug and replug some USB key.
Presumably, udev will handle this gracefully. The kernel error recovery
essentially looks the same to udev: the burp on the bus looks like a
rapid-fire unplug-replug.

BTW, zSeries has similar concerns, where channels can come and go,
causing thousands of unplug/replug udev events in rapid succession.
(This was discussed on the udev mailing lists in the past). It might
be interesting to have the zSeries folks discuss the current EEH design,
as this is something they have far more experience with than any of
the pc or unix crowd. I personally have not discussed with any zSeries
people.

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Sat, 27 Aug 2005 01:30:13

On Thu, Aug 25, 2005 at 10:49:03AM +1000, Benjamin Herrenschmidt was heard to remark:

Yep, but that's not an issue, since all the various device-naming
schemes are supposed to be fixing this. Its a distinct problem;
it needs to be solved even across cold-boots.

(Didn't I ever tell you about the day I added a new disk controller to
my system, and /dev/hda became /dev/hde and thus /home was mounted on
/usr and /var as /etc and all hell broke loose? Owww, device naming
is a serious issue for home users and even more so for enterprise-class
users).

--linas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Benjamin H » Sat, 27 Aug 2005 06:50:14


Ok, so what is the problem then ? Why do we have to wait at all ? Why
not just unplug/replug right away ?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Paul Macke » Sat, 27 Aug 2005 08:20:08

Benjamin Herrenschmidt writes:


We'd have to be absolutely certain that the driver could not possibly
take another interrupt or try to access the device on behalf of the
old instance of the device by the time it returned from the remove
function. I'm not sure I'd trust most drivers that far...

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Benjamin H » Sat, 27 Aug 2005 08:50:11


Hrm... If a driver gets that wrong, then it will also blow up when
unloaded as a module. All drivers should be fully shut down by the time
they return from remove(). free_irq() is synchronous as is iounmap() and
both of those are usually called as part of remove(). I wouldn't be too
worried here.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Paul Macke » Tue, 30 Aug 2005 15:50:12

Linas Vepstas writes:


One way to clean this up would be to make rpaphp the driver for the
EADS bridges (from the pci code's point of view). Then it would
automatically get included in the error recovery process and could do
whatever it should.


OK, so let's just add a new hotplug event called KOBJ_ERROR or
something, which tells userspace that an error has occurred which has
made the device inaccessible. Greg, would that be OK?

The only trouble with that is that if we don't have a hotplug bridge
driver loaded for the slot, we can't tell the driver that its device
has gone away. I guess that's not a big problem in practice.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Wed, 31 Aug 2005 01:00:12

On Fri, Aug 26, 2005 at 07:43:57AM +1000, Benjamin Herrenschmidt was heard to remark:


Paranoia + old versions of udev. I beleive that older versions of udev
(such as the ones currently shipping with Red Hat RHEL4 and SuSE SLES9)
failed to serialize events properly. I beleive that the newer versions
do serialize, but have not verified/tested.

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[patch 8/8] PCI Error Recovery: PPC64 core recovery routines

Post by Linas Veps » Wed, 31 Aug 2005 01:10:06

On Fri, Aug 26, 2005 at 09:37:36AM +1000, Benjamin Herrenschmidt was heard to remark:



:) We've discovered two, so far, that blow up when unloaded:
lpfc and e1000. I beleive these are now fixed in mainline.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/