Skip to topic | Skip to bottom
Home
LCGatUSC
LCGatUSC.HardwareFailuresr1.23 - 10 Feb 2011 - 17:32 - MarcosASecotopic end

Start of topic | Skip to actions
-- MarcosASeco - 14 Mar 2007
Node Failure Date of Failure Action Taken Date of Repair
lhcb018.usc.cesga.es kernel panic 12/03/2006 system restarted -
lhcb061.usc.cesga.es faulty memory: DIMM4 20/02/2007 module replaced -
lhcb063.usc.cesga.es scsi timeout, lost disc access 08/03/2007 system restarted -
lhcb025.usc.cesga.es faulty disk 07/07/2007 disk replaced -
lhcb062.usc.cesga.es scsi timeout, lost disc access 10/07/2007 system restarted -
lhcb020.usc.cesga.es faulty disk 27/07/2007 disk replaced -
lhcb052.usc.cesga.es python process consumed all avaliable memory 30/07/2007 system restarted -
lhcb036.usc.cesga.es faulty disk 10/08/2007 disk replaced -
lhcb029.usc.cesga.es faulty disk 19/09/2007 disk replaced -
lhcb041.usc.cesga.es faulty disk 19/09/2007 disk replaced -
lhcb021.usc.cesga.es faulty disk 10/10/2007 disk replaced -
lhcb022.usc.cesga.es faulty disk 10/10/2007 disk replaced -
lhcb031.usc.cesga.es faulty disk 10/10/2007 disk replaced -
lhcb038.usc.cesga.es faulty disk 10/10/2007 disk replaced -
lhcb023.usc.cesga.es faulty disk 05/12/2007 disk replaced -
lhcb030.usc.cesga.es faulty disk 05/12/2007 disk replaced -
lhcb069.usc.cesga.es faulty motherboard 08/01/2008 motherboard replaced 12/02/2008
lhcb079.usc.cesga.es faulty motherboard 20/02/2008 motherboard and power supply replaced 05/03/2008
lhcb036.usc.cesga.es probably a faulty power supply 12/03/2008 in progress (now working) -
lhcb079.usc.cesga.es faulty motherboard and power supply 18/03/2008 motherboard and power supply replaced 23/04/2008
lhcb080.usc.cesga.es faulty motherboard and power supply 18/03/2008 motherboard and power supply replaced 23/04/2008
lhcb086.usc.cesga.es faulty motherboard and power supply 18/03/2008 motherboard and power supply replaced 23/04/2008
lhcb074.usc.cesga.es faulty power supply 08/04/2008 power supply replaced 25/04/2008
lhcb085.usc.cesga.es faulty power supply 06/05/2008 power supply exchanged with the one in lhcb070 20/05/2008
lhcb070.usc.cesga.es faulty sensor fan and faulty power supply(comming from lhcb085 on 20/05/2008) 13/05/2008 motherboard and power supply replaced 02/07/2008
lhcb074.usc.cesga.es unknown 02/06/2008 after unplugging and plugging again all the memory the problem was solved 13/06/2008
lhcb054.usc.cesga.es faulty DIMM 30/06/2008 DIMM replaced 28/07/2008
lhcb079.usc.cesga.es after too many 'Correctable ECC' errors the machine will not reboot because it was unable to find any memory 15/10/2008 after unplugging and plugging again all the memory the problem was disappeared 16/10/2008
lhcb066.usc.cesga.es Periodically the one bank of memory will be disable because too many 'Correctable ECC' ocurred after reboot things returned to normal. The reason of the failures was a faulty DIMM. The actual DIMM was discovered after moving half of the modules to another machine (lhcb064) and the failures apeared in the new machine. These failures were reproducible by running memtest long enough 08/04/2008 Faulty DIMM replaced 03/11/2008
lhcb065.usc.cesga.es On 20/10/2008 the memory of lhcb079 was exchanged with the memory of this machine and after around a week and several 'Correctable ECC' errors the machine will not reboot because it was unable to find any memory 01/11/2008 All problems in the Caton machines were related to the power supply. The power supplies were changed on all machines between August and September 2008 01/04/2009
lhcb027.usc.cesga.es faulty disk 17/04/2009 Disk replaced 20/04/2009
nodo077.inv.usc.es faulty disk 01/02/2011 Disk swapped with the one from nodo025 03/02/2011
nodo069.inv.usc.es faulty disk 01/02/2011 Disk swapped with the one from nodo026 03/02/2011
nodo065.inv.usc.es faulty motherboard and power suply 01/02/2011 motherboard and power supply swapped with those from nodo025 03/02/2001
nodo109.inv.usc.es faulty disk 07/02/2011 Disk replaced 10/02/2011

to top

You are here: LCGatUSC > HardAtUSC > HardwareFailures

to top

Copyright © 1999-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding this material Send feedback