#319820 - 27/02/2009 01:26
11-hour long fsck and still running...
|
old hand
Registered: 15/02/2002
Posts: 1049
|
I'm 11.5 hours into a manual fsck on a 60GB drive. Quite a few duplicate/bad blocks so far. Dead or dying drive, right?
I'm doing this because a sync was interrupted when I tripped over the power supply and unplugged the empeg power. I had been having intermittent no hard drive detected errors on startup, but I had been attributing those to an IDE header that needs soldering.
Is there any situation where a properly functioning 60GB disk would take this long to complete fsck?
Thanks again, guys.
Jim
|
Top
|
|
|
|
#319829 - 27/02/2009 08:04
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
|
Good question. Mark, would cable or header trouble also cause this? My guess is yes it might.
Did you look at the cable and header before starting the FSCK ?
|
Top
|
|
|
|
#319831 - 27/02/2009 09:04
Re: 11-hour long fsck and still running...
[Re: tfabris]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
My guess is yes it might. Mine too... Did you look at the cable and header before starting the FSCK? A very good question; an fsck that intermittently finds that it can't write to the drive is likely to do more harm than good. Peter
|
Top
|
|
|
|
#319834 - 27/02/2009 12:10
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
I'm 11.5 hours into a manual fsck on a 60GB drive. Quite a few duplicate/bad blocks so far. Dead or dying drive, right? Dunno. But as usual, a serial port log will tell the true story. No point in even speculating without first looking there. 11.5 hours is too long. Something else is wrong. If you are doing this from (J)emplode, then the app may simply have gotten confused. Use the serial port and control^C, and kick off the fsck by hand from the command line. -ml
|
Top
|
|
|
|
#319897 - 02/03/2009 15:42
Re: 11-hour long fsck and still running...
[Re: mlord]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
This was a command line fsck. It finally finished after about 23 hours. The second drive completed in under 3 minutes. That rules out the interface, doesn't it? I'll post a serial port log shortly.
Thanks!
|
Top
|
|
|
|
#319902 - 02/03/2009 16:31
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
|
That rules out the interface, doesn't it? Hm, Mark would be able to answer this definitively, but I don't think so. I think that if some of the connections on the header or cable were bad, it might have the capability to cause errors on one drive but not on the other. Especially if the connection problems were on the cable connector that connects to the drive (just like pictured in the FAQ).
|
Top
|
|
|
|
#320164 - 09/03/2009 14:09
Re: 11-hour long fsck and still running...
[Re: mlord]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Those are disk errors, right???
Thanks in advance,
Jim
empeg-car bootstrap v1.02 20001106 (hugo@empeg.com) If there is anyone present who wants to upgrade the flash, let them speak now, or forever hold their peace...it seems not. Let fly the Penguins of Linux!
e000 v1.04 Copying kernel... Calling linux kernel... Uncompressing Linux..................................... done, booting the kernel.
Linux version 2.2.17-rmk5-np17-empeg55-hijack-v508 (hijack@rtr.ca) (gcc version 2.95.3 20010315 (release)) #2 Fri Jan 9 16:06:35 EST 2009
Processor: Intel StrongARM-1100 revision 11
Checking for extra DRAM:
c1000000: wrote ffffffff, read e28cc001
NetWinder Floating Point Emulator V0.94.1 (c) 1998 Corel Computer Corp.
empeg-car player (hardware revision 9, serial number 40103176) 16MB DRAM
Command line: mem=16m
Calibrating delay loop... 207.67 BogoMIPS
Memory: 15000k/16M available (996k code, 20k reserved, 364k data, 4k init)
Dentry hash table entries: 2048 (order 2, 16k)
Buffer cache hash table entries: 16384 (order 4, 64k)
Page cache hash table entries: 4096 (order 2, 16k)
POSIX conformance testing by UNIFIX
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
TCP: Hash tables configured (ehash 16384 bhash 16384)
IrDA (tm) Protocols for Linux-2.2 (Dag Brattli)
Starting kswapd v 1.5
SA1100 serial driver version 4.27 with no serial options enabled
ttyS00 at 0xf8010000 (irq = 15) is a SA1100 UART
ttyS01 at 0xf8050000 (irq = 17) is a SA1100 UART
ttyS02 at 0xf8030000 (irq = 16) is a SA1100 UART
Signature is 206f6972 'rio '
Tuner: loopback=0, ID=-1
show_message("Hijack v508 by Mark Lord")
empeg display initialised.
empeg dsp audio initialised
empeg dsp mixer initialised
empeg dsp initialised
empeg audio-in initialised, CS4231A revision a0
empeg remote control/panel button initialised.
empeg usb initialised, PDIUSBD12 id 1012
empeg state support initialised 0089/88c1 (save to d0004500).
empeg RDS driver initialised
empeg power-pic driver initialised (first boot)
RAM disk driver initialized: 16 RAM disks of 4096K size
empeg single channel IDE
Probing primary interface...
ide_data_test: wrote 0x0000 read 0xff80
ide_data_test: wrote 0xffff read 0xff80
ide_data_test: wrote 0xaaaa read 0xaa80
ide_data_test: wrote 0x5555 read 0x5580
ide_data_test: wrote 0x0000 read 0xff20
ide_data_test: wrote 0xffff read 0xff20
ide_data_test: wrote 0xaaaa read 0xaa20
ide_data_test: wrote 0x5555 read 0x5520
ide_data_test: wrote 0x0000 read 0xff20
ide_data_test: wrote 0xffff read 0xff20
ide_data_test: wrote 0xaaaa read 0xaa20
ide_data_test: wrote 0x5555 read 0xd720
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide_data_test: wrote 0x0000 read 0xff00
ide_data_test: wrote 0xffff read 0xff00
ide_data_test: wrote 0xaaaa read 0xaa00
ide_data_test: wrote 0x5555 read 0x5500
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide_data_test: wrote 0x0000 read 0xff00
ide_data_test: wrote 0xffff read 0xff00
ide_data_test: wrote 0xaaaa read 0xaa00
ide_data_test: wrote 0x5555 read 0x5500
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide_data_test: wrote 0x0000 read 0xff00
ide_data_test: wrote 0xffff read 0xff00
ide_data_test: wrote 0xaaaa read 0xaa00
ide_data_test: wrote 0x5555 read 0xd500
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide_data_test: wrote 0x0000 read 0xff00
ide_data_test: wrote 0xffff read 0xff00
ide_data_test: wrote 0xaaaa read 0xaa00
ide_data_test: wrote 0x5555 read 0x5500
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide_data_test: wrote 0x0000 read 0xff00
ide_data_test: wrote 0xffff read 0xff00
ide_data_test: wrote 0xaaaa read 0xaa00
ide_data_test: wrote 0x5555 read 0x5500
hda: TOSHIBA MK8025GAS, ATA DISK drive
ide0 at 0x000-0x007,0x038 on irq 6
hda: TOSHIBA MK8025GAS, 76319MB w/0kB Cache, CHS=9729/255/63
empeg-flash driver initialized
smc chip id/revision 0x3349
smc9194.c:v0.12 03/06/96 by Erik Stahlman (erik@vt.edu)
SMC9194: SMC91C94(r:9) at 0x4008000 IRQ:7 INTF:TP MEM:6144b MAC 00:02:d7:28:0c:68
Partition check:
hda: unknown partition table
RAMDISK: ext2 filesystem found at block 0
RAMDISK: Loading 320 blocks [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\done.
EXT2-fs warning: checktime reached, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
empeg-pump v0.03 (19980601) Press Ctrl-A to enter pump...attempt to access beyond end of device
03:05: rw=0, want=2, limit=0
dev 03:05 blksize=1024 blocknr=1 sector=2 size=1024 count=1
EXT2-fs: unable to read superblock
Kernel panic: VFS: Unable to mount root fs on 03:05
|
Top
|
|
|
|
#320169 - 09/03/2009 14:35
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Nope.
But it looks like this is a new disk? One that has never had empeg stuff on it before?
-ml
|
Top
|
|
|
|
#320185 - 09/03/2009 18:23
Re: 11-hour long fsck and still running...
[Re: mlord]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Ah, no. It's an old disk that the builder image wouldn't (re)build. So I installed the developer image and zeroed out the partition table (the first part of the manual build process) and then tried the builder image again (because I was lazy and doing other stuff). Now I just get Hard Disk Not Found...
Edited by TigerJimmy (09/03/2009 18:25)
|
Top
|
|
|
|
#320186 - 09/03/2009 18:26
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Well, the drive is fine. So zero the partition table (again), and then grab builder_bigdisk_v4.upgrade and zap it with that.
-ml
|
Top
|
|
|
|
#320187 - 09/03/2009 18:46
Re: 11-hour long fsck and still running...
[Re: mlord]
|
carpal tunnel
Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
|
Well, the drive is fine. So zero the partition table (again), and then grab builder_bigdisk_v4.upgrade and zap it with that. Which is located at http://rtr.ca/bigdisk/ by the way. This will take care of the drive, but he still needs to solve this problem he mentioned in his original post: I had been having intermittent no hard drive detected errors on startup
|
Top
|
|
|
|
#320189 - 09/03/2009 19:56
Re: 11-hour long fsck and still running...
[Re: mlord]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
That's the builder that didn't do it the first time...
OK, I'm done for the day, so I'll tear into it and see if I can't make it work.
Thanks!
|
Top
|
|
|
|
#320190 - 09/03/2009 20:20
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
The builder seems to have worked now. I think the issue may have been that the only drive attached was the slave? Could that be a problem?
|
Top
|
|
|
|
#320191 - 09/03/2009 20:49
Re: 11-hour long fsck and still running...
[Re: TigerJimmy]
|
carpal tunnel
Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
|
the issue may have been that the only drive attached was the slave? Could that be a problem? If you neglected to remove the slave jumper before attempting to run the builder on it, yeah, that would do it. There has to be a master there before a slave will work, so if you've got only one drive in there, it's gotta be jumperless (i.e., a master).
|
Top
|
|
|
|
#320203 - 10/03/2009 14:03
Re: 11-hour long fsck and still running...
[Re: tfabris]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Yes, that was the problem. I had one drive only, plugged in to the slave position on the cable, and jumpered as a slave, but with no master drive installed. Changed all that around and ran the builder with the disk on the master and it built and worked. In retrospect, probably a stupid mistake.
|
Top
|
|
|
|
#320204 - 10/03/2009 14:04
Re: 11-hour long fsck and still running...
[Re: tfabris]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Well, the drive is fine. So zero the partition table (again), and then grab builder_bigdisk_v4.upgrade and zap it with that. Which is located at http://rtr.ca/bigdisk/ by the way. This will take care of the drive, but he still needs to solve this problem he mentioned in his original post: I had been having intermittent no hard drive detected errors on startup Yeah, I'm pretty sure that Stu has this fixed for me. He's resoldered the IDE header and recrimped the cable. I haven't had the detection problems while running on my spare (except for when the disk wasn't building because it was a slave-only configuration).
|
Top
|
|
|
|
#320208 - 10/03/2009 15:40
Re: 11-hour long fsck and still running...
[Re: mlord]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
Those are disk errors, right??? [...] ide_data_test: wrote 0x0000 read 0xff00 ide_data_test: wrote 0xffff read 0xff00 ide_data_test: wrote 0xaaaa read 0xaa00 ide_data_test: wrote 0x5555 read 0x5500
I think these messages have worried a lot of people, over the years... if these differences between "wrote" and "read" are perfectly normal, is there perhaps some way the message could be reworded to sound less error-like? Peter
|
Top
|
|
|
|
#320210 - 10/03/2009 15:46
Re: 11-hour long fsck and still running...
[Re: peter]
|
carpal tunnel
Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
|
Yup, as Peter is saying, the messages tend to look a bit like errors even when they aren't. Don't know how Mark could make them look less like errors... because until the drive is fully spun up and functioning, they really are errors (if I'm understanding the way they work correctly). (For completeness' sake, here is the description of how the IDE data test messages are used.)
|
Top
|
|
|
|
#320240 - 11/03/2009 12:22
Re: 11-hour long fsck and still running...
[Re: peter]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Mmm.. I wonder if perhaps this:
ide_probe: wrote 0xffff read 0xff00
??
|
Top
|
|
|
|
#320246 - 11/03/2009 12:49
Re: 11-hour long fsck and still running...
[Re: mlord]
|
carpal tunnel
Registered: 18/01/2000
Posts: 5683
Loc: London, UK
|
Mmm.. I wonder if perhaps this:
ide_probe: wrote 0xffff read 0xff00
?? Maybe highlight the ones that match: ide_probe: wrote 0xffff read 0xff00ide_probe: wrote 0xffff read 0xffff - OKThat way, normal disks go from being not OK (without shouting) to OK (with shouting); broken disks never state OK. Bit less scary?
_________________________
-- roger
|
Top
|
|
|
|
#320249 - 11/03/2009 14:02
Re: 11-hour long fsck and still running...
[Re: Roger]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Perhaps, yes.
I wonder though, if then it will lead to even more inquiries as to why some of the tests "fail" (not "OK") whereas others don't.
People are strange beasts at times. The only way to keep them from asking is to remove the messages (MS style). But these are incredibly useful diagnostics, so they're staying put.
Cheers
|
Top
|
|
|
|
#320252 - 11/03/2009 14:12
Re: 11-hour long fsck and still running...
[Re: mlord]
|
carpal tunnel
Registered: 18/01/2000
Posts: 5683
Loc: London, UK
|
The only way to keep them from asking is to remove the messages (MS style). As I'm currently spending a few months in the development end of our support team, I'm beginning to come round to that point of view
_________________________
-- roger
|
Top
|
|
|
|
#320260 - 11/03/2009 14:59
Re: 11-hour long fsck and still running...
[Re: Roger]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
That way, normal disks go from being not OK (without shouting) to OK (with shouting); broken disks never state OK. Bit less scary? A bit, maybe, but note that the drive in this thread never tests OK by that criterion (which is also the criterion in the FAQ) -- the bottom four bits come back zero the whole time (and the bottom 8 most of the time), but apparently this is still actually OK? Peter
|
Top
|
|
|
|
#320261 - 11/03/2009 16:21
Re: 11-hour long fsck and still running...
[Re: peter]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Yup, that drive is just fine.
The "data test" messages are *ONLY* meaningful in the context of a known hardware fault.
-ml
|
Top
|
|
|
|
#320262 - 11/03/2009 16:33
Re: 11-hour long fsck and still running...
[Re: mlord]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
Mmm.. I wonder if perhaps this:
ide_probe: wrote 0xffff read 0xff00
?? How about just: ide_probe: 0xffffff00? That way you'd still get all the data, but most users wouldn't even perceive that the two halves of the number were in some way "meant" to be the same. Peter
|
Top
|
|
|
|
|
|