Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[direct floppy] Validation of floppy drive format tables #1731

Merged
merged 1 commit into from
Sep 23, 2023
Merged

Conversation

ghaerr
Copy link
Owner

@ghaerr ghaerr commented Sep 22, 2023

Updates and verifies floppy drive probe_types[] table from master minor_types[] array.

Adds floppy format names to minor_types[] array. No changes made to floppy formats in minor_types[].

There are only two format changes to probe_types[]: the second probe of a 360k floppy used in QEMU, and the 720k second probe now tests for 360k, rather than checking again for the same unchanged entry.

Some floppy format names were changed back to the original naming scheme: e.g. instead of "360k/slow" and "360k/fast", these are named "360k/PC" and "360k/AT". The (original) naming scheme is meant to correspond to the probed floppy size based on the hardware drive type that might be typically found from the IBM "PC" or "AT" timeframe. So "360k/AT" means a 360k floppy format in an AT machine (which would typically be a 1.2M drive, not a 360k drive).

@Mellvik: I finally figured out the issue with QEMU and the seemingly strange data rate for the second probe (now called "360k/AT"). This is a 360k floppy in a 1.2M drive, which uses a 300k data rate, consistent with the entry in the minor_types[] array. The probe table has been updated accordingly.

The "size/PC" vs "size/AT" naming scheme can be interpreted that the former is the original drive type, while the latter is that same drive capacity in a newer drive type, e.g. 360k/AT = 360k in 1.2M drive for 5.25" discs, and 360k/AT = 360k in 720k drive for 3.5" discs.

The probe tables were painstakingly verified with just the two changes made along with format name change to match the entries in the minor drive type array. This will allow the next step to change the probe_types array to just use indices into the minor types array, as well as allowing say, drive minor number 5 (=4+1) to specify 360k/PC and display that name in the driver startup, which wasn't possible prior with NULL name entries.

The current auto-probes will only work with two probes, which on first probe expect the floppy format to match the CMOS setting, and the second probe allows a single second-guessed format. When the probe_types array is changed into using indices into supported floppy formats, a zero-terminated list will be able to be used to specify a longer chain of disk formats to try, starting from a given CMOS-reported drive type.

All this may seem quite complicated, but I think its coming together!

@ghaerr ghaerr merged commit d998335 into master Sep 23, 2023
2 checks passed
@ghaerr ghaerr deleted the formats branch September 23, 2023 17:58
@Mellvik
Copy link
Contributor

Mellvik commented Sep 23, 2023

This is a great piece of work, @ghaerr. I look forward to dive in and move it over.

I've been thinking about the 300/250 issue and while not finding much on the net, here's the explanation - which may unfortunately complicate the probe again (and invalidate the testing I've done on physical hw).

There are two types of dual density 5.25 in drives: single speed and dual speed. The HD (1.2m) drives run at 360 rpm, the 360k drives run at 300. So the big drives (should) change to 300rpm when probing low density. This is technically correct and deliver the 250kbps rate.

But: Some smart ibm people thought this was too complicated (expensive) and decided to support just one speed, 360. Which works fine except the data rate for low density diskettes is now 300, not 250.

So this has nothing to do with AT or PC, but the type of drive installed. Your compaq - if it has a 5.25in drive - has a dual speed drive. So the optimal probing is to do both... you cannot ask the drive what it does...

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 24, 2023

Thanks for the explanation of the dual drive speeds @Mellvik, I was unaware of the history and it gives a very good reason why the data rates are different for 1.2M vs 360k drives.

Agreed with regards to PC vs AT. The drive naming scheme during probing uses 360/PC vs 360/AT to indicate the drive type most typically found on either a PC (360k drive) or an AT (1.2M drive), rather than confusing the user with something like "360k/1.2M", (meaning 360k disk in 1.2M drive) which can seem ambiguous.

So the optimal probing is to do both... you cannot ask the drive what it does...

No we can't ask the drive what it does, but we do ask CMOS what the drive is, and then that drive type (which for this discussion would either be likely 360k or 1.2M for 5.25" systems), is used for starting the two-probe auto probe. For our discussion, if the CMOS drive type is 360k, the auto probe will try 360k/PC then 360k/AT. If the CMOS drive type is 1.2M then the auto probe will try 1.2M then 360k/AT. (The data rate for 360k/PC is 250, and 300 for 360k/AT). All these are in the probe_types[] tables and will be further adjustable when I move to using zero-terminated indices into minor_types[] for probing, which will allow for more than just two probes per CMOS drive type.

So everything should work now for 360k discs in 1.2M or 360k drives, including the proper data rate :) Take a look at the tables to see if we need different probing sequences for the other CMOS drive types.

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 24, 2023

So everything should work now for 360k discs in 1.2M or 360k drives

@Mellvik, well, I spoke to soon on that! After further testing of this PR on QEMU, I ran into a problem maybe you can help me on: although I thought the DF driver was working on QEMU in the last series of enhancements to it, it actually hasn't been, and now errors so badly it won't boot a 360k drive with root=df0 in /bootopts.

Long story short, I finally traced the problem down to the seek_track = track << floppy->stretch; in redo_fd_request. What appears to be happening (which only occurs on larger block numbers than the root directory since I never tested it fully), is that QEMU isn't working when the FDC is told to seek to a track larger than the number of cylinders on the disk when stretch is 1 and the drive supposedly requires "track doubling".

Here's the entry for the "360k/AT" floppy (note that stretch is 1, and data rate is 0x01 (=300k required for QEMU)):

    { 720,  9, 2, 40, 1, 0x23, 0x01, 0xDF, 0x50, "360k/AT"},    /* 360kB in 1.2MB drive */

The source code comment defines "stretch":

 * Stretch tells if the tracks need to be doubled for some
 * types (ie 360kB diskette in 1.2MB drive etc).

seemingly indicates that a 1.2M drive needs to have the track number doubled using
seek_track = track << floppy->stretch; when stretch is 1, which is the case for 1.2M drives. This doesn't work on QEMU. I only happened to find it when booting from a floppy where /dev/tty1 open from /bin/getty happened to be a larger block and the track number doubled from 39 to 78, which of course isn't present on a 360k drive, but apparently is required to be sent to the controller when using 1.2M drives.

My question is: is this true, do 1.2M drives need double the track pulses in order to work?

I have the the following code in redo_fd_request which will work on QEMU:

    //seek_track = track << floppy->stretch;
    seek_track = track;

Obviously if real hardware requires the track number to be doubled when 360k floppies are in 1.2M drives, I can't comment out the stretch (required for QEMU to work) for real hardware to work. I'm not sure what to do since the kernel doesn't know its operating on QEMU!!! Ugh. Previous versions of the DF driver "seemed" to work on QEMU, but only when doing light-duty work like "ls -l /", which don't ever use block numbers that cause the track doubling to error on QEMU.

It would seem that QEMU ignores seek_track which is used with the FD_SEEK command to step the track motor UNTIL it gets larger than the max track number, and just uses the track variable to actually set the appropriate track number using the FD_READ/FD_WRITE command in the separate setup_rw_floppy routine. Hopefully what I'm talking about makes sense.

Thank you!

@Mellvik
Copy link
Contributor

Mellvik commented Sep 24, 2023

Thanks again @ghaerr -

So the optimal probing is to do both... you cannot ask the drive what it does...

No we can't ask the drive what it does, but we do ask CMOS what the drive is, and then that drive type (which for this discussion would either be likely 360k or 1.2M for 5.25" systems), is used for starting the two-probe auto probe. For our discussion, if the CMOS drive type is 360k, the auto probe will try 360k/PC then 360k/AT. If the CMOS drive type is 1.2M then the auto probe will try 1.2M then 360k/AT. (The data rate for 360k/PC is 250, and 300 for 360k/AT). All these are in the probe_types[] tables and will be further adjustable when I move to using zero-terminated indices into minor_types[] for probing, which will allow for more than just two probes per CMOS drive type.

Admittedly I haven't looked at the code yet, but there is still a caveat - which is what I alluded to when I said

to do both ...

... above. A 1.2M 5.25 drive is probably (I do not have any numbers on this) just as likely to be dual speed (i.e. 250kbps for 360 format) as single speed (300kbps 360 format). So safe probing actually requires testing both. If 360/300 fails, go on to 360/250. Or maybe you're doing that already?

@Mellvik
Copy link
Contributor

Mellvik commented Sep 24, 2023

So everything should work now for 360k discs in 1.2M or 360k drives

@Mellvik, well, I spoke to soon on that! After further testing of this PR on QEMU, I ran into a problem maybe you can help me on: although I thought the DF driver was working on QEMU in the last series of enhancements to it, it actually hasn't been, and now errors so badly it won't boot a 360k drive with root=df0 in /bootopts.

Long story short, I finally traced the problem down to the seek_track = track << floppy->stretch; in redo_fd_request. What appears to be happening (which only occurs on larger block numbers than the root directory since I never tested it fully), is that QEMU isn't working when the FDC is told to seek to a track larger than the number of cylinders on the disk when stretch is 1 and the drive supposedly requires "track doubling".

Here's the entry for the "360k/AT" floppy (note that stretch is 1, and data rate is 0x01 (=300k required for QEMU)):

    { 720,  9, 2, 40, 1, 0x23, 0x01, 0xDF, 0x50, "360k/AT"},    /* 360kB in 1.2MB drive */

The source code comment defines "stretch":

 * Stretch tells if the tracks need to be doubled for some
 * types (ie 360kB diskette in 1.2MB drive etc).

seemingly indicates that a 1.2M drive needs to have the track number doubled using seek_track = track << floppy->stretch; when stretch is 1, which is the case for 1.2M drives. This doesn't work on QEMU. I only happened to find it when booting from a floppy where /dev/tty1 open from /bin/getty happened to be a larger block and the track number doubled from 39 to 78, which of course isn't present on a 360k drive, but apparently is required to be sent to the controller when using 1.2M drives.

My question is: is this true, do 1.2M drives need double the track pulses in order to work?

I have the the following code in redo_fd_request which will work on QEMU:

    //seek_track = track << floppy->stretch;
    seek_track = track;

Obviously if real hardware requires the track number to be doubled when 360k floppies are in 1.2M drives, I can't comment out the stretch (required for QEMU to work) for real hardware to work. I'm not sure what to do since the kernel doesn't know its operating on QEMU!!! Ugh. Previous versions of the DF driver "seemed" to work on QEMU, but only when doing light-duty work like "ls -l /", which don't ever use block numbers that cause the track doubling to error on QEMU.

It would seem that QEMU ignores seek_track which is used with the FD_SEEK command to step the track motor UNTIL it gets larger than the max track number, and just uses the track variable to actually set the appropriate track number using the FD_READ/FD_WRITE command in the separate setup_rw_floppy routine. Hopefully what I'm talking about makes sense.

Interesting @ghaerr.

And most likely a minor thing. Yes, physical drives do require the track doubling. Also - while I'm running QEMU booted from floppy all the time (as in ALL the time in order to detect immediately if I break something and it doesn't cost anything in terms of speed), I'm running 1.44M so I cannot make many claims regarding 5.25 and 360k format. That said, I'm creating minix file systems, mounting FAT and cat'ing the entire drive to /dev/null w/o problems, so it's working.

Whether this drive is dual speed or single speed, I don't know - probably single speed, since you reported that as being set when you moved the source across.

I will take a look at the code tomorrow morning.

Is the drive being recognized as 1.2M at boot?

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 24, 2023

And most likely a minor thing.

Unfortunately I think not - It seems (at least my) QEMU is broken, unless you can show 360k floppy working on 1.2M drive in QEMU. Kluge fix coming, see below.

I'm running 1.44M so I cannot make many claims regarding 5.25
I'm creating minix file systems, mounting FAT and cat'ing the entire drive to /dev/null w/o problems, so it's working.

That's good to know. Except 1.44M never has stretch bit set. Stretch bit is set only on 360k floppy in 720k drive, and 360k floppy in 1.2M drive - and those fail miserably on QEMU. This wasn't caught earlier since my very early 360k testing was only mounting a floppy and doing a few copies, which never caused the seek track to exceed half of the drive. When the track is being doubled, QEMU fails its emulation during FD_SEEK track when track > 38 for 360k/AT floppies. You'll see what I mean when you test 360k floppies on 1.2M drives in QEMU.

Is the drive being recognized as 1.2M at boot?

Yes. QEMU shows 1.2M CMOS on both 360k floppy and 1.2M floppies. I now understand the complexity of it all - it probably has to, since the program being emulated may think its a 1.2M drive with a 360k floppy in it. Unfortunately, force-setting the base_type[0] to 360k/PC won't fix it, since QEMU requires the 360k data rate to be 300k, which is the 360k/AT floppy in our tables, which also requires stretch stepping since that's the 360k floppy in 1.2M drive scenario... get it? No one ever said getting this direct floppy driver working was easy!!!! :)

This QEMU problem could possibly be fixed by creating another 360k floppy type - one that uses the 300k data rate but no stretch setting. But this would also require force-setting the CMOS to 360k in the driver source somehow, since QEMU always sets CMOS 1.2M drive type on 360k images. (BTW, this is why 360k worked prior to this PR since I had erroneously copied the 360k/PC floppy setting and just changed the data rate, but stretch remained 0 - which would not have worked on real hardware).

Whether this drive is dual speed or single speed

I'm not following you here on why that is important - we can't change the drive speed using any mechanism in the DF driver. Is the comment basically relating to the required data rate speed being different between the two? If so, yes, for QEMU the 360k/AT floppy entry is required - which has both 300k data rate as well as track doubling, the latter required for real hardware.

I'm thinking a temp fix for now will have to be informing the kernel about QEMU using qemu=1 in bootopts, (along with root=df0) and ignoring track doubling in the driver when set. This will allow continued QEMU development on 360k drives which I want to keep working for possible @toncho11 testing until more data is forthcoming on testing with 360k floppies in real hardware 1.2M drives.

@Mellvik
Copy link
Contributor

Mellvik commented Sep 24, 2023

@ghaerr ,
What I’ve been trying to say for a few rounds is that 2 probes won’t cut it. You need 3 because you never know whether the 1.2m drive is dial speed or not.

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 24, 2023

What I’ve been trying to say for a few rounds is that 2 probes won’t cut it.

That's a great idea but how would it work? Real hardware requires the stretch bit and higher data rate. QEMU requires no stretch bit and higher data rate. Using the original tables, a 360k in 1.2M drive requires both. If a specially-created QEMU table entry were created without stretch, it would probe correctly and match on real hardware, then fail when the track doubling wasn't set.

The problem now is that "probing" uses the actually requested I/O sector, rather than a "test" probe. So probing may have to be rewritten to 1) probe a test sector prior to real I/O, and 2) use a high track value to wash out the QEMU bug. This in turn means real hardware has to move to track 79 and back at startup... possibly twice.

[EDIT: What is happening is that QEMU is accepting the stretch bit entry after probing, because it works, but only until a large block number. The initial block number after mount is typically the superblock, and that probes correctly even with track doubling, so QEMU takes it. Then later when the filesystem is used, the I/O starts failing...]

@Mellvik
Copy link
Contributor

Mellvik commented Sep 24, 2023

What I’ve been trying to say for a few rounds is that 2 probes won’t cut it.

That's a great idea but how would it work? Real hardware requires the stretch bit and higher data rate.

I'm missing something. Real hw @ 360k requires stretch plus right data rate. We don't know the data rate until we try so we try both.

if qemu fails where real hw works, we have - like you say - a qemu problem/bug. Or qemu is simply more pedantic about programming than real hw, a situation I just had with the directhd driver. I got occasional read errors from the 32m minix image built by the makefile. No errors on a 30+ years old physical drive. Not sensical at all. Turned out an IDE cmd parameter was - let's say, in a grey area.

I'm coming back to my systems in the morning, will see if I can duplicate the symptoms.

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 24, 2023

Real hw @ 360k requires stretch plus right data rate. We don't know the data rate until we try so we try both.

Yes, that's exactly what the driver does: we first try the 250k data rate, then the 300k rate. The 300k rate also requires stretch, since its a 1.2M drive. QEMU won't work when the track is doubled. The problem is that if we add a third entry in between that doesn't have track doubling (a kluge only for QEMU), the real hardware probe will likely pass, since the first block on a mount is usually the superblock on track 0, which is the same when doubled!

will see if I can duplicate the symptoms.

Thanks. I think instead of kluging a /bootopts entry and kernel QEMU variable, I'll just add a compile-time fix for 360k floppies on QEMU for the time being, instead of a rewriting the driver probe which entails lots of changes.

@Mellvik
Copy link
Contributor

Mellvik commented Sep 25, 2023

@ghaerr,
I got the same symptoms - and there is definitely a bug in the QEMU emulation of the 765.

After a seek, the seek complete interrupt routine is issuing a SENSE_INTERRUPT command to get status. This command returns status in ST0 and the new cylinder # in ST1.

What's happening is that in this configuration (360 in 1.2M), when passing cylinder 40, QEMU starts dividing the cylinder # by 2, reporting 21 instead of 42 etc. as the next, then 22 etc. IOW - when passing 40, QEMU suddenly find out that this is a 40 track medium and starts counting real tracks instead of stretched tracks. Really weird. I've run continuous comparisons between the QEMU traces and physical - the latter runs perfectly all the time.

The fix below is about as bad as the fix you already have, but serves to point at the actual problem. It should be combined with using the QEMU=1 setting which is already available in bootopts - as you suggested.

I have not put much effort into investigating this further - there may be something else triggering this. Looking at the traces - it has been a while - reminds me that a dive into possible optimizations is still pending.

static void seek_interrupt(void)
{
    /* sense drive status */
    DEBUG("seekI-");
    output_byte(FD_SENSEI);
    if (result() != 2 || (ST0 & 0xF8) != 0x20 /*|| ST1 != seek_track*/) {.   <--- remove this test (or add a test for half the track #)
        printk("%s%d: seek failed\n", DEVICE_NAME, current_drive);
        recalibrate = 1; 
        bad_flp_intr();
        redo_fd_request();
        return; 
    }       
    current_track = seek_track /*ST1*/;    <--- and change this - 
    setup_rw_floppy();
}   

@Mellvik
Copy link
Contributor

Mellvik commented Sep 25, 2023

BTW - there is a simple way of detecting QEMU without expanding the env setting in bootopts- I've been using this on and off:

The IDE drives have a serial number. QEMU has a command line option to set the IDE serial, which is then picked up in ide_query. Here's the qemu setting: -global ide-hd.serial=EQUMI_ED \

It's byte swapped, thus unreadable and comes out in the IDE ID block like this:

eth: ne0 at 0x300, irq 12, (ne2k) MAC 52:54:00:12:34:56, flags 0x80
rhd: Raw access to block devices configured
0040 003F 0000 0010 7E00 0200 003F 0000 0000 0000 4551 554D 495F 4544 2020 2020 <--- Serial # starts at word 10
2020 2020 2020 2020 0003 0200 0004 322E 352B 2020 2020 5145 4D55 2048 4152 4444 
4953 4B20 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 8010 
0001 0B00 0000 0200 0200 0007 003F 0010 003F F810 0000 0110 F810 0000 0007 0007 
IDE CHS: 63/16/63 serial# QEMU_IDE            

@ghaerr
Copy link
Owner Author

ghaerr commented Sep 25, 2023

I got the same symptoms - and there is definitely a bug in the QEMU emulation of the 765.

Thanks for the testing and the QEMU bug confirmation @Mellvik. I have also confirmed the ST1 register mismatch.

It should be combined with using the QEMU=1 setting which is already available in bootopts

I'm thinking it better perhaps not to open the door for the kernel specially coded to QEMU... or at least delay it ? At least we have known working on real hardware and a compilation #define work around for QEMU testing.

I have not put much effort into investigating this further - there may be something else triggering this.

That's fine, thank you. I hope soon to start testing on the Compaq Portable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants