[df] Add 300k data xfer rate to 360k floppy FDC data #1724

ghaerr · 2023-09-16T01:15:57Z

Allows for DF autodetect to use 300K data transfer rate (as well as previous 250K data transfer rate) for floppy probe. This version boots from QEMU and uses the 300K data transfer rate as the first rate to probe from the SeaBIOS tables at https://github.com/coreboot/seabios/blob/master/src/hw/floppy.c.

This PR intentionally hard-wires floppy 0 to be 360k in an attempt at getting @toncho11's Amstrad PC to boot using the DF driver. After seeing results, the PR may be modified.

Also adds more debug display information to see what might be going on.

Continues @toncho11's discussion from #1721 (comment).

@toncho11, here's a prebuilt image for you to test again with, thank you!

fd360.img.zip

ghaerr · 2023-09-16T04:44:53Z

@toncho11, depending on whether the above 360k image works, here's another one that has even more debug/error messages. You're welcome to try/test this one first, thank you!

fd360.img.zip

toncho11 · 2023-09-16T04:50:09Z

First disk image in PR

toncho11 · 2023-09-16T05:01:06Z

Second image stops much earlier:

It makes a big read and it stops.

ghaerr · 2023-09-16T15:30:17Z

@toncho11,

Thank you again for taking your time to test this.

First disk image in PR

Well, this is disappointing.. it seems changing the floppy data rate from 250K to 300K (which was required to get QEMU to work), didn't have any affect on your system. Unfortunately, the first image doesn't have any enhanced diagnostics to see what is going on with the probing process.

Second image stops much earlier:
It makes a big read and it stops.

Arrghh... I sent you a version which has serial console enabled... and the output is going to the serial port!

Can you try this attached version once more, please? This will display the probing process during boot and help us understand what is happening. I am returning from my travels early next week and then I will continue testing on real hardware.

fd360.img.zip

Thank you!

ghaerr · 2023-09-16T15:31:52Z

BTW is that a CF/SD hard drive you have attached, or a real hard drive? Does it boot ELKS?

toncho11 · 2023-09-16T16:00:33Z

BTW is that a CF/SD hard drive you have attached, or a real hard drive? Does it boot ELKS?

Yes, I have XT-IDE and then I have attached IDE to Micro SD card adapter to it. It works perfectly. Yes, it boots ELKS.

toncho11 · 2023-09-16T17:08:26Z

Second image stops much earlier:

toncho11 · 2023-09-16T17:08:50Z

ghaerr · 2023-09-16T17:45:38Z

Thanks @toncho11. I'm sorry the display and outcome seem not much different from the first three times.

Your test is useful though because it is showing that an I/O error is occurring in the first auto-probe, without advancing to the next auto-probe. What this means is that your FDC controller is not giving an error condition such that the next drive auto-probe type is tried, even though apparently it is failing the sector read eventually and giving an I/O error at the end.

I am currently reading about FDC controllers in this long but interesting article: https://wiki.osdev.org/Floppy_Disk_Controller. In it, they state that "almost all" modern systems use a newer FDC controller chip known as an 82077AA which also emulates most prior chipsets, and that chips prior to that can be wildly incompatible: "The floppy subsystem is probably the worst. As the functionality evolved, some of the bit definitions were actually reversed in meaning, not merely made obsolete." BTW, debug output on QEMU shows it's emulating an 82077, which is probably why the DF driver is working on it.

From the new debug display, it shows your Amstrad has an 8272A chip, one of the earlier ones. The article has this to say about it: "it might be a good idea to abort and not support the floppy subsystem. Almost all of the code based on this article will work, even on the oldest chipsets -- but there are a few commands that will not."

So I will enhance this PR with further debug information, then when I return home will test on some real hardware systems to see what FDC chips they may use. In the meantime, it would be nice to find further documentation on the differences between the 8272A and 82077, as it seems that the DF driver will require some additional changes in order to run on an 8272A, or at least your Amstrad.

As I mentioned initially, @Mellvik has got this driver working on a number of his own systems, but its unclear which chipsets those or others contain, especially as the driver startup message was not always displayed (which is now fixed). (BTW @Mellvik, the CURRENT->rq_errors++ in rw_interrupt breaks the auto-probe mechanism, as bad_flp_intr also increments the error count; fix push coming shortly).

toncho11 · 2023-09-16T18:51:03Z

And this is from IBM 5160:

again with XT IDE and the original tandom 360 on floppy.

ghaerr · 2023-09-16T18:59:14Z

And this is from IBM 5160

Very good to know, thank you! I would think we should be able to get the original IBM PC floppy controller working with DF. It sounds like the issue may not be with the Amstrad, just that the DF driver doesn't yet work on 8272A chips. I'll be diving deeper into the reason why the driver fails the sector read but doesn't fail the auto-probe; could be that's one of the issues between the different chip types. More info soon.

toncho11 · 2023-09-16T19:05:04Z

Thank you @ghaerr!
I see. Now that we do not use the BIOS we need to support several of these chips. At least 82077 and 8272A.

toncho11 · 2023-09-16T23:28:57Z

And this is from IBM 5160:

again with XT IDE and the original random 360 on floppy.

toncho11 · 2023-09-16T23:29:26Z

This could help: https://forum.vcfed.org/index.php?threads/programming-the-intel-8272.12419/

ghaerr · 2023-09-17T03:23:41Z

@toncho11,

I found some great resources on the history of and programming the FDC controller and the 82077AA data sheet and PC/AT compatibility (section 7.0). The original 8272A found on pre-PC/AT systems doesn't have all the internal registers that the later 82072A chip the PC/AT has. Most modern systems including the PS/2 now use the 82077AA chip, which can emulate both prior (and other) chips based on its hardware setup.

I've temporarily commented out all non-8272A compatible code for further testing on an early system like yours.

In addition, there were many places in the DF driver where the FDC I/O might get an error or otherwise abort, but no message was sent to the display, so I've added tons more messages for the time being. It would be nice to try one last time (LOL :) booting from one of your older systems now that I've spent quite a bit of time diving deeply into this subject matter.

Here's another 360k image that matches what's pushed to this PR, if you have any time to try it again, that would be great. I'm back from traveling early next week. This version should at least give a decent error message as to what is happening that will give input on further differences between the two chip types.

fd360.img.zip

Here's QEMU's boot screen with the same image. Note it's emulating an 82077A:

Thank you!

toncho11 · 2023-09-17T22:51:47Z

Sure. I will do it when I have some time.

… entries

ghaerr · 2023-09-18T03:47:54Z

After reading way too much about FDC controller chips and the adapter boards they were embedded in, it was realized that the FDC chip alone is not enough information to properly handle reading and writing to the various "FDC" registers, as some are implemented in the chip itself, and others are implemented on the adapter card. The latest commit fixes things so that hopefully all chips and systems from the IBM PC to modern 82077-equipped systems might work. (See here for a great historical writeup about all this).

Here's a summary for those that might be interested:

/*
 * The original 8272A doesn't have FD_DIR or FD_CCR registers, but
 * the 82077 found on the PS/2 and modern clones can be configured in
 * hardware to emulate the chip and adaptor found on many PC/AT compatibles.
 *
 * History and capabilities of FDC chips (all have MSR,DATA regs):
 *  Chip         System             Adaptor Regs    Chip Regs    New Commands
 *  8272A/NEC765 IBM PC, IBM PC/XT  DOR
 *  8272A/NEC765 IBM PC/AT          DOR,DIR,CCR,DSR
 *  82072                                           DSR          CONFIGURE,DUMPREGS
 *  82077AA      IBM PS/2                           DOR,DIR,CCR  PERPENDICULAR,LOCK
 */

The 2.88M floppy entries were also updated with proper perpendicular more entries from the Linux 2.0 floppy driver.

Perpendicular mode and FIFO configure command testing performed on QEMU (not that that means a whole lot, but another bug was fixed where the reset_floppy() call after the version command was sent wrote improper values for the data rate to the FDC as this all occurred prior to the global floppy being set properly from base_types[]).

Attached is the latest 360k image for testing:
fd360.img.zip

toncho11 · 2023-09-18T08:00:44Z

It does work!

toncho11 · 2023-09-18T08:32:00Z

It is on the Amstrad, not the IBM 5160.
Well done! I was thinking that this will take a lot of time and testing.

Mellvik · 2023-09-18T12:00:02Z

Thank you @ghaerr - this is useful. The drive type/density selection code in the driver is a historical mess from the 90s in dire need for cleanup. Admittedly I haven't looked much at it yet, and your updates will surely come in handy when I pull out the 360k drives.

A couple of additions to your FDC summary:

/*
 * The original 8272A doesn't have FD_DIR or FD_CCR registers, but

That's right, these registers were outside the chip, part of the adapter. The early PCs had the DOR, the CCR came with the AT in order to handle multi-density floppies, the DIR was an attempt to recognize floppy changes to protect stupid users (that's all of use from time to time, right?). All systems until ca. 1990 - including the PS/2 - used the same 765 (or Intel clone) controller chip.

 * the 82077 found on the PS/2 and modern clones can be configured in
 * hardware to emulate the chip and adaptor found on many PC/AT compatibles.

The PS/2 used the same NEC765 as the other models. The PS/2 was introduced in 1986, while the 82072 became available in 1990, and the 82077 in 1995. The PS/2 introduced incompatibilities in the external registers which never made it mainstream.
The 82072 introduced a FIFO and an internal data separator which means the CCR register (aka DSR) moved inside the chip. Its most important feature though, was its ability to be 'soft' powered down without losing status info, as required by the portables coming to market at the time.

IMO - to 16bit/real mode systems (like ELKS and TLVC), there is not much point in adding code for the 765 successors like the 82072 and later, which came into use with the 486 and Pentium processors. Adding 2880k support for QEMU is useful, but other than that simplicity is key. BTW - any physical 2880k 3.5in drive will happily read and write 1440k floppies.

-M

ghaerr · 2023-09-18T16:36:58Z

@toncho11,

It is on the Amstrad, not the IBM 5160.

Great. I'm fairly confident the IBM 5160 will also run. (Also, I'm assuming you ran the "latest" 360k image I posted, right? Both should work, no need to test again.)

Well done! I was thinking that this will take a lot of time and testing.

Thank you for your testing, much appreciated. @Mellvik did all the initial heavy lifting on this, thank you @Mellvik! I was a bit worried as to whether things would start working, or require a hardware deep dive. It's nice to see your startup screen!

The driver had been gathering dust for almost 30 years but is the same one Linus originally wrote for very early Linux. It's unclear if it ever ran on ELKS back then, and @Mellvik grabbed it and performed the painstaking work of getting it running on a real mode kernel and also included XMS support. I'm fine tuning it, removing lots of extra crud, and have added error messages for every possible situation so it will be much easier to debug on the "next" computer. The driver itself is quite complicated, especially because it is fully interrupt-driven, and never busy waits for motor on/off or command completion.

Which brings me to the next point - the version you tested has async I/O support turned on - this means that the kernel will timeshare any other runnable tasks while waiting for floppy I/O to complete. While this may not make any noticeable difference if you're just running a shell and commands, it can make a huge difference with background processes and daemons. @Mellvik typically runs a full network stack and many other processes, for which async I/O is a big enhancement.

Thank you!

ghaerr · 2023-09-19T16:08:42Z

It may be that the (older) 320k drive used the lower clock rate. I think the 300k rate is the correct choice for PCs - maybe just leave it at that for now.

Your original driver used 250k, but QEMU requires 300k. I didn't want to touch the 250k rate since I thought you had said you'd tested with 360k drives before (and the driver used 250k rate). @toncho11's system is booting on 250k rate, but his system likely doesn't have CCR, and my driver won't write CCR unless CPU >= 286. So I left both in.

think a probe - if going the extent of trying all possible variants, should use a 'probability' sequence. Mine would be 1440-1200-360-720.

Ok, I'll experiment around with that. The current probe works well enough to boot QEMU, but will error out of control if more than two probes are needed. (It also starts at 360k, which probably isn't a good idea).

You mean when writing commands etc which don't use DMA? Do you really think that's significant enough?

I don't really know, but the source code for loops count from 0 to 10000. So I figure there's the possibility of lots of waiting!! I suppose we could display the number of loops required to actually write the FDC if we really cared. Meanwhile, writing the CONFIGURE byte takes very little time/code if we know its an 82077. Neither driver handles or tests for an 82072, and I agree its not needed. So in a way, slow systems will stay slow with the driver, and very fast systems with an 82077 will not waste time busy looping when their CPU is powerful enough to do interesting things otherwise.

The DIR is there, otherwise the system wouldn't work. The DIR controls the drive motors.

Oh I get what you're saying now: you mean DOR (digital OUTPUT register). This was on the adaptor board and controls drive motors. DIR (Digital Input register) was added for PC/AT's on the adaptor and internal to 82077s and is read to read the DCL (Drive Change Line) from the floppy to indicate media changed. My point was that you can't read the DIR and go through a reset if set on the early systems, because they don't have it, and that is likely a reason why @toncho11's system (an early PC) wouldn't boot with the direct driver.

You never know - one of these days the directfd driver may become a decent piece of software?

It has certainly improved one heck of a lot since you started working on it! :) Now that I've learned way too much about FDCs and more PC adaptor board history, I'm interested in seeing whether we can get it working mainstream instead of BIOS. Time will tell but we need a wider variety of test machines.

ghaerr · 2023-09-19T16:22:11Z

Another little tidbit on the DIR: The reason reading DIR and going through a reset (including the ridiculously slow method required with the 8272A since it may require two special seeks since its limited to 77 tracks) is tricky: The DCL (drive change line) coming from the floppy is connected to some logic that also has the drive seek line connected to it. So the DIR line on PC/AT+ equipment won't change until the FDC has had a seek operation, which then turns off DIR on the adaptor or controller.

However, on older systems like the original PC, there is no DIR port, so you're just reading a nonexistent I/O port. If bit 7 happens to read high, a reset and seek sequence is initiated, which does nothing; DIR will be read again and it'll still be high, and in the old driver request_done(0) was called, cancelling the I/O request with no error message. In the new driver, DIR won't ever be read, nor any reset/seek initiated. The determination as to whether to read DIR isn't based on the FDC type; it is assumed if the controller isn't an 82077 it is an 8272A - instead a test as to whether the system is a PC/AT or not is used. For simplicity, this is encoded into the fdc_version variable.

Mellvik · 2023-09-19T19:02:31Z

Sorry for the confusion, @ghaerr,
I just realized my very smart spelling helper has consistently replaced DOR with DIR. Proofreading sucks.
So for the record, yes the DIR appeared on the AT, the DOR was always there, controlling the motors.

Mellvik · 2023-09-19T19:13:15Z

It may be that the (older) 320k drive used the lower clock rate. I think the 300k rate is the correct choice for PCs - maybe just leave it at that for now.

Your original driver used 250k, but QEMU requires 300k. I didn't want to touch the 250k rate since I thought you had said you'd tested with 360k drives before (and the driver used 250k rate). @toncho11's system is booting on 250k rate, but his system likely doesn't have CCR, and my driver won't write CCR unless CPU >= 286. So I left both in.

it's been too long and too many changes since I did that testing, so I honestly don't remember. I'll have to redo that.

think a probe - if going the extent of trying all possible variants, should use a 'probability' sequence. Mine would be 1440-1200-360-720.

Ok, I'll experiment around with that. The current probe works well enough to boot QEMU, but will error out of control if more than two probes are needed. (It also starts at 360k, which probably isn't a good idea).

You mean when writing commands etc which don't use DMA? Do you really think that's significant enough?

I don't really know, but the source code for loops count from 0 to 10000. So I figure there's the possibility of lots of waiting!! I suppose we could display the number of loops required to actually write the FDC if we really cared.

I don't think there is much looping at all and that we're looking at leftovers from development/debugging. I'll look at that when the current rewrite of the IDE driver (pre interrupt) is done.

Meanwhile, writing the CONFIGURE byte takes very little time/code if we know its an 82077. Neither driver handles or tests for an 82072, and I agree its not needed. So in a way, slow systems will stay slow with the driver, and very fast systems with an 82077 will not waste time busy looping when their CPU is powerful enough to do interesting things otherwise.

The DIR is there, otherwise the system wouldn't work. The DIR controls the drive motors.

Oh I get what you're saying now: you mean DOR (digital OUTPUT register). This was on the adaptor board and controls drive motors. DIR (Digital Input register) was added for PC/AT's on the adaptor and internal to 82077s and is read to read the DCL (Drive Change Line) from the floppy to indicate media changed. My point was that you can't read the DIR and go through a reset if set on the early systems, because they don't have it, and that is likely a reason why @toncho11's system (an early PC) wouldn't boot with the direct driver.

Right. Actually, (IIRC) the drive change line has been notoriously unreliable, and I have been wanting (yes, another one) to put it to test. It's on the list ...

Mellvik · 2023-09-19T19:36:02Z

Another little tidbit on the DIR: The reason reading DIR and going through a reset (including the ridiculously slow method required with the 8272A since it may require two special seeks since its limited to 77 tracks) is tricky: The DCL (drive change line) coming from the floppy is connected to some logic that also has the drive seek line connected to it. So the DIR line on PC/AT+ equipment won't change until the FDC has had a seek operation, which then turns off DIR on the adaptor or controller.

However, on older systems like the original PC, there is no DIR port, so you're just reading a nonexistent I/O port. If bit 7 happens to read high, a reset and seek sequence is initiated, which does nothing; DIR will be read again and it'll still be high, and in the old driver request_done(0) was called, cancelling the I/O request with no error message.

Interesting, no wonder it failed!

In the new driver, DIR won't ever be read, nor any reset/seek initiated. The determination as to whether to read DIR isn't based on the FDC type; it is assumed if the controller isn't an 82077 it is an 8272A...

A safe assumption. Leaving out the 82072 which on a closer look seems to have passed away without anyone noticing, at least in the PC world, the NEC765A aka I8272 was the only fdc chip used in PCs.

... instead a test as to whether the system is a PC/AT or not is used. For simplicity, this is encoded into the fdc_version variable.

Sounds like the smart fix, thanks.

Mellvik · 2023-09-20T07:30:12Z

It may be that the (older) 320k drive used the lower clock rate. I think the 300k rate is the correct choice for PCs - maybe just leave it at that for now.

Your original driver used 250k, but QEMU requires 300k. I didn't want to touch the 250k rate since I thought you had said you'd tested with 360k drives before (and the driver used 250k rate). @toncho11's system is booting on 250k rate, but his system likely doesn't have CCR, and my driver won't write CCR unless CPU >= 286. So I left both in.

it's been too long and too many changes since I did that testing, so I honestly don't remember. I'll have to redo that.

For the record - I just tested a 360k floppy again, code as ported over from TLVC, and it mounts fine. 250kHz is definitely the correct clock rate - AMOF the 'big' overview of floppy technologies does not mention any drive type or format using 300kHz (see below), so the QEMU requirement is a mystery. When I mount a 360k drive on QEMU (as reported before), it mounts just fine while the CMOS drive type reports 1.2M.

tlvc17# mount /dev/df1 /mnt
FAT: me=fd,csz=2,#f=2,floc=1,fsz=2,rloc=5,#d=112,dloc=12,#s=720,ts=-1898410304
FAT: total 360k, fat12 format
tlvc17#

Mellvik · 2023-09-20T07:42:06Z

One more thing, @ghaerr:

Meanwhile, writing the CONFIGURE byte takes very little time/code if we know its an 82077. Neither driver handles or tests for an 82072, and I agree its not needed. So in a way, slow systems will stay slow with the driver, and very fast systems with an 82077 will not waste time busy looping when their CPU is powerful enough to do interesting things otherwise.

Apropos the 82077, I just noticed one thing that may make a difference in real performance. The 82077 - unlike the NEC765 - does implied seeks. Which means that the driver can bypass all the logic catching the track boundaries which splits transfers - in turn losing revolutions while reading or writing. I'm going to test that on the 386SX-SBC.

ghaerr · 2023-09-20T15:45:33Z

@Mellvik,

Thanks for the (hopefully) definitive floppy table! I've been wanting to find such a thing. I'll compare this with the DF driver entries. I also have a plan on how to effectively merge the two driver floppy tables, using the technique found in Linux 2.0: the second probe table just uses indices in to the first table. This should help keep track of the multitude of formats we've got or may need.

I just tested a 360k floppy again, code as ported over from TLVC, and it mounts fine. 250kHz is definitely the correct clock rate

Thank you, very good to know. Was this on a machine that did or did not have CCR for setting a data rate?

the 'big' overview of floppy technologies does not mention any drive type or format using 300kHz (see below), so the QEMU requirement is a mystery.

At this point, all I know for sure is that I had to get auto-probing working (or force 300k) in order to get QEMU working. And looking at the SeaBIOS source, yes, the 300k is in their tables, so its likely comparing DSR/CCR values when emulating its FDC. I moved the 300k version to the second (probed) position in the 360k table so the 250k data rate should be correctly used for real hardware.

When I mount a 360k drive on QEMU (as reported before), it mounts just fine while the CMOS drive type reports 1.2M.

I'm going to need to check again whether that works with my driver - our drivers have diverged somewhat and I remember having to get auto-probing to work in order to read 360k w/CMOS 1.2M, which is how I got dragged into this part of the rabbit hole in the first place :)

The 82077 - unlike the NEC765 - does implied seeks. Which means that the driver can bypass all the logic catching the track boundaries which splits transfers - in turn losing revolutions while reading or writing.

Yes, I'm aware of that. IMO that sounds like a nice enhancement to the driver for modern machines, and you'd be the perfect guy to write it! :) Another enhancement to the 82077 (and possibly 82072?) is the much improved reset capability (possibly called soft reset?) that allows software to reset the controller without having to through a full recalibrate and other stuff the 8272A requires.

Mellvik · 2023-09-20T15:57:32Z

@ghaerr,

I just tested a 360k floppy again, code as ported over from TLVC, and it mounts fine. 250kHz is definitely the correct clock rate

Thank you, very good to know. Was this on a machine that did or did not have CCR for setting a data rate?

This was on the 386SBC, which is using the 82077 - while the last time I tested, it was on the Compaq Portable 386. My XT class machine is still pending.

The 82077 - unlike the NEC765 - does implied seeks. Which means that the driver can bypass all the logic catching the track boundaries which splits transfers - in turn losing revolutions while reading or writing.

Yes, I'm aware of that. IMO that sounds like a nice enhancement to the driver for modern machines, and you'd be the perfect guy to write it! :) Another enhancement to the 82077 (and possibly 82072?) is the much improved reset capability (possibly called soft reset?) that allows software to reset the controller without having to through a full recalibrate and other stuff the 8272A requires.

Yes that's right - there is the soft reset capability, and I'm now (tempted by the fact that I do have such a machine) leaning towards including some of the '077 features in the driver - if only out of curiosity. It seems most of the (few lines of) code would be detection (which is INITPROC anyway), the rest just simple if's to skip code, not to add code.

BTW I'm unsure about the connection between the DIR and the seek you mentioned, I cannot find it in the HW schematics from neither IBM nor COMPAQ. Where did you find that?

-M

ghaerr · 2023-09-20T17:44:41Z

I'm now (tempted by the fact that I do have such a machine) leaning towards including some of the '077 features in the driver

Glad you're leaning that way - since the driver already has CONFIGURE/FIFO and PERPENDICULAR support (~20 lines), your new features can be added with no change to the detection code. I would suggest you look at the new get_fdc_version routine and the FDC_TYPE_xxx enumerations in fdreg.h: the detection code is very simple, but the FDC_TYPE_xxx values are now sorted (an idea I took from the Linux 2.0 FD driver), which allows for using ">=" comparisons for features. My driver has all of the original 82077 support working, although tested only on QEMU. Use of this method would allow you to easily add any enhancements like implied seek and better reset, IMO both quite nice for modern machines where one doesn't really want to sit around and wait (and listen) to long disk reset/seek sequences. I encourage you to add those features! It would only help getting this driver up to speed with existing BIOS support on modern machines.

It seems most of the (few lines of) code would be detection (which is INITPROC anyway), the rest just simple if's to skip code, not to add code.

I think the detection code is complete and simple (in my driver). That's potentially a really good idea to then use fdc_version to skip a bunch of old slow code (and possibly run some simpler code) for reset and seeks with 82077s. Exciting, in my view!

Not that we would, but just imagine a scenario where all the ugly shake_done, shake_one, shake_zero and retry_recal could be ifdefed out and the driver working on 82077s.

BTW, I did exactly that in my first round of removing any DIR register reading when trying to debug @toncho11's boot problem. I ended up adding back in the code, but only when fdc_version >= FDC_TYPE_8272PC_AT (that is not PC or PC/XT). This was able to be tested on QEMU without any of that junk code and ultimately allowed the driver to be written and working for @toncho11 without actually developing on any real hardware. I guess the point is that all that junky code is only there for resetting the DIR bit for non-modern controllers.

I'm unsure about the connection between the DIR and the seek you mentioned, I cannot find it in the HW schematics from neither IBM nor COMPAQ. Where did you find that?

I've read so much lately I can't find exactly that portion, but the OS/2 Museum FDC article states: "The change line signal is active if a floppy is removed from a drive, and deactivated when a disk is inserted and the drive head stepper motor receives a pulse." What that implies is the DIR line isn't directly connected to the drive DCL (disk change line), but is also connected to a flip-flop connected to the stepper line.

Here is IMO some required reading, the best of the many articles I've been reading on FDC (yes, it is entirely true there is no one place for all this info):

http://www.brokenthorn.com/Resources/OSDev20.html. A fantastic article with tons of history on FDC hardware and operation. Highly recommended!

https://www.isdaman.com/alsos/hardware/fdc/floppy.htm. Lots of super information on each of the FDC registers. Also highly recommend.

https://wiki.osdev.org/Floppy_Disk_Controller#DIR_register.2C_Disk_Change_bit. More good information on FDC programming. Read the "The IBM PC/AT"section on the DIR register and disk change bit. Also has good info on recalibrate/seek issues (for which I still don't entirely understand but looks like you've gone deep into).

http://www.osdever.net/documents/82077AA_FloppyControllerDatasheet.pdf. The definitive source on 82077, and especially useful for the "PC/AT" emulation modes. The "Register Set Compatibility" in Section 7.1 is extremely useful.

https://forum.vcfed.org/index.php?threads/fdc-pin-34-ready-vs-disk-change.41371/. A somewhat interesting conversation that also mentions a seek is required after a disk is inserted or removed.

That should be enough reading to keep you up for a while! :) I kept all of these open in browser tabs during the last big round of getting the DF driver working on ELKS.

ghaerr · 2023-09-20T18:19:41Z

Yet one more tidbit on DIR operation and requirements: as I was writing the above post and re-reading information on the "DIR" floppy changed bit - I realized we don't need it. @toncho11's system works fine without it. I think I misunderstood something important about that bit: I had thought that once that bit was set (indicating drive changed), that the 8272A controller had to be issued a long set of seeks/resets, etc in order to function. What I think now is that an 8272A controller doesn't care or know anything about drive change lines - the DIR DCL (drive change line) bit can be ignored, all that extra code is only required to RESET the bit. For kernels that support media change, this is a requirement.

Our kernel doesn't support drive media changing at all; the original driver was from Linux which does support such a thing. Our kernel doesn't have any code to do anything even when the bit does change, so what are we doing reading the bit and trying to reset it? IMO, the answer is wasting a lot of code space and time. And that particular code is very ugly and IMO full of possible problems, especially on multiple FDCs.

The real question is whether a newer controller e.g. 82077 requires the bit to be reset, and I think the answer is no. This could be easily tested by taking the last 360k image uploaded for @toncho11, and running it on a PC/AT or higher system, and a system with an 82077. I would bet that the image would boot and run fine.

Having said all this, I propose we now seriously consider the idea of ifdefing out the code entirely, for all systems. That would free up much more code space, which IMO would be much better used for some of @Mellvik's ideas about implementing implied seek or fast reset operations on modern systems. This also fits with the proposed idea of keeping the driver as simple as possible.

toncho11 · 2023-09-20T18:32:20Z

If you put ifdefs then it will require recompilation for each controller? What will be the default?

ghaerr · 2023-09-20T18:36:21Z

If you put ifdefs then it will require recompilation for each controller? What will be the default?

No - the ifdefs would just remove all code for checking "floppy changed" since ELKS doesn't do anything when the media is changed anyways. It would work on all systems and just be smaller. If any controller does actually require reseting of DIR DCL when changed, then all the code would have to remain in. The point is that I think the code isn't needed, and the user is responsible for leaving media in the drive. The media must be unmounted or at least synced before changing or power off.

ghaerr · 2023-09-20T18:52:19Z

In the 82077 data sheet Section 2.18a page 11, it states that the DSK CHG bit (bit 7) in DIR just reflects the state of the DSKCHG input pin on the controller. There isn't any other discussion of this bit affecting the operation of the FDC.

For the 8272A, there is no DSKCHG pin on the FDC, DSKCHG and DIR are part of the adaptor board (present on PC/AT or later only, and then replaced when the 82077 chip was used.

So it seems that if the DF driver only intends to support 8272A FDC behavior, with possible unrelated extensions for 82077s only, we could be good to go with removing all code associated with FD_DIR entirely. This is what I originally had for @toncho11 that works, but added back in the code with the misunderstanding that the 8272A on PC/ATs required special handling when the DIR bit generated on the adaptor board got set.

Mellvik · 2023-09-20T19:52:40Z

Thank you @ghaerr,

Yes, I am familiar with most of these references - Lots of incredibly good knowledge. I took some curiosity to the DIR issue because - to my knowledge - disk change detection used to not work on PCs.

The vcfed.org reference was a good refresher on that. The drives simply don't tell the adapter about disk removal, so there is no way it can work on most systems. Technically possible but ignored by most (drive) vendors.

So IMHO we can safely continue to ignore the DIR.

ghaerr · 2023-09-20T19:55:20Z

So IMHO we can safely continue to ignore the DIR.

Currently, the code is not ifdef'd out - DIR is checked on 8272A PC/AT systems, as well as for 82077. Are you saying you're ok with ifdef'ing out the code that references DIR, and never reading it?

[EDIT: my suggestion would be to use #ifdef USE_DIR_BIT or something like that, and have it default not defined.]

Mellvik · 2023-09-21T07:56:10Z

So IMHO we can safely continue to ignore the DIR.

Currently, the code is not ifdef'd out - DIR is checked on 8272A PC/AT systems, as well as for 82077. Are you saying you're ok with ifdef'ing out the code that references DIR, and never reading it?

[EDIT: my suggestion would be to use #ifdef USE_DIR_BIT or something like that, and have it default not defined.]

Yes. This is another one of the 'come back to later' issues in the driver. I had this alarm in my head about the media change detection but couldn't remember what it was. Your vcfed.org reference refreshed my memory. So my intention is to delete it, maybe leave something ifdef'ed for documentation.

ghaerr · 2023-09-21T17:18:47Z

This is another one of the 'come back to later' issues in the driver.

Thanks for your input. I'll make another driver change and ifdef out all the unneeded code. Some later day when the kernel supports media change it can be easily added back in.

Interestingly enough, this will leave the driver with interfacing with a basic 8272A FDC, with all other registers assumed on the adaptor, OR an 82077 with CONFIGURE/PERPENDICULAR capabilities and the same registers on chip. For the case of the IBM PC and PC/XT where CCR is non-existent, that isn't checked for and the data rate is output regardless to a non-existent port.

With all FD_DIR stuff removed, that simplifies the code quite a bit, although there's still code in for multiple seeks, since apparently the 8272A FDC is limited to 77 tracks and doesn't know where the head actually is.

This keeps the driver very simple and compatible, with enhancements added only for 82077+ systems.

the irq disabling was experimental, removing it is most likely fine.

Actually, I just checked the original (Linus) driver. It was there, but that was also when using the dma library, which allows for multiple drivers using DMA. What I don't actually know is whether the DMA chip has to have all the configuration done at once, or whether each output byte can work on its own, through using different registers. I think the latter, since the register address set is quite large on the DMA chip.

Mellvik · 2023-09-21T18:51:10Z

Interestingly enough, this will leave the driver with interfacing with a basic 8272A FDC, with all other registers assumed on the adaptor, OR an 82077 with CONFIGURE/PERPENDICULAR capabilities and the same registers on chip. For the case of the IBM PC and PC/XT where CCR is non-existent, that isn't checked for and the data rate is output regardless to a non-existent port.

I don't see a problem here. The driver is interfacing with an adapter, not a chip, and it's not relevant whether the registers are here or there. It may not even matter if the ccr is missing, the driver should notice that the test failed and continue.

With all FD_DIR stuff removed, that simplifies the code quite a bit, although there's still code in for multiple seeks, since apparently the 8272A FDC is limited to 77 tracks and doesn't know where the head actually is.

This is not correct. I seem to remembe4 off the top of my head that it's only the return to track 0 cmd that is limited to 77 pulses, which isn't much of a problem in practice.

the irq disabling was experimental, removing it is most likely fine.

Actually, I just checked the original (Linus) driver. It was there, but that was also when using the dma library, which allows for multiple drivers using DMA. What I don't actually know is whether the DMA chip has to have all the configuration done at once, or whether each output byte can work on its own, through using different registers. I think the latter, since the register address set is quite large on the DMA chip.

I've been thinking about this one and come to the conclusion that it's better to leave it in. It cheap, just protecting a few instructions and it makes the issuing of the command predictable to the driver. It's a point to get the transfer going as fast as possible so the system can go back to do other stuff while the dma proceeds. I haven't thought about that before, but now that the dma initialization is part of the driver proper and not exposed, it may even be an idea to move the protection outside of the dma routine.

Will look at the code tomorrow.

thanks!

ghaerr · 2023-09-22T17:12:42Z

This is not correct. I seem to remembe4 off the top of my head that it's only the return to track 0 cmd that is limited to 77 pulses, which isn't much of a problem in practice.

Agreed - I was only talking about the required reset procedure for 8272A chips when trying to reset the DIR DSKCHK bit, not normall FDC operation. IIRC there are improvements that could still be made on normal operation drive resets when an 82077 is present, to do with less drive seeking.

I've been thinking about this one and come to the conclusion that it's better to leave it in. It cheap, just protecting a few instructions and it makes the issuing of the command predictable to the driver. It's a point to get the transfer going as fast as possible so the system can go back to do other stuff while the dma proceeds. I haven't thought about that before, but now that the dma initialization is part of the driver proper and not exposed, it may even be an idea to move the protection outside of the dma routine.

I have to completely disagree on this - in almost no cases does disabling interrupts "speed up" an operation (other than of course allowing only that code to run), and in every case disabling interrupts can cause unintended harmful effects to kernel timers and any device that uses interrupts to signal immediate processing requirements. In my view of kernel programming, interrupts should be disabled only when absolutely necessary, which for the ELKS kernel being non reentrant means only using them to protect variables that are read (or written) by both the kernel and an interrupt handler. Since kernel code is otherwise never interrupted and never time sliced, disabling is not required since the DMA registers are never accessed via an interrupt handler, nor by other kernel code. [EDIT: @Mellvik, the DMA pins (not registers) will be "accessed" during a DMA I/O operation, but that would all be occurring after having programmed the DMA by the driver in the first place, and would not be affected by register operations anyway, since the driver uses fdc_busy to enforce a single-process access to the FDC at all times, so two requests will never be simultaneously be in-process.]

In the DMA setup case we're talking about here, I found what seems like a few outb instructions actually generates quite a bit of code - almost as much as a full-blown syscall entry with interrupts disabled (35+ instructions). The effect of the syscall interrupt disabling on older systems was that reliable serial input rates without dropping characters dropped from 38400 to 9600, obviating the need for the "fast" serial driver which skips the _irqit syscall entry point for normal interrupt handlers.

I plan on some real hardware testing for this and other FDC enhancements and will report findings.

Mellvik · 2023-09-23T09:49:23Z

This is not correct. I seem to remembe4 off the top of my head that it's only the return to track 0 cmd that is limited to 77 pulses, which isn't much of a problem in practice.

Agreed - I was only talking about the required reset procedure for 8272A chips when trying to reset the DIR DSKCHK bit, not normal FDC operation. IIRC there are improvements that could still be made on normal operation drive resets when an 82077 is present, to do with less drive seeking.

I'm not following you on this one @ghaerr. Assuming that we're completely ignoring the DIR because it's useless, AFAIK resetting the controller chip is as simple as flipping the reset bit in the DOR and getting the head back to tr0 - one or possibly two commands. This should not happen very often so the fact that it takes some time if the head is far in, shouldn't be all that much of a deal.

I have mentioned before that the initial floppy initialization in the directfd driver is not good (slow) and needs some work related to seeking. I discovered this only when listening to the drives (the physical machines are not in my office). DOS does it faster and nice, IOW not acceptable :-) .

I've been thinking about this one and come to the conclusion that it's better to leave it in. It cheap, just protecting a few instructions and it makes the issuing of the command predictable to the driver. It's a point to get the transfer going as fast as possible so the system can go back to do other stuff while the dma proceeds. I haven't thought about that before, but now that the dma initialization is part of the driver proper and not exposed, it may even be an idea to move the protection outside of the dma routine.

I have to completely disagree on this - in almost no cases does disabling interrupts "speed up" an operation (other than of course allowing only that code to run), and in every case disabling interrupts can cause unintended harmful effects to kernel timers and any device that uses interrupts to signal immediate processing requirements. In my view of kernel programming, interrupts should be disabled only when absolutely necessary, which for the ELKS kernel being non reentrant means only using them to protect variables that are read (or written) by both the kernel and an interrupt handler. Since kernel code is otherwise never interrupted and never time sliced, disabling is not required since the DMA registers are never accessed via an interrupt handler, nor by other kernel code. [EDIT: @Mellvik, the DMA pins (not registers) will be "accessed" during a DMA I/O operation, but that would all be occurring after having programmed the DMA by the driver in the first place, and would not be affected by register operations anyway, since the driver uses fdc_busy to enforce a single-process access to the FDC at all times, so two requests will never be simultaneously be in-process.]

In the DMA setup case we're talking about here, I found what seems like a few outb instructions actually generates quite a bit of code - almost as much as a full-blown syscall entry with interrupts disabled (35+ instructions). The effect of the syscall interrupt disabling on older systems was that reliable serial input rates without dropping characters dropped from 38400 to 9600, obviating the need for the "fast" serial driver which skips the _irqit syscall entry point for normal interrupt handlers.

I plan on some real hardware testing for this and other FDC enhancements and will report findings.

I think we can agree to disagree on this one @ghaerr. We do, however, agree on the principle: Block interrupts only when necessary. Also, I think we agree that the general DMA 'driver' which may be called from any other driver or the kernel itself, needs interrupt protection while a DMA channels is being established. This protection is no longer required when there is only one user, like now. But the effect of the irq blocking on the rest of the system is the same. It was acceptable before, it's acceptable now.
Thus - acceptable but not technically necessary, but is it useful? This is where I am tempted to ditch principle in favor of efficiency. At the point where the DMA is being setup, we're microseconds away from kicking off a data transfer that may take a long time, like a many milliseconds for a full track read. I'd like to get that going now rather than wait for the next scheduled slot because it makes the system overall faster. It's not like there are many of these transactions during a second, so it seems to me the benefit far outweighs the cost. Is it measurable? Probably not.

ghaerr · 2023-09-23T17:56:41Z

@Mellvik: I want to say I enjoy hearing your perspective on things - and I usually learn something as a result. My comments on our above conversation follow.

I'm not following you on this one
AFAIK resetting the controller chip is as simple as flipping the reset bit in the DOR and getting the head back to tr0

I'm behind you on what the floppy controller or drive actually has to do on a failed command, timeout, or reset, as my work on the direct driver is so far only using QEMU, and I have yet to go through the painstaking process of debugging real hardware like you've been doing! :) I thought I had read somewhere that the 8272A had to go through extra steps when commands didn't work, and sometimes required a chip reset; I don't really know what the drive itself might have to told to do. Skipping DIR entirely, I was under the impression there might be areas of the driver that could be improved regarding reset processing on both 8272A and 82077, and that they might be different.

initial floppy initialization in the directfd driver is not good (slow) and needs some work related to seeking.

Yes - I'm interested in more technical details of this kind of problem. Is this related to probing, reset activity, or just handling seek properly with the FDC? With more information, I can learn more about the various FDCs and hopefully we can come up with a fast driver that doesn't take a long time to either initialize or reset itself after errors.

the driver uses fdc_busy to enforce a single-process access to the FDC at all times

the general DMA 'driver' which may be called from any other driver or the kernel itself, needs interrupt protection while a DMA channels is being established. This protection is no longer required when there is only one user, like now.

After writing my previous post, talking about the need for protection, and further reading here, I realized I think I am in error - there isn't a need for any special (interrupt disabling) protection at all for the DMA chip or do_fd_request reentry checking using fdc_busy: the kernel guarantees that block device requests are single kernel-threaded, and won't be interrupted, reentered, nor simultaneous requests processed at any time. These are artifacts of the original Linux driver and are not applicable to our kernel. Thus, with the exception of variables either read or written by both the driver and an interrupt handler, all the protection can be removed.

Of course, this is just talk. I would like to learn the true answer scientifically (result proven through testing), rather than philosophically (result through argument). I will remove all the unneeded code and test extensively on QEMU and real hardware, as I've been planning on testing my driver on my Compaq Portable 386 for some time now. OTOH, you are one of the few users that really puts ELKS/TLVC to work by actually using many of its features for real work. I suggest that you consider extending the interrupt disabling all the way through DMA setup to the actual FDC command being sent - and then run the system doing heavy floppy I/O simultaneously with either incoming serial SLIP, or incoming network packets to see whether there is any noticeable I/O speed difference, as well as whether any serial or network packets are lost. This is important, as perhaps we're worrying about nothing, or not... Only real world testing will tell. From my perspective, I want to know whether my understanding of the kernel is correct, which needs to be proven through testing. And I'm very interested in learning more about how long interrupts can be disabled before data loss occurs in serial and network packets.

Thank you!

Mellvik · 2023-09-24T12:04:28Z

@Mellvik: I want to say I enjoy hearing your perspective on things - and I usually learn something as a result.

Likewise @ghaerr. Some times frustrating but coming from different 'sides' with different expectations and experiences is what makes this interesting and educating.

I'm not following you on this one
AFAIK resetting the controller chip is as simple as flipping the reset bit in the DOR and getting the head back to tr0

I'm behind you on what the floppy controller or drive actually has to do on a failed command, timeout, or reset, as my work on the direct driver is so far only using QEMU, and I have yet to go through the painstaking process of debugging real hardware like you've been doing! :) I thought I had read somewhere that the 8272A had to go through extra steps when commands didn't work, and sometimes required a chip reset; I don't really know what the drive itself might have to told to do. Skipping DIR entirely, I was under the impression there might be areas of the driver that could be improved regarding reset processing on both 8272A and 82077, and that they might be different.

From my perspective the 'reset regime' is more like a driver level convenience: When we're no longer sure about the state of things, let's reset and get a clean slate. The 765 (I keep calling it that because that's the original and that's the one I was working with in the late 70s) isn't particularly difficult in that regard. Like most controller chips from that period (such as the Ethernet chips we've had fun with over the years), the 765 has an external reset pin which needs to be pulled in order to reset it. The adapter adds a register or just a port to pulse or explicitly set/reset that reset line - in this case one of the bits in the DOR. The 82077 is a different generation and has a 'soft' reset command instead (actually in addition to), which is nice but not all that different from a driver point of view.

initial floppy initialization in the directfd driver is not good (slow) and needs some work related to seeking.

Yes - I'm interested in more technical details of this kind of problem. Is this related to probing, reset activity, or just handling seek properly with the FDC? With more information, I can learn more about the various FDCs and hopefully we can come up with a fast driver that doesn't take a long time to either initialize or reset itself after errors.

I'm referring to the first initialization (_init routine) only (which normally does a full seek both ways, which in turn may or may not be part of the probe, that's for the driver to decide). In-flight resets are simpler and faster. And AFAIK there is no difference between adapters on this issue, they're all compatible and all use the same (licensed) FDC at the heart.
BTW it may be argued that the full seek is a waste of time because it's always done by the POST - it may be safe to get rid of it.

the driver uses fdc_busy to enforce a single-process access to the FDC at all times

the general DMA 'driver' which may be called from any other driver or the kernel itself, needs interrupt protection while a DMA channels is being established. This protection is no longer required when there is only one user, like now.

After writing my previous post, talking about the need for protection, and further reading here, I realized I think I am in error - there isn't a need for any special (interrupt disabling) protection at all for the DMA chip or do_fd_request reentry checking using fdc_busy: the kernel guarantees that block device requests are single kernel-threaded, and won't be interrupted, reentered, nor simultaneous requests processed at any time. These are artifacts of the original Linux driver and are not applicable to our kernel. Thus, with the exception of variables either read or written by both the driver and an interrupt handler, all the protection can be removed.

This is an important observation, I didn't know that. Is this good or even acceptable? With the synchronous BIOS based IO I guess it was a given, but it seems to me this is preventing us taking advantage of the asynchronous nature of block IO that is now becoming exploitable with the new drivers. The decision to allow or deny parallel IO operations should be done at the driver level, not in the kernel?

Again it may be an academic discussion using big system logic for very small systems. OTOH there is the permanent race to push more performance out of this limited resources. Like - I'm sitting with this floppy/IDE ISA16 interface in my hand, which I'm considering adding to the SBC (386SX) system that has a southbridge with all the standard IO already. It has floppy irq6 and hd irq14. Supporting shared interrupts with parallel IO in the floppy interface is not viable with the current driver - just too much state. But parallel IO on the HD interface would be interesting (and the southbridge HD interface can probably be changed to use irq15 or any other number). How's that for a weekend at the keyboard?

BTW - the acknowledgement that the kernel enforces block IO single threading doesn't change my earlier statement that 'protecting' the DMA setup may have a performance benefit. Maybe. Minimal cost - in particular after having spent the morning in the make_request/get_request/add_request realm. Wow is the elevator algorithm a real interrupt killer - bringing back a point you made some time ago: What's the optimal size of the request queue? That may be a target for some real measurements too - although admittedly it's an issue for floppy only (for now - I may attach an ancient Conner 20MB drive that I just got my hands on to the fast 386SX, add interrupts to the IDE driver and see what happens).

Of course, this is just talk. I would like to learn the true answer scientifically (result proven through testing), rather than philosophically (result through argument). I will remove all the unneeded code and test extensively on QEMU and real hardware, as I've been planning on testing my driver on my Compaq Portable 386 for some time now. OTOH, you are one of the few users that really puts ELKS/TLVC to work by actually using many of its features for real work. I suggest that you consider extending the interrupt disabling all the way through DMA setup to the actual FDC command being sent - and then run the system doing heavy floppy I/O simultaneously with either incoming serial SLIP, or incoming network packets to see whether there is any noticeable I/O speed difference, as well as whether any serial or network packets are lost. This is important, as perhaps we're worrying about nothing, or not... Only real world testing will tell. From my perspective, I want to know whether my understanding of the kernel is correct, which needs to be proven through testing. And I'm very interested in learning more about how long interrupts can be disabled before data loss occurs in serial and network packets.

Agreed, this is where the sh.. hits the fan, and yes, I am sort of doing this all the time, although rarely in a 'empirical' way. I do believe removing irq protection throughout the directfd driver including the (local) DMA setup will work, but as I said before, I don't think it's desirable from a performance point of view.
Also, high up on my list is to make HD/IDE and tty output interrupt driven, which will make the testing you suggest even more interesting.

Let the show begin.

ghaerr · 2023-09-24T17:06:03Z

The 765 has an external reset pin which needs to be pulled in order to reset it. The adapter adds a register or just a port to pulse or explicitly set/reset that reset line - in this case one of the bits in the DOR.

Thanks for the explanation. It helps to know we're only talking about strobing a reset pin from the adaptor board, rather than having to necessarily go through a full seek, etc each reset.

I'm referring to the first initialization (_init routine) only (which normally does a full seek both ways

Well - the _init routine doesn't do anything at all now, it just registers the block device and reserves the major number. The _open routine queries the chip type, then does executes initial_reset_flag = 1; reset_floppy(); ONLY if the chip type is not an 82077. You had a comment in there about whether this was needed - perhaps we should try removing that (again) for real hardware. QEMU emulates an 82077 so it never executes the full initial reset. I'm hesitant to change anything to do with hardware physical movement stuff since I don't want to break anything you got running on real hardware. This is the only place where a full reset is requested.

ghaerr · 2023-09-24T18:24:01Z

This is an important observation
it seems to me this is preventing us taking advantage of the asynchronous nature of block IO
The decision to allow or deny parallel IO operations should be done at the driver level

It is important to differentiate between the business of "simulated" async execution of user programs, versus an "async" interrupt-driven driver. In the former, user program "simultaneous" operation is simulated by sleeping a task, say, during I/O and the scheduler letting others proceed. In an interrupt driven driver, only a single I/O operation may be allowed at a time. For the FDC, it can't handle simultaneous I/O requests, so the programmer necessarily has to make the decision to handle and allow only one at a time. Thus my comment that the floppy driver is single threaded.

What this means is that while the kernel may queue I/O requests from multiple executing user programs when then sleep them, they all go into a single request queue (per device of course) and then in this case, the floppy driver dequeues them absolutely one at a time, and waits until the currently executed I/O is complete before considering the next one. In actual operation, when the first I/O request is added to an empty queue, the do_fd_request routine is called to kick off I/O, and any subsequent I/O operations are started when redo_fd_request calls end_request, which dequeues the completed request, and then the for loop in redo_fd_request gets the next one using CURRENT->req, etc.

So, given the single-task nature of the FDC controller, all direct floppy requests are designed to be guaranteed single-kernel-thread operations, thus also removing any need for protection. Note this doesn't have to be the case for a controller (perhaps hard disk?) that actually might be able to handle multiple drives simultaneously. In that case, the driver variables would have to be protected, especially in the case of separate I/O complete interrupts occurring per controller.

Wow is the elevator algorithm a real interrupt killer -

Thanks for the reminder - I knew that, but forgot. Given my predilection for absolutely minimizing time while interrupts are disabled, this could be bad. I think the answer is scientific, rather than philosophical - we need to measure how many incoming serial or network packets are lost while running the FDC hard, then make decisions as to whether the FDC driver in fact can work well in a heavy async multitasking environment or needs changes. I'll add that to my list, along with your point about coming up with an estimate for NR_REQUESTS.

high up on my list is to make HD/IDE and tty output interrupt driven

Can't wait to see the HD/IDE interrupt driven driver :)

Agreed an interrupt driven TTY output driver is now entirely doable, given ELKS and TLVC kernels support async I/O. After thinking a bit about it, I realized that doing so opens up a debug problem when printk doesn't wait for completion before returning. This means that printk might not be consolidated on serial console anymore, although kernel timing wouldn't suffer as much. The TTY serial output complete interrupt may have its own ramifications to other interrupt-driven processes though.

I do believe removing irq protection throughout the directfd driver including the (local) DMA setup will work, but as I said before, I don't think it's desirable from a performance point of view.

How a < 10ms advantage is going to change anything on a floppy with +/- ~500msec timing issues is a bit beyond me, given the SLIP and NIC corruption possibilities. But I would like to hear the results of testing :)

Mellvik · 2023-09-25T15:37:21Z

This is an important observation
it seems to me this is preventing us taking advantage of the asynchronous nature of block IO
The decision to allow or deny parallel IO operations should be done at the driver level

It is important to differentiate between the business of "simulated" async execution of user programs, versus an "async" interrupt-driven driver. In the former, user program "simultaneous" operation is simulated by sleeping a task, say, during I/O and the scheduler letting others proceed. In an interrupt driven driver, only a single I/O operation may be allowed at a time. For the FDC, it can't handle simultaneous I/O requests, so the programmer necessarily has to make the decision to handle and allow only one at a time. Thus my comment that the floppy driver is single threaded.

What this means is that while the kernel may queue I/O requests from multiple executing user programs when then sleep them, they all go into a single request queue (per device of course) and then in this case, the floppy driver dequeues them absolutely one at a time, and waits until the currently executed I/O is complete before considering the next one. In actual operation, when the first I/O request is added to an empty queue, the do_fd_request routine is called to kick off I/O, and any subsequent I/O operations are started when redo_fd_request calls end_request, which dequeues the completed request, and then the for loop in redo_fd_request gets the next one using CURRENT->req, etc.

So, given the single-task nature of the FDC controller, all direct floppy requests are designed to be guaranteed single-kernel-thread operations, thus also removing any need for protection. Note this doesn't have to be the case for a controller (perhaps hard disk?) that actually might be able to handle multiple drives simultaneously. In that case, the driver variables would have to be protected, especially in the case of separate I/O complete interrupts occurring per controller.

Thanks @ghaerr - great run-through!! Obviously I'm well aware of all this, I did the drivers, remember? My question/observation was about your claim that the kernel prevents multithreaded IO. Since you're discussing low-level/driver level IO only, I'm assuming that not to be the case. Also, while a single physical component like a floppy adapter (of that age) is single threaded by nature, it doesn't prevent the driver from running parallel IO on multiple controllers, which was the scenario I discussed.

The current directfd driver obviously cannot do that because it wasn't designed to, it's easy enough to fix if desirable.

Wow is the elevator algorithm a real interrupt killer -

Thanks for the reminder - I knew that, but forgot. Given my predilection for absolutely minimizing time while interrupts are disabled, this could be bad. I think the answer is scientific, rather than philosophical - we need to measure how many incoming serial or network packets are lost while running the FDC hard, then make decisions as to whether the FDC driver in fact can work well in a heavy async multitasking environment or needs changes. I'll add that to my list, along with your point about coming up with an estimate for NR_REQUESTS.

Such measurements are desirable to require quite a bit of rigging. What I plan to do is to collect (maybe just print for now) jiffies before and after entering 'do not disturb' mode. That gives a fair indication, for the actual platform, how much time is spent protected.

high up on my list is to make HD/IDE and tty output interrupt driven

Can't wait to see the HD/IDE interrupt driven driver :)

Agreed an interrupt driven TTY output driver is now entirely doable, given ELKS and TLVC kernels support async I/O. After thinking a bit about it, I realized that doing so opens up a debug problem when printk doesn't wait for completion before returning. This means that printk might not be consolidated on serial console anymore, although kernel timing wouldn't suffer as much. The TTY serial output complete interrupt may have its own ramifications to other interrupt-driven processes though.

That's a good point, but I'd rather face that when it becomes a problem. printk's are mixed up quite a bit already, so it's nothing new really, just potentially worse.

ghaerr · 2023-09-25T16:05:22Z

great run-through!! Obviously I'm well aware of all this, I did the drivers, remember?

Lol, yes I know. Sometimes I get a bit pedantic but try to explain from top to bottom for those less familiar.

My question/observation was about your claim that the kernel prevents multithreaded IO. Since you're discussing low-level/driver level IO only, I'm assuming that not to be the case.

The kernel does not prevent multithreaded I/O, but single-threads the request queue. So to be clear - the kernel collects I/O requests from processes which are funneled into a singular queue per device. The kernel will only kick off a single I/O request (i.e. single-threaded I/O start), but its entirely up to the driver as to how to process each request, and when to read the next request and start processing it. The latter will always happen at interrupt time until the request queue empties, at which time the cycle starts again.

What I plan to do is to collect (maybe just print for now) jiffies before and after entering 'do not disturb' mode. That gives a fair indication, for the actual platform, how much time is spent protected.

Kind of - the hardware timer runs at 10ms, which is a lot of instructions and doesn't measure smaller intervals very well. I typically run 'ia16-elf-objdump -D -r -Mi8086 on the .o file and count instructions between the interrupt disabling/enabling.

Mellvik · 2023-09-25T17:07:17Z

I'm referring to the first initialization (_init routine) only (which normally does a full seek both ways

Well - the _init routine doesn't do anything at all now, it just registers the block device and reserves the major number. The _open routine queries the chip type, then does executes initial_reset_flag = 1; reset_floppy(); ONLY if the chip type is not an 82077.

reset_floppy is the one I'm referring to. I have never tested (or enabled) the code for the 82077.

You had a comment in there about whether this was needed - perhaps we should try removing that (again) for real hardware.

Yes, and I forget why I put it back in. It would seem that the initialization down any the BIOS before booting the system should be enough. I need to do some experimentation with this. reset_floppy simply pulls the FDC reset pin and exits, which should not cause head movement .

QEMU emulates an 82077 so it never executes the full initial reset. I'm hesitant to change anything to do with hardware physical movement stuff since I don't want to break anything you got running on real hardware. This is the only place where a full reset is requested.

I'll enable the FIFO_UNTESTED now and see what happens. Might be interesting since on of the hardware systems has the 077.

Mellvik · 2023-09-25T17:44:53Z

What I plan to do is to collect (maybe just print for now) jiffies before and after entering 'do not disturb' mode. That gives a fair indication, for the actual platform, how much time is spent protected.

Kind of - the hardware timer runs at 10ms, which is a lot of instructions and doesn't measure smaller intervals very well. I typically run 'ia16-elf-objdump -D -r -Mi8086 on the .o file and count instructions between the interrupt disabling/enabling.

Good point! Thanks. Will get back to this issue later.

[df] Add 300k data xfer rate to 360k floppy FDC data

4d94a83

Get probing working, cleanup driver somewhat, always display messages

95eaf27

Allow open to fail if floppy_register fails

c424d5d

ghaerr added 3 commits September 16, 2023 18:31

Cleanup extra DEVICE_NAME parm, DEVICE_INTR/do_floppy

b6102cc

ifdef code which uses nonexistant FD_DIR register on 8272A

af7e62b

Cleanup

9818142

ghaerr added 3 commits September 17, 2023 17:42

Document FDC types and source code a bit

d5fc72b

Properly rewrite handling of varied FDC chips and adapters, fix 2.88M…

d5be9ac

… entries

cleanup

18293ab

ghaerr mentioned this pull request Sep 21, 2023

[direct floppy] Ifdef clear DIR media change code out of normal compilation #1730

Merged

[df] Add 300k data xfer rate to 360k floppy FDC data #1724

[df] Add 300k data xfer rate to 360k floppy FDC data #1724

Conversation

ghaerr commented Sep 16, 2023

ghaerr commented Sep 16, 2023

toncho11 commented Sep 16, 2023 • edited Loading

toncho11 commented Sep 16, 2023 • edited Loading

ghaerr commented Sep 16, 2023

ghaerr commented Sep 16, 2023

toncho11 commented Sep 16, 2023 • edited Loading

toncho11 commented Sep 16, 2023

toncho11 commented Sep 16, 2023

ghaerr commented Sep 16, 2023

toncho11 commented Sep 16, 2023 • edited Loading

ghaerr commented Sep 16, 2023

toncho11 commented Sep 16, 2023

toncho11 commented Sep 16, 2023

toncho11 commented Sep 16, 2023

ghaerr commented Sep 17, 2023

toncho11 commented Sep 17, 2023

ghaerr commented Sep 18, 2023

toncho11 commented Sep 18, 2023

toncho11 commented Sep 18, 2023

Mellvik commented Sep 18, 2023

ghaerr commented Sep 18, 2023

ghaerr commented Sep 19, 2023

ghaerr commented Sep 19, 2023 • edited Loading

Mellvik commented Sep 19, 2023

Mellvik commented Sep 19, 2023

Mellvik commented Sep 19, 2023

Mellvik commented Sep 20, 2023

Mellvik commented Sep 20, 2023

ghaerr commented Sep 20, 2023

Mellvik commented Sep 20, 2023

ghaerr commented Sep 20, 2023

ghaerr commented Sep 20, 2023

toncho11 commented Sep 20, 2023

ghaerr commented Sep 20, 2023

ghaerr commented Sep 20, 2023

Mellvik commented Sep 20, 2023

ghaerr commented Sep 20, 2023 • edited Loading

Mellvik commented Sep 21, 2023

ghaerr commented Sep 21, 2023 • edited Loading

Mellvik commented Sep 21, 2023

ghaerr commented Sep 22, 2023 • edited Loading

Mellvik commented Sep 23, 2023

ghaerr commented Sep 23, 2023

Mellvik commented Sep 24, 2023

ghaerr commented Sep 24, 2023

ghaerr commented Sep 24, 2023

Mellvik commented Sep 25, 2023

ghaerr commented Sep 25, 2023

Mellvik commented Sep 25, 2023

Mellvik commented Sep 25, 2023

toncho11 commented Sep 16, 2023 •

edited

Loading

toncho11 commented Sep 16, 2023 •

edited

Loading

toncho11 commented Sep 16, 2023 •

edited

Loading

toncho11 commented Sep 16, 2023 •

edited

Loading

ghaerr commented Sep 19, 2023 •

edited

Loading

ghaerr commented Sep 20, 2023 •

edited

Loading

ghaerr commented Sep 21, 2023 •

edited

Loading

ghaerr commented Sep 22, 2023 •

edited

Loading