Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Motor/Encoder Appears to Write to EEPROM 1000s of times #488

Open
madgrizzle opened this issue Mar 27, 2019 · 31 comments
Open

Test Motor/Encoder Appears to Write to EEPROM 1000s of times #488

madgrizzle opened this issue Mar 27, 2019 · 31 comments

Comments

@madgrizzle
Copy link
Contributor

madgrizzle commented Mar 27, 2019

while (i < 1000){
motorGearboxEncoder.motor.directWrite(255);
i++;
maslowDelay(1);
if (sys.stop){return;}
}

I've been hunting down an issue in some custom firmware I'm working on and believe that their may be an issue with the stock firmware where it appears that when the test motor/encoder function is run, it writes encoder steps to eeprom at least 2000 times per axis.

The loop in the axis' test function executes maslowDelay(1) for each direction 1000 times. maslowDelay() calls execSystemRealtime() continually until the 1 ms delay is up. execSystemRealtime() does a number of things, including calling systemSaveAxesPosition(), where, if all axes are "detached", settingsSaveStepstoEEPROM() is executed. During the axis test function, I don't see anywhere an axis gets "attached" so I assume they are all still "detached". I've found that my custom firmware does actually call settinsSaveStepstoEEPROM during this time (so it appears the axes remain detached), and I don't think I changed anything from the stock firmware in this area that would make it behave differently. Obviously, if its truly writing to EEPROM so many times, that's a bad thing.

@BarbourSmith
Copy link
Member

Very interesting, and excellent sleuthing. My memory about this are very vague, but I thought that if a motor is 'detached' it won't move. DirectWrite() may be overriding that however.

@madgrizzle
Copy link
Contributor Author

madgrizzle commented Mar 28, 2019

The directWrite mechanism passes a force=true to the write routine and therefore it ignores whether the motor is attached or not. Is there harm in attaching the motor before doing the encoder test and detaching it when its done? If the motor was 'attached', then the eeprom wouldn't be written to.

@madgrizzle
Copy link
Contributor Author

If this is truly writing to EEPROM, then other places that do maslowDelay with motors detached will also contribute to wearing out the EEPROM. I see it in the setSpindlePower code where there may be 6000 EEPROM writes done. It's also in the G4 dwell command... worrisome if true.

@BarbourSmith
Copy link
Member

Yes, we absolutely don't want to be writing to the eeprom that much. It should only write once two seconds after the last command.

@madgrizzle
Copy link
Contributor Author

madgrizzle commented Mar 28, 2019

There's something I must be missing because no one's EEPROM should be working anymore. The .ino file's loop command continually executes execSystemRealtime() and that calls systemSaveAxesPosition() and that calls settingsSaveStepstoEEPROM() if the axes are detached. If the machine is idle for more than 2 seconds, the axes are supposed to detach.. So everytime loop runs when the machine is idle, it will be writing to EEPROM...

When should steps be saved to EEPROM? I think it should save once, and only once, when all axes have been detached. Maybe we need a flag that gets set when the steps are written to EEPROM and gets cleared when an axis is attached and that flag is what's checked to determine whether or not to write to EEPROM? Perhaps add a bool to system_t structure?

@madgrizzle
Copy link
Contributor Author

madgrizzle commented Mar 28, 2019

I put together a solution but have no idea how to make a PR for it since my fork's master has a bunch of other stuff in it that's custom. I'm no github expert. It's simple changes to four files (system.h, system.cpp, settings.cpp, and axis.cpp). Any suggestions on how to do a PR? I tried editing each file online off the Maslow/Firmware but it seemed to create "patches" for each file. Is that ok?

Edit: Well, I took the first patch and added all the other changes to it. I created a PR for it for review:
#489

@madgrizzle
Copy link
Contributor Author

Is there any reason to not attach the axis during the motor/encoder test? I don't see any reason not to, but since it doesn't and uses a force override, I ask.

@BarbourSmith
Copy link
Member

Is there any reason to not attach the axis during the motor/encoder test?

I think that the issue was that when they are attached the regular PID controler is trying to control them and during the test we want them to spin no matter what someone has entered in their settings.

@madgrizzle
Copy link
Contributor Author

Ok, make sense. Do we want to have encoder steps after the test? I think most of the time people are testing the motors/encoders they are doing so with the chains attached. We could easily set the flag at the end of the test function if saving is desired.

@BarbourSmith
Copy link
Member

I think that sounds like an excellent idea. For people testing without the chains attached (which is what I was picturing when I created the test), there is no harm in saving the correct position.

@madgrizzle
Copy link
Contributor Author

madgrizzle commented Mar 29, 2019

I wonder if eeprom.h or even the chip itself has code that avoids eeprom writes when what is to be written is the same as what's already there.. would seem possible that's there to have that to minimize where.

Edit: appears that the update function in eeprom.h will do that check, but the put function that's used to save the structure does not.

@madgrizzle
Copy link
Contributor Author

Actually, put uses update (found it searching in a different manner)

https://www.arduino.cc/en/Reference/EEPROMPut

"Note
This function uses EEPROM.update() to perform the write, so does not rewrites the value if it didn't change."

That explains why the EEPROM isn't worn out.. whew. Still, if we can avoid the EEPROM.put calls, we should. I don't know if I can update the PR, but I could add the thing about saving steps after test in a new PR.

@madgrizzle
Copy link
Contributor Author

Apparently editing the code in patch-4 branch (what the PR is based on) automatically updated the PR.

@BarbourSmith
Copy link
Member

Good sleuthing, that explains why it hasn't come up earlier, but I am still glad to have a fix.

Isn't GitHub pretty sweet that way 😏

@madgrizzle
Copy link
Contributor Author

Sad thing is that I think this issue came up a while back because it seems so familiar. Nevertheless, it's good to call functions only when needed. What is interesting is that the eepeom.put command actually interferes with motor operations on a teensy (causes motor to stutter). Some conflict or something no one on teensy forum seems to know about.

@esspe2
Copy link
Contributor

esspe2 commented May 7, 2019

Hi,
What about something like this (move directWrite() outside of the loop) ?

    //move the motor
    motorGearboxEncoder.motor.directWrite(255);
    while (i < 100){
        i++;
        maslowDelay(10);
        if (sys.stop){return;}
    }
    //stop the motor
    motorGearboxEncoder.motor.directWrite(0);

I see no reason to repeat 1000 times the directWrite() call: all parameters and variables stay identical between successive calls.
The PR #498 has just been filed about it.

Did I understand well that each maslowDelay() call does an eeprom write ?
So at least a 10ms delay would seem better (but it isn't included in the PR yet, tell me what);
following the ATMega2560 datasheet on page 22, it is said that 3.3ms are needed to complete the write.
The example listing starts by checking for completion of previous write, which makes me think that, yes, 1ms is really too fast for the eeprom.

@blurfl
Copy link
Collaborator

blurfl commented May 8, 2019

Did I understand well that each maslowDelay() call does an eeprom write ?

@madgrizzle did some sleuthing and found that maslowDelay uses EEPROM.update(0, which only writes to the EEPROM if the value has changed. This seems like the desired action, to keep a record of the values but only write when necessary.

@madgrizzle
Copy link
Contributor Author

I don't know how the eeprom.update code works but I suspect it might do a read, then a comparison, and then write if there's a change (to avoid fatiguing the eeprom).. but that takes time to do so the modifications avoid it all since it's not necessary.

@esspe2
Copy link
Contributor

esspe2 commented May 9, 2019

I'm thinking of putting some pair of counters around settingsSaveStepstoEEprom(), just to check how often it is called with the bit EECR & (1<<EEPE) set and cleared.

When the eeprom isn't ready, one of two things could happen:

  • the write is ignored
  • the write is delayed
    In the latter case, that could be kind of a surprise because maslowDelay(1) could last 3msecs instead of 1, following what is written on page 23 of the datasheet.

Just to keep it at hand, the call chain to the eeprom write is like this:

 maslowDelay() 
 -> execSystemRealtime() 
 -> systemSaveAxesPosition() 
 -> settingsSaveStepstoEEprom() 
 -> EEPROM.put(310, sysSteps)

@BarbourSmith
Copy link
Member

I would be interested in seeing the results of that

@esspe2
Copy link
Contributor

esspe2 commented May 9, 2019

Sure!
still thinking, something like that...

# Settings.h
#ifdef DEBUG
extern long howManyTimesEEPEWasSet;
extern long howManyTimesEEPEWasClear;
#define debugEEPE() if (EECR & (1<<EEPE)) howManyTimesEEPEWasSet++; else howManyTimesEEPEWasClear++;
#endif

#Settings.cpp
void settingsSaveStepstoEEprom(){
(...)
      debugEEPE();
      EEPROM.put(310, sysSteps);
(...)
}

@esspe2
Copy link
Contributor

esspe2 commented May 9, 2019

Meanwhile, I have found some good news reading the arduino reference file:
$HOME/arduino-1.8.2/reference/www.arduino.cc/en/Reference/EEPROMPut.html

Syntax
EEPROM.put(address, data)

This function uses EEPROM.update() to perform the write, so does not rewrites the value if it didn't change.

By chance that right function is used here in Setting.cpp (by design perhaps!).
That could be the main reason to explain the absence of wear even if 1000s of writes were asked.

Likewise, it would be good to replace the last calls EEPROM.write() by EEPROM.update() in settingsWipe(); if the byte is already 0, no need to clear it another time.

And I haven't yet seen a check of the EEPE bit in the arduino code. Still checking before implementing my debug code...

@esspe2
Copy link
Contributor

esspe2 commented May 9, 2019

And the result is ... surprise ... :

howManyTimesEEPEWasSet: 0
howManyTimesEEPEWasClear: +1

Tadaa!

I mean, that happened each time I have typed "B04" on the Arduino Monitor console (to test the motors)...

ie after having typed 21 times that command I got this (Look for the "==============>" ) :

B04
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
Testing L motor:
<Idle,MPos:0,0,0,WPos:0.000,0.000,0.000>
==============> SET: 0 #CLR: 21
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
Direction 1 - Fail
<Idle,MPos:0,0,0,WPos:0.000,0.000,0.000><Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
Direction 2 - Fail
<Idle,MPos:0,0,0,WPos:0.000,0.000,0.000><Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
<Idle,MPos:0.00,0.00,0.00,WPos:0.000,0.000,0.000>
[PE:0.00,0.00,127]
Testing R motor:
<Idle,MPos:0,0,0,WPos:0.000,0.000,0.000>
==============> SET: 0 #CLR: 21

I'm puzzled...

So, maybe yeah, there were not so many writes after all, or I have made some huge mistake, but we are far from what I feared: two large growing numbers. No need for long integers ;-)

And at the end, there is still no confirmed bug about the writes, which is promising after all.

@BarbourSmith
Copy link
Member

The plot thickens...but at least it keeps thickening in a good way

@madgrizzle
Copy link
Contributor Author

Can you explain what the results mean? Is this based upon the current master of firmware?

@esspe2
Copy link
Contributor

esspe2 commented May 10, 2019

It's built from this branch:
https://github.com/esspe2/MaslowCNCFirmware/tree/count-eeprom-writes

Then I flashed an arduino mega2560, and opened the monitor console.
The lines with Idle or with PE appear regularly and can be ignored for this test.
You can see my B04, then the lines with arrows, the counters didn't increment during the test, but just before and afterwards.
They didn't increment either while doing nothing for 30 minutes.

@madgrizzle
Copy link
Contributor Author

ok, I just wanted to make sure this was post-PR.. I'm not sure why it writes just before the test because the code doesn't set sys.writeStepsToEEPROM to true until all motor tests are completed. The only other time it gets set to true is when the axis is attached, but I could never find code that did that in response to a B04 command.

@esspe2
Copy link
Contributor

esspe2 commented May 10, 2019

You asked the right question!
Actually the motorGearboxEncoder.motor.directWrite(255); is still inside the loop, as before the PR(mine, I mean)...
I still can't figure how to explain the result deeper.
I'm not immune to bugs either, still need to cross-check all that I have changed, and find who does the lone write.

@esspe2
Copy link
Contributor

esspe2 commented May 15, 2019

Digging deeper, here is what I have found, typing B04 two times in a monitor window, slightly filtered for clarity,
I have put a counter in each of the functions of the 'call chain' which are displayed during the B04 test :

B04
Testing L motor:
= FORWARD ========> SET: 0 #CLR: 0[ maslowDelay:1 execSystemRealtime:64381
        systemSaveAxesPosition:64381 SaveStepstoEEprom:0]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 0[ maslowDelay:1001 execSystemRealtime:74588 
        systemSaveAxesPosition:74588 SaveStepstoEEprom:0]
Direction 2 - Fail
Testing R motor:
= FORWARD ========> SET: 0 #CLR: 0[ maslowDelay:2002 execSystemRealtime:89870
        systemSaveAxesPosition:89870 SaveStepstoEEprom:0]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 0[ maslowDelay:3002 execSystemRealtime:100122
        systemSaveAxesPosition:100122 SaveStepstoEEprom:0]
Direction 2 - Fail
Testing Z motor:
= FORWARD ========> SET: 0 #CLR: 0[ maslowDelay:4003 execSystemRealtime:115393 
        systemSaveAxesPosition:115393 SaveStepstoEEprom:0]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 0[ maslowDelay:5003 execSystemRealtime:125652
        systemSaveAxesPosition:125652 SaveStepstoEEprom:0]
Direction 2 - Fail
Tests complete.
ok
(10 seconds later...)
B04
Testing L motor:
= FORWARD ========> SET: 0 #CLR: 1[ maslowDelay:6004 execSystemRealtime:248358
        systemSaveAxesPosition:248358 SaveStepstoEEprom:1]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 1[ maslowDelay:7004 execSystemRealtime:258608
        systemSaveAxesPosition:258608 SaveStepstoEEprom:1]
Direction 2 - Fail
Testing R motor:
= FORWARD ========> SET: 0 #CLR: 1[ maslowDelay:8005 execSystemRealtime:273862
        systemSaveAxesPosition:273862 SaveStepstoEEprom:1]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 1[ maslowDelay:9005 execSystemRealtime:284113
        systemSaveAxesPosition:284113 SaveStepstoEEprom:1]
Direction 2 - Fail
Testing Z motor:
= FORWARD ========> SET: 0 #CLR: 1[ maslowDelay:10006 execSystemRealtime:299417
        systemSaveAxesPosition:299417 SaveStepstoEEprom:1]
Direction 1 - Fail
= BACKWARD =======> SET: 0 #CLR: 1[ maslowDelay:11006 execSystemRealtime:309661
        systemSaveAxesPosition:309661 SaveStepstoEEprom:1]
Direction 2 - Fail
Tests complete.
ok

maslowDelay(), called 1000 times, seems to call around 10 times execSystemRealtime();
systemSaveAxesPosition() is only used by execSystemRealtime() which explains the identical numbers.
At the end, SaveStepstoEEprom() is only called unfrequently, maybe only when the positions have changed.

Still going in a nice direction then.

@madgrizzle
Copy link
Contributor Author

Sounds logical.. I think if you take out the mod I made with the addition of sys.writeStepsToEEPROM test then you'll get markedly different results.. but the amount of times the eeprom is written to may not actually change.

@esspe2
Copy link
Contributor

esspe2 commented May 15, 2019

OK, well thought!
having removed the test on sys.writeStepsToEEPROM, here is the same test, with the same 10 seconds delay in between:

B04
Testing L motor:
= FORWARD ==> SET: 0 #CLR: 11625 [maslowDelay:1 execSystemRealtime:11625
         systemSaveAxesPosition:11625 settingsSaveStepstoEEprom:11625 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 16735 [maslowDelay:1001 execSystemRealtime:16735
         systemSaveAxesPosition:16735 settingsSaveStepstoEEprom:16735 Axis attached:]
Direction 2 - Fail
Testing R motor:
= FORWARD ==> SET: 0 #CLR: 24046 [maslowDelay:2002 execSystemRealtime:24046
         systemSaveAxesPosition:24046 settingsSaveStepstoEEprom:24046 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 29158 [maslowDelay:3002 execSystemRealtime:29158
         systemSaveAxesPosition:29158 settingsSaveStepstoEEprom:29158 Axis attached:]
Direction 2 - Fail
Testing Z motor:
= FORWARD ==> SET: 0 #CLR: 36768 [maslowDelay:4003 execSystemRealtime:36768
         systemSaveAxesPosition:36768 settingsSaveStepstoEEprom:36768 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 41885 [maslowDelay:5003 execSystemRealtime:41885
         systemSaveAxesPosition:41885 settingsSaveStepstoEEprom:41885 Axis attached:]
Direction 2 - Fail
Tests complete.
ok
B04
Testing L motor:
= FORWARD ==> SET: 0 #CLR: 108887 [maslowDelay:6004 execSystemRealtime:108887
         systemSaveAxesPosition:108887 settingsSaveStepstoEEprom:108887 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 113625 [maslowDelay:7004 execSystemRealtime:113625
         systemSaveAxesPosition:113625 settingsSaveStepstoEEprom:113625 Axis attached:]
Direction 2 - Fail
Testing R motor:
= FORWARD ==> SET: 0 #CLR: 121241 [maslowDelay:8005 execSystemRealtime:121241
         systemSaveAxesPosition:121241 settingsSaveStepstoEEprom:121241 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 126017 [maslowDelay:9005 execSystemRealtime:126017
         systemSaveAxesPosition:126017 settingsSaveStepstoEEprom:126017 Axis attached:]
Direction 2 - Fail
Testing Z motor:
= FORWARD ==> SET: 0 #CLR: 133632 [maslowDelay:10006 execSystemRealtime:133632
         systemSaveAxesPosition:133632 settingsSaveStepstoEEprom:133632 Axis attached:]
Direction 1 - Fail
= BACKWARD => SET: 0 #CLR: 138731 [maslowDelay:11006 execSystemRealtime:138731
         systemSaveAxesPosition:138731 settingsSaveStepstoEEprom:138731 Axis attached:]
Direction 2 - Fail
Tests complete.
ok

Which makes me understand that the EEPROM.put(310, sysSteps); is called VERY often, and by chance, as you saw before, it calls EEPROM.update() which does some useful filtering.
Btw no axis was attached at the time of the test, here is what was prepared for it;

    Serial.print(F(           " Axis attached:"));
    if (leftAxis.attached())  Serial.print(F("L "));
    if (rightAxis.attached()) Serial.print(F("R "));
    if (zAxis.attached())     Serial.print(F("Z "));

From my point of view, your mod stays relevant, great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants