Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record advance time #457

Merged
merged 34 commits into from
Nov 10, 2024
Merged

Conversation

NQNStudios
Copy link
Collaborator

@NQNStudios NQNStudios commented Sep 25, 2024

Fix #437

  • Record and replay actions from Recharge spell #422
  • Implement --verbose flag for replay recording, which records internal testing actions (for now just advance_time)
  • This PR I think covers making the vast majority of action cases perform advance_time() identically to original runtime behavior. I would say it's highly likely that I missed edge cases, but this version of the code is much more correct.
  • Also of note: obsessively matching the behavior of advance_time() while correct and good, is not entirely necessary for replay integrity as long as action replays always handle the did_something flag correctly.

To test this and find every discrepancy, I used these two replays which together contain one of every action (I'm pretty sure):

Archive.zip

There are some major caveats to those 2 replays, which is why I am not staging them in the repository:

  • They were both recorded with different builds, each in a modified state
  • In the first one, I deleted some actions at the end which led to the crash "Ask About" text input crash #455
  • As I was testing and fixing, I changed the way some actions record themselves. Rather than re-record the whole logs, I would edit the log to match the changing behavior (but I could have made a mistake).
  • They take very very long.

Some things about these replays are cool:

  • They recreate pretty long play sessions, appearing to match the state of the original playthrough perfectly, including in combat (which is fun to watch)
  • You can see the replays reproduce some UI bugs which would be impossible to check with unit testing
  • If I ever demo OpenBoE at a public event, I'll probably make replays to have looping when no one's actively playing it :)

This is a wibbly wobbly PR which I'm afraid rebasing for cleaner history, would get too hairy. So I haven't tried squashing anything yet. I also don't think it would be proper to squash the whole PR into one commit, because it shows the evolution of the --verbose flag and my reasoning for it. All the contiguous "fix some cases" commits could probably be squashed together easily, though.

My next recording system PRs after this one will do the following to address the dangling problems:

  • Add two folders, test/replays/short and test/replays/long
  • Add a much shorter log that records one of everything FROM A CLEAN REPO STATE, WITHOUT MANUAL MODIFICATIONS
  • Add a bash script test/replays.sh which runs the replays, with the options to fail fast and/or skip the long ones.
  • On success of a comprehensive run of all replays, this script will update a file test/replays/success_{platform}.txt which records the latest gitrev info unless the repository has been modified. This file can be staged, and potentially the CI for release builds could check for its existence and assert that a pull request comes with a recent success SHA for all 3 platforms (though this requires a dev to have access to all of Mac/Windows/Linux, which I do for now, but obviously not everyone will and my computers might break some day.)
  • Get the replay test to pass on all 3 platforms so that success files can exist for all 3.

@NQNStudios
Copy link
Collaborator Author

That win-scons one failed for github action reasons. Needs to re-run

@CelticMinstrel
Copy link
Member

though this requires a dev to have access to all of Mac/Windows/Linux, which I do for now

I technically have access to all 3 too, though my Windows/Linux computer has been turned off for quite awhile and probably needs servicing.

@@ -1282,6 +1282,7 @@ bool use_space(location where) {
// Note ... if this is a container, the code must first process any specials. If
// specials return false, can't get items inside. If true, can get items inside.
// Can't get items out in combat.
// TODO this always returns false. Why?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it's supposed to return s1.

@CelticMinstrel
Copy link
Member

This one looks wrong: 762feec

One other that looked wrong is handle_alchemy, but I'm not sure of that – I think it should advance time if you make a potion, but it certainly shouldn't advance time if you open the alchemy menu and then cancel. Were both those cases tested?

There are a couple of TODO notes that need removing now. One is the one you can see in 47b4456 and the other is the original TODO note about validating advance_time.

@NQNStudios
Copy link
Collaborator Author

This one looks wrong: 762feec

I believe it's correct -- screen shifting needs to call advance_time() with need_redraw true but did_something false

@NQNStudios
Copy link
Collaborator Author

One other that looked wrong is handle_alchemy, but I'm not sure of that – I think it should advance time if you make a potion, but it certainly shouldn't advance time if you open the alchemy menu and then cancel. Were both those cases tested?

Oh, I was definitely rushing though when I recorded handle_alchemy. I have to redo that one and make sure to get both cases

@CelticMinstrel
Copy link
Member

CelticMinstrel commented Sep 25, 2024

This one looks wrong: 762feec

I believe it's correct -- screen shifting needs to call advance_time() with need_redraw true but did_something false

Ah, I guess that half makes sense. It suggests advance_time isn't quite the right name for it, but close enough I guess.

Oh, I was definitely rushing though when I recorded handle_alchemy. I have to redo that one and make sure to get both cases

What about other dialog things that can be cancelled? Like casting a spell, or deleting a PC… or giving a stackable item with charges…

@NQNStudios
Copy link
Collaborator Author

NQNStudios commented Sep 26, 2024

No, I didn't comprehensively test canceling cancelable actions. That actually is probably gonna reveal new bugs and discrepancies

EDIT: ...or not. Because --verbose just records what happens. Actually I would expect those cases to just work. We'll see.

@NQNStudios
Copy link
Collaborator Author

NQNStudios commented Sep 26, 2024

Ah, I guess that half makes sense. It suggests advance_time isn't quite the right name for it, but close enough I guess.

Right. advance_time only actually advances time if an action set did_something to true. The other two flags just redraw the screen and print out buffer messages. did_something is very rarely set separately from the business logic of actions' functions in boe.actions.cpp, which means we possibly had leeway to keep calling advance_time after every action in the first place.

Unless advance_time() changed behaviors in the future. Then we would be glad that we got it right.

@NQNStudios
Copy link
Collaborator Author

NQNStudios commented Sep 26, 2024

> Because --verbose just records what happens. Actually I would expect those cases to just work.

I can express this better.

Actions are calling synchronous dialogs to either complete the action or cancel it. After the synchronous dialog runs, they're calling advance_time.

At replay-time, the dialogs also run synchronously and complete the action logic (accounting for cancel) before checking advance_time. So it should match without extra work.

I think this is completely wrong. I'm too many hours deep on this for today. Not thinking clearly.

@NQNStudios
Copy link
Collaborator Author

https://github.com/calref/cboe/pull/457/files#diff-0fb416d013a449a9dcaaf68891fc4bf89204fbb9a69ea747e6c27141026ea24aL424

if this ever triggers a spell being cast outside of combat, it should probably advance time. Inside of combat, I assume the Action Points system would handle advancing the time.

@CelticMinstrel
Copy link
Member

How is the base64 content formatted when a load game action is recorded? If possible I'd like it to be broken into fixed-length lines (except for the final line). It could maybe use a CDATA section… I think that might be required to preserve the line breaks with the way you've implemented the replay system, though this isn't really a case where whitespace needs to be preserved at all… but a base64 decoder is supposed to ignore whitespace anyway so it shouldn't matter.

Copy link
Member

@CelticMinstrel CelticMinstrel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to finish this anytime soon?

@NQNStudios
Copy link
Collaborator Author

NQNStudios commented Nov 9, 2024 via email

@NQNStudios
Copy link
Collaborator Author

How is the base64 content formatted when a load game action is recorded? If possible I'd like it to be broken into fixed-length lines (except for the final line). It could maybe use a CDATA section… I think that might be required to preserve the line breaks with the way you've implemented the replay system, though this isn't really a case where whitespace needs to be preserved at all… but a base64 decoder is supposed to ignore whitespace anyway so it shouldn't matter.

The Linux man page for base64 says 76 is the default chars-per-line of base64 output:
https://linux.die.net/man/1/base64

I can't find any definitive standard about how many chars per line there should be, so I'll use 76.

@CelticMinstrel CelticMinstrel merged commit 77297c2 into calref:master Nov 10, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When replaying, match the original advance_time() behavior
2 participants