Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GuiTraceRay from theater ray intersection instead of camera position. #1782

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

saurtron
Copy link
Collaborator

@saurtron saurtron commented Nov 25, 2024

Work done

  • Calculate optimal position to GuiTraceRay from and apply to all GuiTraceRay (the ones with groundOnly=false).

Explanation

Background

Implementation

  • I define theater as the vertical plane defined by the intersection of camera to far bottom frustum corners with the map maxheight+maxunitaltitude.
  • The theater window is basically the bottom of the frustum intersected with the maximum height + the maximum altitude from airplanes. This defines the closest units we can actually see, even if the camera position is far back.
  • Then calculate the theater ray intersection as the intersection of the theater plane with the ray.
  • Note: For now I don't use the real maxunitaltitude but a hardcoded number, this needs to actually get the maxunitaltitude from somewhere. This comes from unitdefs+some additional logic, and I think can be maintained by the engine, or maybe already does not sure about this. I did check and 500.0 is good for bar atm, but ofc this needs to be improved before merging.
  • We could optimize a bit by not doing if camera direction is just looking down with some threshold, but I think the calculations are not too expensive anyways.

Remarks

  • I think this is the proper logic but of course needs some thought, the idea is tracing from as close as possible to scan less quads.
  • Might be I'm reimplementing some functionality already present in engine, please let me know if this is the case. Specially I couldn't find a way to get a plane equation from norm+point.
  • Also maybe the "theater" terminology is not the best.

Before and after

  • Top images are from current master, bottom are with the PR applied.
  • The white squares are actual quads scanned by GuiTraceRay.
  • Note how the top images quickly become a search from the other side of the map going back.
  • Note how the corners (represented by squares) I'm drawing go further back than the frustum drawn at minimap, this is because they intersect with ground, but I'm intersecting with max altitude basically.
  • Blue square is the camera position, red square the intersection of the ray with the near frustum plane (iirc, I tried this first, but it's also not best).

rayorigin

Diagram

theater

* Theater is the vertical plane defined by the intersection of camera to
  far bottom frustum corners with the map maxheight+maxunitaltitude.
* Calculate the theater ray intersection as the intersection of the
  theater plane with the ray.
@sprunk
Copy link
Collaborator

sprunk commented Nov 25, 2024

Units can be arbitrarily high due to impulse and Lua so there is no such thing as maximum altitude.

@saurtron
Copy link
Collaborator Author

saurtron commented Nov 25, 2024

Units can be arbitrarily high due to impulse and Lua so there is no such thing as maximum altitude.

Well, in practice atm this is not the case so I think it's worth it to consider. In bar in a typical map with some high areas the max map altitude can be 500 or 1000, units max altitude around 500. Camera can be at 4000 or 5000+.

Anyways, even in the case there's no absolute maximum it can be considered to maintain the current highest altitude as an unsynced value, actually I thought the engine might already be maintaining something like this itself but couldn't check properly yet.

@sprunk
Copy link
Collaborator

sprunk commented Nov 25, 2024

ZK has orbital flingers so the "in practice" isn't even true, but if it was I still wouldn't want to sacrifice capabilities / behaviour correctness in name of perf, as a matter of principle.

Keeping some sort of max visible object height cached (such that the behaviour is always correct) sounds fine.

@saurtron
Copy link
Collaborator Author

saurtron commented Nov 25, 2024

Added keeping track of unit altitude.

if it was I still wouldn't want to sacrifice capabilities / behaviour correctness in name of perf, as a matter of principle

Of course I also wouldn't do it, the initial value was just a placeholder while discussing, was just proposing either we keep track of absolute altitude (as implemented atm) or have some kind of limit set that would be impossible to breach unless the game sets it higher in some way, or maybe the engine had something already, but looking at it doesn't seem to be the case.

Actually if you think about it, maybe the engine does have some higher limit by itself, due to different variables (like unitdef), we just don't know what it is. Of course lua can set it higher, but we can track the limit when lua changes something.

Anyways for now I'm updating the limit pretty much everywhere quadField is getting updated, plus also resetting it on UnitHandler UpdateUnits as that seemed reasonable. I think that should be ok.

The thing i'm not totally sure atm is I'm keeping the limit with unit->pos.y + unit->radius, but not totally sure I shouldn't be using midPos+radius maybe.

Copy link
Collaborator

@sprunk sprunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks decent

rts/Game/Camera.cpp Outdated Show resolved Hide resolved
rts/Sim/Units/UnitHandler.cpp Outdated Show resolved Hide resolved
rts/Game/UI/GuiHandler.cpp Outdated Show resolved Hide resolved
@sprunk
Copy link
Collaborator

sprunk commented Nov 25, 2024

The thing i'm not totally sure atm is I'm keeping the limit with unit->pos.y + unit->radius, but not totally sure I shouldn't be using midPos+radius maybe.

It probably should use the top of the selection volume which probably means some nonsense like

const auto& selVol = unit->selectionVolume;
const auto top = selVol.GetWorldSpacePos(unit).y + selVol.GetBoundingRadius();

Features are also subject to the trace-ray so should be included in the bounds calculation. Both units and features are derived from the CSolidObject class which is the one that has selectionVolume as a member so the handling would be the same.

@saurtron
Copy link
Collaborator Author

saurtron commented Nov 27, 2024

It probably should use the top of the selection volume which probably means some nonsense like

const auto& selVol = unit->selectionVolume;
const auto top = selVol.GetWorldSpacePos(unit).y + selVol.GetBoundingRadius();

Added this, although I think it'd be good to find a lighter method. Note we don't need exact results but just to be higher than the exact results. Anyways it's not extremely expensive, but we will be looping all units all the time to keep this updated (otherwise there's no way to reset the altitude and it would just keep going up), plus running it on UnitMove.

Features are also subject to the trace-ray so should be included in the bounds calculation. Both units and features are derived from the CSolidObject class which is the one that has selectionVolume as a member so the handling would be the same.

Added feature support too. I added it inside featureHandler in a similar way to how unitHandler is doing it. I'm doing the altitude reset less often though, as I don't expect this to go up a lot also there is no loop already on features so I had to add one. The good thing is we can check earlier and if the feature max altitude is lower than ground or unit altitude we don't need to loop it.

rts/Game/Camera.cpp Outdated Show resolved Hide resolved
rts/Sim/Features/FeatureHandler.cpp Outdated Show resolved Hide resolved
rts/Game/Camera.cpp Outdated Show resolved Hide resolved
@sprunk
Copy link
Collaborator

sprunk commented Nov 27, 2024

If #1774 is to be merged then everything that updates the respective max should have a visibility check, otherwise the new /debug view would leak information about the existence of high-flying stuff within the fog of war.

@saurtron
Copy link
Collaborator Author

If #1774 is to be merged then everything that updates the respective max should have a visibility check, otherwise the new /debug view would leak information about the existence of high-flying stuff within the fog of war.

Yes I was thinking to make that cheat mode only.

@sprunk
Copy link
Collaborator

sprunk commented Nov 27, 2024

Also works.

@@ -203,6 +220,9 @@ void CFeatureHandler::Update()

updateFeatures.erase(iter, updateFeatures.end());
}
if ((gs->frameNum & 63) == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to 62 because 2 secs felt right

How about

Suggested change
if ((gs->frameNum & 63) == 0) {
/* Value arbitrary, represents a perf tradeoff:
* larger values decrease per-feature overhead over time for the usual state where features don't move vertically,
* smaller values decrease traceray overhead if features often enter and leave the stratosphere. */
static constexpr auto UPDATE_RATE = 2 * GAME_SPEED;
if (gs->frameNum % UPDATE_RATE == 0) {

Copy link
Collaborator Author

@saurtron saurtron Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also change the check above this for destroyed features for consistency too? Otherwise it feels a bit ad-hoc. I did it like this for internal consistency in the class tbh.

Also one concern here is everyone doing checks on frameNum % (n * GAME_SPEED), then all n multipliers of game speed would see perf hit with everyone queueing to do extra work there. So maybe want to add an offset here.

Right now all modules choose "slots" quite randomly resulting in random frames with increased load I guess.

Ideally maybe everyone should query for free/next update slots with some given period stepping so work is distributed among those, probably neither % or & standalone are really good in the end.

Maybe smth like:

// example for doing specific slot in 32 (2⁵ frame groups).
static constexpr int PERIOD_STEPPING = 5;
static constexpr int UPDATE_MASK = std::pow(2, PERIOD_STEPPING)-1;
auto slot = engine->getNextSlot(PERIOD_STEPPING);

 if ((frameNum+slot) & UPDATE_MASK))
    // do work..

I think it could even be improved to sync between different periods. Lua could also use the mechanism, the only issue would be methods who want to do it right now and then with a specific period, but for those probably best would be to actually do it right now somehow, and then conform to the slot assigned by the engine.

Right now here, I think the consistency is important so the function uses the same method for both checks. Later maybe a method like above could be used to properly distribute load among processes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also change the check above this for destroyed features for consistency too? Otherwise it feels a bit ad-hoc. I did it like this for internal consistency in the class tbh.

Eventually yes. There are many x & 31 left all over engine that should mostly become x % GAME_SPEED at some point, but some care needs to be taken for each to make sure there isn't a regression. In particular I am unsure if the one above is safe to change this way because running at a slightly slower rate than unit CommandAI slow updates may be on purpose to make sure CAI has an opportunity to process things.

Right now here, I think the consistency is important so the function uses the same method for both checks.

Sure. I don't mind either way.

everyone doing checks on frameNum % (n * GAME_SPEED), then all n multipliers of game speed would see perf hit with everyone queueing to do extra work there. So maybe want to add an offset here. Right now all modules choose "slots" quite randomly resulting in random frames with increased load I guess.

Sure. Most of this sort of periodic work is reasonably light though, I'd expect everything just has an offset of 0 but there aren't really spikes every second. Also there may be hidden assumptions so again it would take some care to make sure not to break things (e.g. handler A's periodic update must run right after handler B's and it's not made explicit anywhere, but it sort of happens to work because they just currently have the same offset and period - I remember something like this in resource income stat collection but there may be others).

Lua could also use the mechanism, the only issue would be methods who want to do it right now and then with a specific period

Lua is a monolith from engine PoV so it is up to Lua to arrange work for individual wupgets, e.g. they could sign up dynamically to the wupgetHandler or use Script.DelayByFrames.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

care needs to be taken for each to make sure there isn't a regression.

Indeed, any such rework would have to be done with extreme care to understand everything going on.

Also there may be hidden assumptions so again it would take some care to make sure not to break things (e.g. handler A's periodic update must run right after handler B's and it's not made explicit anywhere, but it sort of happens to work because they just currently have the same offset and period

Current situation is fragile tho, because there might be such a situation happening and might be hard to notice, and any day someone may change some of the periods but then break any such assumptions, like if we decide to change both periods at featureHandler in this PR.

@saurtron
Copy link
Collaborator Author

saurtron commented Nov 30, 2024

diagram to document the theater idea:

theater

ray origin is the same as camera position, note how it's much before any possible intersection with units

const auto ftl = GetFrustumVert(CCamera::FRUSTUM_POINT_FTL);

// check the bottom frustum is parallel to ground and lower than the top frustum
if ((std::abs(fbl.y-fbr.y) > std::numeric_limits<float>::epsilon()) || (fbl.y >= ftl.y))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epscmp exists, perhaps use that? Maybe std::numerlic_limits<T>::epsilon() could even become the default for its 3rd arg:

template<class T> inline bool epscmp(const T a, const T b, const T eps) {

can have error much bigger than epsilon even when the camera is parallel
to groung.
@saurtron
Copy link
Collaborator Author

saurtron commented Dec 1, 2024

When analyzing the PR with @lhog I noticed the terrible overscanning is only happening when zoom fov is being used (not normal by default, at least in BAR).

In normal situation it's still overscanning when not looking more or less down, although it's not that bad.

Also fixed some error conditions when determininig the closest point. Now I believe it works good.

Still, I think this PR would benefit from some profiling before thinking about merge, since under "normal" conditions I believe it's not overscanning too much for most users (just for the ones who use more of a Supreme commander style camera).

Also likely the Camera::NearTheaterIntersection method could have a cheaper test to bail out early, but still the main performance cost in this PR is keeping all altitudes updated.

updated image with normal camera overtracing representation

I posted the updated image at PR description for reference, but posting here as well:

(check the total longitude of the ray, not the width, width is a bit different because angle and position of ray is a bit different in both images, so ray width is affecting differently)

at the top images you can see the blue square (camera pos), that's where we normally trace from.

note: previously it was proposed here to use near frustum to mouse ray intersection as starting point, but that seems to be (almost) the same as camera position, needs to be investigated further.

raystart

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants