Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tidy logic #18

Merged
merged 16 commits into from
Jun 10, 2024
Merged

Update tidy logic #18

merged 16 commits into from
Jun 10, 2024

Conversation

dpaddon
Copy link
Contributor

@dpaddon dpaddon commented Apr 16, 2024

  • Updated the min distance/time filtering logic to check each point relative to the last kept point, rather than the previous point in the input list. This previous behaviour could cause too many points to be filtered out.
  • For input traces of > maximumPoints coordinates, previously we split the output into multiple features. Now, we randomly sample maximumPoints probes from the trace by removing a random probe until we hit the desired maximumPoints. Note this happens after time/distance filtering.
  • Updated the geojson files used for comparison in the tests, so that these conform to the new logic.
  • Updated the filtering to allow minimumDistance and minimumTime to be set to 0 (in this case the only effect will be the random sampling if the input length is > maximumPoints). Previously, supplying 0 for these values would cause them to be set to the defaults of 10 and 5 respectively.
  • Use an array of probe indices to keep rather than selecting the points directly. This should make it easier in future to allow the function to return properties other than coordTimes, if they are present in the input.

geojson-tidy.js Outdated Show resolved Hide resolved
geojson-tidy.js Outdated Show resolved Hide resolved
geojson-tidy.js Outdated
Comment on lines 114 to 118
while (keepIdxs.length > (filter.maximumPoints - 2)) {
// randomly select one of the elements to remove
removeIdx = Math.floor(Math.random() * keepIdxs.length)
keepIdxs = keepIdxs.filter((element, index) => index !== removeIdx)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of .filter this is O(n) to remove one point.

I would suggest either use a set for O(1) removals or generate all the points and filter them out in one go

Copy link
Contributor Author

@dpaddon dpaddon Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you generate all the points first, you also have to keep track of which ones you have already "removed". By doing it this way, the list is already modified when you move to the next iteration.
This part won't see much heavy usage, as it comes after the time/distance filtering, which naturally reduces the length of the array. I think for now it's not worth optimising too far.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not do something equivalent to

const pointsToRemoveSet = new Set(...) // populate somehow

// at the very end call .filter once
keepIdxs = keepIdxs.filter(element => !pointsToRemoveSet.has(element));

I think this would be much cheaper in the end.

I am just thninking what if there are 1000 points and we need to remove all but 100, we cannot afford to make that many .filter calls

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also have to keep track of which ones you have already "removed".

not necessarily, I think this can be done efficiently too. For example, shuffle array and take first X points as points to keep/to remove

geojson-tidy.js Outdated Show resolved Hide resolved
@dpaddon dpaddon requested a review from a team as a code owner April 18, 2024 14:08
@dpaddon dpaddon requested a review from Valiunia April 18, 2024 14:11
Copy link

@tautrimas tautrimas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dpaddon dpaddon merged commit 30d94d9 into master Jun 10, 2024
2 checks passed
@dpaddon dpaddon deleted the update-tidy-logic branch June 10, 2024 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants