Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multidimensional data pages baking logic #4478

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rakyi
Copy link
Contributor

@rakyi rakyi commented Jan 23, 2025

  • Don't delete mdims in GrapherBaker
  • Delete old mdims in MultiDimBaker
  • Improve the console output
  • Document issue with TSConfig and getting ESNext types

@rakyi rakyi requested a review from mlbrgl January 23, 2025 10:55
@@ -308,7 +309,6 @@ const bakeGrapherPageAndVariablesPngAndSVGIfChanged = async (
imageMetadataDictionary
)
)
console.log(outPath)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We output the slug in the progress bar already, this doesn't add anything useful on top of that.

@@ -432,7 +431,7 @@ export const bakeAllChangedGrapherPagesVariablesPngSvgAndDeleteRemovedGraphers =
)

const progressBar = new ProgressBar(
"bake grapher page [:bar] :current/:total :elapseds :rate/s :etas :name\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear this is an ETA in the output, which is already cluttered, and doesn't seem useful.

@@ -451,11 +450,18 @@ export const bakeAllChangedGrapherPagesVariablesPngSvgAndDeleteRemovedGraphers =
async (knex) => await bakeSingleGrapherChart(job, knex),
db.TransactionCloseMode.KeepOpen
)
progressBar.tick({ name: `slug ${job.slug}` })
progressBar.tick({ name: job.slug })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating slug on every line doesn't add anything useful.

@rakyi rakyi force-pushed the fix-mdim-baking-logic branch from 6e05aa2 to 8385cc3 Compare January 23, 2025 11:29
@owidbot
Copy link
Contributor

owidbot commented Jan 23, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-fix-mdim-baking-logic

SVG tester:

Number of differences (default views): 0 ✅
Number of differences (all views): 0 ✅

Edited: 2025-01-23 11:44:17 UTC
Execution time: 1.23 seconds

@rakyi rakyi force-pushed the fix-mdim-baking-logic branch from 8385cc3 to 2a6a3d2 Compare January 23, 2025 12:31
@rakyi rakyi linked an issue Jan 23, 2025 that may be closed by this pull request
@rakyi rakyi force-pushed the fix-mdim-baking-logic branch 3 times, most recently from fab579d to af22b53 Compare January 24, 2025 09:59
* Don't delete mdims in GrapherBaker
* Delete old mdims in MultiDimBaker
* Improve the console output
* Document issue with TSConfig and getting ESNext types
@rakyi rakyi force-pushed the fix-mdim-baking-logic branch from af22b53 to 2d02649 Compare January 24, 2025 10:01
Copy link
Member

@mlbrgl mlbrgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

My main concern is about the repetition of deleteOldGrapher / deleteOldMultiDimPages:
1- code redundancy: they largely do the same thing
2- execution redundancy: the deletion of obsolete /grapher pages happens twice (once at the grapher step, and another time at the mdim step)

2 is more of an optimzation for full bakes, and doesn't probably count for much so I wouldn't necessarily complexify here. Also it's nice that each step is self-contained when we do partial bakes.

The near duplication of deleteOldGrapher in 1 is a bit more cumbersome, and it would be more straightforward to have a single function that deals with the overall deletion of /grapher pages (even if it called twice, as per 2).

It would be also a good opportunity to deal with the confusion around redirects to explorers.

On the one hand, we don't delete grapher pages that have been redirected to explorers:

const toRemove = without(oldSlugs, ...newSlugs)
// do not delete grapher slugs redirected to explorers
.filter((slug) => !isPathRedirectedToExplorer(`/grapher/${slug}`))

On the other hand, we don't bake grapher pages that have been redirected to explorers:

// Avoid baking paths that have an Explorer redirect.
// Redirects take precedence.
if (isPathRedirectedToExplorer(`/grapher/${grapher.slug}`)) {
console.log(`⏩ ${grapher.slug} redirects to explorer`)
return
}

Meanwhile, deleteOldMultiDimPages does delete grapher pages that have been redirected to explorers, and effectively overrides deleteOldGrapher's safeguard. If this safeguard is no longer relevant, I'd remove it rather than having it in one place and not the other, which is confusing.

console.log(`DELETING ${slug}`)
const path = `${bakedSiteDir}/grapher/${slug}.html`
await fs.unlink(path)
console.log(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.log(path) necessary? We have the same console.log in deleteOldGrapher().

Also, instead of:

DELETING coronavirus-cfr
localBake/grapher/coronavirus-cfr.html

we could maybe simply output:

DELETING localBake/grapher/coronavirus-cfr.html

?

}
const publishedSlugs = await getAllPublishedMultiDimDataPageSlugs(knex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could get the list of published slugs from multiDimsBySlug potentially, to avoid the extra db query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctly delete unpublished multidimensional data pages when baking
3 participants