Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Add related research and writing via content graph to data pages #2739

Merged
merged 25 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
15e1c46
:construction: start drafting new table
danyx23 Oct 25, 2023
365a41d
:constrution: tweak full posts link generation, almost complete
danyx23 Oct 25, 2023
d681df2
:sparkles: add updating of PostLink to wp update hook
danyx23 Oct 25, 2023
a80c490
:construction: WIP - query for related research and writing
danyx23 Oct 25, 2023
8d4b75e
:bug: fix group by
danyx23 Oct 25, 2023
63a9737
:tada: start showing related research and writing
danyx23 Oct 25, 2023
b8f28fd
:hammer: add temporary thumbnail rendering
danyx23 Oct 25, 2023
5204cf5
:bug: fix wordpress authors display
danyx23 Oct 25, 2023
6480d6f
:hammer: tweak related research query
danyx23 Oct 25, 2023
6cd3a2d
🤖 style: prettify code
danyx23 Oct 25, 2023
e9738a9
:honeybee: fix lint issues
danyx23 Oct 25, 2023
50c0f71
:hammer: add tooling to get pageview data into local mysql
danyx23 Oct 25, 2023
f7a695c
:hammer: make sure pageviews as 0 and not null
danyx23 Oct 25, 2023
7ab6aaf
:sparkles: use thumbnails for wp posts
danyx23 Oct 31, 2023
b726d7f
:hammer: add tags to content that is retrieved
danyx23 Nov 1, 2023
156daeb
: hammer: incorporate tags when matching related research
danyx23 Nov 2, 2023
a8e8f74
:honeybee: fix accidental commits in launch.json
danyx23 Nov 2, 2023
796f8c4
:hammer: fix filter query
danyx23 Nov 2, 2023
f640483
:hammer: fix page title fallback to chart tile
danyx23 Nov 2, 2023
99dc5ba
:bug: fix url not showing up in citation
danyx23 Nov 6, 2023
be2a07f
:hammer: hide charts thumbnails in all charts block for single charts
danyx23 Nov 6, 2023
a07aa34
:hammer: hard code link redirects from country templates to selector
danyx23 Nov 7, 2023
78c20b5
Simplify find postlink
danyx23 Nov 16, 2023
b617889
🔨incorporate feedback
danyx23 Nov 16, 2023
2943fbe
:lipstick: (lint) remove unused variable
sophiamersmann Nov 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ wordpress/web/wp/wp-content/**
wordpress/vendor/**
packages/@ourworldindata/*/dist/
dist/
.vscode/
20 changes: 17 additions & 3 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@
"skipFiles": [
"<node_internals>/**"
],
"skipFiles": [
"<node_internals>/**"
],
"type": "node"
},
{
Expand All @@ -25,6 +28,10 @@
"${fileBasenameNoExtension}.js",
"--watch"
],
"args": [
"${fileBasenameNoExtension}.js",
"--watch"
],
"console": "integratedTerminal"
// "internalConsoleOptions": "neverOpen"
},
Expand Down Expand Up @@ -70,7 +77,7 @@
"skipFiles": [
"<node_internals>/**"
],
"type": "node"
"type": "node",
},
{
"name": "Run SVGTester",
Expand All @@ -79,17 +86,24 @@
"skipFiles": [
"<node_internals>/**"
],
"skipFiles": [
"<node_internals>/**"
],
"type": "node",
"args": [
"-g",
"367"
]
"args": [
"-g",
"367"
]
},
{
"name": "Launch admin server",
"program": "${workspaceFolder}/itsJustJavascript/adminSiteServer/app.js",
"request": "launch",
"type": "node"
"type": "node",
},
{
"name": "Attach to node",
Expand All @@ -115,4 +129,4 @@
"port": 9000
}
]
}
}
31 changes: 18 additions & 13 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,23 +24,24 @@ help:
@echo 'Available commands:'
@echo
@echo ' GRAPHER ONLY'
@echo ' make up start dev environment via docker-compose and tmux'
@echo ' make down stop any services still running'
@echo ' make refresh (while up) download a new grapher snapshot and update MySQL'
@echo ' make migrate (while up) run any outstanding db migrations'
@echo ' make test run full suite (except db tests) of CI checks including unit tests'
@echo ' make dbtest run db test suite that needs a running mysql db'
@echo ' make svgtest compare current rendering against reference SVGs'
@echo ' make up start dev environment via docker-compose and tmux'
@echo ' make down stop any services still running'
@echo ' make refresh (while up) download a new grapher snapshot and update MySQL'
@echo ' make refresh.pageviews (while up) download and load pageviews from the private datasette instance'
@echo ' make migrate (while up) run any outstanding db migrations'
@echo ' make test run full suite (except db tests) of CI checks including unit tests'
@echo ' make dbtest run db test suite that needs a running mysql db'
@echo ' make svgtest compare current rendering against reference SVGs'
@echo
@echo ' GRAPHER + WORDPRESS (staff-only)'
@echo ' make up.full start dev environment via docker-compose and tmux'
@echo ' make down.full stop any services still running'
@echo ' make refresh.wp download a new wordpress snapshot and update MySQL'
@echo ' make refresh.full do a full MySQL update of both wordpress and grapher'
@echo ' make up.full start dev environment via docker-compose and tmux'
@echo ' make down.full stop any services still running'
@echo ' make refresh.wp download a new wordpress snapshot and update MySQL'
@echo ' make refresh.full do a full MySQL update of both wordpress and grapher'
@echo
@echo ' OPS (staff-only)'
@echo ' make deploy Deploy your local site to production'
@echo ' make stage Deploy your local site to staging'
@echo ' make deploy Deploy your local site to production'
@echo ' make stage Deploy your local site to staging'
@echo

up: export DEBUG = 'knex:query'
Expand Down Expand Up @@ -136,6 +137,10 @@ refresh:
@echo '==> Updating grapher database'
@. ./.env && DATA_FOLDER=tmp-downloads ./devTools/docker/refresh-grapher-data.sh

refresh.pageviews:
@echo '==> Refreshing pageviews'
yarn && yarn buildTsc && yarn refreshPageviews

refresh.wp:
@echo '==> Downloading wordpress data'
./devTools/docker/download-wordpress-mysql.sh
Expand Down
7 changes: 6 additions & 1 deletion baker/GrapherBaker.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import {
getRelatedArticles,
getRelatedCharts,
getRelatedChartsForVariable,
getRelatedResearchAndWritingForVariable,
isWordpressAPIEnabled,
isWordpressDBEnabled,
} from "../db/wpdb.js"
Expand Down Expand Up @@ -227,7 +228,7 @@ export async function renderDataPageV2({
}
const datapageData = await getDatapageDataV2(
variableMetadata,
grapherConfigForVariable ?? {}
grapher ?? {}
)

const firstTopicTag = datapageData.topicTagsLinks?.[0]
Expand Down Expand Up @@ -272,6 +273,10 @@ export async function renderDataPageV2({
variableId,
grapher && "id" in grapher ? [grapher.id as number] : []
)

datapageData.relatedResearch =
await getRelatedResearchAndWritingForVariable(variableId)

return renderToHtmlPage(
<DataPageV2
grapher={grapher}
Expand Down
37 changes: 36 additions & 1 deletion baker/postUpdatedHook.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@ import { exit } from "../db/cleanup.js"
import { PostRow } from "@ourworldindata/utils"
import * as wpdb from "../db/wpdb.js"
import * as db from "../db/db.js"
import { buildReusableBlocksResolver } from "../db/syncPostsToGrapher.js"
import {
buildReusableBlocksResolver,
getLinksToAddAndRemoveForPost,
} from "../db/syncPostsToGrapher.js"
import { postsTable, select } from "../db/model/Post.js"
import { PostLink } from "../db/model/PostLink.js"
const argv = parseArgs(process.argv.slice(2))

const zeroDateString = "0000-00-00 00:00:00"
Expand Down Expand Up @@ -141,6 +145,37 @@ const syncPostToGrapher = async (
db.knexTable(postsTable).where({ id: postId })
)
)[0]

if (postRow) {
const existingLinksForPost = await PostLink.findBy({
sourceId: wpPost.ID,
})

const { linksToAdd, linksToDelete } = getLinksToAddAndRemoveForPost(
postRow,
existingLinksForPost,
postRow!.content,
wpPost.ID
)

// TODO: unify our DB access and then do everything in one transaction
if (linksToAdd.length) {
console.log("linksToAdd", linksToAdd.length)
await PostLink.createQueryBuilder()
.insert()
.into(PostLink)
.values(linksToAdd)
.execute()
}

if (linksToDelete.length) {
console.log("linksToDelete", linksToDelete.length)
await PostLink.createQueryBuilder()
.where("id in (:ids)", { ids: linksToDelete.map((x) => x.id) })
.delete()
.execute()
}
}
return newPost ? newPost.slug : undefined
}

Expand Down
26 changes: 26 additions & 0 deletions db/migration/1692042923850-AddPostsLinks.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { MigrationInterface, QueryRunner } from "typeorm"

export class AddPostsLinks1692042923850 implements MigrationInterface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

A tragedy of alphabetization 🥲

public async up(queryRunner: QueryRunner): Promise<void> {
queryRunner.query(`-- sql
CREATE TABLE posts_links (
id int NOT NULL AUTO_INCREMENT,
sourceId int NOT NULL,
target varchar(2047) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_cs NOT NULL,
linkType enum('url','grapher','explorer', 'gdoc') CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_cs DEFAULT NULL,
componentType varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_cs NOT NULL,
text varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_cs NOT NULL,
queryString varchar(2047) COLLATE utf8mb4_0900_as_cs NOT NULL,
hash varchar(2047) COLLATE utf8mb4_0900_as_cs NOT NULL,
PRIMARY KEY (id),
KEY sourceId (sourceId),
CONSTRAINT posts_links_ibfk_1 FOREIGN KEY (sourceId) REFERENCES posts (id)
) ENGINE=InnoDB;`)
}

public async down(queryRunner: QueryRunner): Promise<void> {
queryRunner.query(`-- sql
DROP TABLE IF EXISTS posts_links;
`)
}
}
47 changes: 47 additions & 0 deletions db/model/PostLink.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import { Entity, PrimaryGeneratedColumn, Column, BaseEntity } from "typeorm"
import { formatUrls } from "../../site/formatting.js"
import { Url } from "@ourworldindata/utils"
import { getLinkType, getUrlTarget } from "@ourworldindata/components"

@Entity("posts_links")
export class PostLink extends BaseEntity {
@PrimaryGeneratedColumn() id!: number
// TODO: posts is not a TypeORM but a Knex class so we can't use a TypeORM relationship here yet

@Column({ type: "int", nullable: false }) sourceId!: number

@Column() linkType!: "gdoc" | "url" | "grapher" | "explorer"
@Column() target!: string
@Column() queryString!: string
@Column() hash!: string
@Column() componentType!: string
@Column() text!: string

static createFromUrl({
url,
sourceId,
text = "",
componentType = "",
}: {
url: string
sourceId: number
text?: string
componentType?: string
marcelgerber marked this conversation as resolved.
Show resolved Hide resolved
}): PostLink {
const formattedUrl = formatUrls(url)
const urlObject = Url.fromURL(formattedUrl)
const linkType = getLinkType(formattedUrl)
const target = getUrlTarget(formattedUrl)
const queryString = urlObject.queryStr
const hash = urlObject.hash
return PostLink.create({
target,
linkType,
queryString,
hash,
sourceId,
text,
componentType,
})
}
}
49 changes: 49 additions & 0 deletions db/refreshPageviewsFromDatasette.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// index.ts
import fetch from "node-fetch"
import Papa from "papaparse"
import * as db from "./db.js"

async function downloadAndInsertCSV(): Promise<void> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🙌

const csvUrl = "http://datasette-private/owid/pageviews.csv?_size=max"
const response = await fetch(csvUrl)

if (!response.ok) {
throw new Error(
`Failed to fetch CSV: ${response.statusText} from ${csvUrl}`
)
}

const csvText = await response.text()
const parsedData = Papa.parse(csvText, {
header: true,
})

if (parsedData.errors.length > 1) {
console.error("Errors while parsing CSV:", parsedData.errors)
return
}

const onlyValidRows = [...parsedData.data].filter(
(row) => Object.keys(row as any).length === 5
) as any[]

console.log("Parsed CSV data:", onlyValidRows.length, "rows")
console.log("Columns:", parsedData.meta.fields)

await db.knexRaw("TRUNCATE TABLE pageviews")

await db.knexInstance().batchInsert("pageviews", onlyValidRows)
console.log("CSV data inserted successfully!")
}

const main = async (): Promise<void> => {
try {
await downloadAndInsertCSV()
} catch (e) {
console.error(e)
} finally {
await db.closeTypeOrmAndKnexConnections()
}
}

main()
Loading