Skip to content

Commit

Permalink
Merge branch 'feat/migration-MT-update' into test-mt-dsfr
Browse files Browse the repository at this point in the history
  • Loading branch information
m-maillot committed Oct 18, 2024
2 parents 68c8baf + b7eb204 commit 3cda874
Show file tree
Hide file tree
Showing 9 changed files with 2,153 additions and 5,939 deletions.
1 change: 0 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
name: Release
on:
schedule:
# * is a special character in YAML so you have to quote this string
- cron: "00 21 * * *"
repository_dispatch:
types: manual_release
Expand Down
28 changes: 28 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,31 @@
# [4.700.0](https://github.com/SocialGouv/fiches-travail-data/compare/v4.699.0...v4.700.0) (2024-09-24)


### Features

* disable scheduled released until new version ([#422](https://github.com/SocialGouv/fiches-travail-data/issues/422)) ([3a072b6](https://github.com/SocialGouv/fiches-travail-data/commit/3a072b612301caee576a8d8a16014001982948bc))

# [4.699.0](https://github.com/SocialGouv/fiches-travail-data/compare/v4.698.0...v4.699.0) (2024-09-20)


### Features

* **data:** 20240920_2121 update ([0f16a9f](https://github.com/SocialGouv/fiches-travail-data/commit/0f16a9f015660ff8daa32e12b1a49673ede2f500))

# [4.698.0](https://github.com/SocialGouv/fiches-travail-data/compare/v4.697.0...v4.698.0) (2024-09-19)


### Features

* **data:** 20240919_2120 update ([ccc4e65](https://github.com/SocialGouv/fiches-travail-data/commit/ccc4e65adf3ceb9d16155990b01fc0b278f2bb28))

# [4.697.0](https://github.com/SocialGouv/fiches-travail-data/compare/v4.696.0...v4.697.0) (2024-09-17)


### Features

* **data:** 20240917_2121 update ([58020e9](https://github.com/SocialGouv/fiches-travail-data/commit/58020e93ae4f8e0b41180775d621ff43d4bedac6))

# [4.696.0](https://github.com/SocialGouv/fiches-travail-data/compare/v4.695.0...v4.696.0) (2024-09-15)


Expand Down
578 changes: 155 additions & 423 deletions data/fiches-travail.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@socialgouv/fiches-travail-data",
"version": "4.696.0",
"version": "4.700.0",
"main": "build/index.js",
"module": "build/index.js",
"files": [
Expand Down
1,494 changes: 904 additions & 590 deletions src/fetch-data/__tests__/__snapshots__/parseDom.test.ts.snap

Large diffs are not rendered by default.

3,901 changes: 0 additions & 3,901 deletions src/fetch-data/__tests__/agents-chimiques-dangereux-acd.html

This file was deleted.

2,027 changes: 1,015 additions & 1,012 deletions src/fetch-data/__tests__/article-complex-html.html

Large diffs are not rendered by default.

29 changes: 26 additions & 3 deletions src/fetch-data/__tests__/parseDom.test.ts

Large diffs are not rendered by default.

32 changes: 24 additions & 8 deletions src/fetch-data/parseDom.js
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,16 @@ export const textClean = (text, noNbsp = false) => {
.trim();
};

const duplicateContent = (sections, highlight) => {
if (highlight) {
return (
sections.find((section) => highlight.text.includes(section.text)) !==
undefined
);
}
return false;
};

function parseHTMLSections(dom) {
const document = dom.window.document;

Expand Down Expand Up @@ -180,7 +190,14 @@ function parseHTMLSections(dom) {
sections.push(section);
});

if (sections.find((section) => section.html === "")) {
const cleanSections = sections.map((section) => ({
...section,
// Sometimes, we have all the html in a section
// We check a second times and delete HTML from the h2 found
// (H2 should not be in a section)
html: removeExtraH2(section.html),
}));
if (cleanSections.find((section) => section.html === "")) {
return [
{
title: "Contenu",
Expand All @@ -189,13 +206,9 @@ function parseHTMLSections(dom) {
},
];
}
return sections.map((section) => ({
...section,
// Sometimes, we have all the html in a section
// We check a second times and delete HTML from the h2 found
// (H2 should not be in a section)
html: removeExtraH2(section.html),
}));
if (cleanSections) {
return cleanSections;
}
}

const removeExtraH2 = (html) => {
Expand Down Expand Up @@ -329,6 +342,9 @@ export function parseDom(dom, id, url) {
let sections = parseHTMLSections(dom);

const highlight = parseHighlight(dom);
if (duplicateContent(sections, highlight)) {
sections = [];
}
if (highlight) {
sections.unshift(highlight);
}
Expand Down

0 comments on commit 3cda874

Please sign in to comment.