-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HelpDot] detect infinite redirect cycles in redirects.csv #40434
Changes from 23 commits
f31b294
7b8f8b9
a640d41
104e9e4
6637db8
26d8d5e
22795da
1972309
32c569c
22075e5
033e6c0
e430bb0
458bc35
08ee704
da44110
bfe2096
0232f44
409b1c1
eb6c38e
08e2545
2b5015f
3295539
1f2abbd
188b68d
307508e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,64 @@ | ||||||||||
import {parse} from 'csv-parse'; | ||||||||||
import fs from 'fs'; | ||||||||||
|
||||||||||
const parser = parse(); | ||||||||||
const adjacencyList: Record<string, string[]> = {}; | ||||||||||
|
||||||||||
function addEdge(source: string, target: string) { | ||||||||||
if (!adjacencyList[source]) { | ||||||||||
adjacencyList[source] = []; | ||||||||||
} | ||||||||||
adjacencyList[source].push(target); | ||||||||||
} | ||||||||||
|
||||||||||
function isCyclic(currentNode: string, visited: Map<string, boolean>, backEdges: Map<string, boolean>): boolean { | ||||||||||
visited.set(currentNode, true); | ||||||||||
backEdges.set(currentNode, true); | ||||||||||
|
||||||||||
// Do a depth first search for all the neighbours. If a node is found in backedge, a cycle is detected. | ||||||||||
const neighbours = adjacencyList[currentNode]; | ||||||||||
if (neighbours) { | ||||||||||
for (const node of neighbours) { | ||||||||||
if (!visited.has(node)) { | ||||||||||
if (isCyclic(node, visited, backEdges)) { | ||||||||||
return true; | ||||||||||
} | ||||||||||
} else if (backEdges.has(node)) { | ||||||||||
return true; | ||||||||||
} | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
backEdges.delete(currentNode); | ||||||||||
|
||||||||||
return false; | ||||||||||
} | ||||||||||
|
||||||||||
function detectCycle(): boolean { | ||||||||||
const visited: Map<string, boolean> = new Map<string, boolean>(); | ||||||||||
const backEdges: Map<string, boolean> = new Map<string, boolean>(); | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not make them global variables? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no particular reason. done! |
||||||||||
|
||||||||||
for (const [node] of Object.entries(adjacencyList)) { | ||||||||||
if (!visited.has(node)) { | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. linter does not like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like that lint rule tbh, but this is fine for now |
||||||||||
if (isCyclic(node, visited, backEdges)) { | ||||||||||
const cycle = Array.from(backEdges.keys()); | ||||||||||
console.log(`Infinite redirect found in the cycle: ${cycle.join(' -> ')} -> ${node}`); | ||||||||||
return true; | ||||||||||
} | ||||||||||
} | ||||||||||
} | ||||||||||
return false; | ||||||||||
} | ||||||||||
|
||||||||||
fs.createReadStream(`${process.cwd()}/docs/redirects.csv`) | ||||||||||
.pipe(parser) | ||||||||||
.on('data', (row) => { | ||||||||||
// Create a directed graph of sourceURL -> targetURL | ||||||||||
addEdge(row[0], row[1]); | ||||||||||
}) | ||||||||||
.on('end', () => { | ||||||||||
if (detectCycle()) { | ||||||||||
process.exit(1); | ||||||||||
} | ||||||||||
process.exit(0); | ||||||||||
}); |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you could safely remove this if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there will be nodes that have no neighbours because this is a directed graph.
eg:
Graph will look like this
urlA -> urlB
urlB
has no neighbours that we can go to. So we'll try to iterate over undefined in next line