-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helpers for rkey and tid syntax; validate rkey at record creation time #1738
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
|
||
# not base32 | ||
3jzfcijpj2z21 | ||
0000000000000 | ||
|
||
# too long/short | ||
3jzfcijpj2z2aa | ||
3jzfcijpj2z2 | ||
|
||
# old dashes syntax not actually supported (TTTT-TTT-TTTT-CC) | ||
3jzf-cij-pj2z-2a | ||
|
||
# high bit can't be high | ||
zzzzzzzzzzzzz | ||
kjzfcijpj2z2a |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# 13 digits | ||
# 234567abcdefghijklmnopqrstuvwxyz | ||
|
||
3jzfcijpj2z2a | ||
7777777777777 | ||
3zzzzzzzzzzzz |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
export const ensureValidRecordKey = (rkey: string): void => { | ||
if (rkey.length > 512 || rkey.length < 1) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'm pretty tempted to limit this to less than 512 I think we talked about 16 at one point (tho i think that's too low). but maybe 64? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not totally against that, but there are a couple things to discuss. I opened an issue in the specs repo: bluesky-social/atproto-website#250 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cool cool - that doesn't have to hold this up 👍 |
||
throw new InvalidRecordKeyError('record key must be 1 to 512 characters') | ||
} | ||
// simple regex to enforce most constraints via just regex and length. | ||
if (!/^[a-zA-Z0-9_~.-]{1,512}$/.test(rkey)) { | ||
throw new InvalidRecordKeyError('record key syntax not valid (regex)') | ||
} | ||
if (rkey == '.' || rkey == '..') | ||
throw new InvalidRecordKeyError('record key can not be "." or ".."') | ||
} | ||
|
||
export const isValidRecordKey = (rkey: string): boolean => { | ||
try { | ||
ensureValidRecordKey(rkey) | ||
} catch (err) { | ||
if (err instanceof InvalidRecordKeyError) { | ||
return false | ||
} | ||
throw err | ||
} | ||
|
||
return true | ||
} | ||
|
||
export class InvalidRecordKeyError extends Error {} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
export const ensureValidTid = (tid: string): void => { | ||
if (tid.length != 13) { | ||
throw new InvalidTidError('TID must be 13 characters') | ||
} | ||
// simple regex to enforce most constraints via just regex and length. | ||
if (!/^[234567abcdefghij][234567abcdefghijklmnopqrstuvwxyz]{12}$/.test(tid)) { | ||
throw new InvalidTidError('TID syntax not valid (regex)') | ||
} | ||
} | ||
|
||
export const isValidTid = (tid: string): boolean => { | ||
try { | ||
ensureValidTid(tid) | ||
} catch (err) { | ||
if (err instanceof InvalidTidError) { | ||
return false | ||
} | ||
throw err | ||
} | ||
|
||
return true | ||
} | ||
|
||
export class InvalidTidError extends Error {} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../../../interop-test-files/syntax/recordkey_syntax_invalid.txt |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../../../interop-test-files/syntax/recordkey_syntax_valid.txt |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../../../interop-test-files/syntax/tid_syntax_invalid.txt |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../../../interop-test-files/syntax/tid_syntax_valid.txt |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import { ensureValidRecordKey, InvalidRecordKeyError } from '../src' | ||
import * as readline from 'readline' | ||
import * as fs from 'fs' | ||
|
||
describe('recordkey validation', () => { | ||
const expectValid = (r: string) => { | ||
ensureValidRecordKey(r) | ||
} | ||
const expectInvalid = (r: string) => { | ||
expect(() => ensureValidRecordKey(r)).toThrow(InvalidRecordKeyError) | ||
} | ||
|
||
it('conforms to interop valid recordkey', () => { | ||
const lineReader = readline.createInterface({ | ||
input: fs.createReadStream( | ||
`${__dirname}/interop-files/recordkey_syntax_valid.txt`, | ||
), | ||
terminal: false, | ||
}) | ||
lineReader.on('line', (line) => { | ||
if (line.startsWith('#') || line.length == 0) { | ||
return | ||
} | ||
expectValid(line) | ||
}) | ||
}) | ||
|
||
it('conforms to interop invalid recordkeys', () => { | ||
const lineReader = readline.createInterface({ | ||
input: fs.createReadStream( | ||
`${__dirname}/interop-files/recordkey_syntax_invalid.txt`, | ||
), | ||
terminal: false, | ||
}) | ||
lineReader.on('line', (line) => { | ||
if (line.startsWith('#') || line.length == 0) { | ||
return | ||
} | ||
expectInvalid(line) | ||
}) | ||
}) | ||
}) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import { ensureValidTid, InvalidTidError } from '../src' | ||
import * as readline from 'readline' | ||
import * as fs from 'fs' | ||
|
||
describe('tid validation', () => { | ||
const expectValid = (t: string) => { | ||
ensureValidTid(t) | ||
} | ||
const expectInvalid = (t: string) => { | ||
expect(() => ensureValidTid(t)).toThrow(InvalidTidError) | ||
} | ||
|
||
it('conforms to interop valid tid', () => { | ||
const lineReader = readline.createInterface({ | ||
input: fs.createReadStream( | ||
`${__dirname}/interop-files/tid_syntax_valid.txt`, | ||
), | ||
terminal: false, | ||
}) | ||
lineReader.on('line', (line) => { | ||
if (line.startsWith('#') || line.length == 0) { | ||
return | ||
} | ||
expectValid(line) | ||
}) | ||
}) | ||
|
||
it('conforms to interop invalid tids', () => { | ||
const lineReader = readline.createInterface({ | ||
input: fs.createReadStream( | ||
`${__dirname}/interop-files/tid_syntax_invalid.txt`, | ||
), | ||
terminal: false, | ||
}) | ||
lineReader.on('line', (line) => { | ||
if (line.startsWith('#') || line.length == 0) { | ||
return | ||
} | ||
expectInvalid(line) | ||
}) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is coarse and lexicon-unaware, I wonder if we should push this check down into the repo code. If the PDS is lexicon-aware, it could enforce the stricter key type check as a part of record validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me in terms of correctness.
repo
: I'm not super confident what the blast radius of impact would be of verifying NSID+RecordKey would be in, eg,repo/src/util.ts
parseDataKey()
. That seems to run both on writes and reads? So could break things like the existing with-:
record. We could initially check only in the create/write path, maybe creating a separateparseDataKeyPermissive
as a migration/transition thing.PDS
: I expect PDS to be lexicon-aware enough to do generic Lexicon validation, possibly by fetching/resolving (and caching) new Lexicon files in the future. For that, passingrkey
to the "validate record" function would make sense to me.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dholms any feels on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a little bit more comfortable living here for the time being.
We don't do NSID validation on collection names in the repo code & that strikes me as similar to this.
So far the repo/MST is pretty unaware of these syntatic things & is primarily concerned with the tree structure & cbor encoding. Though I think there's a good argument for pushing some more lexicon/syntax awareness into it.
In short, I'd leave it here for now but consider adjusting where we draw the abstraction boundary