You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When developing signatures for text based formats it would be useful to have a built-in ability to manage whitespace, and potentially linebreaks as well.
Many programming languages are whitespace agnostic - whitepaces do not affect the processing of the program. Python is one exception.
Each file contains the same code, however, the first example has possible whitespace and a line break after the initial parentheses, the second example has no whitespace, and the third example has a single space after the semicolon.
Functionally, all three excerpts are valid (as they would be with HTML, Perl, etc.), but the PRONOM signatures for all three would be different.
I'm thinking of a new signature value which indicates "some number of blank spaces, tabs, and/or linebreaks here".
Does this make sense, or am I missing some easier method of creating signatures that cover all of the above possibilities (plus all those allowed in many text based formats)?
The text was updated successfully, but these errors were encountered:
When developing signatures for text based formats it would be useful to have a built-in ability to manage whitespace, and potentially linebreaks as well.
Many programming languages are whitespace agnostic - whitepaces do not affect the processing of the program. Python is one exception.
Consider the following excerpts of formats in the Simple Game Format (https://www.red-bean.com/sgf/)
(
;GM[1]FF[3]
(;GM[1]FF[3]
( ;GM[1]FF[3]
Each file contains the same code, however, the first example has possible whitespace and a line break after the initial parentheses, the second example has no whitespace, and the third example has a single space after the semicolon.
Functionally, all three excerpts are valid (as they would be with HTML, Perl, etc.), but the PRONOM signatures for all three would be different.
I'm thinking of a new signature value which indicates "some number of blank spaces, tabs, and/or linebreaks here".
Does this make sense, or am I missing some easier method of creating signatures that cover all of the above possibilities (plus all those allowed in many text based formats)?
The text was updated successfully, but these errors were encountered: