-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to check for inconsistent use of capitalization #25
Comments
Processing all words, then all two-words sequences would be quite intensive. Can, for example, It is safe to ignore terms shorter than 4 characters? |
do you need to check more than two adjacent words? |
For the most part. I would like "NUL" vs "Nul" or "null", and "ID", "Id", and "id" to be flagged. Three letters and shorter with interesting capitalization tend to be abbreviations.
Field names and response codes can have more than two capitalized words (e.g., Destination Connection ID Length field, Proxy Authentication Required). It would be helpful if the variations like the following were flagged:
|
This is an interesting probem, and is probably solvable, but it's not currently solved in any code we have. |
I don't know if it would help to stop searching once an all-lowercase word is found. The example above would then be
That doesn't matter. Checking terms for caps consistency is something we do manually now. For example, if I see both "Id" and "ID" used, I use case-sensitive search (on singular constructions, which also finds plurals) to get counts and see how the terms are used. If there are a lot of "ID"s but only a couple of "Id"s, then I'll check if I can update those "Id"s to "ID". If the use is mixed (e.g., 22 "ID"s, 24 "Id"s), then I'll ask authors for guidance. |
I think there is something to discuss on plurals that does matter. If the entire document is |
An option would be to have a set list of common words to check for? rather than processing every possible word / term sequence in the document. |
It would be great to have this:
It would still be helpful to have this
RFC terminology is incredibly diverse, and there are WG 'dialects' where one working group may develop a capitalization style that may not be used in another working group. The RPC strives for consistency within a doc and within clusters, but it can be difficult to make documents consistent across time and areas. We have working style sheets for documents that we edit, but we don't currently capture the final decisions, just the questions. So we don't have great data for building a checklist. It would be awesome if we could save final style sheets and use them as checklists in the future. For instance, while editing a document that normatively references RFC NNNN, you could load the rfcNNNN style sheet (and the style sheets of all the RFCs that your doc normatively references) and check your document's capitalization against those. |
I think you answered what you would want if a document contained both of the example sentences I gave, but I don't understand the output numbers. |
I made up the counts in my example above. |
Just like #24, but the report captures inconsistent capitalization of terms within the text. For example:
Note that the RPC uses title case for section titles, so if a term is only capitalized in a heading, this would not be considered inconsistent and does not need to be highlighted.
The text was updated successfully, but these errors were encountered: