-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detection of duplicates in API reference, some to remove #277
Detection of duplicates in API reference, some to remove #277
Conversation
Hi Sadie - thanks for this. I reckon that we could keep |
Sounds good to me. I'll update the PR shortly to get the duplicates other than those removed, in the least appropriate location. |
@davidhassell I have now updated the PR to remove the majority of the rest of the duplicates, so that the output of the script is: $ ./check_docs_api_coverage
WARNING: some methods are listed multiple times inside one class file/page. Decide if the duplicates are intended and if not remove them. They are:
cfdm.Data.sum
cfdm.Data.unique
All non-private methods are documented in line with your idea to leave just those two under So this should be ready for re-review. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sadie, looks good, thanks. I have a question about the partial matches, though ...
# so we must account for that. Checking next character | ||
# of duplicate(s) is something other than a newline or | ||
# whitespace, seems robust and simplest. | ||
end_loc = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume I'm missing something here, but count
doesn't count partial matches, so why is this needed?
>>> a = ['abc', 'abcd', 'abcde']
>>> a.count('abc')
1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's needed because what's being done is count from one larger string, being the string read from the full file, rather than a list of sub-strings as per your example, in which case we have as an equivalent:
>>> a = "abc abcd abcde"
>>> a.count("abc")
3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless, if you can think of a more elegant way to go about the logic I've added, I am happy to hear. I might be able to think of an alternative myself if I mull over it, but since it's a convenience script for automating checks I am satisfied as long as it works and isn't (too) slow 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good - thanks for the explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the short discussion, I now understand, and therefore approve, all of these changes. Thanks, Sadie.
Took a slight tangent from my reviewing of #272 when considering API reference coverage and spotting there are some methods listed multiple times in the same file or even under the same sub-heading.
I thought I'd make a small update to the
check_docs_api_coverage
script so we can automate the detection of those when we are already checking that they are all present. So with this change we are checking they are all present precisely once and not just not zero times like before (sub-string method name complications are dealt with - see code comments), unless we've deliberately chosen to duplicate them under different headings. A warning is raised reporting any duplicates for consideration.I've removed any duplication that seems unintended in the API reference source restructured text, thought left in some cases where it looks like we might want to put methods under multiple headings. Namely, at PR opening the script still reports these which I haven't touched:
as opposed to, previously, a longer list:
Happy to remove those duplicates from the first snippet too, though? But the main thing is the automation of the check and that I've removed methods duplicated under the same heading, which can be blitzed to tidy.