-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review MFTINDX.py
with mypy --strict
#67
Review MFTINDX.py
with mypy --strict
#67
Conversation
The line `from indxparse.MFT import *` obscured symbol definitions from `mypy`, and is also flagged as a code style issue by `flake8`. This patch makes the symbol imports explicit. In addition to the unit tests of `make check`, this patch is written to pass `flake8` review when tested with the following: ```bash flake8 --ignore=E501,W503 indxparse/MFTINDX.py ``` (`flake8` is intentionally not proposed for addition to CI yet due to needing to address other issues.) Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <[email protected]>
This was found while reviewing variable name re-use. Signed-off-by: Alex Nelson <[email protected]>
`mypy` assigns a type to a symbol on its first encounter within a function body. When the symbol is reused later, even if for a whole new purpose, the original name assignment is preserved. If the type is incongruous, a type error is raised. Reviewing `MFTINDX.py` with `mypy --strict` showed several instances of variable re-use (including in `for` loops), with reassignment to different types. This patch renames symbols to avoid ambiguity. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <[email protected]>
Since `MFTRecord.standard_information()` could fail to find the SI Attribute, a block of logic that assumed the SI would be found is revised in this patch to handle the not-found case. Signed-off-by: Alex Nelson <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
Since `MFTRecord.filename_information()` could fail to find a FN Attribute, a line of output that assumed the FN would be found is revised in this patch to handle the not-found case. Signed-off-by: Alex Nelson <[email protected]>
Type-review flagged that no method `Runlist.entries` exists. Signed-off-by: Alex Nelson <[email protected]>
One of the methods flagged as needing type annotations by `mypy --strict indxparse/MFTINDX.py` is `NTFSFile.mft_get_record_buf`. This method is written in a way that, for one unfamiliar with calling context, could return a potentially null result. Review of the code paths to this function show that, at present, an `array.array` will be returned. But, `MFT.py` would require significant rearchitecting (likely, subclassing `NTFSFile`) to guarantee that. Rather than return `Optional[array]`, this patch adds a `raise ValueError` in accordance with one usage in `MFTINDX.py`. Handling `None` looked possible, but more complex, than a blunt fail, due to its usage in the recursive path-constructing function `mft_record_build_path`. Some of the other type additions in this patch are side effects from `mypy` flagging needs, and some are side effects from reviewing the call paths to `mft_get_record_buf`. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <[email protected]>
These signatures were added to reduce type-issues reported by `mypy --strict indxparse/MFTINDX.py`. `MFT.py` typically only needed return types on methods to satisfy this objective. As a progress note: some hundreds of issues remain if `mypy --strict` is run against `MFT.py`. One less-obvious signature revision happened: `mft_get_record_by_path` now returns `Optional[MFTRecord]` instead of `Union[MFTRecord, bool]`, to remain consistent with the control flow around its sole caller, in `MFTINDX.py`. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. A follow-on patch will add further signatures in `MFTINDX.py`. Signed-off-by: Alex Nelson <[email protected]>
With this patch, `MFTINDX.py` passes `mypy --strict`. A follow-on patch will add this status to the testing. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <[email protected]>
This is in partial satisfaction of adding mypy type checking, noted on Issue 38. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. References: * williballenthin#38 Signed-off-by: Alex Nelson <[email protected]>
This is a continuation of the coding style trialed in Pull Request 62. References: * williballenthin#62 Signed-off-by: Alex Nelson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tons of good changes here. thanks for your patience and attention to detail as you modernize this old code!
|
||
self.declare_field( | ||
"qword", "child_vcn", align(self.current_field_offset(), 0x8) | ||
) | ||
|
||
def filename_information(self): | ||
def filename_information(self) -> FilenameAttribute: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type hints are nice additions. thanks!
@@ -1159,14 +1172,14 @@ class MFT_RECORD_FLAGS: | |||
MFT_RECORD_IS_DIRECTORY = 0x2 | |||
|
|||
|
|||
def MREF(mft_reference): | |||
def MREF(mft_reference) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def MREF(mft_reference) -> int: | |
def MREF(mft_reference: int) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this suggestion, by the way. I recall that mypy --strict
used to require that arguments in the information flow path would also be required to be annotated, but it looks like the review strictness now only reaches return types when checking imported modules. I purposefully left this here to see if it was a quirk of my environment, but, nope.
""" | ||
Given a MREF/mft_reference, return the record number part. | ||
""" | ||
return mft_reference & 0xFFFFFFFFFFFF | ||
|
||
|
||
def MSEQNO(mft_reference): | ||
def MSEQNO(mft_reference) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def MSEQNO(mft_reference) -> int: | |
def MSEQNO(mft_reference: int) -> int: |
@@ -141,14 +171,21 @@ def record_bodyfile(ntfsfile, record, inode=None, attributes=None): | |||
return ret | |||
|
|||
|
|||
def node_header_bodyfile(options, node_header, basepath): | |||
def node_header_bodyfile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better, thank you
offset: int, | ||
prefix: str, | ||
progress: bool, | ||
slack: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having this many parameters is rather smelly, but it was equivalent and simply obfuscated by options
before.
it's interesting to reread the code i wrote a decade ago and see what i agree and disagree with today.
This patch series has three primary objectives:
mypy --strict
review of${top_srcdir}/indxparse
, identifying an exemplar script.MFTINDX.py
passmypy --strict
.options
(argparse.Namespace
) argument-passing style fromMFTINDX.py
.Some bugs were fixed as type review was occurring. For instance, some variables that were reused in longer functions, with different types at different points, instigated variable renames, and highlighted an old copy-paste error obscured by a variable name persisting after a loop. Also, some code branches were updated to handle returned
None
s from functions not previously highlighted as returning someOptional
value.Each patch's log message describes its purpose.
Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.