New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

converted elf_analysis plugin to new base class #1266

Open

jstucke wants to merge 1 commit into master from elf-analysis-plugin-v1

Collaborator

jstucke commented Sep 11, 2024

No description provided.

jstucke requested a review from maringuu

September 11, 2024 15:39

jstucke self-assigned this

jstucke force-pushed the elf-analysis-plugin-v1 branch from c14b112 to b307c7e Compare

September 12, 2024 11:32

jstucke force-pushed the elf-analysis-plugin-v1 branch from b307c7e to afec839 Compare

December 3, 2024 08:46

codecov-commenter commented Dec 3, 2024 •

edited

Loading

Codecov Report

Attention: Patch coverage is 99.44134% with 1 line in your changes missing coverage. Please review.

Project coverage is 91.86%. Comparing base (ab95ac5) to head (5b1d342).
Report is 5 commits behind head on master.

Files with missing lines	Patch %	Lines
...alysis/hardware_analysis/code/hardware_analysis.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1266      +/-   ##
==========================================
- Coverage   92.48%   91.86%   -0.62%     
==========================================
  Files         379      376       -3     
  Lines       24115    21044    -3071     
==========================================
- Hits        22302    19332    -2970     
+ Misses       1813     1712     -101

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

maringuu requested changes

View reviewed changes

src/plugins/analysis/elf_analysis/code/elf_analysis.py Outdated Show resolved Hide resolved

src/helperFunctions/hash.py Outdated Show resolved Hide resolved

src/plugins/analysis/elf_analysis/code/elf_analysis.py Outdated Show resolved Hide resolved

src/plugins/analysis/elf_analysis/code/elf_analysis.py Outdated Show resolved Hide resolved

jstucke force-pushed the elf-analysis-plugin-v1 branch from 700a7d4 to 64db148 Compare

December 31, 2024 13:24

jstucke force-pushed the elf-analysis-plugin-v1 branch from 64db148 to cb519ee Compare

January 8, 2025 16:54

Collaborator Author

jstucke commented Jan 8, 2025 •

edited

Loading

I made some additional changes:

"libraries" were missing from the result
fixed the _get_elf_analysis_libraries() method to work with the new result structure
deduplicated imports and exports (for whatever reason, they can sometimes contain duplicates in lief)

maringuu requested changes

View reviewed changes

src/plugins/analysis/elf_analysis/code/elf_analysis.py Outdated Show resolved Hide resolved

src/plugins/analysis/elf_analysis/code/elf_analysis.py

Comment on lines +287 to +288

		def _deduplicate_functions(lief_functions: Iterable[lief.Function]) -> list[tuple[int, str]]:
		return sorted({(f.address, f.name) for f in lief_functions})

Collaborator

maringuu Jan 16, 2025

Can you file a bug report upstream and put a link to it in the code?
Then when someone stumbles upon this code again, they can see if it is still needed.

Collaborator Author

jstucke Jan 16, 2025

I'm actually not sure whether or not this is intended. In the old plugin this wasn't an issue, since it also "deduplicated" the results using sets, so in the plugin this is a new bug

Collaborator

maringuu Feb 6, 2025

Just my two cents on how multiple symbols can have the same name.

The ELF specification is defined as part of the System V ABI [1].
According to it, multiple non-global symbols can have the same name in multiple object files [2].

src/plugins/analysis/elf_analysis/code/elf_analysis.py Outdated

+                      exported_functions: List[ElfSymbol]
+                      imported_functions: List[ElfSymbol]
+                      libraries: List[str]
+                      mod_info: Optional[List[str]] = Field(description='Key value pairs with module information.')

Collaborator

maringuu Jan 16, 2025

Could you reword the description and make more clear that this is related to Linux kernel modules (Isn't it?).
To me the description does not convey much information that is not conveyed by the type and name of the field.

Also, how do 'key value pairs' fit in an Optinal[List[str]] ?

Collaborator Author

jstucke Jan 16, 2025

I agree, a list of key value pairs doesn't make a lot of sense. I converted it to a dict.

jstucke force-pushed the elf-analysis-plugin-v1 branch from cb519ee to ed406c1 Compare

January 16, 2025 12:55

jstucke requested a review from maringuu

January 16, 2025 12:55

jstucke force-pushed the elf-analysis-plugin-v1 branch from ed406c1 to 24c1c7e Compare

January 16, 2025 13:05


          feat: converted elf_analysis plugin to new base class

5b1d342

and also removed the redundant normalize_lief_items method

jstucke force-pushed the elf-analysis-plugin-v1 branch from 24c1c7e to 5b1d342 Compare

January 16, 2025 13:06

maringuu requested changes

View reviewed changes

src/plugins/analysis/elf_analysis/code/elf_analysis.py

Comment on lines +287 to +288

		def _deduplicate_functions(lief_functions: Iterable[lief.Function]) -> list[tuple[int, str]]:
		return sorted({(f.address, f.name) for f in lief_functions})

Collaborator

maringuu Feb 6, 2025

Just my two cents on how multiple symbols can have the same name.

The ELF specification is defined as part of the System V ABI [1].
According to it, multiple non-global symbols can have the same name in multiple object files [2].

src/plugins/analysis/elf_analysis/code/elf_analysis.py

Comment on lines +54 to +71

+                          entrypoint=header.entrypoint,
+                          file_type=header.file_type.__name__,
+                          header_size=header.header_size,
+                          identity_abi_version=header.identity_abi_version,
+                          identity_class=header.identity_class.__name__,
+                          identity_data=header.identity_data.__name__,
+                          identity_os_abi=header.identity_os_abi.__name__,
+                          identity_version=header.identity_version.__name__,
+                          machine_type=header.machine_type.__name__.lower(),
+                          numberof_sections=header.numberof_sections,
+                          numberof_segments=header.numberof_segments,
+                          object_file_version=header.object_file_version.__name__,
+                          processor_flag=header.processor_flag,
+                          program_header_size=header.program_header_size,
+                          program_headers_offset=header.program_header_offset,
+                          section_header_size=header.section_header_size,
+                          section_headers_offset=header.section_header_offset,
+                          section_name_table_idx=header.section_name_table_idx,

Collaborator

maringuu Feb 6, 2025

Not a huge fan of using __name__ to convert the constants to human readable names.
The lief names are not documented somehwere accessible (are they?).
For example, the header.file_type as defined in the typing stubs does not have any documentation.
Furthermore, the lief names do not match the names in the official specification.

I'm not sure what the right way to proceed is here. From a GUI user perspective human readable names are important. From a API user perspective the numeric values would be preferable.

Also, why does this work at all? It does not work with standard enum.Enum's and is not defined in lief's typing stubs.

Collaborator Author

jstucke Feb 13, 2025 •

edited

Loading

Not a huge fan of using name to convert the constants to human readable names.
The lief names are not documented somehwere accessible (are they?).

neither am I, but sadly this seems to be the only way to reproduce the old output. The alternative would be this: #1266 (comment)

I'm not sure what the right way to proceed is here. From a GUI user perspective human readable names are important. From a API user perspective the numeric values would be preferable.

since the barrier for a GUI user to work with numeric values is arguably higher, IMHO using strings is the better approach

Also, why does this work at all? It does not work with standard enum.Enum's and is not defined in lief's typing stubs.

It probably has something to do with the fact that lief is not actually python code, but rather a shared library (.so) with python bindings

Collaborator Author

jstucke commented Feb 13, 2025

Just my two cents on how multiple symbols can have the same name.
The ELF specification is defined as part of the System V ABI [1].
According to it, multiple non-global symbols can have the same name in multiple object files [2].

I'm not sure this is the problem here. In cases of duplicate entries, every entry seems to have a duplicate (with the same offset). I'm still not sure why

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet