-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3f02a06
commit 5cd060c
Showing
15 changed files
with
1,823 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Change-impact-analysis Tool | ||
|
||
The Change Impact Analysis Tool generates a comprehensive visual report detailing changes in both header files and source code between two Linux versions (tags in the Linux kernel repository: old_tag and new_tag). This tool helps developers view updates from the old version. | ||
|
||
The diff report includes a subset of files from the Linux repository that are included in building the kernel, contributing to a focused and detailed report on the compile-time source code in Linux. | ||
|
||
## Table of Content | ||
|
||
- [How to Use](#how-to-use) | ||
- [Files Generated](#files-generated) | ||
- [Structure of the Tool](#structure-of-the-tool) | ||
- [I. Compilation File List Generation](#i-compilation-file-list-generation) | ||
- [II. Git Diff Report Generation](#ii-git-diff-report-generation) | ||
- [III. Commit Metadata Retrieval](#iii-commit-metadata-retrieval) | ||
- [IV. Web Script Generation](#iv-web-script-generation) | ||
|
||
## How to use | ||
|
||
To utilize this tool in your Linux environment (compatible with Ubuntu and Debian), follow these steps: | ||
|
||
Clone the repository: | ||
|
||
```bash | ||
git clone <repository_url> | ||
``` | ||
|
||
Navigate to the cloned repository: | ||
|
||
```bash | ||
cd <repository_directory> | ||
``` | ||
|
||
Execute the tool by specifying the old and new tags: | ||
|
||
```bash | ||
./run_tool.sh <old_tag> <new_tag> | ||
``` | ||
|
||
## Files Generated | ||
|
||
**/build_data:** | ||
|
||
- `sourcefile.txt` - List of all built source code files | ||
- `headerfile.txt` - List of all built dependency files | ||
- `git_diff_sourcefile.txt` - Git diff report for source code files | ||
- `git_diff_headerfile.txt` - Git diff report for dependency files | ||
- `tokenize_header.json` - Metadata for commit git diff for dependency files | ||
- `tokenize_source.json` - Metadata for commit git diff for source files | ||
|
||
**/web_source_codes:** | ||
|
||
- `index.html` - Click on to view the result | ||
|
||
## Structure of the Tool | ||
|
||
The tool operates through a structured process to generate a comprehensive change impact analysis report. Here's a detailed breakdown of its operation: | ||
|
||
### I. Compilation File List Generation | ||
|
||
#### Header File | ||
|
||
During linux kernel compilation, `Makefile.build` calls `$K/scripts/basic/fixdep.c` to generate a .cmd file for each source that collects dependency information during compilation. | ||
|
||
This tool incorporates a modification that applies a patch (`patch.file`) to `scripts/basic/fixdep.c`, enabling it to output dependency information into a **list of header files** when building the kernel. | ||
|
||
#### Source code | ||
|
||
This tool leverages the `$K/scripts/clang-tools/gen_compile_commands.py` script to generate a `compile_commands.json` file. This file documents all source files involved in the compilation process. The `gen_compile_commands.py` script traverses each `.cmd` file to aggregate the list of source files. | ||
|
||
Then, the tool invokes `parse_json.py` to parse `compile_commands.json`, generating **a list of source files**. | ||
|
||
### II. Git Diff Report Generation | ||
|
||
Using the file lists, the tool generates 2 separate git diff reports (dependency diff report & source diff report) for updates from **old_tag** to **new_tag**. | ||
|
||
### III. Commit Metadata Retrieval | ||
|
||
Based on the git diff reports, the tool retrieves commit metadata for each newly added line in the reports. | ||
|
||
- **Tokenization**: If multiple commits modify a single line between two tags, the tool breaks down each commit line into smaller parts and associates commit information with relevant tokens. The results after tokenization are stored in JSON files. | ||
|
||
### IV. Web Script Generation | ||
|
||
Using the git diff reports and metadata stored in JSON files, the tool generates a web report displaying the changes. | ||
|
||
The web report contains three source html: | ||
|
||
- `index.html`: with on-click directions to: | ||
- `sourcecode.html`: renders the content in source diff report, with embedded url and on-hover metadata box for each newly added lines/tokens in new_tag. | ||
- `header.html`: renders teh content in dependency diff report, with embedded url and on-hover metadata box for each newly added lines/tokens in new_tag. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
#!/bin/bash | ||
# | ||
# Script to build the kernel, collect compiled file lists using modified kernel scripts, | ||
# and generate a git diff report based on the collected lists. | ||
|
||
set -e | ||
|
||
DEFAULT_TAG1="v6.9" | ||
DEFAULT_TAG2="v6.10" | ||
TAG1="${TAG1_ENV:-$DEFAULT_TAG1}" | ||
TAG2="${TAG2_ENV:-$DEFAULT_TAG2}" | ||
|
||
# check and install gcc-11 if not already installed | ||
install_package_safe() { | ||
if ! command -v gcc-11 &> /dev/null; then | ||
sudo apt update | ||
sudo apt install gcc-11 | ||
else | ||
echo "GCC-11 is already installed." | ||
fi | ||
if ! command -v libssl-dev &> /dev/null; then | ||
sudo apt-get update | ||
sudo apt-get install -y libssl-dev | ||
else | ||
echo "libssl-dev is already installed." | ||
fi | ||
} | ||
|
||
# safely apply a patch to linux kernel | ||
apply_patch() { | ||
# shellcheck disable=SC2154 | ||
local patch_path="$root_path/scripts/change-impact-tools/fixdep-patch.file" | ||
|
||
# Stash any changes if there is any | ||
if ! git diff --quiet; then | ||
git stash | ||
fi | ||
|
||
# Abort `git am` only if there is a patch being applied | ||
if git am --show-current-patch &> /dev/null; then | ||
git am --abort | ||
fi | ||
echo "path check: $(pwd)" | ||
git apply < "$patch_path" | ||
echo "applied the git patch" | ||
echo "path check: $(pwd)" | ||
} | ||
|
||
# parse the JSON file | ||
parse_source_json_file() { | ||
local python_path="$root_path/scripts/change-impact-tools/build_scripts/parse_json.py" | ||
# shellcheck disable=SC2154 | ||
local cloned_repo_name="/$clone_dir/" | ||
local input_path="$root_path/scripts/clang-tools/compile_commands.json" | ||
local output_path="$root_path/scripts/change-impact-tools/build_data/sourcefile.txt" | ||
|
||
display_file_head "$root_path/scripts/clang-tools" "compile_commands.json" 3 | ||
python3 "$python_path" "$cloned_repo_name" "$input_path" "$output_path" | ||
display_file_head "$root_path/scripts/change-impact-tools/build_data" "sourcefile.txt" 3 | ||
} | ||
|
||
# generate the build file list after building the kernel | ||
generate_compiled_file_lists() { | ||
# Generate compiled source files list | ||
local json_output_path="$root_path/scripts/clang-tools/compile_commands.json" | ||
echo "path check: $(pwd)" | ||
python3 scripts/clang-tools/gen_compile_commands.py -o "$json_output_path" | ||
|
||
parse_source_json_file | ||
echo "source compiled filelist generated to sourcefile.txt" | ||
|
||
# Generate compiled header files list | ||
|
||
local output_list="$root_path/scripts/change-impact-tools/build_data/headerfile.txt" | ||
local output_json="$root_path/scripts/change-impact-tools/build_data/source_dep.json" | ||
local dep_path="dependency_file.txt" | ||
local python_tool_path="$root_path/scripts/change-impact-tools/build_scripts/parse_dep_list.py" | ||
|
||
python3 "$python_tool_path" "$dep_path" "$output_json" "$output_list" | ||
echo "dependency compiled filelist generated to headerfile.txt$" | ||
|
||
} | ||
|
||
# clean up the working directory | ||
cleanup_working_directory() { | ||
git reset --hard | ||
git clean -fdx | ||
} | ||
|
||
# generate diff for build between TAG1 and TAG2 | ||
generate_git_diff() { | ||
|
||
# collect and setup input & output file | ||
file_type=${1:-source} | ||
local root_build_data_path="$root_path/scripts/change-impact-tools/build_data" | ||
local diff_input="$root_build_data_path/sourcefile.txt" | ||
local diff_output="$root_build_data_path/filtered_diff_source.txt" | ||
|
||
if [ "$file_type" = "header" ]; then | ||
echo "[generate_git_diff] Generating dependency git diff report ..." | ||
diff_input="$root_build_data_path/headerfile.txt" | ||
diff_output="$root_build_data_path/filtered_diff_header.txt" | ||
else | ||
echo "[generate_git_diff] Generating source git diff report ..." | ||
fi | ||
|
||
while IFS= read -r file | ||
do | ||
if git show "$TAG2:$file" &> /dev/null; then | ||
local diff_result | ||
diff_result=$(git diff "$TAG1" "$TAG2" -- "$file") | ||
if [[ -n "$diff_result" ]]; then | ||
{ | ||
echo "Diff for $file" | ||
echo "$diff_result" | ||
echo "" | ||
} >> "$diff_output" | ||
|
||
fi | ||
fi | ||
done < "$diff_input" | ||
echo "[generate_git_diff] Git diff report for $file_type files save to compiled_data" | ||
|
||
} | ||
|
||
|
||
if [ $# -eq 2 ]; then | ||
TAG1="$1" | ||
TAG2="$2" | ||
fi | ||
|
||
# Fetch tags from the repository | ||
git fetch --tags | ||
echo "Generating source file list for $TAG1" | ||
git checkout "$TAG1" | ||
echo "starting to run make olddefconfig" | ||
make olddefconfig | ||
echo "finished make olddefconfig" | ||
|
||
|
||
# Preparation before running make | ||
apply_patch | ||
install_package_safe | ||
|
||
# Build linux kernel | ||
echo "the current os-linux version: " | ||
cat /etc/os-release | ||
|
||
echo "start running make" | ||
make HOSTCC=gcc-11 CC=gcc-11 | ||
echo "finished compile kernel using gcc 11" | ||
|
||
|
||
# Collect build metadata | ||
generate_compiled_file_lists | ||
|
||
# Generate git diff report | ||
generate_git_diff source | ||
generate_git_diff header | ||
|
||
# Clean up the working directory | ||
cleanup_working_directory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/bin/bash | ||
# | ||
# Fetch name email information for linux kernel contributors | ||
|
||
DEFAULT_TAG="v6.10" | ||
TAG="${1:-$DEFAULT_TAG}" | ||
git checkout "$TAG" | ||
|
||
echo "Starting to generate the email name list ..." | ||
# shellcheck disable=SC2154 | ||
git shortlog -e -s -n HEAD > "$curr_dir"/build_data/name_list.txt | ||
|
||
# shellcheck disable=SC2154 | ||
if [ -s "$curr_dir"/build_data/name_list.txt ]; then | ||
echo "build_data/name_list.txt created successfully" | ||
else | ||
echo "build_data/name_list.txt is empty or not created" | ||
fi | ||
|
||
echo "Finished generating name list" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
""" | ||
The script parses the dependency list generated by patching `fixdep.c`. | ||
This script takes three arguments: | ||
1. The path of dependency list | ||
2. The output path for a json file | ||
3. The output path for the list of header files. | ||
Usage: | ||
parse_json.py <dep_list_path> <output_json_path> | ||
<output_header_file_list_path> | ||
""" | ||
import re | ||
import argparse | ||
import json | ||
|
||
# Regular expression patterns | ||
source_file_pattern = re.compile(r'^source file := (.+)$') | ||
|
||
# Function to parse the input data | ||
|
||
|
||
def parse_dependencies(dep_list_file, output_json, output_dep_list): | ||
"""Parse dependency file generated by 'fixdep.c'.""" | ||
dependencies = [] | ||
dep_set = set() | ||
current_source_file = None | ||
|
||
for line in dep_list_file: | ||
line = line.strip() | ||
if not line: | ||
continue | ||
|
||
source_match = source_file_pattern.match(line) | ||
if source_match: | ||
current_source_file = source_match.group(1) | ||
dependencies.append({ | ||
'source_file': current_source_file, | ||
'dependency_files': [] | ||
}) | ||
else: | ||
dependencies[-1]['dependency_files'].append(line) | ||
dep_set.add(line) | ||
|
||
# Write dependency list to output file | ||
with open(output_dep_list, 'w', encoding='utf-8') as output_list_file: | ||
for header_file in dep_set: | ||
output_list_file.write(header_file + '\n') | ||
|
||
# Dump dependencies into JSON file | ||
with open(output_json, 'w', encoding='utf-8') as json_file: | ||
json.dump(dependencies, json_file, indent=4) | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser( | ||
description="Process dependency list generated while compiling kernel.") | ||
parser.add_argument('input_file', type=str, | ||
help="Path to input dependency file") | ||
parser.add_argument('output_json', type=str, | ||
help="Path to output JSON file") | ||
parser.add_argument('output_header_list', type=str, | ||
help="Path to output dependency list file") | ||
|
||
args = parser.parse_args() | ||
|
||
with open(args.input_file, 'r', encoding='utf-8') as input_file: | ||
parse_dependencies(input_file, args.output_json, | ||
args.output_header_list) | ||
|
||
print("Dependency parsing complete.") |
Oops, something went wrong.