Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM verification #356

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified ext/auto-inst/__pycache__/parsing.cpython-310.pyc
Binary file not shown.
46 changes: 30 additions & 16 deletions ext/auto-inst/parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,23 +244,37 @@ def get_repo_instructions(repo_directory):

def find_json_key(instr_name, json_data):
"""
Attempt to find a matching key in json_data for instr_name, considering different
naming conventions: replacing '.' with '_', and trying various case transformations.
Find a matching instruction in json_data by comparing against AsmString values.
Returns the matching key if found, None otherwise.

Args:
instr_name (str): The instruction name from YAML
json_data (dict): The JSON data containing instruction information

Returns:
str or None: The matching key from json_data if found, None otherwise
"""
lower_name = instr_name.lower()
lower_name_underscore = lower_name.replace('.', '_')
variants = {
lower_name,
lower_name_underscore,
instr_name.upper(),
instr_name.replace('.', '_').upper(),
instr_name.capitalize(),
instr_name.replace('.', '_').capitalize()
}

for v in variants:
if v in json_data:
return v
# First, normalize the instruction name for comparison
instr_name = instr_name.lower().strip()

# Search through all entries in json_data
for key, value in json_data.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the !instanceof metadata to only search through things with an encoding, I pointed this out in the comment about the initial approach. This will save you iterating through aliases (and other data like isel patterns)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see !instanceof follows this structure, but I'm finding it hard to understand how to use it for the parsing purposes, can you please further explain?

{
  "!instanceof": {
    "AES_1Arg_Intrinsic": [
      "int_arm_neon_aesimc",
      "int_arm_neon_aesmc"
    ],
    "AES_2Arg_Intrinsic": [
      "int_arm_neon_aesd",
      "int_arm_neon_aese"
    ],
    "ALUW_rr": [
      "ADDW",
      "ADD_UW",
      "DIVUW",
      "DIVW",
      "MULW",
      "PACKW",
      "REMUW",
      "REMW",
      "ROLW",
      "RORW",
      "SH1ADD_UW",
      "SH2ADD_UW",
      "SH3ADD_UW",
      "SLLW",
      "SRAW",
      "SRLW",
      "SUBW"
    ],
    "ALU_ri": [
      "ADDI",
      "ANDI",
      "ORI",
      "SLTI",
      "SLTIU",
      "XORI"
    ],

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, let's assume you parsed the whole json object, into a python variable called json.

json["!instanceof"] is a map, from tablegen classes, to a list of definition names that are instances of that class (or its sub-classes). These definition names appear in the top-level json.

You would use code like the following:

for def_name in json["!instanceof"]["RVInstCommon"]:
  def_data = json[def_name]
  ...

This saves you having to look at all the tablegen data that is not an instruction (so an alias or a pattern or CSR or something).

Note you'll still have to look at isPseudo and isCodeGenOnly and potentially exclude items where one or both of those is true.

if not isinstance(value, dict):
continue

# Get the AsmString value and normalize it
asm_string = safe_get(value, 'AsmString', '').lower().strip()
if not asm_string:
continue

# Extract the base instruction name from AsmString
# AsmString might be in format like "add $rd, $rs1, $rs2"
# We want just "add"
base_asm_name = asm_string.split()[0]

if base_asm_name == instr_name:
return key

return None

def run_parser(json_file, repo_directory, output_file="output.txt"):
Expand Down
Loading