First of all we need to check the structure of the transaction which includes version, locktime, inputs and outputs.
- Verifies required fields:
"version"
,"locktime"
,"vin"
, and"vout"
. - Checks field types:
"version"
and"locktime"
must be integers. - Calls
validate_vin()
andvalidate_vout()
for further validation.
- Checks each
"vin"
element for required fields and types. - Verifies
"txid"
,"vout"
,"prevout"
,"scriptsig"
/"scriptsig_asm"
,"witness"
,"is_coinbase"
, and"sequence"
.
- Checks each
"vout"
element for required fields and types. - Verifies
"scriptpubkey"
,"scriptpubkey_asm"
,"scriptpubkey_type"
,"scriptpubkey_address"
, and"value"
.
- Checks if the script is empty, a valid push data script, a valid opcode, or a valid combination of push data and opcodes.
Checking Transaction Structure in Mempool: check_structure_transactions(mempool_folder, valid_folder)
- Iterates through JSON files in the
mempool_folder
. - Loads transaction data and calls
check_transaction()
for validation. - Moves valid transactions to the
valid_folder
and prints messages for valid and invalid transactions. - Handles errors during file processing.
These algorithms work together to validate the structure of Bitcoin transactions, checking the presence and types of required fields, validating "vin"
and "vout"
lists, and verifying the signature script. The code also provides a function to validate multiple transaction files in the mempool folder and move valid transactions to a separate folder.
Step by step implementation for creating TXID and then updating the json file in mempool_valid
folder.
-
Initialize necessary modules and functions:
- Import the required modules:
json
,struct
,hashlib
, andos
. - Define the
compact_size
function to serialize the size of data in the transaction. - Define the
create_txid
function to create the transaction ID (txid) based on the JSON data.
- Import the required modules:
-
Specify the folder path:
- Set the
folder_path
variable to the path of the folder containing the JSON files (e.g.,"./mempool"
).
- Set the
-
Iterate over the JSON files in the folder:
- Use
os.listdir(folder_path)
to get a list of files in the specified folder. - Iterate over each file in the folder using a loop.
- Check if the file has a
.json
extension using theendswith()
method.
- Use
-
Read the JSON file:
- Construct the full file path by joining the
folder_path
and the currentfilename
usingos.path.join()
. - Open the JSON file in read mode using
open()
and the"r"
flag. - Read the contents of the file using
file.read()
and store it in thejson_data
variable.
- Construct the full file path by joining the
-
Create the txid:
- Call the
create_txid
function, passing thejson_data
as an argument. - Inside the
create_txid
function:- Parse the JSON data into a dictionary using
json.loads()
. - Serialize the transaction data using
struct.pack()
andcompact_size()
functions. - Concatenate the serialized data to create the
tx_data
bytes.
- Parse the JSON data into a dictionary using
- Calculate the double SHA-256 hash of the
tx_data
usinghashlib.sha256()
. - Reverse the byte order of the hash using
[::-1]
and convert it to a hexadecimal string usinghex()
. - Store the resulting txid in the
txid
variable.
- Call the
-
Append the txid to the JSON data:
- Load the JSON data from
json_data
into a dictionary usingjson.loads()
. - Create a new dictionary called
updated_data
with"txid"
as the first property and its value set to the generatedtxid
. - Use the
update()
method to merge the original JSON data dictionary into theupdated_data
dictionary, ensuring that"txid"
remains the first property.
- Load the JSON data from
-
Write the updated JSON data back to the file:
- Open the same JSON file in write mode using
open()
and the"w"
flag. - Write the
updated_data
dictionary to the file usingjson.dump()
. - Specify an indentation of 2 spaces using
indent=2
for better readability.
- Open the same JSON file in write mode using
During the verification process I found out that the total number of P2WPKH were lots(around 3000-4000) of them.
And also the P2TR were lots. So verifying the P2TR is out of the scope of this project. The P2TR is considered valid by default.
-
Parse the transaction JSON:
-
Load the transaction JSON data into a Python dictionary called tx.
-
Iterate over each input (vin) in the transaction:
-
For each input, retrieve the scriptPubKey and scriptSig from the transaction data.
-
Extract the public key hash from the scriptPubKey:
-
The public key hash is located in the scriptPubKey field, typically starting from the 7th character and ending 4 characters before the end.
-
Extract the signature and public key from the scriptSig:
-
The scriptSig contains the signature and public key. The signature length is obtained by converting the 3rd and 4th characters of the scriptSig from hexadecimal to integer.
-
The signature starts after the length and continues for the specified number of bytes. The public key starts immediately after the signature.
-
Hash the extracted public key using SHA-256.
-
Apply the RIPEMD-160 hash function to the SHA-256 hash to obtain the public key hash.
-
Compare the newly computed public key hash with the one extracted from the scriptPubKey.
-
If the hashes don't match, print an error message indicating a public key hash mismatch and return False.
-
Decode the extracted public key hash from hexadecimal to bytes.
-
Create a VerifyingKey object using the extracted public key and the SECP256k1 elliptic curve.
-
Verify the signature using the verify_digest method of the VerifyingKey object.
-
The verify_digest method takes the signature, the decoded public key hash, and the hash function (SHA-256) as arguments.
-
If the signature verification fails, print an error message indicating the failure and return False.
-
If all inputs pass the verification steps without any errors:
-
Return True to signify a valid transaction.
Verifying a P2WPKH transaction is a bit different than the traditional one. We need to create a message of the transaction and later verify the message with public key and Elliptic Curve.
-
Creating the preimage of the transaction:
-
preimage = version + hash256(inputs(txid+vout)) + hash256(sequence) + inputs + scriptcode + amount + sequence + hash256(outputs) + locktime
-
Script Code is basically is a modified version of the ScriptPubKey from the output we're spending. To create the scriptcode, we need to find the ScriptPubKey on the output we want to spend, extract the public key hash, and place it in to the following P2PKH ScriptPubKey structure:
scriptcode = 1976a914{publickeyhash}88ac
-
Now we need add the signature hash type to the end if preimage which is SIGHASH_ALL(0x01)
-
Finally we create message which is
message = hash256(preimage)
-
-
Now verify the message and signature created for the transaction using ECDSA.
-
If it is successful then return true.
NOTE: hash256(x) = sha256(sha256(x))
-
Transaction Structure validation: structural_check.py
-
Creating TXID and Updating the JSoN Structure: create_txid.py
-
Verifying P2PKH transaction: p2pkh_validation
-
Verifying P2WPKH Transaction: p2wpkh_validation
Present the results of your solution, and analyze the efficiency of your solution.
Discuss any insights gained from solving the problem, and outline potential areas for future improvement or research. Include a list of references or resources consulted during the problem-solving process.