Let's now decode a Segwit transaction. Given all the changes we made in the previous lesson, this should be relatively easy to set up!
We'll start by updating our Decodable
implementation for Transaction
:
#[derive(Debug)]
impl Decodable for Transaction {
fn consensus_decode<R: Read>(r: &mut R) -> Result<Self, Error> {
let version = Version::consensus_decode(r)?;
let inputs = Vec::<TxIn>::consensus_decode(r)?;
if inputs.is_empty() {
let segwit_flag = u8::consensus_decode(r)?;
match segwit_flag {
1 => {
let mut inputs = Vec::<TxIn>::consensus_decode(r)?;
let outputs = Vec::<TxOut>::consensus_decode(r)?;
for txin in inputs.iter_mut() {
txin.witness = Witness::consensus_decode(r)?;
}
if !inputs.is_empty() && inputs.iter().all(|input| input.witness.is_empty()) {
Err(Error::ParseFailed("witness flag set but no witnesses present"))
} else {
Ok(Transaction {
version,
inputs,
outputs,
lock_time: u32::consensus_decode(r)?,
})
}
}
// We don't support anything else
x => Err(Error::UnsupportedSegwitFlag(x)),
}
// non-segwit
} else {
Ok(Transaction {
version,
inputs,
outputs: Vec::<TxOut>::consensus_decode(r)?,
lock_time: u32::consensus_decode(r)?,
})
}
}
}
Changes:
- After decoding the version and the inputs, the next thing we want to do is check the input count. If it's zero, then this is likely a marker for a segwit transaction. If it's not, it's a legacy transaction, and we'll simply go ahead to decoding the outputs and the locktime. The following byte after the marker is the flag. Currently, the only supported flag is
1
. Anything else and we'll throw a new error that we create called theUnsupportedSegwitFlag
(exactly how the rust-bitcoin library does it). We'll take a look at ourError
enum in a moment. - With a valid segwit flag, we then move forward by decoding the inputs and the outputs. What follows after that is the witness data. We know that for every input there is a witness stack, or a collection of elements that comprise the witness for that particular input. Another way we can think about this is a
vec
of byte collections or aVec<Vec<u8>>
for each input. Though the witness is segregated towards the end of the raw transaction data, it's convenient to display the witness with each input for serialization. So we'll add this field toTxIn
struct, which we'll look at below. Here's an example output of a decoded segwit transaction, which we use for another integration test case:
{
"transaction_id": "17e1fcaae34575d0d1566c1ae64bf4c9f7b7b9df0ff505015d3eb72460fe3a61",
"version": 2,
"inputs": [
{
"txid": "0c0fe4cc11c477231ad80de3496e20f40cc3088797c50aec8996e955c87e46d2",
"vout": 1,
"txinwitness": [
"3044022036c03ad8796f865c9348403fb705d5b984a4ef9565e8b0c81a1069f0f36bbeeb022034e9d5679e9783a441586fae034c78c60854ed71b7b53e6ef169e4f58153356101",
"0355dd8af3cbfe5c3d3424b441069455a59ce0c8d5fe628da0913dae55037ef928"
],
"sequence": 4294967294
}
],
"outputs": [
{
"amount": 0.02034575,
"script_pubkey": "00146f048d1381aa546a3e89e87f7549efc45f150b7f"
},
{
"amount": 0.01035945,
"script_pubkey": "0014d850c02b89821f0f189ca7e81756c102241f7f40"
}
],
"locktime": 2422463
}
- If the witness flag is present but the witness stack if not available for all of the inputs, we'll return a new
Error::ParseFailed
. - Two new methods you may not be familiar with but are very commonly used to iterate over collections of data are the
iter
anditer_mut
methods. These methods convert a collection, such as aVec
or an array, into anIterator
type. This is what allows us to step through a collection in a loop for example, and provides many other methods such asmap
,filter
,find
, etc. which are very useful for traversing a collection or modifying it. Take a look at the documentation for theall
method.
Let's take a look at our updated Error
enum:
#[derive(Debug)]
pub enum Error {
Io(std::io::Error),
ParseFailed(&'static str),
UnsupportedSegwitFlag(u8),
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Error::Io(ref e) => write!(f, "IO error: {}", e),
Error::ParseFailed(s) => write!(f, "parse failed: {}", s),
Error::UnsupportedSegwitFlag(swflag) =>
write!(f, "unsupported segwit version: {}", swflag),
}
}
}
impl std::error::Error for Error {}
Changes:
- We've added the
Error::ParseFailed
andError::UnsupportedSegwitFlag
error variants and implementedDisplay
for them. - You might notice the syntax
&'static str
. You're probably familiar already with&str
, so what does&'static str
mean? Well, the tick mark indicates alifetime
parameter. Whenever we have a field in a struct or enum that is a borrowed reference to some data, we have to indicate to our program somehow how long that data is expected to live relative to encapsulating structure. Why? Because what would happen if we pass around a struct and then at some point the field's data goes out of scope. The struct would have a field with a dangling pointer which is a memory no-no. So our borrowed reference must always outlive the struct (or encapsulating structure). In this case,'static
means that the borrowed reference will live for as long as the program runs. In other words, we won't have to worry about the underlyingString
data on the heap being deallocated while the program runs. This might sound pretty confusing. Don't worry if its difficult to understand. Lifetimes are one of the most challenging aspects of learning Rust and require lots of practice to really understand. We'll revisit this topic in a bonus section later on.
Next, let's look at our updated TxIn
struct and new Witness
struct:
...
#[derive(Debug)]
pub struct TxIn {
pub previous_txid: Txid,
pub previous_vout: u32,
pub script_sig: String,
pub sequence: u32,
pub witness: Witness,
}
impl Serialize for TxIn {
fn serialize<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
let mut txin = s.serialize_struct("TxIn", 4)?;
txin.serialize_field("txid", &self.previous_txid)?;
txin.serialize_field("vout", &self.previous_vout)?;
if self.witness.is_empty() {
txin.serialize_field("scriptSig", &self.script_sig)?;
} else {
txin.serialize_field("txinwitness", &self.witness)?;
}
txin.serialize_field("sequence", &self.sequence)?;
txin.end()
}
}
#[derive(Debug)]
pub struct Witness {
content: Vec<Vec<u8>>,
}
impl Witness {
pub fn new() -> Self {
Witness { content: vec![] }
}
fn is_empty(&self) -> bool {
self.content.is_empty()
}
}
impl Serialize for Witness {
fn serialize<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
use serde::ser::SerializeSeq;
let mut seq = s.serialize_seq(Some(self.content.len()))?;
for elem in self.content.iter() {
seq.serialize_element(&hex::encode(&elem))?;
}
seq.end()
}
}
...
Changes:
- We've added custom serialization for our
TxIn
struct. We set field names that match whatbitcoin-cli
uses and will optionally show thescriptSig
or thewitness
depending on what's present. - Our
Witness
struct has acontent
field which is essentially just aVec<Vec<u8>>
. We added anew
method which we'll call when first decoding aTxIn
and before adding the witness to it. An empty witness will be added to eachTxIn
when it's first decoded and we have theis_empty
method as a convenient method we can call to determine if the witness is empty or not. This eventually calls theis_empty
method on aVec
. - We want to serialize the
Witness
so that it shows up as an array of hex-encoded bytes. We can use theserialize_seq
method to do this and callhex::encode
on each byte collection in the encapsulatingVec
.
Lastly, let's look at our Decodable
implementations:
...
impl Decodable for TxIn {
fn consensus_decode<R: Read>(r: &mut R) -> Result<Self, Error> {
Ok(TxIn {
previous_txid: Txid::consensus_decode(r)?,
previous_vout: u32::consensus_decode(r)?,
script_sig: String::consensus_decode(r)?,
sequence: u32::consensus_decode(r)?,
witness: Witness::new(),
})
}
}
impl Decodable for Witness {
fn consensus_decode<R: Read>(r: &mut R) -> Result<Self, Error> {
let mut witness_items = vec![];
let count = u8::consensus_decode(r)?;
for _ in 0..count {
let len = CompactSize::consensus_decode(r)?.0;
let mut buffer = vec![0; len as usize];
r.read_exact(&mut buffer).map_err(Error::Io)?;
witness_items.push(buffer);
}
Ok(Witness{ content: witness_items })
}
}
...
Changes:
- When we first decode a
TxIn
, we'll add a new emptyWitness
field. We'll fill this out later on while decoding aTransaction
. - The
Witness
has the following structure: a count of the stack items with each item preceded by aCompactSize
to indicate it's length. So we'll decode the first byte to get the count of witness items. Then, we'll decode the variable length field and push that into our witnessvec
. Finally we'll return theWitness
struct with the content of ourwitness_items
.
Alright! This should all work now! Let's update our integration test.
use std::fs;
#[test]
fn test_legacy() {
let raw_transaction_hex = "010000000242d5c1d6f7308bbe95c0f6e1301dd73a8da77d2155b0773bc297ac47f9cd7380010000006a4730440220771361aae55e84496b9e7b06e0a53dd122a1425f85840af7a52b20fa329816070220221dd92132e82ef9c133cb1a106b64893892a11acf2cfa1adb7698dcdc02f01b0121030077be25dc482e7f4abad60115416881fe4ef98af33c924cd8b20ca4e57e8bd5feffffff75c87cc5f3150eefc1c04c0246e7e0b370e64b17d6226c44b333a6f4ca14b49c000000006b483045022100e0d85fece671d367c8d442a96230954cdda4b9cf95e9edc763616d05d93e944302202330d520408d909575c5f6976cc405b3042673b601f4f2140b2e4d447e671c47012103c43afccd37aae7107f5a43f5b7b223d034e7583b77c8cd1084d86895a7341abffeffffff02ebb10f00000000001976a9144ef88a0b04e3ad6d1888da4be260d6735e0d308488ac508c1e000000000017a91476c0c8f2fc403c5edaea365f6a284317b9cdf7258700000000";
let json = transaction_decoder_22::run(raw_transaction_hex.to_string()).unwrap();
let expected = fs::read_to_string("tests/test_transaction_legacy.json").unwrap();
assert_eq!(expected, json);
}
#[test]
fn test_segwit() {
let raw_transaction_hex = "02000000000101d2467ec855e99689ec0ac5978708c30cf4206e49e30dd81a2377c411cce40f0c0100000000feffffff028f0b1f00000000001600146f048d1381aa546a3e89e87f7549efc45f150b7fa9ce0f0000000000160014d850c02b89821f0f189ca7e81756c102241f7f4002473044022036c03ad8796f865c9348403fb705d5b984a4ef9565e8b0c81a1069f0f36bbeeb022034e9d5679e9783a441586fae034c78c60854ed71b7b53e6ef169e4f58153356101210355dd8af3cbfe5c3d3424b441069455a59ce0c8d5fe628da0913dae55037ef928bff62400";
let json = transaction_decoder_22::run(raw_transaction_hex.to_string()).unwrap();
let expected = fs::read_to_string("tests/test_transaction_segwit.json").unwrap();
assert_eq!(expected, json);
}
The tests
directory has been updated with the expected json outputs for the legacy and segwit transactions. These should both pass now!
Let's also test this out with cargo run
.
$ cargo run -- 02000000000101d2467ec855e99689ec0ac5978708c30cf4206e49e30dd81a2377c411cce40f0c0100000000feffffff028f0b1f00000000001600146f048d1381aa546a3e89e87f7549efc45f150b7fa9ce0f0000000000160014d850c02b89821f0f189ca7e81756c102241f7f4002473044022036c03ad8796f865c9348403fb705d5b984a4ef9565e8b0c81a1069f0f36bbeeb022034e9d5679e9783a441586fae034c78c60854ed71b7b53e6ef169e4f58153356101210355dd8af3cbfe5c3d3424b441069455a59ce0c8d5fe628da0913dae55037ef928bff62400
This should output the following:
{
"transaction_id": "17e1fcaae34575d0d1566c1ae64bf4c9f7b7b9df0ff505015d3eb72460fe3a61",
"version": 2,
"inputs": [
{
"txid": "0c0fe4cc11c477231ad80de3496e20f40cc3088797c50aec8996e955c87e46d2",
"vout": 1,
"txinwitness": [
"3044022036c03ad8796f865c9348403fb705d5b984a4ef9565e8b0c81a1069f0f36bbeeb022034e9d5679e9783a441586fae034c78c60854ed71b7b53e6ef169e4f58153356101",
"0355dd8af3cbfe5c3d3424b441069455a59ce0c8d5fe628da0913dae55037ef928"
],
"sequence": 4294967294
}
],
"outputs": [
{
"amount": 0.02034575,
"script_pubkey": "00146f048d1381aa546a3e89e87f7549efc45f150b7f"
},
{
"amount": 0.01035945,
"script_pubkey": "0014d850c02b89821f0f189ca7e81756c102241f7f40"
}
],
"locktime": 2422463
}
Congratulations! Your program can now decode any transaction. Feel free to play with this a bit. Visit mempool.space/testnet, find a random transaction hex, decode it and compare your results!
- Lifetimes: https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html
- Iterators: https://doc.rust-lang.org/book/ch13-02-iterators.html
Check out these additional resources:
- Practice at Exercism.io: https://exercism.org/tracks/rust
- Effective Rust: https://effective-rust.com/cover.html
- Programming Rust: https://www.oreilly.com/library/view/programming-rust-2nd/9781492052586/