Skip to content

Latest commit

 

History

History
521 lines (401 loc) · 18.3 KB

appendix_validation.asciidoc

File metadata and controls

521 lines (401 loc) · 18.3 KB

Appendix A: Validation

Whenever we’re teaching and talking about these techniques, one question that comes up over and over is "Where should I do validation? Does that belong with my business logic, in the domain model, or is that an infrastructural concern?"

As with any architectural question, the answer is: it depends!

The most important consideration is that we want to keep our code well-separated so that each part of the system is simple. We don’t want to clutter our code with irrelevant detail.

What is validation anyway?

When people use the word "validation" they usually mean a process where they test the inputs of an operation to make sure that they match some criteria. Inputs that match the criteria are considered "valid", and inputs that don’t are "invalid".

If the input is invalid, then the operation can’t continue, but should exit with some kind of error.

In other words, validation is about creating pre-conditions. We find it useful to separate our pre-conditions into three sub-types: syntax, semantics, and pragmatics.

Validating Syntax

In linguistics, the syntax of a language is the set of rules that govern the structure of grammatical sentences. For example, in English, the sentence "Allocate three units of TASTELESS-LAMP to order twenty-seven" is grammatically sound while the phrase "hat hat hat hat hat hat wibble" is not. We can describe gramatically correct sentences as "well-formed".

How does this map to our application? Here are some examples of syntactic rules:

  • An Allocate command must have an orderid, a sku, and a quantity.

  • A quantity is a positive integer

  • A sku is a string

These are rules about the shape and structure of incoming data. An Allocate command without a sku or order id isn’t a valid message. It’s the equivalent of the phrase "Allocate three to".

We tend to validate these rules at the edge of the system. Our rule of thumb is that a message handler should only ever receive a message that is well-formed and contains all required information.

One option is to put your validation logic on the message type itself:

Example 1. Validation on the message class (src/allocation/commands.py)
from schema import And, Schema, Use


@dataclass
class Allocate(Command):

    _schema = Schema({  #(1)
        'orderid': int,
         sku: str,
         qty: And(Use(int), lambda n: n > 0)
     }, ignore_extra_keys=True)

    orderid: str
    sku: str
    qty: int

    @classmethod
    def from_json(cls, data):  #(2)
       data = json.loads(data)
       return cls(**_schema.validate(data))
  1. The schema[1] library lets us describe the structure and validation of our messages in a nice declarative way.

  2. The from_json method reads a string as JSON, and turns it into our message type.

This can get repetitive, though, since we need to specify our fields twice, so we might want to introduce a helper library that can unify the validation and declaration of our message types.

Example 2. A command factory with schema (src/allocation/commands.py)
def command(name, **fields):  #(1)
    schema = Schema(And(Use(json.loads), fields), ignore_extra_keys=True)  #(2)
    cls = make_dataclass(name, fields.keys())
    cls.from_json = lambda s: cls(**schema.validate(s))  #(3)
    return cls

def greater_than_zero(x):
    return x > 0

quantity = And(Use(int), greater_than_zero)  #(4)

Allocate = command(  #(5)
    orderid=int,
    sku=str,
    qty=quantity
)

AddStock = command(
    sku=str,
    qty=quantity
  1. The command function takes a message name, plus kwargs for the fields of the message payload, where the name of the kwarg is the name of the field, and the value is the parser.

  2. We use the make_dataclass function from the dataclass module to dynamically create our message type.

  3. We patch the from_json method onto our dynamic dataclass

  4. We can create reusable parsers for quantity, or sku, etc to keep things DRY.

  5. Declaring a message type becomes a one-liner

This comes at the expense of losing the types on your dataclass, so bear that trade-off in mind.

Postel’s Law and the Tolerant Reader Pattern

Postel’s law or the Robustness Principle, tells us "Be liberal in what you accept, and conservative in what you emit". We think this applies particularly well in the context of integration with our other systems. The idea here is that we should be strict whenever we’re sending messages to other systems, but as lenient as possible when we’re receiving messages from others.

For example, our system could validate the format of a sku. We’ve been using made up skus like "UNFORGIVING-CUSHION" and "MISBEGOTTEN-POUFFE". These follow a simple pattern: two words, separated by dashes, where the second word is the type of product, and the first word an adjective.

Developers love to validate this kind of thing in their messages, and reject anything that looks like an invalid sku. This causes horrible problems down the line when some anarchist releases a product named "COMFY-CHAISE-LONGUE" or when a snafu at the supplier results in a shipment of "CHEAP-CARPET-2".

Really, as the allocation system, it’s none of our business what the format of a SKU might be. All we need is an identifier, so we can simply describe it as a string. This means that the Procurement system can change the format whenever they like, and we won’t care.

This same principle applies to order numbers, customer phone numbers, and much more. For the most part, we can ignore the internal structure of strings.

Similarly, developers love to validate incoming messages with tools like JSON schema, or to build libraries that validate incoming messages and share them among systems. This likewise fails the robustness test.

Let’s imagine, for example, that the procurement system adds new fields to the ChangeBatchQuantity message that record the reason for the change, and the email of the user responsible for the change.

Since these fields don’t matter to the allocation service, we should simply ignore them. We can do that in the Schema library by passing the keyword arg ignore_extra_keys=True.

This pattern, where we extract only the fields we care about, and do minimal validation of them is the Tolerant Reader pattern.

Tip
Validate as little as possible. Only read the fields you need, and don’t over-specify their contents. This will help you to stay robust when other systems change over time. Resist the temptation to share message definitions between systems, instead make it easy to define the data you depend on. For more info, see Martin Fowler’s article on the Tolerant Reader pattern.
Is Postel Always Right?

Mentioning Postel can be quite triggering to some people. They will tell you that Postel is the precise reason that everything on the Internet is broken and we can’t have nice things. Ask Hynek about SSLv3 one day.

We like the tolerant reader approach in the particular context of event-based integration between services that we control, because it allows for independent evolution of those services.

If you’re in charge of an API that’s open to the public on the big bad Internets, there might be good reasons to be more conservative about what inputs you allow.

Validating At The Edge

Earlier, we said that we want to avoid cluttering our code with irrelevant detail. In particular, we don’t want to code defensively inside our domain model. Instead, we want to make sure that requests are known to be valid before our domain model or use-case handlers see them. This helps our code to stay clean and maintainable over the long-term. We sometimes refer to this as "validating at the edge of the system".

As well as keeping your code clean and free of endless checks and asserts, bear in mind that invalid data wandering through your system is a time bomb; the deeper it gets, the more damage it can do, and the less reasonable tools you have to respond to it.

Back in [chapter_06_uow] we said that the message bus was a great place to put cross-cutting concerns, and validation is a perfect example of that. Here’s how we might change our bus to perform validation for us.

Example 3. Validation
class MessageBus:

    def handle_message(self, name: str, body: str):
        try:
            message_type = next(mt for mt in EVENT_HANDLERS.keys() if mt.__name__ == name)
            message = message_type.from_json(body)
            self.handle([message])
        except StopIteration:
            raise KeyError(f"Unknown message name {name}")
        except ValidationError as e:
            logging.error(
                f'invalid message of type {name}\n'
                f'{body}\n'
                f'{e}'
            )
            raise e

Here’s how we might use that method from our Flask API endpoint.

Example 4. API bubbles up validation errors (src/allocation/flask_app.py)
@app.route("/change_quantity", methods=['POST'])
def change_batch_quantity():
    try:
        bus.handle_message('ChangeBatchQuantity', request.body)
    except ValidationError as e:
        return bad_request(e)
    except exceptions.InvalidSku as e:
        return jsonify({'message': str(e)}), 400

def bad_request(e: ValidationError):
    return e.code, 400

And here’s how we might plug it in to our asynchronous message processor

-

Example 5. Validation errors when handling redis messages (src/allocation/redis_pubsub.py)
def handle_change_batch_quantity(m, bus: messagebus.MessageBus):
    try:
        bus.handle_message('ChangeBatchQuantity', m)
    except ValidationError:
       print('Skipping invalid message')
    except exceptions.InvalidSku as e:
        print(f'Unable to change stock for missing sku {e}')

Notice that our entry points are solely concerned with how to get a message from the outside world, and how to report success or failure. Our mesage bus takes care of validating our requests, and routing them to the correct handler, and our handlers are exclusively focused on the logic of our use case.

Tip
Usually when you receive an invalid message there’s little you can do but log the error and continue. At MADE we use metrics to count how many messages a system receives, and how many of those are successfully processed, skipped, or invalid. Our monitoring tools will alert us if we see spikes in the numbers of bad messages.

Validating Semantics

While syntax is concerned with the structure of messages, semantics is the study of meaning in messages. The sentence "undo no dogs from ellipsis four" is syntactically valid, and has the same structure as the sentence "allocate one teapot to order five", but it’s meaningless.

Example 6. A meaningless message
{
  "orderid": "superman",
  "sku": "zygote",
  "qty": -1
}

We can read this json blob as an Allocate command but we can’t successfully execute it, because it’s nonsense.

We tend to validate semantic concerns at the message handler layer with a kind of contract-based programming.

Example 7. Preconditions (src/allocation/ensure.py)
"""
This module contains pre-conditions that we apply to our handlers.
"""

class MessageUnprocessable(Exception):  #(1)

    def __init__(self, message):
        self.message = message

class ProductNotFound(MessageUnprocessable):  #(2)
   """"
   This exception is raised when we try to perform an action on a product
   that doesn't exist in our database.
   """"

    def __init__(self, message):
        super().__init__(message)
        self.sku = message.sku

def product_exists(event, uow):  #(3)
    product = uow.products.get(event.sku)
    if product is None:
        raise ProductNotFound(event)
  1. We use a common base class for errors that mean a message is invalid

  2. Using a specific error type for this problem makes it easier to report on and handle the error. For example, it’s easy to map ProductNotFound to a 404 in Flask.

  3. product_exists is a precondition. If the condition is False, we raise an error.

This keeps the main flow of our logic in the service layer clean and declarative:

Example 8. ensure in use in services (src/allocation/services.py)
# services.py

from allocation import ensure

def allocate(event, uow):
    line = mode.OrderLine(event.orderid, event.sku, event.qty)
    with uow:
        ensure.product_exists(uow, event)

        product = uow.products.get(line.sku)
        product.allocate(line)
        uow.commit()

We can extend this technique to make sure that we apply messages idempotently. For example, we want to make sure that we don’t insert a batch of stock more than once.

If we get asked to create a batch that already exists, we’ll log a warning and continue to the next message.

Example 9. Raise SkipMessage exception for ignorable events (src/allocation/services.py)
class SkipMessage (Exception):
    """"
    This exception is raised when a message can't be processed, but there's no
    incorrect behavior. For example, we might receive the same message multiple
    times, or we might receive a message that is now out of date.
    """"

    def __init__(self, reason):
        self.reason = reason

def batch_is_new(self, event, uow):
    batch = uow.batches.get(event.batchid)
    if batch is not None:
        raise SkipMessage(f"Batch with id {event.batchid} already exists")

Introducing a SkipMessage exception lets us handle these cases in a generic way in our message bus.

Example 10. The Bus Now Knows How To Skip (src/allocation/messagebus.py)
class MessageBus:

    def handle_message(self, message):
        try:
           ...
       except SkipMessage as e:
           logging.warn(f"Skipping message {message.id} because {e.reason}")

There are a couple of pitfalls to be aware of here. Firstly, we need to be sure that we’re using the same unit of work that we use for the main logic of our use-case. Otherwise we open ourselves to irritating concurrency bugs.

Secondly, we should try to avoid putting all our business logic into these pre-condition checks. As a rule of thumb, if a rule can be tested inside our domain model, then it should be tested in the domain model.

Validating Pragmatics

Pragmatics is the study of how we understand language in context. After we have parsed a message and grasped it’s meaning, we still need to process it in context. For example, if you get a comment on a pull request saying "I think this is very brave", it may mean that the reviewer admires your courage, unless they’re British, in which case they’re trying to tell you that what you’re doing is insanely risky, and only a fool would attempt it. Context is everything.

In software terms, the pragmatics of an operations are usually managed by the domain model. When we receive a message like "allocate three million units of SCARCE-CLOCK to order 76543", the message is syntactically valid, and semantically valid, but we’re unable to comply because we don’t have the stock available.

Tip
Once you’ve validated the syntax and semantics of your commands at the edges of your system, the domain is the place for the rest of your validation. Validation of pragmatics is often a core part of your business rules.
Validation Wrap-Up
Validation means different things to different people

When talking about validation, make sure you’re clear about what you’re validating. We find it useful to think about syntax, semantics, and pragmatics: the structure of messages, the meaningfulness of messages, and the business logic governing our response to messages.

Validate at the edge when possible

Validating required fields and the permissible ranges of numbers is boring and we want to keep it out of our nice clean code base. Handlers should only ever receive valid messages.

Only validate what you require

Use the Tolerant Reader pattern: only read the fields your application needs and don’t over-specify their internal structure. If you can treat it as an opaque string, that will make life easier.

Spend time writing helpers for validation

Having a nice declarative way to validate incoming messages and apply preconditions to your handlers will make your codebase much cleaner. It’s worth investing time to make boring code easy to maintain.

Locate each of the three types of validation in the right place

Validating syntax can happen on message classes, validating semantics can happen in the service layer / on the message bus, and validating pragmatics belongs in the domain model.