Skip to content

Commit

Permalink
assignments/parallel-firewall: Add parallel firewall assignment
Browse files Browse the repository at this point in the history
Add parallel firewall assignment for the compute chapter.

The goal is to introduce students to parallel programming
with a single producer multiple consumer problem that uses
a thread-safe ring buffer to transfer data between threads.

Signed-off-by: Andrei Stan <[email protected]>
  • Loading branch information
andreistan26 committed Nov 3, 2024
1 parent 18277f0 commit f6de3b5
Show file tree
Hide file tree
Showing 23 changed files with 1,521 additions and 1 deletion.
2 changes: 1 addition & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ docusaurus:
subsections:
- Mini Libc/: chapters/software-stack/libc/projects/mini-libc/
- Memory Allocator/: content/assignments/memory-allocator/
- Parallel Graph/: content/assignments/parallel-graph/
- Parallel Firewall/: content/assignments/parallel-firewall/
- Mini Shell/: content/assignments/minishell/
- Asynchronous Web Server/: content/assignments/async-web-server/
- Exams:
Expand Down
193 changes: 193 additions & 0 deletions content/assignments/parallel-firewall/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# Parallel Firewall

## Objectives

- Learn how to design and implement parallel programs
- Get experienced at utilizing the POSIX threading API
- Learn how to convert a serial program into a parallel one

## Statement

A firewall is a program that checks network packets against a series of filters which provide a decision regarding dropping or allowing the packets to continue to the upper level in the TCP/IP stack.
In the case of this project, instead of a network card, there will be a producer thread that buffers packets into a circular buffer, out of which consumer threads will take packets and process them in order to decide whether they advance to the upper levels of the stack.
You will have to implement the circular buffer and the consumer threads in order to provide a log file with the firewall's decisions ordered by a timestamp.

## Support Code

The support code consists of the directories:

- `src/` contains the skeleton for the parallelized firewall and the already implemented serial code.
You will have to implement the missing parts marked as `TODO`

- `utils/` contains utility files used for debugging and logging.

- `tests/` contains tests used to validate and grade the assignment.

## Implementation

### Firewall Threads

In order to parallelize the firewall we have to distribute the packets to multiple threads.
The packets will be added to a shared data structure (visible to all threads) by a `producer` thread and processed by multiple `consumer` threads.
Each `consumer` thread picks a packet from the shared data structure, checks it against the filter function and writes the packet hash together with the drop/accept decision to a log file.
`consumer` threads stop waiting for new packets from the `producer` thread and exit when the `producer` thread closes the connection to the shared data structure.

The `consumer` threads **must not do any form of busy waiting**.
When there are new packets that need to be handled, the `consumer` threads must be **notified**.
**Waiting in a `while()` loop or sleeping is not considered a valid synchronization mechanism and points will be deducted.**

Implement the `consumer` related functions marked with `TODO` in the `src/consumer.c` file.

### Ring Buffers

A ring buffer (or a circular buffer) is a data structure that stores its elements in a circular fixed size array.
One of the advantages of using such a data structure as opposed to an array is that it acts as a FIFO, without the overhead of moving the elements to the left as they are consumed.
Thus, the shared ring buffer offers the following fields:

- `write_pos` index in the buffer used by the `producer` thread for appending new packets.
- `read_pos` index in the buffer used by the `consumer` threads to pick packets.
- `cap` the size of the internal buffer.
- `data` pointer to the internal buffer.

Apart from these fields you have to add synchronization primitives in order to allow multiple threads to access the ring buffer in a deterministic manner.
You can use mutexes, semaphores, conditional variables and other synchronization mechanisms offered by the `pthread` library.

You will have to implement the following interface for the ring buffer:

- `ring_buffer_init()`: initialize the ring buffer (allocate memory and synchronization primitives).
- `ring_buffer_enqueue()`: add elements to the ring buffer.
- `ring_buffer_dequeue()`: remove elements from the ring buffer.
- `ring_buffer_destroy()`: free up the memory used by the ring_buffer.
- `ring_buffer_stop()`: finish up using the ring buffer for the calling thread.

### Log File

The output of the firewall will be a log file with the rows containing the firewall's decision, the hash of the packet and its timestamp.
The actual format can be found in the serial implementation (at `src/serial.c`).

When processing the packets in parallel the threads will finish up the work in a non deterministic order.
We would like the logs to be sorted by the packet timestamp, the order that they came in from the producer.
Thus, the `consumers` should insert the packet information to the log file such as the result is ordered by timestamp.

The logs must be written to the file in ascending order during packet processing.
**Sorting the log file after the consumer threads have finished processing is not considered a valid synchronization mechanism and points will be deducted.**

## Operations

### Building

To build both the serial and the parallel versions, run `make` in the `src/` directory:

```console
student@so:~/.../content/assignments/parallel-firewall$ cd src/

student@so:~/.../assignments/parallel-firewall/src$ make
```

That will create the `serial` and `firewall` binaries.

## Testing and Grading

Testing is automated.
Tests are located in the `tests/` directory.

To test and grade your assignment solution, enter the `tests/` directory and run `grade.sh`.

```console
student@so:~/.../content/assignments/parallel-firewall$ cd tests/
```

```console
student@so:~/.../content/assignments/parallel-firewall/tests$ ./grade.sh
```

Note that this requires linters being available.
The easiest way to test the project is to use a Docker-based setup with everything installed and configured (see the [README.checker.md](README.checker.md) file for instructions).

To create the tests, run:

```console
student@so:~/.../content/assignments/parallel-firewall/tests$ make check
```

To remove the tests, run:

```console
student@so:~/.../content/assignments/parallel-firewall/tests$ make distclean
```

When using `grade.sh` you will get a maximum of 90/100 points for general correctness and a maximum of 10/100 points for coding style.

### Restrictions

- Threads must yield the cpu when waiting for empty/full buffers i.e. not doing `busy waiting`.
- The logs must be written as they are processed and not after the processing is done, in ascending order by the timestamp.

### Grades

- 10 points are awarded for a single consumer solution that also implements the ring buffer
- 50 points are awarded for a multi consumer solution
- 30 points are awarded for a multi consumer solution that writes the logs in the sorted manner (bearing in mind the above restrictions)

### Running the Checker

Each test is worth a number of points.
The maximum grade is `90`.

A successful run will show the output:

```console
student@so:~/.../assignments/parallel-firewall/tests$ make check
[...]
Test [ 10 packets, sort False, 1 thread ] ...................... passed ... 3
Test [ 1,000 packets, sort False, 1 thread ] ...................... passed ... 3
Test [20,000 packets, sort False, 1 thread ] ...................... passed ... 4
Test [ 10 packets, sort True , 2 threads] ...................... passed ... 5
Test [ 10 packets, sort True , 4 threads] ...................... passed ... 5
Test [ 100 packets, sort True , 2 threads] ...................... passed ... 5
Test [ 100 packets, sort True , 4 threads] ...................... passed ... 5
Test [ 1,000 packets, sort True , 2 threads] ...................... passed ... 5
Test [ 1,000 packets, sort True , 4 threads] ...................... passed ... 5
Test [10,000 packets, sort True , 2 threads] ...................... passed ... 5
Test [10,000 packets, sort True , 4 threads] ...................... passed ... 5
Test [20,000 packets, sort True , 2 threads] ...................... passed ... 5
Test [20,000 packets, sort True , 4 threads] ...................... passed ... 5
Test [ 1,000 packets, sort False, 4 threads] ...................... passed ... 5
Test [ 1,000 packets, sort False, 8 threads] ...................... passed ... 5
Test [10,000 packets, sort False, 4 threads] ...................... passed ... 5
Test [10,000 packets, sort False, 8 threads] ...................... passed ... 5
Test [20,000 packets, sort False, 4 threads] ...................... passed ... 5
Test [20,000 packets, sort False, 8 threads] ...................... passed ... 5

Checker: 90/100
```

### Running the Linters

To run the linters, use the `make lint` command in the `tests/` directory:

```console
student@so:~/.../assignments/parallel-firewall/tests$ make lint
[...]
cd .. && checkpatch.pl -f checker/*.sh tests/*.sh
[...]
cd .. && cpplint --recursive src/ tests/ checker/
[...]
cd .. && shellcheck checker/*.sh tests/*.sh
```

Note that the linters have to be installed on your system: [`checkpatch.pl`](https://.com/torvalds/linux/blob/master/scripts/checkpatch.pl), [`cpplint`](https://github.com/cpplint/cpplint), [`shellcheck`](https://www.shellcheck.net/).
They also need to have certain configuration options.
It's easiest to run them in a Docker-based setup with everything configured.

### Fine-Grained Testing

Input tests cases are located in `tests/in/` and are generated by the checker.
The expected results are generated by the checker while running the serial implementation.
If you want to run a single test, use the below commands while in the `src/` directory:

```console
student@so:~/.../assignments/parallel-firewall/src$ ./firewall ../tests/in/test_<num_packets>.in <output_file> <number_of_consumers>
```

Results provided by the serial and parallel implementation must be the same for the test to successfully pass.
Empty file.
32 changes: 32 additions & 0 deletions content/assignments/parallel-firewall/src/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
BUILD_DIR := build
UTILS_PATH ?= ../utils
CPPFLAGS := -I$(UTILS_PATH)
CFLAGS := -Wall -Wextra

CFLAGS += -ggdb -O0
LDLIBS := -lpthread

SRCS:= ring_buffer.c producer.c consumer.c packet.c $(UTILS_PATH)/log/log.c
HDRS := $(patsubst %.c,%.h,$(SRCS))
OBJS := $(patsubst %.c,%.o,$(SRCS))

.PHONY: all pack clean always

all: firewall serial

firewall: $(OBJS) firewall.o
$(CC) $(CPPFLAGS) $(CFLAGS) -o $@ $^ $(LDLIBS)

serial: $(OBJS) serial.o
$(CC) $(CPPFLAGS) $(CFLAGS) -o $@ $^ $(LDLIBS)

$(UTILS_PATH)/log/log.o: $(UTILS_PATH)/log/log.c $(UTILS_PATH)/log/log.h
$(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@ $<

pack: clean
-rm -f ../src.zip
zip -r ../src.zip *

clean:
-rm -f $(OBJS) serial.o firewall.o
-rm -f firewall serial
35 changes: 35 additions & 0 deletions content/assignments/parallel-firewall/src/consumer.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
// SPDX-License-Identifier: BSD-3-Clause

#include <pthread.h>
#include <fcntl.h>
#include <unistd.h>

#include "consumer.h"
#include "ring_buffer.h"
#include "packet.h"
#include "utils.h"

void consumer_thread(so_consumer_ctx_t *ctx)
{
/* TODO: implement consumer thread */
(void) ctx;
}

int create_consumers(pthread_t *tids,
int num_consumers,
struct so_ring_buffer_t *rb,
const char *out_filename)
{
(void) tids;
(void) num_consumers;
(void) rb;
(void) out_filename;

for (int i = 0; i < num_consumers; i++) {
/*
* TODO: Launch consumer threads
**/
}

return num_consumers;
}
20 changes: 20 additions & 0 deletions content/assignments/parallel-firewall/src/consumer.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
/* SPDX-License-Identifier: BSD-3-Clause */

#ifndef __SO_CONSUMER_H__
#define __SO_CONSUMER_H__

#include "ring_buffer.h"
#include "packet.h"

typedef struct so_consumer_ctx_t {
struct so_ring_buffer_t *producer_rb;

/* TODO: add synchronization primitives for timestamp ordering */
} so_consumer_ctx_t;

int create_consumers(pthread_t *tids,
int num_consumers,
so_ring_buffer_t *rb,
const char *out_filename);

#endif /* __SO_CONSUMER_H__ */
78 changes: 78 additions & 0 deletions content/assignments/parallel-firewall/src/firewall.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
// SPDX-License-Identifier: BSD-3-Clause

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <pthread.h>

#include "ring_buffer.h"
#include "consumer.h"
#include "producer.h"
#include "log/log.h"
#include "packet.h"
#include "utils.h"

#define SO_RING_SZ (PKT_SZ * 1000)

pthread_mutex_t MUTEX_LOG;

void log_lock(bool lock, void *udata)
{
pthread_mutex_t *LOCK = (pthread_mutex_t *) udata;

if (lock)
pthread_mutex_lock(LOCK);
else
pthread_mutex_unlock(LOCK);
}

void __attribute__((constructor)) init()
{
pthread_mutex_init(&MUTEX_LOG, NULL);
log_set_lock(log_lock, &MUTEX_LOG);
}

void __attribute__((destructor)) dest()
{
pthread_mutex_destroy(&MUTEX_LOG);
}

int main(int argc, char **argv)
{
so_ring_buffer_t ring_buffer;
int num_consumers, threads, rc;
pthread_t *thread_ids = NULL;

if (argc < 4) {
fprintf(stderr, "Usage %s <input-file> <output-file> <num-consumers:1-32>\n", argv[0]);
exit(EXIT_FAILURE);
}

rc = ring_buffer_init(&ring_buffer, SO_RING_SZ);
DIE(rc < 0, "ring_buffer_init");

num_consumers = strtol(argv[3], NULL, 10);

if (num_consumers <= 0 || num_consumers > 32) {
fprintf(stderr, "num-consumers [%d] must be in the interval [1-32]\n", num_consumers);
exit(EXIT_FAILURE);
}

thread_ids = calloc(num_consumers, sizeof(pthread_t));
DIE(thread_ids == NULL, "calloc pthread_t");

/* create consumer threads */
threads = create_consumers(thread_ids, num_consumers, &ring_buffer, argv[2]);

/* start publishing data */
publish_data(&ring_buffer, argv[1]);

/* TODO: wait for child processes to finish execution*/
(void) threads;

free(thread_ids);

return 0;
}

Loading

0 comments on commit f6de3b5

Please sign in to comment.