Logflow collects nginx layer 7 data and layer 4 packet data and will be sent to kafka using protobuf. As of using protocol buffer, it works very fast. We will be improving on it. It is still in very early stage.
It takes the data from the gRPC request and uses publishes into kafka . Logflow uses client stream
to collect the data.
It can publish a single request into multiple topics and topics can be changed on every request, as the time of writing this,
all possible topics must be given (set in the .env
file) before you start the server.
- Kafka
- GO
- Protocol Buffer for development purpose.
At the moment we don't have the docker environments ready yet. The dependencies need to be installed manually.
First, git clone
// https
git clone https://github.com/thearyanahmed/logflow.git
// or ssh
git clone [email protected]:thearyanahmed/logflow.git
// github cli
gh repo clone thearyanahmed/logflow
then cd into the directory
cd logflow
If you don't have ZooKeeper and Kafka running, you can follow these instructions
Once you have ZooKeeper and Kafka running, in the logflow
directory
cp .env.example to .env
Make sure you have the .env values setup correctly, in case you have changed any config for kafka or if some default ports/values are already in use on your machine.
To start the server, run
go run main.go --action serve
This is start tcp connect server at RPC_PORT
( from .env) . The default is 5053.
Then you need to start the upd server by running
go run server/udp_server.go
Add the following to your nginx config,
log_format tufin escape=json
'{'
'"time":"$msec",'
'"connection":"$connection",'
'"request":"$request",'
'"status":"$status",'
'"user_agent":"$http_user_agent"'
'}';
# and inside your server block
server {
...
access_log syslog:server=localhost:6060,facility=local7,tag=nginx,severity=info tufin;
}
Make sure you keep the server=$host:$port
same as .env's UDP_SERVER_PROT
After that, visit your web app that is using the nginx config, visit any page, and it should start streaming access logs to kafka.
Notes
- Before running the client, make sure you have ZooKeeper and Kafka running, and the topics have been created.
- Current docker image does not work, for what I believe due to a bug in wurstmeister/kafka-docker#issue-516, I get the same error as segmentio/kafka-go#issues/682. Thus, you'll need to run kafka and zookeeper manually for the time being.
- The program is at a very early stage. Program structure is subject to change.
First run ZooKeeper and Kafka brokers. Go to your kafka installation directory and run
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
If this is your first time running them, you probably don't have topics. You'll need at least one. To create one, use the following command.
It will create a topic name hello_world
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic hello_world
- Running kafka consumer (optional) In case you want to see the data transmission directly from kafka.
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic hello_world --from-beginning```
At the moment I have not docker image ready for this but soon will. If you want to run kafka inside docker, you can simply use your docker container's address eg: localhost:39092 or something like that ( in the .env) .
There is at the moment only one flag you can pass along when running the main.go file.
--action
Possible values: serve, client, help. eg:go run main.go --action serve
Enterprise Network Flow Collector (IPFIX, sFlow, Netflow) from Verizon Media
The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare
GopherCon 2016: John Leon - Packet Capture, Analysis, and Injection with Go
Capturing HTTP packets the hard way
LISA16: Linux 4.X Tracing Tools: Using BPF Superpowers
Sniffing Creds with Go, A Journey with libpcap
Collecting NGINX Plus Monitoring Statistics with Go
Packet Capture, Injection, and Analysis with Go Packet
These ^ are gems
- Installations & setup
- Setup kafka
- Use docker container for kafka
- Decide on protobuf ( grpc / rpc )
- Setup a nginx docker
- Enable .env support
- Draw system architecture
- Connect to kafka
- Setup a kafka client to receive messages
- Setup a kafka producer
- Write tests
- Get some log data and test the whole system
- Dockerize full app
- Prepare dummy NginxLogRequest, Packet & Headers
- Prepare a client to test grpc and kafka producer with dummy data
- Also see manually if kafka consumer is consuming messages
- Document everything ( on going so far)