GitHub - thearyanahmed/logflow: Proof of concept | Collecting nginx log data and pushing it to kafka with the help of gRPC.

Logflow

What it does?

Logflow collects nginx layer 7 data and layer 4 packet data and will be sent to kafka using protobuf. As of using protocol buffer, it works very fast. We will be improving on it. It is still in very early stage.

It takes the data from the gRPC request and uses publishes into kafka . Logflow uses client stream to collect the data.

It can publish a single request into multiple topics and topics can be changed on every request, as the time of writing this, all possible topics must be given (set in the .env file) before you start the server.

Dependencies

Kafka
GO
Protocol Buffer for development purpose.

At the moment we don't have the docker environments ready yet. The dependencies need to be installed manually.

Running the Application

First, git clone

// https 
git clone https://github.com/thearyanahmed/logflow.git

// or ssh
git clone [email protected]:thearyanahmed/logflow.git

// github cli
gh repo clone thearyanahmed/logflow

then cd into the directory

cd logflow

If you don't have ZooKeeper and Kafka running, you can follow these instructions

.ENV

Once you have ZooKeeper and Kafka running, in the logflow directory

cp .env.example to .env

Make sure you have the .env values setup correctly, in case you have changed any config for kafka or if some default ports/values are already in use on your machine.

To start the server, run

go run main.go --action serve

This is start tcp connect server at RPC_PORT ( from .env) . The default is 5053.

Then you need to start the upd server by running

go run server/udp_server.go

Add the following to your nginx config,

log_format tufin escape=json
'{'
     '"time":"$msec",'
     '"connection":"$connection",'
     '"request":"$request",'
     '"status":"$status",'
     '"user_agent":"$http_user_agent"'
'}';


# and inside your server block 
server {
    ...
    access_log syslog:server=localhost:6060,facility=local7,tag=nginx,severity=info tufin;
}

Make sure you keep the server=$host:$port same as .env's UDP_SERVER_PROT

After that, visit your web app that is using the nginx config, visit any page, and it should start streaming access logs to kafka.

Notes

Before running the client, make sure you have ZooKeeper and Kafka running, and the topics have been created.
Current docker image does not work, for what I believe due to a bug in wurstmeister/kafka-docker#issue-516, I get the same error as segmentio/kafka-go#issues/682. Thus, you'll need to run kafka and zookeeper manually for the time being.
The program is at a very early stage. Program structure is subject to change.

Architecture

Start ZooKeeper and Kafka

First run ZooKeeper and Kafka brokers. Go to your kafka installation directory and run

bin/zookeeper-server-start.sh config/zookeeper.properties

bin/kafka-server-start.sh config/server.properties

If this is your first time running them, you probably don't have topics. You'll need at least one. To create one, use the following command.

It will create a topic name hello_world

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic hello_world

Running kafka consumer (optional) In case you want to see the data transmission directly from kafka.

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic hello_world --from-beginning```

With Docker

At the moment I have not docker image ready for this but soon will. If you want to run kafka inside docker, you can simply use your docker container's address eg: localhost:39092 or something like that ( in the .env) .

Flags

There is at the moment only one flag you can pass along when running the main.go file.

--action Possible values: serve, client, help. eg: go run main.go --action serve

Useful Resources

Enterprise Network Flow Collector (IPFIX, sFlow, Netflow) from Verizon Media

The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare

GopherCon 2016: John Leon - Packet Capture, Analysis, and Injection with Go

Capturing HTTP packets the hard way

LISA16: Linux 4.X Tracing Tools: Using BPF Superpowers

Sniffing Creds with Go, A Journey with libpcap

Collecting NGINX Plus Monitoring Statistics with Go

GoPacket by Google

Packet Capture, Injection, and Analysis with Go Packet

BCC HTTP Filter

Logkit by Qiniu

NGINX log data format

These ^ are gems

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
cmd/kafka		cmd/kafka
collectors		collectors
docker		docker
images		images
pb/packet		pb/packet
rpc		rpc
server		server
utils		utils
.env.example		.env.example
.gitignore		.gitignore
docker-compose.yml		docker-compose.yml
generate_proto.sh		generate_proto.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Logflow

What it does?

Dependencies

Running the Application

.ENV

Architecture

Start ZooKeeper and Kafka

With Docker

Flags

Useful Resources

Todos

At the moment, I did not include any docker image for any component.

More to come

About

Releases

Packages

Languages

thearyanahmed/logflow

Folders and files

Latest commit

History

Repository files navigation

Logflow

What it does?

Dependencies

Running the Application

.ENV

Architecture

Start ZooKeeper and Kafka

With Docker

Flags

Useful Resources

Todos

At the moment, I did not include any docker image for any component.

More to come

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages