large-file-stream

A POC application to read large files with stream processing approach

implementation details

two approaches implemented:

Stream: simple buffering with backpressure and aggregating with akka actors
Kafka:Seperating reading and processing with kafka

how to run

tech stack

scala 2.13
sbt 1.4.7
java 11
docker compose

1- Generate a large csv file in given format

1, "DE"
2, "NL"
1, "US"
3, "US"

You can run largecsvgenerator.go by to create a file of 10G.

WARNING A file named largefile.csv must be inside projects root folder.

1- to run stream solution

> export SOLUTION="stream"
> sbt run

2- to run kafka solution

start kafka using docker-compose

> docker-compose up

then

> export SOLUTION="kafka"
> sbt run

Results will be aggregated and displayed on console continously.

Example

...
(90,EnrichedProduct(90,Map(NL -> 17852, DE -> 17674, US -> 17624)))
(91,EnrichedProduct(91,Map(US -> 17915, NL -> 17688, DE -> 17755)))
(92,EnrichedProduct(92,Map(NL -> 17568, DE -> 17476, US -> 17770)))
(93,EnrichedProduct(93,Map(DE -> 17690, US -> 17841, NL -> 17584)))
(94,EnrichedProduct(94,Map(DE -> 17675, NL -> 17631, US -> 17560)))
(95,EnrichedProduct(95,Map(NL -> 17717, US -> 17550, DE -> 17790)))
(96,EnrichedProduct(96,Map(US -> 17609, DE -> 17921, NL -> 17732)))
(97,EnrichedProduct(97,Map(US -> 17441, NL -> 17804, DE -> 17370)))
(98,EnrichedProduct(98,Map(DE -> 17775, US -> 17797, NL -> 17658)))
(99,EnrichedProduct(99,Map(NL -> 17711, US -> 17629, DE -> 17813)))
(100,EnrichedProduct(100,Map(US -> 17830, NL -> 17675, DE -> 17805)))

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
project		project
src/main		src/main
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt
docker-compose.yml		docker-compose.yml
largecsvgenerator.go		largecsvgenerator.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

large-file-stream

implementation details

how to run

tech stack

About

Releases

Packages

Languages

mcelayir/large-file-stream

Folders and files

Latest commit

History

Repository files navigation

large-file-stream

implementation details

how to run

tech stack

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages