[Go Hack] feature addition and refractor #18

yadunandan004 · 2022-05-15T04:11:34Z

This PR includes the following main changes:

##===================================================================##

Retry mechanism:

This is used to attempt retries in case of failure.

lbcluster/retry_module.go

type Retry struct {
signal chan int // provides the retry signal
done chan bool // marks the end of retry
tick chan bool
currentCount int
maxCount int // maximum count of retries permitted
retryStarted bool
maxDuration time.Duration // max duration for which retries are allowed
retryDuration time.Duration
prevRetryDuration time.Duration
logger logger.Logger
}

Max duration can be set via SetMaxDuration(maxDuration time.Duration) error
Max retry count can be set via SetMaxCount(maxCount int) error
The retry back off time keeps incrementing based on fibonacci sequence.
The maximum number of retries that are allowed will be either till the max count is reached or the max duration is reached. The lesser of the two is considered.
Any function that needs to be part of the retry mechanism (mostly network IO related), can be set by passing to the retry executor Execute(executor func() error) error. Implementation is present in lbcluster/lbcluster_dns.go:78

##===================================================================##

Logging Refractor:

This includes general refractor of logging, and also adds snapshot support to logging as well.

logger/lbcluster_log.go

Contains a Factory method to create logger now. NewLoggerFactory(logFilePath string) (Logger, error). Any log based errors are handled during factory initialization

type Logger interface {
EnableDebugMode()
EnableWriteToSTd()
StartSnapshot(d time.Duration)
GetLogFilePath() string
Info(s string)
Warning(s string)
Debug(s string)
Error(s string)
}

The above interface used as the logger type everywhere. The methods can be used instead of mutating the variables directly.
Log based Snapshots : logs can now be made into snapshots by providing a cycle time. Once the cycle time expires a new snapshot is created. StartSnapshot(d time.Duration)
log snapshots will be created in the following format:

sample.0.log, sample.1.log

##===================================================================##

Metric Collection:

Added feature to collect metrics for DNS updates. This can be used for telemetry purposes.

Default Port: 8081

Contains GET API \metric to fetch all the metrics related to the server for that day.
Server can be started by NewMetricServer(metricDirectoryPath string) error. Implemented in lbd.go:189( but disabled by default)
New record can be written by WriteRecord(property Property) error. implemented in lbcluster/lbcluster_dns.go:86

type Property struct {
RoundTripStartTime time.Time json:"start_time" csv:"start_time" rw:"r"
RoundTripEndTime time.Time json:"end_time" csv:"end_time" rw:"r"
RoundTripDuration time.Duration json:"round_trip_duration"
}

above are the current supported metrics. This functionality can be extended to add more.

##===================================================================##

Concurrency Addition

Concurrency has been added to two following methods.
- EvaluateHosts(hostsToCheck map[string]lbhost.Host) in lbcluster/lbcluster.go:294
- SNMPDiscovery() in lbhost/lbhost.go:93

##===================================================================##

General Refractors

Refractors are done with following few things in mind.

Avoid direct use of variables to change state or read state. Getter and setter methods used instead.
Avoid passing too many params or redundant params. Params such as logger and others which are common are added to the underlying struct instead.
Remove redundant code

Appropriate test cases have been added to all the above mentioned changes

psaiz · 2022-05-24T07:46:12Z

model/cluster_config.go

@@ -0,0 +1,7 @@
+package model
+
+type CluserConfig struct {


Could it be that this is supposed to be ClusterConfig? Note that it will have to be changed in the other 14 places

sure, will change this.

psaiz · 2022-05-24T08:04:53Z

lbcluster/retry_module.go

+
+func (r *Retry) run() {
+	start := make(chan bool)
+	go func() {


First of all, thanks for this contribution! I appreciate a lot that you took your time to look into the code, and think of ways of improving it.

I'm concerned about the retry approach. If I get it right, the retry is done on a different goroutine. Imagine that this is going over 200 aliases, and, for some reason, there are issues with the DNS, and they start failing. All those retries will start piling up. Taking into account that the main loop retries every 10 seconds, this might turn into a DOS, isn't it?

I have the feeling that the current approach of failing once, and hoping that the next iteration will fix it might be less risky.

Sure, the scenario does makes for the retry call to be removed. Should I remove the whole retry module or just the call (reserve the module for any future use maybe) ?

@yadunandan004 cannot we try to use a CircuitBreaker pattern here instead of plain retry? It works like this:

Once you retry a previous message lets say after 10 seconds, if it fails, the next consequent retry will have a more cooldown time till the next retry, for example, lets say retryTime = retryTime * 2 or some other function that is increasing. Once a retry is successful, then we can either set retryTime = DEFAULT_RETRY_TIME or decrease it my a multiplicative factor.

Reference: https://github.com/cloud-native-go/examples/blob/main/ch04/circuitbreaker.go

What do you think, @psaiz ?

I see. Yes I do have computeNextRetryTime (Line:126) method to compute for next retry time by using fibonacci algorithm. Would that suffice, or should I use another algorithm for that purpose?

mehrankamal

Just some feedback or improvements for the retry_module. Take them with a pinch of salt.

mehrankamal · 2022-06-09T03:56:39Z

lbcluster/retry_module.go

+
+func (r *Retry) run() {
+	start := make(chan bool)
+	go func() {


@yadunandan004 cannot we try to use a CircuitBreaker pattern here instead of plain retry? It works like this:

Once you retry a previous message lets say after 10 seconds, if it fails, the next consequent retry will have a more cooldown time till the next retry, for example, lets say retryTime = retryTime * 2 or some other function that is increasing. Once a retry is successful, then we can either set retryTime = DEFAULT_RETRY_TIME or decrease it my a multiplicative factor.

Reference: https://github.com/cloud-native-go/examples/blob/main/ch04/circuitbreaker.go

What do you think, @psaiz ?

mehrankamal · 2022-06-09T04:00:57Z

lbcluster/retry_module.go

+func (r *Retry) run() {
+	start := make(chan bool)
+	go func() {
+		ticker := time.NewTicker(1 * time.Minute)


The ticker duration should be configurable from the Retry type, I think. Because, we may want to have different retry duration based on services?

Sure. That can be configured while instantiating via the constructor itself (Line:24).

mehrankamal · 2022-06-09T04:10:13Z

lbcluster/retry_module.go

+				if r.currentCount == r.maxCount {
+					r.done <- true
+					return
+				}


Is this semantically correct to have return true in r.done channel even in case of r.currentCount == r.maxCount. Also, should not it return some error in case of all retries have failed?

The module never returns the true output to the caller. That is just to terminate the looping. The module does return the error once max retries have reached (Line: 122).

yadunandan004 · 2022-07-20T09:51:00Z

I have addressed all comments, hope that is sufficient. Please let me know if there are any more gaps. If it looks good however, could the branch be merged?

yadunandan004 added 21 commits May 7, 2022 13:23

init commit

57923aa

retry module implement init

705434b

refractor retry module and added tests

b6459f1

refractored lbhost

adcbad6

refractored logging with tests

1ebb486

more refractor and fixes

5ef8828

added snapshot support

87341ab

minor fixes

883a279

added concurrency to host evauluation

c848bfd

test refractor

f98b278

test refractor and refractor get hosts

073434f

refractor and tests

166595e

added hostmetric collection with api

8e9a2ff

fixes for ut

ca72f01

mend

8d5c9d0

Merge branch 'master' into golbd-submission

62c59d1

mend

78682b8

Merge branch 'retry-mechanism' into golbd-submission

144ece5

minor correction to log snapshots

0ccebc6

changes to metrics collector

f167a38

minor fix to evalhosts

45e37b9

psaiz reviewed May 24, 2022

View reviewed changes

mehrankamal reviewed Jun 9, 2022

View reviewed changes

minor fixes and refractors

9e295d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Go Hack] feature addition and refractor #18

[Go Hack] feature addition and refractor #18

yadunandan004 commented May 15, 2022 •

edited

Loading

psaiz May 24, 2022

yadunandan004 Jun 7, 2022

psaiz May 24, 2022

yadunandan004 Jun 7, 2022

mehrankamal Jun 9, 2022

yadunandan004 Jun 13, 2022 •

edited

Loading

mehrankamal left a comment

mehrankamal Jun 9, 2022

mehrankamal Jun 9, 2022

yadunandan004 Jun 13, 2022

mehrankamal Jun 9, 2022

yadunandan004 Jun 13, 2022 •

edited

Loading

yadunandan004 commented Jul 20, 2022

[Go Hack] feature addition and refractor #18

Are you sure you want to change the base?

[Go Hack] feature addition and refractor #18

Conversation

yadunandan004 commented May 15, 2022 • edited Loading

Retry mechanism:

Logging Refractor:

Metric Collection:

Concurrency Addition

General Refractors

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yadunandan004 Jun 13, 2022 • edited Loading

Choose a reason for hiding this comment

mehrankamal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yadunandan004 Jun 13, 2022 • edited Loading

Choose a reason for hiding this comment

yadunandan004 commented Jul 20, 2022

yadunandan004 commented May 15, 2022 •

edited

Loading

yadunandan004 Jun 13, 2022 •

edited

Loading

yadunandan004 Jun 13, 2022 •

edited

Loading