-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go Hack] feature addition and refractor #18
base: master
Are you sure you want to change the base?
Conversation
model/cluster_config.go
Outdated
@@ -0,0 +1,7 @@ | |||
package model | |||
|
|||
type CluserConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be that this is supposed to be ClusterConfig
? Note that it will have to be changed in the other 14 places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, will change this.
|
||
func (r *Retry) run() { | ||
start := make(chan bool) | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks for this contribution! I appreciate a lot that you took your time to look into the code, and think of ways of improving it.
I'm concerned about the retry approach. If I get it right, the retry is done on a different goroutine. Imagine that this is going over 200 aliases, and, for some reason, there are issues with the DNS, and they start failing. All those retries will start piling up. Taking into account that the main loop retries every 10 seconds, this might turn into a DOS, isn't it?
I have the feeling that the current approach of failing once, and hoping that the next iteration will fix it might be less risky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the scenario does makes for the retry call to be removed. Should I remove the whole retry module or just the call (reserve the module for any future use maybe) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yadunandan004 cannot we try to use a CircuitBreaker
pattern here instead of plain retry? It works like this:
Once you retry a previous message lets say after 10 seconds, if it fails, the next consequent retry will have a more cooldown time till the next retry, for example, lets say retryTime = retryTime * 2
or some other function that is increasing. Once a retry is successful, then we can either set retryTime = DEFAULT_RETRY_TIME
or decrease it my a multiplicative factor.
Reference: https://github.com/cloud-native-go/examples/blob/main/ch04/circuitbreaker.go
What do you think, @psaiz ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yes I do have computeNextRetryTime
(Line:126) method to compute for next retry time by using fibonacci algorithm. Would that suffice, or should I use another algorithm for that purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some feedback or improvements for the retry_module
. Take them with a pinch of salt.
|
||
func (r *Retry) run() { | ||
start := make(chan bool) | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yadunandan004 cannot we try to use a CircuitBreaker
pattern here instead of plain retry? It works like this:
Once you retry a previous message lets say after 10 seconds, if it fails, the next consequent retry will have a more cooldown time till the next retry, for example, lets say retryTime = retryTime * 2
or some other function that is increasing. Once a retry is successful, then we can either set retryTime = DEFAULT_RETRY_TIME
or decrease it my a multiplicative factor.
Reference: https://github.com/cloud-native-go/examples/blob/main/ch04/circuitbreaker.go
What do you think, @psaiz ?
func (r *Retry) run() { | ||
start := make(chan bool) | ||
go func() { | ||
ticker := time.NewTicker(1 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ticker duration should be configurable from the Retry
type, I think. Because, we may want to have different retry duration based on services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. That can be configured while instantiating via the constructor itself (Line:24).
if r.currentCount == r.maxCount { | ||
r.done <- true | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this semantically correct to have return true
in r.done
channel even in case of r.currentCount == r.maxCount
. Also, should not it return some error in case of all retries have failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module never returns the true output to the caller. That is just to terminate the looping. The module does return the error once max retries have reached (Line: 122).
I have addressed all comments, hope that is sufficient. Please let me know if there are any more gaps. If it looks good however, could the branch be merged? |
This PR includes the following main changes:
##===================================================================##
Retry mechanism:
This is used to attempt retries in case of failure.
lbcluster/retry_module.go
SetMaxDuration(maxDuration time.Duration) error
SetMaxCount(maxCount int) error
Execute(executor func() error) error
. Implementation is present inlbcluster/lbcluster_dns.go:78
##===================================================================##
Logging Refractor:
This includes general refractor of logging, and also adds snapshot support to logging as well.
logger/lbcluster_log.go
NewLoggerFactory(logFilePath string) (Logger, error)
. Any log based errors are handled during factory initializationStartSnapshot(d time.Duration)
##===================================================================##
Metric Collection:
Added feature to collect metrics for DNS updates. This can be used for telemetry purposes.
Default Port: 8081
\metric
to fetch all the metrics related to the server for that day.NewMetricServer(metricDirectoryPath string) error
. Implemented inlbd.go:189
( but disabled by default)WriteRecord(property Property) error
. implemented inlbcluster/lbcluster_dns.go:86
above are the current supported metrics. This functionality can be extended to add more.
##===================================================================##
Concurrency Addition
EvaluateHosts(hostsToCheck map[string]lbhost.Host)
inlbcluster/lbcluster.go:294
SNMPDiscovery()
inlbhost/lbhost.go:93
##===================================================================##
General Refractors
Refractors are done with following few things in mind.
Appropriate test cases have been added to all the above mentioned changes