Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection manager repeatedly retries connection for dead process [JIRA: RIAK-2379] #723

Open
ian-mi opened this issue Feb 11, 2016 · 0 comments

Comments

@ian-mi
Copy link
Contributor

ian-mi commented Feb 11, 2016

When the calling process exits or crashes before a connection completes, the riak_core_connection process will crash with noproc when it attempts to call the connected callback. The connection manager will then repeatedly retry the connection with the same PID resulting in repeated noproc errors.

Seen at a customer where the fssource crashes due to:

2016-02-07 21:37:52 =SUPERVISOR REPORT====
     Supervisor: {local,riak_repl2_fssource_sup}
     Context:    child_terminated
     Reason:     {normal,{gen_server,call,[<11363.32378.80>,cluster_name,120000]}}
     Offender:   [{pid,<0.15265.27>},{name,822094670998632891489572718402909198556462055424},{mfargs,{riak_repl2_fssource,start_link,undefined}},{restart_type,temporary},{shutdown,5000},{child_type,worker}]

Which was then followed by periodic noproc crashes such as

2016-02-07 21:37:52 =ERROR REPORT====
** State machine <0.15266.27> terminating 
** Last message in was {tcp,#Port<0.819352>,<<131,104,2,100,0,2,111,107,104,2,100,0,8,102,117,108,108,115,121,110,99,104,3,97,3,97,0,97,0>>}
** When State == wait_for_protocol
**      Data  == {state,ranch_tcp,#Port<0.819352>,fullsync,[{3,0},{2,0},{1,1}],[{keepalive,true},{nodelay,true},{packet,4},{active,false}],riak_repl2_fssource,<0.15265.27>,"riak_tpsrvc_test2_iscc_104",[{clustername,"riak_tpsrvc_test2_iscc_104"},{ssl_enabled,false}],[{clustername,"riak_tpsrvc_test2_corp_104"},{ssl_enabled,false}],{10,253,50,54},9080}
** Reason for termination = 
** {noproc,{gen_server,call,[<0.15265.27>,{connected,#Port<0.819352>,ranch_tcp,{{REDACTED},9080},{fullsync,{3,0},{3,0}},[{clustername,"riak_tpsrvc_test2_corp_104"},{ssl_enabled,false}]},120000]}}
2016-02-07 21:37:52 =CRASH REPORT====
  crasher:
    initial call: riak_core_connection:init/1
    pid: <0.15266.27>
    registered_name: []
    exception exit: {{noproc,{gen_server,call,[<0.15265.27>,{connected,#Port<0.819352>,ranch_tcp,{{REDACTED},9080},{fullsync,{3,0},{3,0}},[{clustername,"riak_tpsrvc_test2_corp_104"},{ssl_enabled,false}]},120000]}},[{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,622}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
    ancestors: [<0.15251.27>]
    messages: []
    links: [#Port<0.819352>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 610
    stack_size: 27
    reductions: 1637
  neighbours:

Always with the same PID. This behaviour continues until the node is restarted.

@Basho-JIRA Basho-JIRA changed the title Connection manager repeatedly retries connection for dead process Connection manager repeatedly retries connection for dead process [JIRA: RIAK-2379] Feb 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants