Lorre is extremely aggressive on the tezos-node #950

ghost · 2020-12-01T11:40:09Z

Hi,
I have a tezos-node (mainnet) and Lorre and Conseil API connected.
I do not use your docker images, I run the processes myself.

My problem is that when lorre is running I can barely connect to tezos-node anymore, e.g. it takes 20 to 60 seconds to check /chains/main/blocks/head:

 time curl -s --noproxy "*" --connect-timeout 60 --max-time 60 -X GET -H 'Content-Type: application/json' 'http://*****/chains/main/blocks/head' | jq '.header.level'
1239159

real    0m0.050s
user    0m0.008s
sys     0m0.005s

time curl -s --noproxy "*" --connect-timeout 60 --max-time 60 -X GET -H 'Content-Type: application/json' 'http://*****/chains/main/blocks/head' | jq '.header.level'
1239163

real    0m20.426s
user    0m0.008s
sys     0m0.013s

My current Lorre config looks like this:

platforms: [ {
  name: tezos
  network: mainnet
  enabled: true
  node: {
    protocol: "http"
    hostname: "****"
    port: *****
    pathPrefix: ""
    }
  }
]

lorre {

  request-await-time: 120 s
  get-response-entity-timeout: 90s
  post-response-entity-timeout: 1s

  sleep-interval: 5 s
  bootup-retry-interval: 10 s
  bootup-connection-check-timeout: 10 s
  fee-update-interval: 20
  fees-average-time-window: 3600
  depth: newest
  chain-events: []
  block-rights-fetching: {
    init-delay: 2 minutes
    interval: 60 minutes
    cycles-to-fetch: 5
    cycle-size: 4096
    fetch-size: 200
    update-size: 16
    enabled: true
  }

  batched-fetches {
    account-concurrency-level: 5
    block-operations-concurrency-level: 10
    block-page-size: 500
    block-page-processing-timeout: 1 hour
    account-page-processing-timeout: 15 minutes
    delegate-page-processing-timeout: 15 minutes
  }

  db {
    dataSourceClass: "org.postgresql.ds.PGSimpleDataSource"
    properties {
      user: "***********"
      password: "***********"
      url: "jdbc:postgresql://************"
    }
  }

}

akka {
  tezos-streaming-client {
    max-connections: 10
    max-open-requests: 512
    idle-timeout: 10 minutes
    pipelining-limit: 7
    response-entity-subscription-timeout: 15 seconds
  }
  tezos-dispatcher {
    type: "Dispatcher"
    executor: "thread-pool-executor"
    throughput: 1

    thread-pool-executor {
      fixed-pool-size: 16
    }
  }

  http {
    server {
      request-timeout: 5 minutes
      idle-timeout: 5 minutes
    }
  }
}

I built Lorre from the master branch today.

What can I do to make it less aggressive?

The text was updated successfully, but these errors were encountered:

ivanopagano · 2020-12-01T14:51:40Z

you can start by halving a couple of values in the akka.tezos-streaming-client section.

try something like

max-connections: 5 # <- half the number of concurrent open connections
max-open-requests: 512
idle-timeout: 10 minutes
pipelining-limit: 7
response-entity-subscription-timeout: 15 seconds

This should essentially drop the ongoing requests to half because it will use less connections

What I don't know for sure is why your tezos node should have less capacity than the one we use in our docker. Unless the node can auto-tune based on available system resources?
Did you set any custom configuration to run the tezos node?

ghost · 2020-12-02T08:06:51Z

@ivanopagano
Thank you for the advice - I am testing it now. I run the tezos-node like this:

tezos-node run -v --history-mode=archive --data-dir=/tezos --network=mainnet --rpc-addr=0.0.0.0:8732 --config-file=mainnet.json --connections=5

whereas mainnet.json contains:

{
  "data-dir": "/tezos",
  "p2p": {
    "bootstrap-peers": [
      "boot.tzbeta.net",
      "dubnodes.tzbeta.net:9732",
      "franodes.tzbeta.net:9732",
      "sinnodes.tzbeta.net:9732",
      <... many more peers ...>
    ],
    "listen-addr": "[::]:9732"
  }
}

ghost · 2020-12-02T13:27:56Z

@ivanopagano
I tried like this:

akka {
  tezos-streaming-client {
    max-connections: 3
    max-open-requests: 256
    idle-timeout: 10 minutes
    pipelining-limit: 7
    response-entity-subscription-timeout: 15 seconds
  }
  tezos-dispatcher {
    type: "Dispatcher"
    executor: "thread-pool-executor"
    throughput: 1

    thread-pool-executor {
      fixed-pool-size: 16
    }
  }

  http {
    server {
      request-timeout: 5 minutes
      idle-timeout: 5 minutes
    }
  }
}

And it did not improve the situation. Any other idea?

ghost · 2020-12-09T15:51:40Z

I have experimented a little more.
First I lower the akka values as follows:

akka {
  tezos-streaming-client {
    max-connections: 3
    max-open-requests: 128
    idle-timeout: 10 minutes
    pipelining-limit: 7
    response-entity-subscription-timeout: 15 seconds
  }
  tezos-dispatcher {
    type: "Dispatcher"
    executor: "thread-pool-executor"
    throughput: 1

    thread-pool-executor {
      fixed-pool-size: 16
    }
  }

  http {
    server {
      request-timeout: 5 minutes
      idle-timeout: 5 minutes
    }
  }
}

It still did not make a tangible difference. So as a workaround I created a new tezos-node on the same machine. So I have one that I can query and there's one for Conseil.
This way it works, I get quick responses.
What I learn from this is that it is not a hardware/IO/network issue as it works with more processes better than with less processes.
It seems to me that there's something like a "max-rpc-calls-per-second" limit on the tezos-node or Conseil ignores my akka config?

ghost · 2020-12-27T19:53:26Z

Hi There, any news on this? Do you have a suggestion what to do?

ghost · 2021-01-13T10:13:18Z

Hi, any idea what I shall do? The advises I received did not have any effect. Conseil keeps paralyzing the tezos-node.

jun0tpyrc · 2021-01-14T15:18:42Z

I got docker-compose of all those conseil+psql + tezos running for mainnet,
most tunings do not help much until i determined to scale up my instance to a 8core 32GB memory one and a fast gp3 disk on aws - which together seems solved the io bottleneck for me

vishakh · 2021-01-29T22:42:32Z

Please try the latest release and let us know how it looks. There is improved logging so it should be easier to identify the root issue.

https://github.com/Cryptonomic/Conseil/releases/tag/2021-january-release-35

vishakh · 2021-01-29T22:43:06Z

@g574 @jun0tpyrc Please see the above comment about the latest release.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lorre is extremely aggressive on the tezos-node #950

Lorre is extremely aggressive on the tezos-node #950

ghost commented Dec 1, 2020 •

edited by ghost

Loading

ivanopagano commented Dec 1, 2020

ghost commented Dec 2, 2020

ghost commented Dec 2, 2020

ghost commented Dec 9, 2020

ghost commented Dec 27, 2020

ghost commented Jan 13, 2021

jun0tpyrc commented Jan 14, 2021 •

edited

Loading

vishakh commented Jan 29, 2021

vishakh commented Jan 29, 2021

Lorre is extremely aggressive on the tezos-node #950

Lorre is extremely aggressive on the tezos-node #950

Comments

ghost commented Dec 1, 2020 • edited by ghost Loading

ivanopagano commented Dec 1, 2020

ghost commented Dec 2, 2020

ghost commented Dec 2, 2020

ghost commented Dec 9, 2020

ghost commented Dec 27, 2020

ghost commented Jan 13, 2021

jun0tpyrc commented Jan 14, 2021 • edited Loading

vishakh commented Jan 29, 2021

vishakh commented Jan 29, 2021

ghost commented Dec 1, 2020 •

edited by ghost

Loading

jun0tpyrc commented Jan 14, 2021 •

edited

Loading