-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API-level proxy detection and implementation resolution #1655
Comments
Why was the bytecode being loaded? Afaik we don't return the bytecode in the I'm thinking if it makes sense to save the proxy information in the db, possibly in the In general, in APIv1 or v2 this option to get the proxy should be disabled by default and should be optional. If it's enabled by default it will result in an RPC call for every |
You are right. The bytecode isn't loaded at the moment. But we would need to if we want to save on the RPC request here.
Yes it is at a fixed storage slot, but still you need to resolve. We would save the RPC request if we store the implementation address. Edit: Thinking about this again, and we should definitely not store the implementation address. The reason for implementing the resolution on-the-fly is that we don't index the chain and don't know when to update the implementation address.
If we have the bytecode it won't result in an RPC call for every request, only if we detect that it is a proxy. But I agree that this extra complexity shouldn't be the default. So making it optional is good. |
I actually wanted to refer to Minimal Proxies (ERC-1167) here so not EIP1967, my bad. So, is it trivial to derive the implementation address? |
Yes, for Minimal Proxies this is true. No RPC call needed. |
To summarize from the conversation above:
Anything else we should consider before implementing? |
To be clear we already want to add this to APIv1 as in the first message, right? In that case we should be able to easily turn it on/off for the proof of concept. Also, we are not saving the implementation address in the DB for sure, but do we want to save the info on whether this contract is a proxy and what type of proxy it is? Finally, can you run a quick analysis on how many additional RPC calls this would cause? Like, how many requests we have now, estimate some percentage for proxy contracts (maybe look at #1632 ?), and have a rough daily request count. So that we can estimate the potential cost impact of this |
I would rather not do it. It seems not too complex to me in terms of computation to detect a proxy. But to prove this, I can run some tests / measurements. Then we can decide if it is worth storing. |
I'm thinking from the prospective of someone using our exports, if a contract is a proxy and which type is a valuable information. It makes sense to store the proxy type in the database so that others don't have to recalculate it. |
That's a very good point. The detection should then just be integrated into the verification process IMO. How about implementing this as simple first imlementation that runs the detection when the API is called, and add the DB functionality in another issue? The DB change will also need some migration process again. |
To gather some statistics, I ran a script detecting proxies over the complete production database. The script is using the whatsabi library. Number of proxies in Sourcify's DBNumber of proxies detected: 2,207,547 (of 5,471,192 analysed matches) The number of proxies is quite high (40% of all contracts). This makes sense when you have a look at #1632. Proxy typesHere are the numbers of individual types of proxies that were detected:
Note that there is a small portion of Estimate of RPC callsWe currently have roughly 18,000 daily requests to the Multi proxy contractsThe whatsabi library can return multiple proxies for a provided bytecode, because a contract can potentially implement multiple proxy standards. I found a lot of contracts for which actually multiple proxies are returned. There were 206,246 such contracts while the number of returned proxies ranged from 2 to 31. I still need to find out if these resolve to the same implementation addresses and if some are a problem of the whatsabi library. I definitely also saw a few returning multiple proxy standards, while many returned the same standard multiple times. PerformanceI also measured the time it took to run the detection per bytecode. I found a median of 0.028 millisecond for running the detection algorithm on my machine. So the only bottleneck of this feature can be the rpc calls for resolving the implementation address. |
That is a very insightful breakdown, thanks! 7k RPC calls for 18k current ones is not trivial but feels ok. I also suspect
Yeah very interesting. What source code are they associated with? Maybe an assembly Solidity code that compiles to the exact same bytecode? |
I pushed my script to here: https://github.com/sourcifyeth/data-analysis-scripts Will probably add another script, for analysing the multi proxy contracts further. The issue with them is we might not be able to just return one implementation address for them from our API. For Multiple implementation addresses would also mean multiple RPC calls, and therefore I would not like to return multiple addresses. In such a case we can maybe just return |
Regarding that, there are definitely proxies that have source code but have a fixed implementation address for example taken in the constructor. See this as an example: https://sepolia.etherscan.io/address/0x2f4eeccbe817e2b9f66e8123387aa81bae08dfec |
How does other block explorers handle multiple proxies? Because normally they have an
So does WhatsABI classify this example and the ERC-1176 Minimal Proxy both as FixedProxy? IMO there should be a separate class for eip1167. |
Good question. I will check Etherscan while I check the data a bit more.
To my understanding, yes. But let me try it out actually Edit: I tried it with https://optimistic.etherscan.io/address/0x6c5aDf5eDd2C77b63d6Fed3851c7C30E3f6D3C2b |
So what I found regarding EtherscanEtherscan does not support them. Either they were not shown as proxy and also not verifiable as proxy, or they showed just one implementation address of many. In the second case, it was also not possible to use the "Read as proxy" and "Write as proxy" functionalities. A DiamondProxy on Etherscan: https://etherscan.io/address/0x69460570c93f9de5e2edbc3052bf10125f0ca22d BlockscoutBlockscout adds a dropdown to select the implementation which you want to interact with when you want to "read/write as proxy". The API returns actually an array of implementations (also for other types of proxies):
Same contract on Blockscout: https://eth.blockscout.com/address/0x69460570c93f9DE5E2edbC3052bf10125f0Ca22d?tab=read_proxy LouperLouper is an Inspector for DiamonProxies. I just want to add this tool here, because I found it quite helpful. Here is the above contract on Louper: https://louper.dev/diamond/0x69460570c93f9de5e2edbc3052bf10125f0ca22d?network=mainnet#facets |
To be able to support multiple implementations, we decided to have the following field in the response:
I also updated the issue description. |
Shouldn't we also include the proxy type in the implementation or somewhere else? Because my understanding is we have |
Yes, in the issue description above, I already have a |
Yes but this should also be in the API response, no? |
That code field in the description is how I designed the API response. Or what do you mean? For API v2 I will propose something later. |
Ok I missed the [
{
"address": "0x6F1D75a53a8805DcA5347aE5F3eDE035CAE3CBC1",
"chainIds": [
{
"chainId": "5",
"status": "perfect",
"isProxy": true,
"proxyType": "EIP1967Proxy",
"implementations": [
{
address: "0x751D7C0Cf91a9b7704541b44E5fF7BeC3D2caA6F",
name: "Logic contract"
}
]
}
]
}
] My only question would be, there can only be one proxyType but multiple implementations, right? |
One main use case that I'm not sure if we considered is the following. I was trying to verify the contract https://sepolia.etherscan.io/address/0x5a6b9d7b73e5bc32044d2ca014555dc542bebc45#code for testing traces. The contract is created by a factory and is actually a minimal proxy. The verification failed because of course the contract's bytecode is just the minimal proxy runtime bytecode. I think Sourcify should automatically handle the resolution of the implementation and marking the proxy and the implementation appropriately. |
Yes exactly.
Yes. Still, I saw some cases where the library returned multiple kinds of proxies, but these were rather a limitation of the library / bug. I could work around that, so there will definitely be only one proxyType for one contract. See this issue for further reference. The only case when we would have multiple implementations would be
I agree with that. I think that the proxy verification should be integrated into the verification process, and it should be possible to run our verify endpoint with minimal proxies. I propose to make this and the required db schema change with our planned verification flow refactoring. |
As it has been a feature that was requested frequently, we decided to implement proxy detection and resolution of the implementation contract as an API-level functionality. This is made possible by the whatsabi library which detects major proxy styles by only looking at the bytecode. As a first proof of concept we implement this in API v1 before we incorporate it in the API v2 design.
Spec
On a request on the
check-all-by-addresses
endpoint, the bytecode is loaded from the db as before. We use the bytecode to callwhatsabi.autoload
. This will detect if the contract is a proxy and which type of proxy. It works without the need of rpc calls. If the address is a proxy we resolve the implementation address. This will make use of an rpc call depending on the type of proxy.The response of the API should look like this then:
If a contract is not verified, we don't have the bytecode. In this case we can let whatsabi fetch the bytecode as a fallback, and show the proxy status even if a contract is not verified. This is necessary to also support Minimal Proxies (#1624).
ToDos
FixedProxy
contracts in our DBThe text was updated successfully, but these errors were encountered: