Skip to content
This repository has been archived by the owner on Nov 2, 2018. It is now read-only.

Pagination for /wallet/addresses and /wallet/transactions #2511

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hudaniel
Copy link

@hudaniel hudaniel commented Dec 7, 2017

New to golang. Please let me know what could be improved.

Addressing this Issue: #1478

Changes:
Adding an optional page query to /wallet/addresses and optional queries confirmPage and unconfirmedPage to /wallet/transactions. Not specifying these queries will return the same response as before to better support the command line tools.

The response of these endpoints given the page queries will reply with a total_pages field which will tell the client how many pages are left, beginning with 0.

@tbenz9
Copy link
Collaborator

tbenz9 commented Dec 7, 2017

I think it makes more sense to have total_pages start at 1 instead of 0, even if a response only has 1 entry, that should still count as 1 page.

Is it possible to specify how many results are included in each response instead of hard-coding it at 25?

@mindjuice
Copy link

Most pagination APIs don't require you to send page numbers, but rather item numbers.

You provide start as the starting item number, and limit as the maximum number of items to return. Then for the next request, you increment start by the number of items received.

This is more flexible and lets you have multiple ways of "paging":

  • Move forward one item: [ next ]
  • Move forward 5 items: [ > ]
  • Move forward 10 items: [ >> ]

@lukechampine
Copy link
Member

It's also standard to include links to the next/previous pages in the header of the response. Here's an example from the GitHub API: https://developer.github.com/v3/guides/traversing-with-pagination. We don't have to add those now, but it's a good thing to keep in mind.

@tbenz9
Copy link
Collaborator

tbenz9 commented Dec 7, 2017

@DavidVorick as the reporter of #1478 do you have a preference on having the pagination delineate by total pages or item numbers? It seems there are pros/cons to both ideas and either would probably work fine.

@hudaniel
Copy link
Author

hudaniel commented Dec 8, 2017

Cool, I've seen pagination done both ways. But yeah the one that specifies the start might be better.

@hudaniel
Copy link
Author

hudaniel commented Dec 8, 2017

sample response for

localhost:9980/wallet/addresses?start=1&limit=5
{
    "addresses": [
        "2f03d9fcf3de30b905ae5cb992aed0f2a82dac9bc3ddbce630c8fe3b64fa633e6782fde4549e",
        "371bd3af1f153d246403c442ee91c6b042f3d7d1e42b35b61b2f8a2e6f2ff7e9691a51612e68",
        "40315eec5bbb186c3ec02fd385db8bd7125dc42d8709d8cf4cec7f85535ecb5afca1812d7cb7",
        "4267a2f33659f98770b4d90db7a0ac0af4bfb241c27bfc5721b9f3a347129e32cf58118354f9",
        "75ca2454b01b78b1b1aadd69c78f619aac03faf6d045fe012c25b28dfdf1c56f46c91b1cc77e"
    ],
    "pagination": {
        "start": 1,
        "limit": 5,
        "total_pages": 2
    }
}

@hudaniel
Copy link
Author

hudaniel commented Dec 8, 2017

Also, I'm not understanding the tests. I think the failure are unrelated to my changes?

@DavidVorick
Copy link
Member

Hey, sorry to let this go for two weeks without a response.

If you rebase to the current master and push, some of the test failures should disappear. The master branch has been in an unstable state for a while, but I think that we've finally cleaned it up.

With regards to the pagination, my preference would be to have a starting item and a maximum number of entries to return. Let's keep it simple, and just start with that for now, instead of also returning links for 'next' and 'prev'.

api/api.go Outdated

"github.com/NebulousLabs/Sia/build"
"github.com/NebulousLabs/Sia/modules"
)

const (
// size of each page for paginated API responses
DEFAULT_PAGINATION_SIZE = 25
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably default to something more like 1000. Also, convention is to put constants like this in consts.go, and the naming scheme would be DefaultPaginationSize instead of using all caps and underscores.

api/api.go Outdated
PaginationQueryIsValid bool
Start int
Limit int
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to run go fmt here. If you don't run go fmt then the build will always fail.

Best thing to do though is to run make test-long before you push code, because that'll do all the formatting, linting, vetting, and of course testing for you. Takes about 10 minutes on a desktop w/SSD.

@DavidVorick
Copy link
Member

have to run for a call, will continue review later. All functions need docstrings that explain what they do. This code overall is not very well commented.

@tbenz9
Copy link
Collaborator

tbenz9 commented Dec 20, 2017

@hudaniel I would recommend becoming familiar with this document for naming conventions and Sia developer tips. Thanks for your work so far, we're excited to see this PR get merged once it's complete.

https://github.com/NebulousLabs/Sia/blob/master/doc/Developers.md

@hudaniel
Copy link
Author

Thanks for the review, I will address these comments after the holidays!

@tbenz9
Copy link
Collaborator

tbenz9 commented Feb 22, 2018

@hudaniel Any chance we can get you to revive this PR? I know several people are still interested in the feature and you have got the best chance of pushing it through.

@hudaniel
Copy link
Author

hudaniel commented Feb 22, 2018 via email

@hudaniel
Copy link
Author

updated @tbenz9 @DavidVorick

@DavidVorick
Copy link
Member

Thanks, we will be looking at this as soon as possible.

@mtlynch
Copy link
Contributor

mtlynch commented Feb 26, 2018

Thanks for doing this work! In my Sia load test, I'm seeing siad latency reaching up to 300+ seconds, so pagination is a welcome feature.

An issue with this implementation is that it doesn't protect the client against changes to the underlying data between paginated calls.

This doesn't feel like such a big deal with transactions because transactions can't be deleted and the API guarantees chronological order, but consider if we apply this to something like /renter/files where items can be removed and the server does not guarantee ordering.

Imagine this scenario:

  1. The server has files [X, Y, Z]
  2. Client calls /renter/files?start=0&limit=1
  3. Server returns {file=X, start=0, limit=1, total_pages=3}
  4. Server adds file W, so now it has files [W, X, Y, Z]
  5. Client calls /renter/files?start=1&limit=1
  6. Server returns {file=X, start=1, limit=1, total_pages=4} <- dupe result

What does the client do in that situation? Do they have to throw everything away and start the query over? If the underlying data is changing more frequently than the client can iterate through the total pages, the client would never finish a query.

APIs usually handle this with page tokens. BigQuery does this, for example.

Instead of the server reporting the state at the time of each page call, they give the client a token associated with a set of immutable results. Even if the underlying data changes, the server will let the client iterate through the results snapshotted at the time when the page token was created.

Relatedly, it would be nice to allow the client to do a "dry" query where they request no data except the page token. My reading of the current implementation is that the client can't get metadata about the total size of the list unless they request at least one element of data. This is awkward because what most clients want to do is something like:

token := getPageToken(query)
while token.hasMoreElements() {
  results = append(results, getNextPage(token))
}

@lukechampine
Copy link
Member

@mtlynch That's a nice property to have, but the implementation would be difficult. Could tokens be tacked on afterward without breaking compatibility?

I'm also wondering if pagination is really required for wallet transactions. You can already query them by block height, so shouldn't the client be able to handle the pagination? There also isn't an issue of reordering/invalidation, because new transactions should only be added to the end of the list.

@tbenz9
Copy link
Collaborator

tbenz9 commented Feb 27, 2018

I'm also wondering if pagination is really required for wallet transactions.

"All wallet calls which return potentially large amounts of data (like the transactions or the addresses) need to support pagination in the API." --David Vorick in #1478

@mtlynch
Copy link
Contributor

mtlynch commented Feb 27, 2018

@mtlynch That's a nice property to have, but the implementation would be difficult. Could tokens be tacked on afterward without breaking compatibility?

Would it be possible to make a page token, but just have it get invalidated any time an element is added or removed in the underlying data?

That way, the server doesn't have to maintain snapshots, but the client knows when they need to re-start a paged query because the data shifted. It should be backwards compatible from the client's end because they have to handle page token expiration anyway. If we later improve the server so that it keeps page tokens (and their associated snapshots) valid for 10 mins, then the clients don't have to change their code, but they'll encounter expired tokens much less frequently.

It still has the problem that the client can't achieve a single, consistent "view" of the underlying data, but that doesn't seem critical.

Copy link
Contributor

@ChrisSchinnerl ChrisSchinnerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this! I'd just like to see some refactoring to simplify the code.The functionality looks fine to me at a first glance but we still need tests for each endpoint that supports pagination in our new siatest package.

@@ -29,6 +32,25 @@ type Error struct {
// be valid or invalid depending on the current state of a module.
}

type PaginationRequest struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need that struct at all. We can just return multiple return values in the methods that use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw all exported structs and methods need a comment.

Limit: paginationRequest.Limit,
TotalPages: totalPages,
}
return pagination, paginationResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pagination and paginationResponse have exactly the same fields. Why not just do a
return startPageIndex, endingPageIndex, totalPages? That reduces the length of the method by 50%. You might even want to use named return values instead. That way it is just a return` at the end of the function.

build.Critical("pagination request limit must be greater than 0")
}
slice := reflect.ValueOf(sliceInterface)
if slice.Kind() != reflect.Slice {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a bit complicated for just getting the length of arbitrary data. Why not just change sliceInterface interface{} to dataLen int ?

// returns the starting index, end index, and total pages of a given slice for the page
func GetPaginationIndicesAndResponse(sliceInterface interface{}, paginationRequest PaginationRequest) (PaginationWrapper, PaginationResponse) {
if paginationRequest.Limit <= 0 {
build.Critical("pagination request limit must be greater than 0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build.Critical should be used as a last resort. In production this will only print to the terminal and will continue the execution of the program. In a debug build this panics.
You should rather return an errors.New("pagination request limit must be greater than 0") and write that error to the http response later.

startingPageIndex := int(math.Min(float64(paginationRequest.Start), float64(slice.Len())))
endingPageIndex := int(math.Min(float64(paginationRequest.Start+paginationRequest.Limit), float64(slice.Len())))
totalPages := slice.Len() / paginationRequest.Limit
if math.Mod(float64(slice.Len()), float64(paginationRequest.Limit)) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need math.Mod with all the float casting. Instead you can do if slice.Len() % paginationRequest.Limit != 0

@@ -162,3 +184,77 @@ func WriteJSON(w http.ResponseWriter, obj interface{}) {
func WriteSuccess(w http.ResponseWriter) {
w.WriteHeader(http.StatusNoContent)
}

func GetPaginationDefaultRequest(req *http.Request) PaginationRequest {
return GetPaginationRequest(req, "start", "limit")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the custom names for start and limit. If we do that, the names of the fields of PaginationResponse should probably also be custom to match the fields in the request.

End int
TotalPages int
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably enough new pagination types and methods to move the generic ones into a node/api/pagination.go file instead of having them in api.go.
The ones that are specific to a module can still be in the corresponding [module].go file.

Addresses: addresses,
Pagination: paginationResponse,
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above I would change the call to GetPaginationRequest so that you get something like the following:

start, limit, err := GetPaginationDefaultRequest(req)
...
if err == errNoPagination {
  // legacy behavior
  return
}
// new behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a custom error like errNoPagination by declaring it at the top of the file. You can find an example in modules/transactionpool/transactionpool.go where we declare errNilCS and errNilGateway at the top.
Of course that also means you have to do a return errNoPagination in GetPaginationRequest if no params were specified.

ConfirmedTransactions: confirmedTxns,
UnconfirmedTransactions: unconfirmedTxns,
})
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid nested if statements. You can get rid of the else by returning after the call to WriteJSON. In the API we usually don't do anything after calling WriteJSON or WriteError which means you can usually simplify your code by returning right afterwards.

@@ -49,6 +49,9 @@ var (
SiafundCount = NewCurrency64(10000)
SiafundPortion = big.NewRat(39, 1000)
TargetWindow BlockHeight

// size of each page for paginated API responses
DefaultPaginationSize = 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually try to keep the constants in a consts.go file within the module it is used in. Since it is only 1 declaration you can probably add defaultPaginationSize to a node/api/pagination.go file together with the code or move it into a node/api/consts.go file.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants