Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (nonblocking) read(buf, len) to Stream class #54

Open
matthijskooijman opened this issue Aug 22, 2014 · 9 comments
Open

Add (nonblocking) read(buf, len) to Stream class #54

matthijskooijman opened this issue Aug 22, 2014 · 9 comments
Assignees

Comments

@matthijskooijman
Copy link
Collaborator

Currently, the Stream class only has a (virtual) read() method to read one character at a time. It does have a readBytes() method to read multiple bytes at a time, but hat one blocks (with a timeout) and is not virtual, so it resorts to single-character reads.

It would be good to have a virtual multiple-byte reading method, which allows:

  • Callers to easily read multiple bytes into a buffer
  • Stream implementations to do efficient multiple-byte reads (e.g. prevent one SPI transfer per byte)

The Client class already has exactly this method, probably for the above reasons. Adding a virtual version of it to Stream means that Client implementations will support it out of the box.

We can also add a default implementation to the Stream class (using the existing read() method), that should keep all existing Stream implementations working. The default implementation would be something like:

Something like this (untested):

int read(uint8_t *buf, size_t size) {
    if (!available())
        return -1;
    int res = 0;
    while((int c = read()) >= 0 && size--)
        buf[res] = c;
    return res;
}

Note that this has a slightly peculiar return value: it returns -1 when
there is no data, to match the (Ethernet)Client implementation (which
returns 0 when the connection is closed). I guess it would make sense
swapping these, but that would break backward compatibility for the
Ethernet and (to a lesser degree, see arduino/Arduino#2251) the Wifi library.

How does all this sound?

@PaulStoffregen
Copy link

Why not just make readBytes() virtual? When the timeout is set to zero, it's the same as this proposed function.

@millerabel
Copy link

The proposed implementation above could be improved in a couple of small ways:
(by replacing the while loop in the above code...)

// Don't read if there is no place to put the byte:
while (size-- && (int c = read()) >= 0) {

    // Increment res after each byte is stored:
    buf[res++] = c;

    // Don't block when we run out of bytes:
    if (!available())
        break;
}

With these changes,

  • If no bytes available before calling, returns -1.
  • Otherwise, returns the actual number of bytes that were available(), read() and stored in buf.
  • If called with size == 0, returns zero without calling read().

Improve readBytes() by making it virtual, to allow redefinition, and with the above as the zero timeout behavior. There should remain a blocking variant of readBytes() that does not accept a timeout parameter.

@matthijskooijman
Copy link
Collaborator Author

Why not just make readBytes() virtual? When the timeout is set to zero, it's the same as this proposed function.

I don't really like this approach, but hadn't thought about why until now. Two problems I can see:

  • If you want to read multiple bytes from within a library, you have to modify the timeout that the user may have set. I don't really like a "change, read, change back" approach in the first place, but I don't think there is even a way to get the current timeout.
  • Every subclass that implements the efficient multiple-read thing, will also have to re-implement the timeout functionality. Likely to cause small differences in implementation as well.

Hence I'd vote for adding a new virtual method instead.

If no bytes available before calling, returns -1.

I don't really like this - there's no point to it except backward compatibility, which I'd be willing to break for 1.6.0. For ethernetclient, I don't think there is any point in distinguishing between "no data" and "connection closed" (if an application cares, it can just check EthernetClient::connected()). Applications can just check for <= 0 if they want to support 1.6 and pre-1.6, only problem would be programs that check for == -1 or < 0 now.

 // Don't block when we run out of bytes:
if (!available())
    break;

I don't think this is needed, calling read() doesn't block either.

while (size-- && (int c = read()) >= 0) {

I don't really like decrementing size before we're sure we actually have something to read, since the value of size will be wrong after the loop. IT doesn't really matter since it is not used there, but for future-proofness, how is this?

size_t read(uint8_t *buf, size_t size) {
    size_t res = 0;
    while (res < size && (int c = read()) >= 0)
        buf[res++] = c;
    return res;
}

I think this has all the points you mentioned, except the -1 return value.

I think

@millerabel
Copy link

I think this proposal improves on the prior proposals. I agree with your reasoning, and it is tighter for sure. Thank you.

However, none of the proposals that rely on repeatedly calling read() are appropriate for bus-based systems. In SPI for example, it is inefficient to establish and release bus select, and to re-address each byte transfer individually.

For the NFC library and application I'm developing, I must maintain > 1 Mb/s transfer rate between the Arduino and target. And the target relies on chip select continuity to define the boundaries of continuous addressing. The hardware is fully capable of supporting this (I've run it up to 4 Mb/s). The standard io library ought to support a virtual bulk transfer method that can be overridden so it's not dependent on repeated read() calls.

In overridden virtual readBytes() / writeBytes() / transferBytes() methods I would:

Establish bus master, then transfer all buffered bytes, (bidirectionally in the case of transferBytes, optionally using the same buffer for outgoing and incoming byte flow), then release the bus.

Overwriting readBytes(), writeBytes(), or transferBytes() would permit this implementation. And in my library, I would redefine read() in terms of readBytes as a special case, reversing the implementation dependency for this one type of bus. Same for write() in terms of writeBytes().

There would be no externally visible API difference, but performance would be greatly improved. So higher level libraries that depend on read() and readBytes() continue to work.

A reach goal would be to support the bulk transferBytes() also asynchronously, if a standard Arduino library pattern emerges for that (direct call to setup transfer, then with IRQ continuation support). I'll leave that to a separate discussion.

@matthijskooijman
Copy link
Collaborator Author

I'm not sure I understood everything you said on first reading (no more time now), but the idea was to make the new read function virtual, so it can be overridden. The above implementation is just the default implementation for existing subclasses / subclasses that do not care. Would that sufficiently address your usecase?

@millerabel
Copy link

Yes, if the new "size_t read(uint8_t *buf, size_t size)" is virtual, and there is a similar virtual writeBytes function, I can support my use case. (I write readBytes as a shorthand for this method signature).

I would then override the single byte read(uint8_t c) in terms of the overridden readBytes function. Same for write and writeBytes.

I will have to further study how to correctly implement the timeout behavior in a bus transfer.

@PaulStoffregen
Copy link

If you want to read multiple bytes from within a library, you have to modify the timeout that the user may have set. I don't really like a "change, read, change back" approach in the first place, but I don't think there is even a way to get the current timeout.

Are there really compelling use cases where both a library and the main sketch would need to use readBytes(), with different timeout settings?

Nearly all the libraries that use a Stream pointer or reference are designed to manage all the communication on that port. Bridge, Xbee, Firmata, are just a few examples. Are there any existing and widely used libraries that realistically need readBytes() with a configure, read, reconfigure approach to allow the sketch to also read from the same Stream?

Every subclass that implements the efficient multiple-read thing, will also have to re-implement the timeout functionality. Likely to cause small differences in implementation as well.

This is equally true for read() or readBytes(). Today, Print class objects that implement block write have their own implementations, if they don't use the fallback one.

@cmaglie cmaglie self-assigned this May 28, 2015
@sandeepmistry sandeepmistry transferred this issue from arduino/Arduino Sep 16, 2019
@lasselukkari
Copy link

lasselukkari commented Jan 18, 2021

I really would like to see this fixed. I have a class that takes Stream as a input. My own class is also derived from the Stream. Most of the the time the the wrapped Stream given as input is actually a Client but now I can not use the version of read that would be using the buffer and be a lot more performant. I can go around the problem by wrapping the input Stream to a Client class that has all the Client specific functions such as connect stubbed. This way the input could be a Stream or a Client but this feels stupid.

In my opinion it does not make any sense that the Client and Stream class would have any differences in the read and write functions. The Client should just extend the Stream with the IP related functions. I should not need to use the Client instead Stream to be able to read properly.

@lasselukkari
Copy link

lasselukkari commented Jan 19, 2021

@aentinger and @facchinm what do you think? Could this be fixed. The changes would be completely backwards compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants