Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading a file line by line does not pause #39

Open
cimi opened this issue Jun 24, 2013 · 1 comment
Open

Reading a file line by line does not pause #39

cimi opened this issue Jun 24, 2013 · 1 comment

Comments

@cimi
Copy link

cimi commented Jun 24, 2013

I've tried this simple code using node-lazy:

var productStream = function (readstream) {
  return new Lazy(readstream)
    .lines.forEach(function (line) {
      console.log(count++);
      if (count > 10) {
        console.log("Should stop");
        readstream.pause();
      }
      return normalizeAttributes(JSON.parse(line.toString().slice(0, -1)));
    })
}

console.log(productStream(fs.createReadStream(datafile)).take(5));

Lazy does not stop after the tenth line, it sweeps through the whole file. I've tried with a fairly large one (20 000 lines). If I handle the stream events myself (separating lines myself), I can get the expected behavior.

I've also asked this on StackOverflow, but since I got no answer there I thought I'd go straight to the source. :)

@mpelikan
Copy link

Initially, I thought the same thing with a project that I was working on. However, after I dug into it, it appears that Lazy is working properly.

The stream that you are passing into Lazy is being paused; however, you need to consider the stream's block size. In prior versions of Node, you could pass blockSize into the fs.createReadStream. However, it is now deprecated and not used. As such, you may not realize that the stream is reading in a large amount of data with each call (OS dependent).

While Lazy is handling all of this behind the scenes, you need to realize that the filesystem reads are coming in large chunks and the stream pause is only pausing the stream from further reads, not pausing Lazy's processing of the buffer returned by the read.

In my case, the fs.read would return 1000s of lines. If I paused the stream in one of Lazy's lines.forEach(), it would pause the stream, but the 1000s of lines in the Lazy buffer would be processed before the pause would be noticed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants