-
Notifications
You must be signed in to change notification settings - Fork 149
Issues: mosaicml/streaming
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Expose StreamingDataset world (or world_size and rank) as argument
enhancement
New feature or request
#854
opened Dec 19, 2024 by
lukemelas
Will cache eviction logic take previously-existing shards into account?
#844
opened Dec 5, 2024 by
jamin-chen
Pipeline Parallelism (Supported? How to?)
enhancement
New feature or request
#827
opened Nov 14, 2024 by
casper-hansen
UnicodeDecodeError: ... Efficient way to debug the dataset with streaming?
enhancement
New feature or request
#820
opened Nov 1, 2024 by
TAYmit
Choose JPEG compression level
enhancement
New feature or request
#811
opened Oct 24, 2024 by
cabreraalex
Support for on-the-fly filtering
enhancement
New feature or request
#800
opened Oct 9, 2024 by
ColinToft
Make New feature or request
epoch_sample_ids
cachable
enhancement
#792
opened Sep 28, 2024 by
janEbert
Dataset does not work after stopping training
bug
Something isn't working
#781
opened Sep 15, 2024 by
gluonfield
JointWriter: Allow shard file appending
bug
Something isn't working
#775
opened Sep 5, 2024 by
janEbert
File exists: '/000000_epoch_shape' when using the ddp strategy from pytorch lightning
bug
Something isn't working
#767
opened Aug 25, 2024 by
elbamos
Estimate total shards at the beginning of data conversion
enhancement
New feature or request
#742
opened Aug 3, 2024 by
abhijithneilabraham
huge temp files while uploading data using MDS writer
bug
Something isn't working
#734
opened Jul 24, 2024 by
MaxxP0
Replication changes sample order
bug
Something isn't working
#725
opened Jul 15, 2024 by
CodeCreator
'File exists: "/00000_locals"' when integrated with deepspeed training scripts
bug
Something isn't working
#717
opened Jul 8, 2024 by
Clement25
All processes allocate memory on rank 0 during StreamingDataset initialization in a distributed setting
bug
Something isn't working
#716
opened Jul 2, 2024 by
ohallstrom
Optional dependency for different storages?
enhancement
New feature or request
#709
opened Jun 24, 2024 by
huxuan
Suboptimal usage of 8xH100 GPUs - Streaming dataloader speed significantly fluctuates across batches
bug
Something isn't working
#686
opened May 25, 2024 by
VSehwag
Last entry in the dataset is causing "Relative sample index $x is not present" error
bug
Something isn't working
#677
opened May 20, 2024 by
isidentical
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.