Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure python solution #3

Open
abhishekkrthakur opened this issue Feb 8, 2021 · 2 comments
Open

Pure python solution #3

abhishekkrthakur opened this issue Feb 8, 2021 · 2 comments

Comments

@abhishekkrthakur
Copy link
Owner

Here is a solution that I thought of. Runs in 5.5 seconds on my machine:

import glob
import time

start = time.time()

path = "../data/"
all_files = glob.glob(path + "/*.csv")

cols = ["f_0", "f_1", "f_2", "f_3", "f_4", "f_5", "f_6", "f_7", "f_8", "f_9", "target"]
with open("out.csv", "w") as out:
    out.write("%s\n" % ",".join(cols))
    for filename in all_files:
        with open(filename) as f:
            for idx, line in enumerate(f):
                if idx > 0:
                    out.write(line)
end = time.time()
print(end - start)
@sbarthwal
Copy link

Could you (@abhishekkrthakur ) please provide your machine configuration?
Because it is taking 124.09613084793091 s on my machine.
My machine configuration:
MacBook Pro (15-inch, 2019)
2.3 GHz Intel Core i9
16 GB 2400 MHz DDR4
Radeon Pro 560X 4 GB
Intel UHD Graphics 630 1536 MB

Thank you

@abhishekkrthakur
Copy link
Owner Author

core i7, 32gb ram but that shouldnt matter. 124s and 5s is huge!!! something else is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants