Data-Deduplication-on-Cloud

Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. Here our team did the study of methods of deduplication through which we can do deduplication. In this project, we did 4 methods of deduplication - Fixed Length Chunking , Variable length Chunking , Sliding gate chunking and Extreme Binning.

Inferences

Conclusion

From the results so obtained from the project, we can clearly see the trade-off between the deduplication efficiency and the computation time. In practice as a client point of view we want that our computation time should be small, but we also want that our storage should be used efficiently. So we can conclude for the company point of view where the efficiency and time both matters, we can use the extreme binning algorithm. This algorithm performs well even if we have prefixes as this uses the variable length chunking.

NOTE

This project was done in 2017, the original repositories contents were copied and put here in a different repo for backup in 2020

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Data-deduplication-master		Data-deduplication-master
Dedup		Dedup
Inferences		Inferences
Conclusion.pdf		Conclusion.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Deduplication-on-Cloud

Inferences

Conclusion

NOTE

About

Releases

Packages

Languages

saurabhkumar8112/Data-Deduplication-on-Cloud

Folders and files

Latest commit

History

Repository files navigation

Data-Deduplication-on-Cloud

Inferences

Conclusion

NOTE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages