All the labs were done using Google Cloud Platform (GCP) for the course CSCI 5253 : Datacenter Scale Computing.
The course covers the primary problem solving strategies, methods, and tools needed for data-intensive programs using large collections of computers typically called as "warehouse scale" or "data-center scale" computers. The course also examines methods and algorithms for processing data-intensive applications, methods for deploying and managing large collections of computers in an on-demand infrastructure and issues of large-scale computer system design.
- A quick setup of the Google Cloud Platform
- Converting WordCount Map-Reduce example to URLCount using Hadoop
- Chain Mappers/Reducers application using Hadoop on Google Cloud Platform
- Application that demonstrates PySpark and Python's DataFrame (DF) functions
- Demonstrate the construction of VM instances on Google Cloud Platform programmatically.
- Compare the REST and GRPC API calls for their latency and bandwidth on Google Cloud Platform