Parallel and Distributed Computing
This is a collection of parallel and distributed computing projects I did. Levels of parallelism vary from data level SIMD to thread level OpenMP to Spark based map-reduce.
- Project 1: Homemade Numpy (spec)
- Design and implement a slower version of numpy that supports cache-optimized parallel matrix computations.
- Highlights: C, SIMD, OpenMP
- Project 2: Yelp Ratings Prediction (spec)
- Use the MapReduce programming paradigm to parallelize a Naive Bayes classifier with a Bag of Words model in Spark to predict Yelp review ratings.
- Highlights: Python, Spark, Map Reduce
- Project 3: Parallel Huffman Coding (report)
- Implement a parallel algorithm to generate Huffman codes.
- Highlights: Java multithreading