K-State has remote hosted a Big Data workshop taught by the Pittsburgh Supercomputer Center on their Bridges supercomputer. This course is entirely based on videos from their two day workshop taught April 7-8 of 2020. Anyone taking this on-line course can go through the videos at their own pace, perform the exercises and homework assignments, and even test themselves using the quizzes at the end.
Everything needed has been adapted and tested for the Beocat cluster computer at Kansas State University and the BeoShock cluster at Wichita State University. Each user can run their jobs interactively or submit them through the Slurm batch scheduler.
For each section, there is a video to listen to and some PDF slides that you can follow along with, plus directions here on how to do the same work on Beocat/BeoShock. The >
sign at the start of lines below represents the command line prompt on Beocat/BeoShock, and >>>
represents the prompt you’ll get when you start pyspark or python if you run interactively.