Spring 2019 - Applied Distributed Systems

(Previously called Science Gateway Architectures)

  • Course: CSCI-B 649, Topics in Systems, Computer Science, School of Informatics and Computing, Indiana University
  • Instructors: Marlon Pierce, marpierc@iu.edu; Suresh Marru, smarru@iu.edu

  • Class Schedule Friday from 1:00 pm to 3:30 pm in I2 (Informatics East) Room 130
  • Office Hours To Be Scheduled

Course Overview

Science gateways are distributed computing environments that enable scientists to conduct computational experiments on computing clouds and supercomputers and have revolutionized bioinformatics, computational chemistry, nano-engineering, atmospheric science and other scientific fields by bringing unprecedented computing power to a broad community of scientists. Gateways are interesting topics in their own right. Modern gateway systems utilize microservice architectures and DevOps principles in their design and operations, adopting lessons learned from cloud-based Software as a Service activities. Distributed systems by design scale software to handle large amount of data and to achieve better performance. Inherently distributed systems face challenges related to scaling and a system consists of multiple processes and these processes may run on different hardware systems. The challenges in distributed systems can be mainly categorized as follows:

  • Scalability
  • Efficiency: System handles large amount of data, so performance is important
  • Fault tolerance: Now to solve a problem multiple processes work together. If a process goes down, system is not able to solve a problem.
  • Operation: Easy scaling reduce operation complexity and cost
  • Avoid over-engineering
  • Ability to work with multiple devices
  • Change management

In this course, students will be divided into development teams, and each team will build a distributed system software as a service system from scratch. Teams will be encouraged to explore alternative technologies and ways for building systems as well as learning DevOps principles such as containerization, continuous integration, and continuous deployment for deploying robust cloud services. Students will also be introduced to the Apache Software Foundation’s open community governance principles for open source software and will learn how to effectively interact with Apache Software Foundation projects in order to become committers and project management committee members. Finally the students will have an opportunity to apply the learnings to Apache Airavata based Science Gateways.

Course Objectives

  • Provide a high level, broad understanding of the application of core distributed computing systems concepts to “Software as a Service” systems that support scientific research and education.
  • Study both abstract concepts and practical techniques for building science gateways.
  • Provide hands-on experience in developing a science gateway while working with open source philosophies modelled after Apache Software Foundation.
  • Apply the general concepts of Distributed Systems and understanding state of the art in applicable areas.

Course Outcomes

  • Demonstrate an applied understanding of microservice architectures and their underlying distributed systems foundations.
  • Demonstrate an applied understanding of the DevOps principles of continuous integration and delivery to the development and operations of science
  • Demonstrate an understanding of open source practices, particularly those of the Apache Software Foundation.
  • Demonstrate an ability to develop remote job submission interfaces to computational cyberinfrastructure like IU Big Red 2 Supercomputers.
  • Demonstrate an ability to develop a simple metadata management system.
  • Demonstrate an ability to develop and consume API services.

Course Structure

Course Goal: students will, working in team of 2 to 3 students., learn and apply modern distributed computing concepts to a stand alone Apache Airavata and contribute them to the code base.

  • All assignment reports are individual assignments even though projects are executed within groups.
  • The course will focus on learning basic distributed computing concepts, microservices, DevOps etc. Students interested in applying these concept to Apache Airavata can do so in the Advanced Science Gateway Architectures course.


The course will be taught by Marlon Pierce and Suresh Marru, who lead the Pervasive Technology Institute’s Science Gateways Research Center and are members of the Apache Software Foundation and project management committee members for the Apache Airavata open source software.


  • Homework Assignments:
    • Each assignment is worth 18 points
      • You will get 18 points for your submission
      • You will get 2 points for peer reviewing other submissions
    • There will be 4 assignments
    • Assignments will be graded on functionality of the submission, which must be documented in your wiki
    • Each assignment is submitted as a Wiki entry in GitHub
    • Each project gets a github repo in https://github.com/airavata-courses
    • Each assignment must be submitted for initial peer grading via GitHub issues, followed by full submission (with corrections) to Canvas.
  • Presentations and Reports
    • Each student makes a presentation (5 points)
    • Each student submits a report (5 points)
    • Midterm: 10 points
    • Final: 10 points


During the course, instructors will provide references to journal and conference papers. A good understanding of concepts discussed in these referred papers will greatly help in absorbing the course material.

Beginner Materials

Open Collaboration

  • Reuse and building upon ideas or code are major parts of modern software development. As a professional programmer you will never write anything from scratch. This class is structured such that all solutions are public. You are encouraged to learn from the work of your peers. We won’t hunt down people who are simply copying-and-pasting solutions, because without challenging themselves, they are simply wasting their time and money taking this class.

  • Please respect the terms of use and/or license of any code you find, and if you reimplement or duplicate a design or code from elsewhere, credit the original source.