Science Gateway Architectures

Course Overview

The following video describes the course:

Course Overview

Science gateways are distributed computing environments that enable scientists to conduct computational experiments on computing clouds and supercomputers and have revolutionized bioinformatics, computational chemistry, nano-engineering, and other scientific fields by bringing unprecedented computing power to a broad community of scientists.

Gateways are interesting topics in their own right. Modern gateway systems are moving towards microservice architectures and DevOps principles employing containerization (using Docker), continuous integration and continuous delivery.

Many gateways are also investigating how to integrate Apache’s “big data” and cloud computing software projects like Apache Mesos, Apache Spark, Apache Samza, Apache JClouds and Apache Kafka. RPC versus message-oriented middleware at scale is also an open question, as are NoSQL versus Relational DB approaches for gateway data management. Finally, as gateways typically include Web-based user environments, choosing the right technologies and crafting the correct user experience are challenging problems.

In this course, students will be divided into development teams, and each team will build a science gateway software as a service system from scratch. Teams will be encouraged to explore alternative technologies and ways for building science gateways as well as learning DevOps principles for deploying robust cloud services. Students will also be introduced to the Apache Software Foundation’s open community governance principles for open source software and will learn how to effectively interact with Apache Software Foundation projects in order to become committers and project management committee members.

Course Objectives

Course Outcomes


The course will be taught by Marlon Pierce and Suresh Marru, who lead the IU Science Gateway Group and are project management committee members for the Apache Airavata open source software.


Students should be familiar with Linux and Unix operating systems, basic networking concepts, one or more programming languages, databases, basics of Web development, and version control systems

Course Outline

Week 1 - January 12th


Assignment 1

Week 2 - January 19th


Week 3 - January 26th


Assignment 2.1: Peer Review Project 1 (1 point)

Assignment 2.2: Peer Review Project 2 (1 point)

Project Milestone 1 Due: Application that submits and monitors a job remotely on Karst

Week 4 - February 2nd


Week 5 - February 9th


Project Milestone 2 Due (10 Points)

Week 6 - February 16th


Assignment 4

Week 7 - February 23rd


Week 8 - March 1st


Project Milestone 3

Week 9 - March 8th


Project Milestone 4

Spring Break - March 13th to 18th

Week 10 - March 22nd - Midterm Demos, Presentations

Midterm Presentations

Assignment 5

Week 11 - March 29th


Project Milestone 5

Week 12 - April 5th


Assignment 6

Week 13 - April 12th


Project Milestone 6

Week 14 - April 19th


Assignment 7

Week 15 - April 26th - Final Demos, Presentations


Project Milestone 7

Finals Week - May 2nd - 6th

Student Teams



During the course, instructors will provide references to journal and conference papers. A good understanding of concepts discussed in these referred papers will greatly help in absorbing the course material.

Beginner Materials

Reference Papers

Open Collaboration