Science Gateway Architectures

Course Overview

The following video describes the course:

Course Overview

Science gateways are distributed computing environments that enable scientists to conduct computational experiments on computing clouds and supercomputers and have revolutionized bioinformatics, computational chemistry, nano-engineering, and other scientific fields by bringing unprecedented computing power to a broad community of scientists.

Gateways are interesting topics in their own right. Modern gateway systems are moving towards microservice architectures and DevOps principles employing containerization (using Docker), continuous integration and continuous delivery.

Many gateways are also investigating how to integrate Apache’s “big data” and cloud computing software projects like Apache Mesos, Apache Spark, Apache Samza, Apache JClouds and Apache Kafka. RPC versus message-oriented middleware at scale is also an open question, as are NoSQL versus Relational DB approaches for gateway data management. Finally, as gateways typically include Web-based user environments, choosing the right technologies and crafting the correct user experience are challenging problems.

In this course, students will be divided into development teams, and each team will build a science gateway software as a service system from scratch. Teams will be encouraged to explore alternative technologies and ways for building science gateways as well as learning DevOps principles for deploying robust cloud services. Students will also be introduced to the Apache Software Foundation’s open community governance principles for open source software and will learn how to effectively interact with Apache Software Foundation projects in order to become committers and project management committee members.

Course Objectives

Course Outcomes

Instructors

The course will be taught by Marlon Pierce and Suresh Marru, who lead the IU Science Gateway Group and are project management committee members for the Apache Airavata open source software.

Prerequisites

Students should be familiar with Linux and Unix operating systems, basic networking concepts, one or more programming languages, databases, basics of Web development, and version control systems

Course Outline

Week 1 - January 12th

Lectures

Assignment 1

Week 2 - January 19th

Lectures

Week 3 - January 26th

Lectures

Assignment 2.1: Peer Review Project 1 (1 point)

Assignment 2.2: Peer Review Project 2 (1 point)

Project Milestone 1 Due: Application that submits and monitors a job remotely on Karst

Week 4 - February 2nd

Lectures

Week 5 - February 9th

Lectures

Project Milestone 2 Due (10 Points)

Week 6 - February 16th

Lectures

Assignment 4

Week 7 - February 23rd

Lectures

Week 8 - March 1st

Lectures

Project Milestone 3

Week 9 - March 8th

Lectures

Project Milestone 4

Spring Break - March 13th to 18th

Week 10 - March 22nd - Midterm Demos, Presentations

Midterm Presentations

Assignment 5

Week 11 - March 29th

Lectures

Project Milestone 5

Week 12 - April 5th

Lectures

Assignment 6

Week 13 - April 12th

Lectures

Project Milestone 6

Week 14 - April 19th

Lectures

Assignment 7

Week 15 - April 26th - Final Demos, Presentations

Lectures

Project Milestone 7

Finals Week - May 2nd - 6th

Student Teams

Grading

Resources

During the course, instructors will provide references to journal and conference papers. A good understanding of concepts discussed in these referred papers will greatly help in absorbing the course material.

Beginner Materials

Reference Papers

Open Collaboration