iRODS-Based Climate Data Services and Virtualization-as-a-Service in the NASA Center for Climate Simulation
Wednesday, September 12, 2012
Building 3 Auditorium - 11:20 AM
(Coffee and cookies at 10:30 AM)
Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service (VaaS). A virtual climate data server (vCDS) is an Open Archive Information System (OAIS) compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have deployed vCDS Version 1.0 in the Amazon EC2 cloud using S3 object storage and are using the system to deliver a subset of NASA's Intergovernmental Panel on Climate Change (IPCC) data products to the latest CentOS federated version of Earth System Grid Federation (ESGF), which is also running in the Amazon cloud. vCDS-managed objects are exposed to ESGF through FUSE (Filesystem in User Space), which presents a POSIX-compliant filesystem abstraction to applications such as the ESGF server that require such an interface. A vCDS manages data as a distinguished collection for a person, project, lab, or other logical unit. A vCDS can manage a collection across multiple storage resources using rules and microservices to enforce collection policies. And a vCDS can federate with other vCDSs to manage multiple collections over multiple resources, thereby creating what can be thought of as an ecosystem of managed collections. With the vCDS approach, we are trying to enable the full information lifecycle management of scientific data collections and make tractable the task of providing diverse climate data services. In this presentation, we describe our approach, experiences, lessons learned, and plans for the future.
Dr. John Schnase is a Senior Computer Scientist in NASA Goddard Space Flight Center's Office of Computational and Information Science and Technology. His work focuses on the development of new information technologies and their transfer into practical use. John is currently helping develop scientific data services in the NASA Center for Climate Simulation (NCCS). He also is a Principal Investigator on activities that focus on the use of data grids, cloud computing, and map reduce analytics in climate research and climate adaptation influences on post-wildfire ecosystem recovery. He is a Fellow of the American Association for the Advancement of Science (AAAS), a former member of the Biodiversity and Ecosystems Panel of the President's Committee of Advisors on Science and Technology (PCAST), and a member of the ABET Computing Accreditation Commission.
IS&T Colloquium Committee Host: John Donohue
Sign language interpreter upon request: 301-286-7040
Request future announcements