B2STAGE is one of the core services of the EUDAT Data Infrastructure offering functionalities for the easy transfer of data between EUDAT resources and external computational facilities, including those provided by PRACE or EGI. It supports different functionalities allowing users either to stage data outside EUDAT, or ingest computational results while maintaining the coherency of associated PIDs. The basic component of the service extends iRODS, the core technology of the EUDAT infrastructure, to support the GridFTP protocol, the de facto standard for massive and high performance data transfers. Thanks to this component users, once authorized, are able to seamless access EUDAT storage resources and any plain GridFTP server.

In the course of the next months, important enhancements will be developed to improve the user's experience with the service and foster the interoperability with other e-Infrastructures. The most relevant achievement will regard the release of a RESTful HTTP interface being compliant with CDMI specification.  
During the session the actual status of the B2STAGE and further developments will be presented with a particular emphasis on the usage experience of EUDAT users (features and limits).
The interaction with the attendees will be important to understand their concrete needs on the following aspects:
To what extent, do user need to know the PIDs ingested data (possibly billions)?
Would the user like to describe its PID with some metaPID in order, for example, to copy all the corresponding PID outside EUDAT identifying them with the metaPID only (so keeping the command easy)?
Which is the preferred transfer protocol?

Thursday 25th September 2014

by Giuseppe Fiameni, CINECA
The B2STAGE is a reliable, efficient, lightweight and easy-to-use service that permits research data to be transferred among EUDAT storage resources and high-performance computing (HPC) workspaces. The service allows community users to easily ingest data sets onto EUDAT storage resources for long-term preservation via a programmatic interface and to transfer large data collections from EUDAT storage resources to external HPC facilities for being processed. The goal of this presentation is to present the characteristics of the services, the developments that have been made to offer a programmatic interface, the future works towards a common service layer interface.
by Andrea Manzi, CERN

Andrea Manzi is a computer scientist graduated from the University of Pisa in Italy. After working 3 years as a research fellow at CNR (the Italian National Research Council) and 1 year in a private company (IONTrading) as software developer, he started working at CERN in 2009 and is am currently holding a staff position in the IT Department. He was involved in the EU project iMarine till April 2014, having the roles of deputy technical director, WP leader and developer of solutions for data transfer. He is currently involved in the DPM project (grid storage), where he is the developer of an extension of the storage solution to Apache HDFS and FTS project where he is developing a Web Frontend for File Transfer. He is  also a member of the WLCG collaboration with the role of Middleware Officer.

Collaboration with PRACE: gtransfer - A tool for WAN transfers
by Frank Shiner
Frank Scheiner is the creator of gtransfer and gsatellite and is working at HLRS since 2009. Frank is also a free software advocate and started the development of gtransfer back in 2010 during the DEISA2 project to support a challenging user demand. Professionally he is mainly interested in enabling high performance data transfers - especially with GridFTP. He was responsible for the GridFTP infrastructure in DEISA2 (GridFTP subtask leader) and PRACE (Data management subtask leader) projects and developed software and procedures to simplify the setup and operation of GridFTP services. In his spare time he enjoys science fiction and maintaining his collection of vintage and extraordinary computers.
by Shaun de Witt,  STFC
Shaun de Witt is an expert in federated data management within the Science and Technology Facilities Council (STFC) in the UK.  He has previously worked on a number of federated and distributed data management projects including the Worldwide LGC Computing Grid (WLCG) and the NASA Mission To Planet Earth program.  He is currently leading the EUDAT work package investigating Scalability and Data Preservation

Validation of the EUDAT Data Infrastructure and Preservation for the ALEPH experiment

by Marcello Maggi - INFN

The ALEPH experiment was one of the four experiments mounted of Large Electron Positron (LEP) collider at CERN. It took data form 1989 to 2000 producing high energy physics results published in NN papers on all the major journals of the field. ALEPH is one of the experiment that share their experience on data management, preservation and infrastructure under the DPHEP umbrella. The presentation will summarize the experience with B2STAGE and B2FIND services and their integration with the ALEPH workflow to exploit the data stored under EUDAT.


Attachment Size
GiuseppeFiameni.pdf (1.24 MB) 1.24 MB
AndreaManzi.pdf (8.43 MB) 8.43 MB
ShaunDeWitt.pdf (3.23 MB) 3.23 MB