The ILL, with the support of the European Commission, is one of the first institutes in Europe to have developed and implemented a policy for the management of experimental data. In 2012, the ILL began to assign an identification number to the datasets produced after every single experiment. This number is known as the DOI, or Digital Object Identifier, and is a persistent link. For the referencing of their data, it is vital that users adopt the DOI.
The DOI is an identifier allowing data to the traced from their production right through to their publication. By inserting this DOI into all their publications and articles, users guarantee the traceability of all the details about their experiment. This includes: the request for beamtime; the experimental parameters and conditions ; the instrumentation used; the data obtained; the analysis of this data; the names of the research team members.
Data DOIs promote experiments to peers and to potential funding bodies as well as to publishers and journals. Some of these already insist on access to this data before they will validate a publication. Inserting the DOI in an article actually speeds up the review process!
It also helps to demonstrate the reliability of the results, giving access to the experimental conditions, and makes it easier to understand the resulting findings.
Also, making data available will allow new research to be carried out on the same topic in the future, because the DOI is a permanent identifier.
The ILL was the first international scientific user facility to publish a “Scientific Data Policy” in November 2011, just before the opening of the December 2011 proposal round. The text came into force in October 2012, and prescribed a default non-disclosure period of three years during which access to data is restricted to the experimental team; in cases where no request for data has been made, this period would be extended to five years.
Following the publication of the policy, the ILL created an interdisciplinary working group, the DPP (Data Protection and Processing). One of its first missions was to drive the development of the software tools needed to put the policy into practice, with a focus on usability (especially during experiments) and security. A new data portal (data.ill.eu) that implemented the changes was launched in 2014. It enables visitors to search all textual metadata related to an experiment and quickly retrieve all data of interest related to search criteria, while also implementing access restriction for data not yet public.
When inserted in publications, ILL's data DOIs allow readers to obtain more information about the referenced experiments, access the ILL Data Portal and even request access to the experimental team if the data are not yet publicly available.
The DPP continuously upgrades the data policy so as to reflect the evolution of the Data Management tools available at the ILL. The latest version of the ILL Data Policy was adopted in July 2017.
Central facilities for neutron scattering and synchrotron X-rays in Europe keep working together to develop and share infrastructure for the data they collect. Such co-operation will make it easier and more efficient for users to access and process their data, and provide more secure means of storage and retrieval. It will also increase the scientific value of the data by opening it up to a wider community for further analysis and fostering new collaborations between scientific groups.
In order to control access to the experimental data obtained at the ILL in a coherent and secure fashion, the ILL has recently developed a single portal for consulting, downloading and managing your data.
Here “data” is understood to mean raw data (i.e. numor files), processed data, and meta-data (e.g. log files or “logs”).
This webportal offers:
- Global text search on all documents related to ILL experiments (proposals, data files, reports …).
- Advanced search allowing filtering by (co-)proposer, instrument, numor, cycle, dates of experiment …
- Presentation of the list of experiments matching the search criteria (note that a single proposal could involve >1 instruments and thus >1 experiments).
- Data and information relative to these experiments can then be accessed/download depending on your access authorisation. As an ILL user you can access: data obtained from your own experiments, data from an experiment whose main proposer has specifically granted you access, or data which has been made public.
- For each experiment’s proposal a set of tabs points to data files and other information (link to the DOI, the proposal text, the logs …).
- Provided that you are the principal investigator (PI) (normally the main proposer), the “Members” tab allows you to grant another person access to the data, or even to make the data public.
- Once you have access authorization to a given proposal, the “Data folders” tab allows you to access the content of various sub-folders (pertaining to a particular experiment) containing either raw data or processed data or log files.
- Alternatively the “Data Ranges” tab allows an authorized user to select and download specific ranges of raw data files corresponding to e.g. different samples and/or temperatures.
- Be aware that notification will be sent to the PI of the proposal whose data have been downloaded, identifying the downloader to him.
- Also be aware that the step of making data public is understandably irreversible.
Note that other data access tools coexist with this portal:
- explorer.ill.eu (Deprecated - will be phase out in 2017) is a browser-based tool for downloading files of any kind via your ILL computer account (ILL data, personal files …).
- IDA allows downloading of raw data files that predate the ILL Data Policy (i.e. before cycle 123 of Autumn 2012).
data.ill.eu, the ILL web portal is the recommended solution for accessing, managing and dowloading your experimental data, nevertheless in case of large volume the http protocol does not provide sufficient mechanisms to ensure smooth and reliable transfers, you can opt for solution dedicated to large data transfer.
Publications and DOI: If you publish results of ILL data, either your own data, data to which you were granted access, or data that were made public, the ILL expects you to cite the DOI reference using the specified format.
This service is intended for ILL users that would like to download large volume of experimental data.
In order to use this service you need a sftp client software on your local computer, we recommend the open-ssh solution (the standard Command Line Interface for unix based systems) or FileZilla (Windows, Linux, Mac Os X) for those who prefer a graphical environement. Nevertheless any sftp software should work out of the box but pay special attention to the fact that it provides the functionallity to resume failed transfers.
This service is hosted by dt.ill.fr (standard SSH port, i.e TCP 22), you need to authenticate using your ILL account. Once connected, you will find the usual "MyData" folder containing the proposal data folders organised by Year or Instrument.
> sftp firstname.lastname@example.org
sftp> cd MyData/byProposal/
exp_8-05-XXXX_in13 exp_8-05-XXXX_in5 exp_9-13-XXXX_figaro exp_TEST-XXX_d1b
sftp> cd exp_TEST-XXX_d1b
histo logfiles processed rawdata
sftp> get -ra rawdata
Fetching /net4/serdon/illdata/141/d1b/exp_TEST-2368/rawdata/ to c:Temp
/net4/serdon/illdata/141/d1b/exp_TEST-23999/rawdata/222998.nxs 100% 14KB 13.9KB/s 00:00
/net4/serdon/illdata/141/d1b/exp_TEST-23999/rawdata/222999.nxs 100% 14KB 13.9KB/s 00:00
The '-a' option is important as it allows to resume failed transfer.
In case of difficulties, please contact data(at)ill.eu
Note: This service is also usefull if you want to upload reduced or analysed data into the "processed" folder of your experiment.
Rsync, GridFTP, ...
Transfering very large data volume (multi TB datasets) over the internet could be tedious even in 2017, so please, do not hesitate to contact us in case of difficulties with the standard services provided, more specific solutions could be offrered on demand.