SLAC, April 14th, 2016
Introductory sessions from the
2016 DOE Data ID Service Workshop
Parallel event held in the Panofsky Auditorium (B053)
These introductory sessions are not part of the OSTI/SLAC Researchers Workshop but can help SLAC researchers to better understand the role of OSTI and the resources that they can provide for the researchers.
8:30 am: Introductions and Logistics
8:45 am: The DOE Data Explorer (What it is, What you can discover) – OSTI
9:15 am: The DOE Data ID Service (What it is, How it Works) - OSTI
9:45 am: Today’s Policy Environment for Scientific Data - TBD
1:00 pm: Data Roundtable - All willing workshop participants
- What types of data do you current work with (instruments, formats, etc.)
- How do you currently manage your data?
- Do you have additional data management needs (data storage, federation of data repositories, metadata services, etc.)
OSTI/SLAC Research Workshop
Exploring the Scientific and Technical Information and Data Needs Of Researchers and DOE
This workshop will explore how lab-based scientists use scientific and technical information (STI), data, and supplemental material in the workflow of their research efforts. DOE’s Office of Scientific and Technical Information (OSTI) will describe their role and responsibilities in collecting, preserving, and disseminating STI, as well as specific OSTI tools and services. OSTI will seek to gain a better understanding of the STI and data needs of researchers for the purpose of making OSTI tools and services (and their STI content) more useful and integrated to meet those needs, while fulfilling DOE’s public access and dissemination mandates.
Access to DOE R&D Results for the Research Community – 10:30am PST, Trinity Conference Room (B053)
Group (15-25 participants) discussion with crosscutting attendance from the scientific disciplines at SLAC.
10:30 – 10:40 AM: Introductions and OSTI Overview Presentation – Brian Hitson, Director of OSTI
10:40 – 11:00 AM: Researchers work through the OSTI Tools HW (Please bring a computer to use during this session)
11:00 – 12:00 PM: Discussion about community and OSTI scientific and technical information tools – facilitated by Carly Robinson, OSTI
During discussion, facilitators should aim to answer the following questions:
Homework Related Questions
- The homework questions alluded to a number of existing and potential capabilities for OSTI tools:
- The ability to associate research areas with DOE funding offices;
- the ability to track outcomes of particular DOE funding offices over time;
- the ability to determine DOE’s role in supporting key scientific initiatives or areas of research including identifying performers, collaborations, and DOE funding offices;
- the ability to find potential sources of DOE funding for a particular area of research;
- the ability to comprehensively identify and explore DOE R&D outcomes …
- Are these reasonable? Are there other capabilities of interest to the research community? How well do OSTI tools enable these capabilities?
- Going through the pre-workshop homework questions,
- What tools and capabilities worked well? What could be improved and how?
- Did you have any frustrations? What were they?
- Was the metadata useful for advanced searching and completing the tasks?
- Does SciTech Connect meet researcher needs when using the tool to answer science-related questions? What specifically could differentiate science-related searches from typical web searches?
- Did you use the advanced searches in any tools?
- Could you find relevant datasets in DOE Data Explorer?
- What is the difference between a data collection and dataset?
Domain Area and OSTI Tools
- How do you find most of your research articles and other research related information (ex. INSPIRE, arXiv.org, Web of Science, etc.)?
- What are the key features of these tools that create value for the community? What are typical search terms (key words, authors, titles, etc.)? Do you typically use advanced search options?
- What types of STI do you typically use in your workflow? Publications, technical reports, other researcher data, software developed outside of the group?
- Do you text and data mine in the course of your research?
- What is the current level of awareness of OSTI products? How could awareness be raised?
- OSTI aims to be the definitive resource for DOE R&D results. How can OSTI improve access to DOE R&D information to be more useful to the research communities?
- Given OSTI’s unique position of being the central clearinghouse for DOE publications, data, software and patents, what additional service could OSTI provide that would be of benefit to scientific discovery?
Data Management and Software – 2:30pm PST, Trinity Conference Room (B053)
Group (15-25 participants) discussion with attendance from the data science and software community at SLAC.
2:30 – 4:00 PM: Discussion about current data management, supplemental material, and software practices. – facilitated by Carly Robinson, OSTI
During discussion, facilitator should aim to answer the following questions:
Data Management and Supplemental Information
- How do you manage your data currently?
- What kinds of data services are needed?
- Now
- 2 years
- 5-10 years
- DOI registration and data citation is perhaps the most well understood (the appropriate questions to ask are at least understood) area surrounding the public access to digital data environment. How can OSTI and the Lab STI Program improve and extend the reach of the DOE Data ID Service at the Lab? How can we assist researchers in improving the visibility of their data?
- What services are missing in the DOE community related to archival data management? How might OSTI and the Lab STI Program assist in these areas (identification, data collection/storage services, federation of data repositories)?
- What services surrounding supplementary information would be useful from OSTI and the Lab STI program (ex. treating figures, charts, and graphs as separate but related research objects, collecting supplemental information with accepted manuscripts, etc.)?
- What are typical supplemental information formats in your field (pdf, excel, csv, HDF5, etc.)?
- What metadata services do you use, and do you need?
Software
- How do could we track software versions? Do we assign them DOIs?
- What metadata is relevant in the STI content associated with software?
- Can you snapshot the software and configuration that was used to create STI?
- How do you handle proprietary data and codes?
- What are effective ways to encourage scientists to contribute their software as shared STI?
- Should OSTI maintain a separate software dissemination product or seek to integrate software into an umbrella STI product?
- What is the best way to make software searchable? For that matter, what does it mean "to search" software?
- How can we better link authors, developers, data and original research publications to software? For example, does everything get tagged with ORCID ids?
- How can we better track software artifacts that were not directly published, such as documentation or websites?
- Should older versions of software be searchable through OSTI?
- Should or how can OSTI facilitate connections between users of software packages and cloud service providers?
OSTI services
SciTech Connect – to search for technical reports, bibliographic citations, journal articles, conference papers, books, multimedia, and data information sponsored by DOE through a grant, contract, cooperative agreement, or similar type of funding mechanism from the 1940s to today.
http://www.osti.gov/scitech/
DOE PAGESBeta – to search for scholarly publications, including peer-reviewed journal articles and accepted manuscripts, resulting from DOE-funded research. PAGES is DOE’s new tool for implementing OSTP’s public access requirement. PAGES has only begun to ingest DOE-funded accepted manuscripts starting October 1st, 2014. DOE and OSTI continue working to communicate author submission requirements to our researchers.
http://www.osti.gov/pages/
DOE Data Explorer – to search collections of scientific research data and also retrieve individual datasets submitted by data centers, repositories, and other organizations within DOE. DOE Data Explorer is a tool where data owners can make their datasets discoverable and have digital object identifiers (DOIs) assigned to datasets.
http://www.osti.gov/dataexplorer/
Please use the tool listed above to explore the following questions. These questions are designed to give you a deeper engagement with and understanding of OSTI assets and tools as well as serve as example use cases for potential OSTI customers.
1) Can you find a technical report (a document written by a researcher detailing the results of a project and submitted to DOE) from the last two years in your specific field of study? Are the metadata for the report accurate and complete? Is the information in SciTech Connect sufficient to determine whether this report would be useful for your area of research?
2) Can you find an article that was published within the past year in your specific research field?
3) Can you find a dataset relevant to your field of study? Is there enough information in DOE Data Explorer for you to determine if the data set would be useful for your area of research? How would you look for publications and/or technical reports related to these data?
4) Explore a current open science question relevant to your field of research.
5) Consider the DOE office from which you receive the most funding. Are there other DOE offices funding related research?
6) Search for a piece of STI that you would reasonably expect to be hosted by OSTI.
7) What was the most significant result in DOE-funded High Energy Physics in 2012?
8) What are the most significant collaborations between DOE supported researchers and China?