Sharing, Reusing, and Repurposing Data

Similar documents
LIBRARY AND RECORDS AND ARCHIVES SERVICES STRATEGIC PLAN 2016 to 2020

Connect Communicate Collaborate. Transform your organisation with Promethean s interactive collaboration solutions

Davidson College Library Strategic Plan

Open Sharing, Global Benefits The OpenCourseWare Consortium

Development of a Library 2.0 service model for an African library

A Framework for Articulating New Library Roles

CollaboFramework. Framework and Methodologies for Collaborative Research in Digital Humanities. DHN Workshop. Organizers:

Technology and the Global Commons

RESEARCH INTEGRITY AND SCHOLARSHIP POLICY

Preliminary Report Initiative for Investigation of Race Matters and Underrepresented Minority Faculty at MIT Revised Version Submitted July 12, 2007

A Strategic Plan for the Law Library. Washington and Lee University School of Law Introduction

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Experience and Innovation Factory: Adaptation of an Experience Factory Model for a Research and Development Laboratory

104 Immersive Learning Simulation Strategies: A Real-world Example. Richard Clark, NextQuestion Deborah Stone, DLS Group, Inc.

Director, Intelligent Mobility Design Centre

Senior Research Fellow, Intelligent Mobility Design Centre

The Round Earth Project. Collaborative VR for Elementary School Kids

DESIGN, DEVELOPMENT, AND VALIDATION OF LEARNING OBJECTS

Sources of funding. for Higher Education in the UK. Sources of funding for HE in UK. Centre for Excellence in Reusable Learning Objects

Headings: Digital libraries. Metadata. Surveys. Thesauri

Open Access Free/Open Software, Open Data, Creative Commons Wikipedia: Commonalities and Distinctions. Stevan Harnad UQAM & U Southampton

Overcoming the Tyranny of Distance in 21 st Century Research AARNet/Pacific Wave. Overcoming the Tyranny of Distance in 21 st Century Research

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Virtual Labs: An investigation in to the future of the teaching labs

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS

WHY GRADUATE SCHOOL? Turning Today s Technical Talent Into Tomorrow s Technology Leaders

After breakfast this morn.ing, we will have the opportunity to attend workshops in four areas:.libraries.archives. .language

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Exploring the Development of Students Generic Skills Development in Higher Education Using A Web-based Learning Environment

Understanding Co operatives Through Research

BME 198A: SENIOR DESIGN PROJECT I Biomedical, Chemical, and Materials Engineering Department College of Engineering, San José State University

empowering explanation

On the Open Access Strategy of the Max Planck Society

University of Southern California Hayward R. Alker Postdoctoral Fellow, Center for International Studies,

DESIGN-BASED LEARNING IN INFORMATION SYSTEMS: THE ROLE OF KNOWLEDGE AND MOTIVATION ON LEARNING AND DESIGN OUTCOMES

BPS Information and Digital Literacy Goals

Dialogue Live Clientside

A New Computing Book Series From ACM

Self-Study Report. Markus Geissler, PhD

Journal Article Growth and Reading Patterns

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document.

LSC 555 Information Systems in Libraries and Information Centers Syllabus - Summer Description

Running Head: STUDENT CENTRIC INTEGRATED TECHNOLOGY

EDELINA M. BURCIAGA 3151 Social Science Plaza Irvine, CA

Institutional repository policies: best practices for encouraging self-archiving

Introduction to Information System

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Emmanuel Opara, D.B.A. Associate Professor Accounting & Finance & MIS College of Business

Bold resourcefulness: redefining employability and entrepreneurial learning

COURSE LISTING. Courses Listed. Training for Cloud with SAP SuccessFactors in Integration. 23 November 2017 (08:13 GMT) Beginner.

Available online at (Elixir International Journal) Library and Information Science

Researcher Development Assessment A: Knowledge and intellectual abilities

May 23, sead-data.net

The University of British Columbia Board of Governors

University of Delaware Library STRATEGIC PLAN

Learning, the Internet and Society

What is a Mental Model?

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

NORTH CAROLINA STATE BOARD OF EDUCATION Policy Manual

e-portfolios: Issues in Assessment, Accountability and Preservice Teacher Preparation Presenters:

Regional Bureau for Education in Africa (BREDA)

Europeana Creative. Bringing Cultural Heritage Institutions and Creative Industries Europeana Day, April 11, 2014 Zagreb

THE DEPARTMENT OF DEFENSE HIGH LEVEL ARCHITECTURE. Richard M. Fujimoto

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Danielle Dodge and Paula Barnick first

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Tools for Tracing Evidence in Social Science

For the Ohio Board of Regents Second Report on the Condition of Higher Education in Ohio

Evaluating the Effectiveness of Mindmapping in Generating Domain Ontologies using OntoREM: The MASCOT Case Study

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Adler Graduate School

USC MARSHALL SCHOOL OF BUSINESS

Summary results (year 1-3)

Librarians of Highlights of a survey of RUL faculty. June 7, Librarians of 2023 June 7, / 11

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Deploying Agile Practices in Organizations: A Case Study

MASTER S COURSES FASHION START-UP

Reporting On-Campus Crime Online: User Intention to Use

EDITORIAL: ICT SUPPORT FOR KNOWLEDGE MANAGEMENT IN CONSTRUCTION

Inspiring Science Education European Union Project

Linguistics Department Academic Plan

Lincoln School Kathmandu, Nepal

DICE - Final Report. Project Information Project Acronym DICE Project Title

esocial Science and Evidence-Based Policy Assessment: Challenges and Solutions

Execution Plan for Software Engineering Education in Taiwan

Practical Applications of Statistical Process Control

What is PDE? Research Report. Paul Nichols

NATIONAL AGENDA FOR CONTINUING EDUCATION AND PROFESSIONAL DEVELOPMENT ACROSS LIBRARIES, ARCHIVES, AND MUSEUMS

KR Connections Culminating Project. Planning for your Job Shadow!

HIGHER EDUCATION IN POLAND

Next-Generation Technical Services (NGTS) Archivists Toolkit Recommendations

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Knowledge Synthesis and Integration: Changing Models, Changing Practices

Motivation to e-learn within organizational settings: What is it and how could it be measured?

Community-oriented Course Authoring to Support Topic-based Student Modeling

LEGAL RESEARCH & WRITING FOR NON-LAWYERS LAW 499B Spring Instructor: Professor Jennifer Camero LLM Teaching Fellow: Trygve Meade

From Social to Scholarly and Back Again

Stakeholder Debate: Wind Energy

Transcription:

University of California, Los Angeles From the SelectedWorks of Christine L. Borgman May 21, 2013 Sharing, Reusing, and Repurposing Data Christine L Borgman, University of California, Los Angeles Available at: http://works.bepress.com/borgman/344/

Sharing, Reusing, and Repurposing Data Oxford eresearch Centre 21st May 2013 Christine L. Borgman Oliver Smithies Visiting Fellow and Lecturer, Balliol College, Oxford Visiting Fellow, Oxford eresearch Centre Visiting Fellow, Oxford Internet Institute Professor and Presidential Chair in Information Studies University of California, Los Angeles

The Conundrum of Sharing Research Data If the rewards of the data deluge are to be reaped, then researchers who produce those data must share them, and do so in such a way that the data are interpretable and reusable by others.* *Borgman, C.L. (2012). The Conundrum of Sharing Research Data. JASIST, 63(6):1059 1078 http://www.tzanis.org/tzanisblog/archives/images/push-pull-thumb.jpg

Overview Paradigm shift Arguments for sharing data Science friction, data friction Success factors for reusing and repurposing data http://inventionmachine.com/the-invention-machine-blog/bid/51703/three-key-challenges-to-entering-new-markets

New problem solving methods Empirical Theory Applied computer science is now playing the role that mathematics did from the 17th through the 20th centuries: providing an orderly, formal framework and exploratory apparatus for other sciences G. Djorgovski Simulation Data <0 1700 1950 1990 Slide courtesy Ian Foster, 2009

Volume of data The long tail of data Number of researchers Slide: The Institute for Empowering Long Tail Research

Data sharing imperatives Research Councils of the UK Open access publishing requirements Provisions for access to data Wellcome Trust Open access publishing Data sharing requirements National Science Foundation Data sharing requirements Data management plans U.S. Federal policy-2013 Open access to publications Open access to data

What are data? Marie Curie s notebook aip.org hudsonalpha.org ncl.ucar.edu http://www.census.gov/population/cen2000/map02.gif http://onlineqda.hud.ac.uk/intro_qda/examples_of_qualitative_data.php

Pepe, A., Mayernik, M. S., Borgman, C. L. & Van de Sompel, H. (2010). From Artifacts to Aggregations: Modeling Scientific Life Cycles on the Semantic Web. Journal of the American Society for Information Science and Technology, 61(3): 567 582.

Overview Paradigm shift Arguments for sharing data Science friction, data friction Success factors for reusing and repurposing data http://inventionmachine.com/the-invention-machine-blog/bid/51703/three-key-challenges-to-entering-new-markets

Why share research data? Rationales 1. To reproduce or to verify research 2. To make results of publicly funded research available to the public 3. To enable others to ask new questions of extant data 4. To advance the state of research and innovation Borgman, C.L. (2012). The Conundrum of Sharing Research Data. JASIST, 63(6):1059 1078

1. Reproduce or verify research http://chemistry.curtin.edu.au/research/index.cfm http://serc.carleton.edu/cismi/broadaccess/groupwork.html

Scientific Gold Standard REPLICATION THE CONFIRMATION OF RESULTS AND CONCLUSIONS FROM ONE STUDY obtained independently in another is considered the scientific gold standard. Jasny, B. R., Chin, G., Chong, L. & Vignieri, S. (2011). Again, and again, and again. Science, 334(6060): 1225.

Victoria Stodden, Columbia Reproducibility? Deductive sciences Check the proof Experimental sciences Redo the field work Computational sciences Start with the dataset Reconstruct workflow Published by AAAS J P A Ioannidis, M J Khoury Science 2011;334:1230-1232

Why share research data? Rationales 1. To reproduce or to verify research 2. To make results of publicly funded research available to the public 3. To enable others to ask new questions of extant data 4. To advance the state of research and innovation Borgman, C. L. (2012, forthcoming). The conundrum of sharing research data. Journal of the American Society for Information Science and Technology. Figure by Jillian C. Wallis, UCLA

2. Public monies serve the public good

Why share research data? Rationales 1. To reproduce or to verify research 2. To make results of publicly funded research available to the public 3. To enable others to ask new questions of extant data 4. To advance the state of research and innovation Borgman, C. L. (2012, forthcoming). The conundrum of sharing research data. Journal of the American Society for Information Science and Technology. Figure by Jillian C. Wallis, UCLA

3. Others can ask new questions data discovery http://annualreport.ucdavis.edu/2008/images/photos/discovery.jpg http://digitalassetmanagement.org.uk/2010/02/01/the-winds-of-change-are-blowing-in-the-clouds-favor/

Why share research data? Rationales 1. To reproduce or to verify research 2. To make results of publicly funded research available to the public 3. To enable others to ask new questions of extant data 4. To advance the state of research and innovation Borgman, C. L. (2012, forthcoming). The conundrum of sharing research data. Journal of the American Society for Information Science and Technology. Figure by Jillian C. Wallis, UCLA

4. Data curation advances research 3. Www WISE image Worldwide Telescope

Overview Paradigm shift Arguments for sharing data Science friction, data friction Success factors for reusing and repurposing data http://inventionmachine.com/the-invention-machine-blog/bid/51703/three-key-challenges-to-entering-new-markets

http://www.stmary.ws/highschool/physics /home/notes/dynamics/friction/imge2.gif

Science friction, data friction* Data are unruly objects Data do not stand alone Data reuse is a function of distance from origin Intractable problems *Edwards, P. N., Mayernik, M. S., Batcheller, A. L., Bowker, G. C., & Borgman, C. L. (2011). Science Friction: Data, Metadata, and Collaboration. Social Studies of Science, 41, 667 690. doi:10.1177/0306312711413314 www.zazzle.com

Data are unruly objects* Poorly bounded Malleable, mutable, mobile (Latour) Dynamic, evolving Signal to noise varies by use *Wynholds, L. A. (2010). Linking to Scientific Data: Identity Problems of Unruly and Poorly Bounded Digital Objects. Presented at the Digital Curation Conference, 15 June 2011. http://www.ijdc.net/index.php/ijdc/article/view/174 www.zazzle.com

Data do not stand alone Data are inseparable Code Technical standards Documentation Instrumentation Calibration Provenance Workflows Local practices Physical samples http://peacetour.org/sites/default/files/code4peace-logo2-v3-color-sm.jpg

Data reuse is a function of distance Reuse by investigator Reuse by collaborators Reuse by colleagues Reuse by unaffiliated others Reuse at later times Months Years Decades Centuries from origin http://chandra.harvard.edu/photo/2013/kepler/kepler_525.jpg

Intractable problems Confidentiality Anonymization Reidentification Intellectual property Economics http://fyi.uiowa.edu/wp-content/uploads/2011/10/utopia_in_four_movements_filmstill5_utopiasign.jpg

Overview Paradigm shift Arguments for sharing data Science friction, data friction Success factors for reusing and repurposing data http://inventionmachine.com/the-invention-machine-blog/bid/51703/three-key-challenges-to-entering-new-markets

The Conundrum of Sharing Research Data If the rewards of the data deluge are to be reaped, then researchers who produce those data must share them, and do so in such a way that the data are interpretable and reusable by others.* *Borgman, C.L. (2012). The Conundrum of Sharing Research Data. JASIST, 63(6):1059 1078 http://www.tzanis.org/tzanisblog/archives/images/push-pull-thumb.jpg

How to share data Curated data archive: NASA, UKDA, ICPSR Author curated data archive University data archive: ORA Personal website ftp site Email on request http://www.zippykidstore.com/

Simple Rules for the Care and Feeding of Scientific Data* 1. Good science requires good data 2. Make your science inspectable by others 3. Conduct your science with provenance in mind 4. Do not reduce your data more than necessary 5. Make your data available 6. Make your workflows available 7. Publish all software, even small scripts 8. Foster a data community for your community 9. Describe how you want to be acknowledged 10.Attribute the sources of data that you use *DRAFT: Radcliffe Seminar on Data Provenance, 9-10 May 2013, A. Goodman & X-L Meng

Conclusions Data reuse is part of open science / open scholarship Data sharing is a paradigm shift Data are not journal articles (yet) Data are messy Data sharing is a necessary but not sufficient condition for reuse Data reuse depends on Conditions of sharing Conditions of reuse Data friction is part of scholarship Better practices in managing data will increase the reuse of data http://www.tzanis.org/tzanisblog/archives/images/push-pull-thumb.jpg

Acknowledgements National Science Foundation CENS: Cooperative Agreement #CCR-0120778, D.L. Estrin, UCLA, PI. CENS Education Infrastructure: #ESI- 0352572, W.A. Sandoval, PI; C.L. Borgman, co-pi. Towards a Virtual Organization for Data Cyberinfrastructure, #OCI-0750529, C.L. Borgman, UCLA, PI; G. Bowker, Santa Clara University, Co-PI; T. Finholt, University of Michigan, Co-PI. Monitoring, Modeling & Memory: Dynamics of Data and Knowledge in Scientific Cyberinfrastructures: #0827322, P.N. Edwards, UM, PI; Co-PIs C.L. Borgman, UCLA; G. Bowker, SCU; T. Finholt, UM; S. Jackson, UM; D. Ribes, Georgetown; S.L. Star, SCU) Data Conservancy: OCI0830976, Sayeed Choudhury, PI, Johns Hopkins University. Knowledge and Data Transfer: the Formation of a New Workforce. # 1145888. C.L. Borgman, PI; S. Traweek, Co-PI. Microsoft External Research: Tony Hey, Lee Dirks, Catherine van Ingen, Catherine Marshall Sloan Foundation: The Transformation of Knowledge, Culture, and Practice in Data-Driven Science: A Knowledge Infrastructures Perspective. # 20113194. C.L. Borgman, PI; S. Traweek, Co-PI. Joshua Greenberg, program director Project website: http://knowledgeinfrastructures.gseis.ucla.edu/index.html