Workshop Description¶
Open data is essential for fostering transparency, collaboration, and reproducibility in research. This workshop introduces participants to the principles of open data sharing and practical tools for depositing, managing, and sharing datasets on various open-access platforms. Participants will learn how to use platforms such as Zenodo, GitHub, Figshare, and Dryad to make their research data FAIR (Findable, Accessible, Interoperable, and Reusable).
This session is ideal for researchers, educators, and students looking to enhance their data-sharing practices. No prior experience with open data repositories is required, making it accessible to beginners.
Learning Outcomes¶
By the end of the workshop, participants will be able to:
- Understand the significance of open data in research.
- Identify appropriate open data repositories for different types of research data.
- Deposit datasets on platforms such as Zenodo, GitHub, Figshare, and Dryad.
- Apply FAIR principles to ensure data accessibility and usability.
- Use persistent identifiers (DOIs) to improve data citation and discoverability.
- Implement best practices for metadata and licensing in open data sharing.
Format¶
This is an interactive, hands-on workshop. Participants will follow along with demonstrations and practice key concepts through guided exercises.
Prerequisites¶
To ensure a smooth learning experience, participants are encouraged to create accounts on the following platforms before the workshop:
- Zenodo: Register
- GitHub: Sign up
- Figshare: Create an account
- Dryad: Join here
Date and Time¶
- Date: Monday, March 24, 2025
- Time: 10:00 AM - 11:00 AM ET
- Location: Virtual (Zoom link will be provided upon registration)
Instructor¶
- Name: Qiusheng Wu
- Affiliation: Department of Geography & Sustainability, University of Tennessee, Knoxville
- Email: qwu18@utk.edu
- Website: https://gishub.org
Who Should Attend?¶
This workshop is ideal for:
- Researchers looking to improve their open data-sharing workflow.
- Students and faculty interested in reproducible research.
- Anyone new to Zenodo, GitHub, Figshare, and Dryad who wants a hands-on introduction.
No prior experience with open data repositories is required.
Registration¶
To attend, please complete the registration form at this link. Once registered, you will receive a confirmation email with the Zoom link and preparation instructions.
1. Introduction to Open Data¶
Why Open Data?¶
- Reproducibility: Enables verification and reuse of research.
- Collaboration: Facilitates teamwork and interdisciplinary research.
- Transparency: Builds trust and ensures accountability in research.
- Data Citation: Provides credit to researchers for their datasets.
Tools for Open Data Sharing¶
- Zenodo: A general-purpose open repository developed by CERN.
- GitHub: A version control platform that supports dataset storage and collaboration.
- Figshare: A repository for datasets, figures, and supplementary materials.
- Dryad: A curated repository for data underlying scientific publications.
2. Hands-on: Depositing Data on Open Repositories¶
Hosting Data on GitHub¶
- Create a new repository.
- Add dataset files.
- Write a README.md file with dataset descriptions.
- Use GitHub releases to archive data snapshots.
- Generate a DOI using Zenodo-GitHub integration.
Uploading Data to Zenodo¶
- Log in to Zenodo.
- Click on New Upload.
- Fill in metadata fields (title, description, keywords, funding, etc.).
- Upload dataset files.
- Choose a license (e.g., Creative Commons).
- Publish and obtain a DOI.
Using Figshare for Data Sharing¶
- Log in to Figshare and create a new dataset.
- Upload data files and add metadata.
- Assign a DOI and select an appropriate license.
- Publish and share dataset links.
Depositing Data on Dryad¶
- Submit datasets associated with published research.
- Fill out required metadata and dataset descriptions.
- Comply with journal and funding agency data-sharing mandates.
- Publish with a DOI for citation and tracking.
3. Best Practices for Open Data Sharing¶
Ensuring Data Quality and FAIR Principles¶
- Provide rich metadata for better discovery.
- Use open formats (e.g., CSV, JSON, NetCDF) to ensure accessibility.
- Apply persistent identifiers (DOIs) to datasets.
- Document datasets with README files and usage instructions.
Licensing and Attribution¶
- Use open licenses such as CC0 or CC-BY.
- Clearly state data ownership and reuse terms.
Long-term Data Management¶
- Keep multiple backups in different locations.
- Maintain version control with GitHub for ongoing updates.
- Follow institutional or funder data-sharing policies.
More Open Data Repositories¶
For those interested in exploring additional open data repositories, here are some other popular platforms:
- Dataverse – A repository software for sharing, preserving, and citing research data. (Dataverse Project)
- OSF (Open Science Framework) – A collaborative research management platform with data storage and version control. (OSF)
- Mendeley Data – A free platform for managing and sharing datasets. (Mendeley Data)
- re3data (Registry of Research Data Repositories) – A directory of open data repositories across disciplines. (re3data)
- EUDAT B2SHARE – A European-based service for long-term data storage and sharing. (EUDAT)
- Harvard Dataverse – A widely used Dataverse repository for sharing academic research data. (Harvard Dataverse)
Resources & Further Reading¶
This hands-on workshop will equip participants with the skills to deposit, manage, and share research data on open-access platforms effectively.