MADATA FAQ

What is MADATA and what is it good for?

MADATA is short for MAnnheim research DATA server. It is a repository where researchers of the University of Mannheim can archive and share their research data in compliance with the FAIR principles. It is suitable for

  • preserving/securing the data of your research results at an external instance,
  • making your data citable and findable for others, since all datasets accepted on MADATA will receive a Digital Object Identifier (DOI),
  • sharing your data with a desired access level (open, mediated, embargoed),
  • adhering to the requirements of research funding bodies like DFG, BMBF, EU etc., since you can archive your data for at least 10 years and publish/share them as needed,
  • supporting reproducibility of research findings and the Open Science idea.
In addition, MADATA is the University of Mannheim's data bibliography, which lists the research data created with the participation of the university's researchers.

What is research data?

Research data can be a variety of different things that may not be obvious research data at first glance. It is any information that is collected, observed, generated or created for the validation or falsification of research findings. These can be, for example, data sets, software, code, methods, standard operating procedures, workflows, models, illustrations, tables, images and videos, interviews, questionnaires, documentation, etc.

Who is in charge of MADATA?

The research data center (Forschungsdatenzentrum, FDZ) of the Mannheim University Library is in charge of MADATA. The research data consultants at the FDZ can support you to archive your data at MADATA following the FAIR principles and best practices in research data management. Data archiving through bwDataArchive is done in cooperation with UNIT.

Who can archive/share on MADATA?

Researchers, i.e. PhD students, PostDocs, professors, staff etc., of the University of Mannheim, who produce research data. Students can share data of their papers or thesis if this is supported by their supervisor.

Does it cost anything to publish and store with MADATA?

Storage of research data under 1TB is free. From a total size of 1TB, research data archiving is charged. 1 TB costs 1,000€ for a 10 year archiving period.

Why should I archive/share my research data?

In the light of the FAIR data debate, the sharing and archiving of research data is essential for a number of reasons. It promotes transparency and reproducibility, thereby enhancing the credibility of scientific results. It increases opportunities for collaboration, which promotes interdisciplinary research and accelerates scientific progress. Sharing data allows efficient use of resources by avoiding duplication of effort. Increased visibility and impact of research, as well as compliance with funding and journal requirements, are additional benefits. Ethically, data sharing promotes openness and accountability. Long-term preservation ensures accessibility for future researchers, and shared datasets contribute to education and global collaboration in the scientific community.

What should I archive/share?

This depends on what you are aiming at: The reproducibility of your research findings or the full reusability of your data also in other research contexts. In the first case, we recommend to upload:

  • final version of your data and code files, i.e. the dataset that was used to obtain the results
  • a readme file with general information on the project (see below)
  • Data Management Plan(s),
  • codebooks for structured data (link to codebook help page)
  • blank consent form
  • blank questionnaire
In the second case, i.e. if you strive for the full reusability of your data, you should additionally keep:
  • Raw data
  • Code used to pre-process the data
  • Approval by ethics committee
  • Data Management Plan

What should be in a ReadMe file?

A ReadMe file enables users to understand the structure and contents of the dataset (link to readme file guide) in non-proprietary file format (e.g. txt). A ReadMe file should at least contain the following information:

  • Title of the dataset
  • Creators of the dataset
  • Year of creation/publication
  • DOI of the dataset
  • License
  • Project description
  • Information about data/code
Comprehensive template for Social Sciences: https://social-science-data-editors.github.io/template_README/
The ReadMe file should be open access to allow MADATA users to assess the usefulness of the data for their research.

How can I optimize reproducibility?

Good coding practices and a folder structure that separates by function (e.g. input, output, code and data) are recommended to optimise reproducibility. Examples and recommendations can be found here and here.
Important is for example:

  • Code:
    • Document/comment your code so that you and others understand the code
    • Use relative paths in the code
  • Data:
    • Tabular data: create a variable-level documentation like a codebook, variable report, data dictionary
    • Digital objects (images, video, audio, text...): describe your objects using a pre-defined metadata scheme (ideally standard-compliant)
    • Document the data collection process and the tools and methods that were used in this process

How is data storage guaranteed on MADATA?

To fulfil the requirements with respect to good research practices and comply with the major research funding bodies, we keep the data at least for 10 years. After this time, we might get back to you and ask whether you still want to have that data saved and published. In any case, the metadata will be kept in order to make the whereabouts of the data traceable according to the A2 of the FAIR principles ("Metadata should be accessible even when the data is no longer available").

Where will the data be preserved/archived?

MADATA is hosted on servers at the University of Mannheim. In addition, a backup of the contents of MADATA is stored at bwDataArchive, which is housed at a different geographical location at the Karlsruhe Institute of Technology (KIT).

What should I be aware of when uploading data and is versioning possible?

To ensure reproducibility and transparency, please upload only comprehensive datasets. This includes all formal data as well as other materials, such as codebooks or custom software, necessary to open and use the uploaded data.

Versioning is not possible, yet. If the dataset has changed or a new version has been created, please add it as a new item to the repository. It is not possible to modify uploaded datasets. Each uploaded dataset is assigned a DOI (Digital Object Identifier) for permanent identification and citation. The stability of the DOI is guaranteed by the Mannheim University Library.

Please do not upload single files that are technically inaccessible or whose content is incomprehensible. The FDZ team will check whether the research data can be opened and will not publish data that cannot be opened.

Is (meta) data on MADATA FAIR?

MADATA helps making (meta)data FAIR (Findable, Accessible, Interoperable and Reusable) via:

  1. Findability: By assigning DOIs to datasets in cooperation with Da|ra and DataCite and including this DOI in the metadata. The metadata is not only registered and searchable via MADATA, but also through Da|ra and DataCite systems and Google Dataset search. MADATA as a repository is registered in Re3Data, the registry of research data repositories.
  2. Accessibility: Metadata of datasets on MADATA are openly accessible in any case. We also encourage data depositors to share the data itself as open as possible - in this case, data is available via direct download. We are well aware that it might not always be possible to share data completely open due to legal, license or confidentiality regulations. For these cases, we provide different access modes (moderated access through MADATA or through the safe room in the library) which are also compliant with the FAIR principles.
  3. Interoperability: MADATA uses the metadata schema of Da|ra. The metadata can be harvested via OAI-PMH interface.
  4. Reusability: While the metadata for your dataset will always be open data (CC0 license), you can assign several standard (open) licenses to the data itself within MADATA, like Creative Commons licenses. Apart from that, we recommend a thorough data documentation (e.g. through codebooks, readme files etc.) and non-proprietary file formats in order to improve reusability.
Please contact us for more info on how to FAIRify your research data!

Does MADATA support anonymous data publications?

If you need to share data anonymously without revealing your identity as an author (e.g. in a peer review process), we recommend doing this via an external service because MADATA would disclose at least your institutional affiliation. One way to go could be the Open Science Framework (https://help.osf.io/article/201-create-a-view-only-link-for-a-project).

What access and licensing options are available for data in MADATA?

As a member of the University of Mannheim, you are invited to publish your research data and to define the conditions under which this data is published.

For this purpose, the repository offers the use of Creative Commons (CC) licences as well as the possibility to restrict access to selected persons, e.g. members of the University of Mannheim. We would like to point out that the least restrictive licences, such as CC0, are the most conducive to scientific progress.

How can I add MADATA entries to my publications webpage?

The publications plugin for Typo3 which you can use in connection with MADOC, can also be used to display your MADATA entries on your institutional publications webpages.

What can I as a researcher do to make my data FAIR?

MADATA is a tool to make research data compliant with the FAIR principles to a certain extent.