Back

5 Assessment Report

In this section, we want to present an aggregated assessment of the maDMPs submitted to the aforementioned Zenodo community. As it is possible to determine the respective authors from the content of an maDMP, we want to clarify that we - of course - do not intend to disparage neither the efforts that went into creating the documents nor the authors themselves by this assessment in any way. We merely utilized the files as realistic test data since they stem from experiments with diverse topics - in order to gauge the utility of the SPARQL queries developed during our project. The tables below form a summary of our attempts at evaluating the maDMPs.

The column(s) “Satisfaction Value” are numeric on a scale from zero to five. A value of five is equivalent to a holistic fulfilment of the respective criterion, a value of zero either denotes that the criterion is “not satisfied” or that the SPARQL queries are not able to extract the required pieces of information.

1.jsonld

Category Satisfaction Value Justification
0 General Information 2 Sufficient information about DMP. Information about project not included.
1 Data Description and Collection or Re-Use of Existing Data 2 The size of the produced/used data is provided. However, for two out of four distributions, the description is missing. Furthermore, the file formats of the produced data are not specified (in contrast to the reused data).
2 Documentation and Data Quality 2 No information about metadata or versioning provided. Keywords are included for half of the defined datasets. Minimal information about naming conventions included, as well as some statements about quality assurance measures.
3 Storage and Backup During the Research Process 2 maDMP does not have host elements defined, therefore some information is missing (backup type and frequency, availability). Good description of access restrictions. For most datasets, clear indication whether personal/sensitive data is stored provided.
4 Legal and Ethical Requirements, Code of Conduct 5 There is no information about potential preservation considerations. Regarding licenses, the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition. Good description of access restrictions and sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 3 maDMP does not have host elements defined, therefore a lot of important information is missing (PID system, backup strategies, URLs etc.). There are preservation statements in the original JSON file, but they cannot be queried from the JSON-LD due to the reason explained above. Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition.
6 Data Management Responsibilities and Resources 1 Contact person is defined, but no contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 17/35

Due to the missing host definition, a lot of information could not be extracted with the queries. There is virtually no documentation of metadata. Information about the data management responsibilities is missing as well. Apart from those aspects, the maDMP provides a decent informational value.

2.jsonld

Category Satisfaction Value Justification
0 General Information 2 Sufficient information about DMP. Information about project not included.
1 Data Description and Collection or Re-Use of Existing Data 4 There is a clear description for each distribution. The file formats are specified (except for the source code). The size of the data is given as well (except for the source code).
2 Documentation and Data Quality 0 No keywords specified. Information about metadata, data quality assurance and versioning is missing.
3 Storage and Backup During the Research Process 0 maDMP does not have host elements defined, therefore some information is missing (backup type and frequency, availability). No description of security measures. For most datasets, no clear indication whether personal/sensitive data is stored provided.
4 Legal and Ethical Requirements, Code of Conduct 2 There is no information about potential preservation considerations. Regarding licensing, the maDMP contains helpful data, but the SPARQL query is a little bit too strict and fails due to the missing host definition. No description of access restrictions. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 2 maDMP does not have host elements defined, therefore a lot of important information is missing (PID system, backup strategies, URLs etc.). There are no preservation statements, therefore no information about research uses and preservation details (which data is kept, how to select etc.). Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition.
6 Data Management Responsibilities and Resources 1 Contact person is defined, but no contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 11/35

A lot of (important) information is missing in this maDMP. Hence, one can conclude based on the assessment with our queries that the maDMP provides insufficient documentation and exhibits many aspects in which it can be improved.

3.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 4 There is a clear description for each distribution. The file formats are specified (except for the source code). The size of the data is given as well (except for the source code).
2 Documentation and Data Quality 0 No keywords specified. Information about metadata and versioning is missing. Minimal information regarding data quality assurance provided.
3 Storage and Backup During the Research Process 2 Extensive description of where the data is stored; however, information about backups and security measures is missing. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 3 There is no information about potential preservation considerations. Useful information about licensing provided. No description of access restrictions. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 4 Substantial information about the data hosts (Zenodo); however, the corresponding queries are a little bit too strict and do not return anything. Good description of licensing/usage (sensitive data, embargo, openness). No explicit data preservation statement (missing data: retention period, data destruction, what data is kept).
6 Data Management Responsibilities and Resources 4 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 22/35

The documentation of the metadata is not satisfactory. Furthermore, there is no information regarding security considerations, data preservation and backups. All in all, this maDMP provides sufficient documentation without being of exceptional quality.

4.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 5 There is a clear description for each distribution. The file formats are specified as well as the size of the data.
2 Documentation and Data Quality 5 Significant keywords are provided as well as the metadata accompanying the data. Community metadata standards are used. Minimal information about versioning is available. Extensive description of data quality assurance measures.
3 Storage and Backup During the Research Process 5 Extensive description of where the data is stored, the respective backup modalities and access regulations. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 5 Extensive documentation of access restrictions, ethical considerations and licensing. Information about data preservation with regard to sensitive or personal data included.
5 Data Sharing and Long-Term Preservation 5 Substantial information about the data hosts (GitHub, Zenodo). Good description of licensing/usage (sensitive data, embargo, openness). There is a preservation statement in the original JSON file, but it cannot be queried from the JSON-LD due to the reason explained above.
6 Data Management Responsibilities and Resources 4 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 34/35

This maDMP could be assessed quite well; based on the results one can argue that this maDMP is excellent. The only missing aspect concerns the costs of the project.

5.jsonld

Category Satisfaction Value Justification
0 General Information 5 The SPARQL queries are not able to collect the required data. Nevertheless, in the original TTL file, there is a sufficient documentation of basic information (author etc.) and the project description. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 3 There is a clear description for each distribution; the file formats are defined as well. However, the type of the dataset is not specified and the size of the produced/used data is missing.
2 Documentation and Data Quality 0 Only few keywords specified. Information about metadata and versioning is missing. No information regarding data quality assurance provided.
3 Storage and Backup During the Research Process 1 Minimal description of where the data is stored and the respective backup modalities (which the SPARQL queries fail to detect). No description of security measures. No indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 2 No mention of data preservation considerations and no description of access control mechanisms. Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the incomplete host definitions. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 4 Almost complete information about the data hosts (GitHub, Zenodo); information about the PID system is missing. There is no preservation statement, therefore no information about research uses and preservation details (which data is kept, how to select etc.). Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the incomplete host definitions.
6 Data Management Responsibilities and Resources 5 Extensive information about creator, contributors and their roles. Needed resources are defined. Financial costs are not specified in the maDMP.
Sum 20/35

The main issue here is the lack of metadata documentation. Aside from that, the maDMP fails to provide information about data storage and preservation, backups and security measures. The other aspects were elaborated sufficiently. In conclusion, this maDMP is of decent quality.

6.jsonld

Category Satisfaction Value Justification
0 General Information 2 Sufficient information about DMP. Information about project not included.
1 Data Description and Collection or Re-Use of Existing Data 4 There is a clear description for each distribution. The file formats are specified (except for the source code). The size of the data is given as well (except for the source code).
2 Documentation and Data Quality 0 No keywords specified. Information about metadata and versioning is missing. Minimal information regarding data quality assurance provided.
3 Storage and Backup During the Research Process 1 maDMP does not have host elements defined, therefore some information is missing (backup type and frequency, availability). No information about security measures provided. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 2 No mention of data preservation considerations and no description of access control mechanisms. Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the incomplete host definitions. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 2 maDMP does not have host elements defined, therefore a lot of important information is missing (PID system, backup strategies, URLs etc.). There is no preservation statement, therefore no information about research uses and preservation details (which data is kept, how to select etc.). Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition.
6 Data Management Responsibilities and Resources 1 Contact person is defined, but no contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 12/35

A lot of (important) information is missing here. Hence, one can conclude based on the assessment with our queries that the maDMP provides insufficient documentation and exhibits many aspects in which it can be improved.

7.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 5 There is a clear description for each distribution. The file formats are specified (except for the source code). The size of the data is provided as well.
2 Documentation and Data Quality 0 No keywords specified. Specified metadata standard is not a standard. Minimal information regarding data quality assurance provided.
3 Storage and Backup During the Research Process 2 Extensive description of where the data is stored; however, information about backups and security measures is missing. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 3 There is no information about potential preservation considerations. Useful information about licensing provided. No description of access restrictions. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 4 Substantial information about the data host (Zenodo) and licensing/usage (sensitive data, embargo, openness). No explicit data preservation statement (missing data: retention period, data destruction, what data is kept).
6 Data Management Responsibilities and Resources 4 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 23/35

The documentation of the metadata is not quite satisfactory. Furthermore, there is no information regarding security considerations, data preservation and backups. All in all, this maDMP provides sufficient documentation without being of exceptional quality.

8.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 3 The file formats are defined. However, some distribution descriptions are missing as well as the size of some data.
2 Documentation and Data Quality 2 Significant keywords are specified. Specified metadata standards are no standards. No information regarding data quality assurance provided, except for minimal information about versioning.
3 Storage and Backup During the Research Process 3 Extensive description of where the data is stored; however, information about access restrictions is missing, as well as a description of the backup modalities for some hosts. Clear indication whether personal/sensitive data is stored for most specified datasets.
4 Legal and Ethical Requirements, Code of Conduct 3 There is no information about potential preservation considerations and access restrictions. Extensive documentation regarding licensing. Sufficient declaration of ethical issues.
5 Data Sharing and Long-Term Preservation 5 Extensive information about the data hosts (GitHub, Zenodo) and licensing/usage (sensitive data, embargo, openness). There are preservation statements in the original JSON file, but they cannot be queried from the JSON-LD due to the reason explained above.
6 Data Management Responsibilities and Resources 5 Extensive information about creator, contributors and their roles. Extensive description of needed resources and costs.
Sum 26/35

There was apparently some confusion regarding the metadata standards. Apart from that, there are only a few small issues. Overall, this maDMP is of mediocre quality.

9.jsonld

Category Satisfaction Value Justification
0 General Information 4 Extensive information about DMP. Description of the project is not included. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 5 There is a clear description for each distribution. The file formats are specified as well as the size of the data.
2 Documentation and Data Quality 4 Keywords are provided as well as the metadata accompanying the data. Community metadata standards are used. Minimal information about versioning is available. Description of data quality assurance measurements missing.
3 Storage and Backup During the Research Process 4 Extensive description of where the data is stored and access restrictions; however, information about backups is missing. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 4 Extensive documentation of access restrictions, ethical considerations and licensing. No information about data preservation with regard to sensitive or personal data included.
5 Data Sharing and Long-Term Preservation 4 There is extensive information about the data hosts (GitHub, Zenodo, The World Bank) and licensing/usage (sensitive data, embargo, openness). No explicit data preservation statement (missing data: retention period, data destruction, what data is kept). No target audiences (foreseeable research uses).
6 Data Management Responsibilities and Resources 5 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are also specified in the maDMP.
Sum 30/35

According to the results of the queries, although a few requirements were not completely fulfilled, this maDMP has a quite high amount of information demanded by the evaluation rubric, hinting at good quality.

10.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 4 The file formats are specified, but not in the IANA media type format. The size of the data is provided. However, the distribution descriptions are missing.
2 Documentation and Data Quality 2 Significant keywords are specified. No information about metadata or versioning provided. Extensive documentation of naming conventions included.
3 Storage and Backup During the Research Process 1 maDMP does not have host elements defined, therefore some information is missing (backup type and frequency, availability). No information about security measures provided. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 2 There is no information about potential preservation considerations and access control. Regarding licensing, the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 3 maDMP does not have host elements defined, therefore a lot of important information is missing (PID system, backup strategies, URLs etc.). There is a preservation statement in the original JSON file, but it cannot be queried from the JSON-LD due to the reason explained above. Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition.
6 Data Management Responsibilities and Resources 4 Creator/contact person is defined, but no contributors and their roles. Costs of storing and backing up the data are also specified in the maDMP.
Sum 21/35

Due to the missing host definition, a lot of information could not be extracted with the queries. There is virtually no documentation of metadata. Apart from those aspects, the maDMP did provide a decent informational value.

11.jsonld

Category Satisfaction Value Justification
0 General Information 2 Sufficient information about DMP. Information about project not included.
1 Data Description and Collection or Re-Use of Existing Data 4 There is a clear description for each distribution. The file formats are specified (except for the source code). The size of the data is given as well (except for the source code).
2 Documentation and Data Quality 0 No keywords specified. Original JSON file contains documentation_and_metadata element where some information about metadata is provided; this field is, however, not part of the RDA-DMP Common Standard and can therefore not be considered. No information about versioning. Minimal statement regarding data quality assurance.
3 Storage and Backup During the Research Process 1 maDMP does not have host elements defined, therefore some information is missing (backup type and frequency, availability). No information about security measures provided. Clear indication whether personal/sensitive data is stored.
4 Legal and Ethical Requirements, Code of Conduct 2 There is no information about potential preservation considerations and access control. Regarding licensing, the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition. Sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 3 maDMP does not have host elements defined, therefore a lot of important information is missing (PID system, backup strategies, URLs etc.). There is no preservation statement, therefore no information about research uses and preservation details (which data is kept, how to select etc.). Regarding licenses (license, embargo, openness, sensitivity), the maDMP does contain helpful data. However, the SPARQL query is a little bit too strict and fails due to the missing host definition.
6 Data Management Responsibilities and Resources 4 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 16/35

Since the JSON-LD maDMP was surprisingly short in content, a manual look into the source maDMP revealed that there are a lot of fields that are not actually part of the RDA-DMP Common Standard and thus, not queryable with our approach. From this assessment, one can conclude that there is still room for improvement.

12.jsonld

Category Satisfaction Value Justification
0 General Information 5 Extensive information about DMP and project. Funding information is missing, but this is to be expected since the project which was done in the course of this lecture is obviously not funded by anyone.
1 Data Description and Collection or Re-Use of Existing Data 5 There is a clear description for each distribution. The file formats are specified as well as the size of the data.
2 Documentation and Data Quality 5 Significant keywords are provided as well as the metadata accompanying the data. Community metadata standards are used. Minimal information about versioning is available. Extensive description of data quality assurance measures and folder structures.
3 Storage and Backup During the Research Process 5 Extensive description of where the data is stored, the respective backup modalities and access regulations. Clear indication whether personal/sensitive data is stored. Data is stored at four locations.
4 Legal and Ethical Requirements, Code of Conduct 5 Extensive documentation regarding licensing. Good description of access restrictions and sufficient declaration of ethical considerations.
5 Data Sharing and Long-Term Preservation 4 Substantial information about the data hosts (GitHub, Zenodo) and licensing/usage (sensitive data, embargo, openness). No explicit data preservation statement (missing data: retention period, data destruction, what data is kept). No target audiences (foreseeable research uses).
6 Data Management Responsibilities and Resources 4 Clear information about creator, contributors and their roles. Costs (resources, equipment, staff expenses etc.) are not specified in the maDMP.
Sum 33/35

Overall, this maDMP could be assessed decently well and turned out to be of excellent quality based on the evaluation with our queries. Missing aspects were mostly due to the maDMP schema, a preservation statement would have provided some more information.

Conclusion

The table below displays the average satisfaction value for each category defined in the rubric as well as the average sum.

Category Average Satisfaction Value
0 General Information 3.9
1 Data Description and Collection or Re-Use of Existing Data 4.0
2 Documentation and Data Quality 1.6
3 Storage and Backup During the Research Process 2.3
4 Legal and Ethical Requirements, Code of Conduct 3.2
5 Data Sharing and Long-Term Preservation 3.6
6 Data Management Responsibilities and Resources 3.5
Sum 22/35

As one can see in the table and the individual evaluations above, the main issues in the input maDMPs are insufficient documentation of the metadata accompanying the used and produced data as well as lacking information about the storage and backup of data (categories 2 and 3). Regarding the other categories, most of the maDMPs provided a decent amount of information, with a few shortcomings here and there. One aspect worth mentioning here is the missing definition of the host element which is an issue that appeared in quite a few maDMPs. Furthermore, the costs were neglected in all maDMPs, the required resources were only specified in one maDMP.

Nevertheless, based on the evaluation with our queries one can argue that all maDMPs are of good (or at least sufficient) quality.

With respect to the quality and usefulness of our queries, as already mentioned in the introduction to this section, some queries could be made more tolerant against not having defined optional schema elements. This would improve the results in gauging maDMPs. Other than that, they proved to be quite useful in assessing the set of input maDMPs.

All in all, the SPARQL queries can certainly serve as a starting point for reviewers. However, it is worth noting that queries are mostly kept rather general in order to be applicable to the quite diverse input files. Hence, one might need to adjust them to fit one’s specific domain and requirements.