Within the ARP ecosystem, the ARP Data Repository assigns a permanent, unique Handle identifier to every dataset and file by default. This identifier is available in the header of the datasets and files.
In addition to the Handle identifier, the creator or editor of any dataset may also request a DOI-type identifier to identify their dataset. The latter must be requested through ARP Support, which supports and processes all requests.
- A Brief Overview of Persistent Identifiers
-
Persistent identifiers (PID) are used for the long-term, global, and unambiguous identification of entities (datasets, files, research data, software, publications, individuals, institutions, funders, grants, etc.) and for linking them to one another. PIDs are typically generated as codes consisting of numbers and letters, accompanied by a link.
An important function of persistent identifiers is to consistently ensure access to specific information about a given entity, even if that entity ceases to exist at its original storage location.
The most widely used permanent unique identifiers for digital objects (data, datasets, publications) are Handle and DOI identifiers:
- Handle - https://www.handle.net/index.html https://researchdata.hu/fogalomtar#h
- DOI - https://www.doi.org/ https://researchdata.hu/fogalomtar#d
The Handle/DOI enables the unique identification of a dataset and ensures its long-term accessibility, findability, and reusability.
- Persistent Identifiers in the ARP Data Repository
-
By default, the ARP system assigns a Handle-type unique, persistent identifier to every dataset and file.
When a new dataset is created, the system immediately and automatically generates a new, unique Handle identifier for it, which is accessible in the dataset's header. Whenever a new file is placed within the dataset, the system also immediately and automatically assigns a new, unique Handle identifier to it. The identifier associated with the files is the dataset’s Handle identifier supplemented with an additional character string.
In the ARP Data Repository, all dataset have a uniform format for their Handle identifier: 21.15109/ARP/XXXXXX – where:
- The first string is fixed and represents the permanent identifier of the Handle system: 21.15109/
- The middle string is also fixed and represents the permanent identifier of the ARP Data Repository: ARP/
- The last string is variable and represents the dataset’s unique identifier, consisting of letters and numbers: XXXXXX
Example of a Handle in the ARP Data Repository:
- Dataset Handle identifier: 21.15109/ARP/TY567A
- File Handle identifier: 21.15109/ARP/TY567A/SII9XK
Every Handle identifier can be resolved using the https://hdl.handle.net/ URL prefix.
- DOI Identifiers Available in the ARP Data Repository
-
The creator or editor of any dataset may request a DOI identifier to identify their dataset.
While the system automatically assigns a Handle identifier, DOI identifiers must be requested through ARP Support. ARP Support processes and fulfills all requests.
In the ARP Data Repository, the standard format for the requested DOI identifier for each dataset is: 10.5158/ARP/XXXXX – where:
- The first string is fixed and represents the DOI system’s permanent identifier: 10.5158/
- The middle string is also fixed and represents the ARP Data Repository’s permanent identifier: ARP/
- The last string is variable and represents the dataset’s unique identifier, consisting of letters and numbers: XXXXX
The automatically assigned Handle and the DOI identifier that can be requested are character strings that can be mapped to one another. The DOI identifier can be easily derived from the Handle identifier: after the fixed initial (10.5158/) and fixed middle (ARP/) character strings of the DOI, the unique final character string of the Handle must be appended. Example:
- Handle identifier: 21.15109/ARP/ABC123
- DOI identifier: 10.5158/ARP/ABC123
In case of the former example:
- Handle identifier: 21.15109/ARP/TY567A
- DOI identifier: 10.5158/ARP/TY567A
Every DOI identifier can be resolved using the URL prefix https://doi.org/.
- Practical Steps for Requesting a DOI Identifier
-
- Opening an existing dataset or creating a new one (at least in draft form) – the responsibility of the dataset creator/editor
- Assigning DOI identifier based on the Handle identifier associated with the dataset – the responsibility of the dataset creator/editor
- For details, see: “Requesting a DOI in the ARP Data Repository”
- Inclusion of the DOI identifier in the dataset’s metadata – the responsibility of the dataset creator/editor
- Location: Edit Dataset / Metadata
- Other Identifier field:
- Agency: DOI
- Identifier: the specified identifier
- Save Changes
- Other Identifier field:
- For draft datasets, after completing each edit: Publish Dataset (this may occur at a different time than the creation of the dataset)
- For published datasets: republish
- The DOI identfier will be visible and accessible in the dataset on the Metadata tab
- Location: Edit Dataset / Metadata
- Requesting the activation of the DOI identifier through ARP Support - by clicking on the "Support" menu item or by sending an email to @email - the responsibility of the dataset creator/editor
- The request must include:
- The URL or Handle link
- The requested DOI
- The request must include:
- Activating the DOI – the responsibility of ARP Support
- Sending a notification regarding the activated DOI – the responsibility of ARP Support
Important Note 1: You can only request a DOI identifier for a dataset that has already been created (either published or as a draft). While creating a new dataset, neither the required Handle identifier nor the metadata field needed to enter the DOI identifier is available yet. To access these, you must save the newly created dataset, thereby creating a draft dataset.
Important Note 2: After submitting a request through ARP Support, the dataset creator/editor and an ARP Support staff member will coordinate the timing and implementation of the DOI activation. The DOI identifier can only be activated after the dataset has been made public. In the case of a previously published dataset, the dataset must be republished after the DOI identifier has been entered.