Prepare your metadata and artifacts
===================================
.. admonition:: Protocol reference
:class: note
This page will help you prepare a deposit without getting into too much details,
a :doc:`complete reference of the deposit protocol <../references/protocol>`
is also available to explain all the technical specifications.
A deposit is constituted of a metadata file and optionally one or more software
artefacts.
The metadata file
-----------------
This is the most important part of a deposit process, see
:doc:`../explanations/why-metadata`.
Here's a complete metadata file example for a metadata-only deposit on ``ORIGIN_URL``:
.. code-block:: xml
METADATA_URL
A required software name
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus aliquam tincidunt lacus, ut mollis tellus volutpat a. Mauris ut ornare mauris. Suspendisse elementum lacinia erat, at ornare lorem fringilla vel. Aliquam sagittis dictum cursus. Etiam ut porta libero, ut malesuada augue. In viverra felis justo, a ullamcorper sem consectetur sed. Sed in euismod nunc.
2022-11-17
2023-04-27
GNU Affero General Public License
digital geometry,image processing,geometry processing
https://example.com
c++
Linux, Mac OS X, Windows
GNU Affero General Public License
Hedy Lamarr
email@example.com
1.0.0
This file can be a bit daunting, let's examine its content in detail.
XML Entry
~~~~~~~~~
As we're using the SWORD v2 standard to handle the deposits the format we used for the
metadata file is XML. Used namespaces:
- `atom `_ (required)
- `Software Heritage deposit `_
(required)
- `CodeMeta `_ (recommended)
- `schema `_ (optional)
.. code-block:: xml
SWH deposit's own properties
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This namespace is specific to our implementation of the SWORD v2 protocol, it's used
to describe what *kind* of deposit the you are doing:
.. tab-set::
.. tab-item:: Initial deposit
This is the first time you're making a code deposit for ``ORIGIN_URL``.
.. code-block:: xml
.. tab-item:: New version deposit
You already made a code deposit for ``ORIGIN_URL`` and you want to send a new
version.
.. code-block:: xml
.. tab-item:: Metadata-only deposit
You don't have a software artefact to send, only metadata related to a ``SWHID`` or
an ``ORIGIN_URL``.
.. code-block:: xml
CodeMeta
~~~~~~~~
We're using `CodeMeta ` terms to describe the metadata.
For example:
.. code-block:: xml
A required software name
ORIGIN_URL
test
Some keywords, separated, by, commas
An optional description.
1.12
stable
ocaml
GNU Affero General Public License
Hedy Lamarr
email@example.com
.. list-table:: Required fields
:header-rows: 1
* - Name
- Description
* - codemeta:name
- The name of this software
* - codemeta:author
- The author(s) of this software
.. list-table:: Recommended fields
:header-rows: 1
* - Name
- Description
* - codemeta:version
- The version of the software, used to differentiate multiple deposits of a same
``ORIGIN_URL``, see versioning below
* - codemeta:description
- Short or long description of the software
* - codemeta:license
- The license(s) of the software
See the `full CodeMeta terms list `_ for a complete
reference of the available properties.
Versioning
~~~~~~~~~~
The ``codemeta:version`` property is used to differentiate multiple deposits of a same
``ORIGIN_URL``. Use cases:
- the software has been updated, you want a make a new deposit of it, you need to
increment the ``codemeta:version`` property (if the property is missing we will
use a version number reflecting the number of deposits made for this origin)
- a mistake was made in a previous deposit, you can use make a new one using the same
``codemeta:version`` value. The new snapshot will only contain the latest deposit
with this version number
Here is `a snapshot view a an origin`_ listing all distinct versions deposited by HAL
for the origin ``https://hal.archives-ouvertes.fr/hal-04088473``
.. _a snapshot view a an origin: https://archive.softwareheritage.org/browse/snapshot/f4680770f994ab60a835844168c8b68ee24ac0b8/releases/?origin_url=https://hal.archives-ouvertes.fr/hal-04088473&snapshot=f4680770f994ab60a835844168c8b68ee24ac0b8
Please note that using the same ``codemeta:version`` value for multiple deposits will
not delete the previous one(s) from the archive: they will still be accessible using
their SWHID, but they will not appear in the future snapshots.
Metadata provenance
~~~~~~~~~~~~~~~~~~~
To indicate where the metadata is coming from, deporefsit clients can use a
```` element in ```` whose content
is the object the metadata is coming from.
For example, when the metadata is coming from Wikidata, then the
provenance should be the page of a Q-entity or when the metadata is coming from a
curated repository like HAL, then it should be the HAL project.
For example, to deposit metadata on GNU Hello:
.. code:: xml
https://www.wikidata.org/wiki/Q16988498
Software artefact
-----------------
Now that your metadata file is ready you'll need to prepare your code artefact by
packaging the files in a supported archive format:
- ``zip``: common zip archive (no multi-disk zip files).
- ``tar``: tar archive without compression or compressed using ``gzip``, ``bzip2`` or
``lzma``
Our server will reject files larger than 100MB, so if your artefact is larger than that
you will have to split it in multiple files.
Tools
-----
To use the deposit services you will need to make API calls or use our command line
interface (CLI):
- software used to make API calls: `curl `_,
`httpie `_, etc.
- `swh-deposit `_ CLI: ``pip install swh-deposit``
Next step
---------
You are now ready to make your first deposit!
- You have a single artefact to upload, then follow :doc:`first deposit `
- Your artefacts were too large for a simple deposit, then go to
:doc:`make a multi-step deposit `
- You only have metadata to deposit then head to
- :doc:`metadata-only deposit `