Prepare your metadata and artifacts#

Protocol reference

This page will help you prepare a deposit without getting into too much details, a complete reference of the deposit protocol is also available to explain all the technical specifications.

A deposit is constituted of a metadata file and optionally one or more software artefacts.

The metadata file#

This is the most important part of a deposit process, see Why do we need metadata?.

Here’s a complete metadata file example for a metadata-only deposit on ORIGIN_URL:

<?xml version="1.0" encoding="utf-8"?>

<!-- XML Entry -->
<entry xmlns="http://www.w3.org/2005/Atom"
       xmlns:codemeta="https://w3id.org/codemeta/3.0"
       xmlns:swh="https://www.softwareheritage.org/schema/2018/deposit">

   <!-- SWH deposit's own properties -->
   <swh:deposit>
      <swh:reference>
         <swh:object swhid="SWHID_CONTEXT"/>
      </swh:reference>

      <!-- Metadata provenance -->
      <swh:metadata-provenance>
         <schema:url>METADATA_URL</schema:url>
      </swh:metadata-provenance>
   </swh:deposit>

   <!-- CodeMeta metadata -->
   <codemeta:name>A required software name</codemeta:name>
   <codemeta:description>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus aliquam tincidunt lacus, ut mollis tellus volutpat a. Mauris ut ornare mauris. Suspendisse elementum lacinia erat, at ornare lorem fringilla vel. Aliquam sagittis dictum cursus. Etiam ut porta libero, ut malesuada augue. In viverra felis justo, a ullamcorper sem consectetur sed. Sed in euismod nunc.</codemeta:description>
   <codemeta:dateCreated>2022-11-17</codemeta:dateCreated>
   <codemeta:datePublished>2023-04-27</codemeta:datePublished>
   <codemeta:license>
      <codemeta:name>GNU Affero General Public License</codemeta:name>
   </codemeta:license>
   <codemeta:keywords>digital geometry,image processing,geometry processing</codemeta:keywords>
   <codemeta:relatedLink>https://example.com</codemeta:relatedLink>
   <codemeta:programmingLanguage>c++</codemeta:programmingLanguage>
   <codemeta:operatingSystem>Linux, Mac OS X, Windows</codemeta:operatingSystem>
   <codemeta:license>
      <codemeta:name>GNU Affero General Public License</codemeta:name>
   </codemeta:license>
   <codemeta:author>
      <codemeta:name>Hedy Lamarr</codemeta:name>
      <codemeta:email>email@example.com</codemeta:email>
   </codemeta:author>

   <!-- Versioning info -->
   <codemeta:version>1.0.0</codemeta:version>
</entry>

This file can be a bit daunting, let’s examine its content in detail.

XML Entry#

As we’re using the SWORD v2 standard to handle the deposits the format we used for the metadata file is XML. Used namespaces:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom"
       xmlns:codemeta="https://w3id.org/codemeta/3.0"
       xmlns:swh="https://www.softwareheritage.org/schema/2018/deposit">
   <!-- metadata -->
</entry>

SWH deposit’s own properties#

This namespace is specific to our implementation of the SWORD v2 protocol, it’s used to describe what kind of deposit the you are doing:

This is the first time you’re making a code deposit for ORIGIN_URL.

<swh:deposit>
   <swh:create_origin>
      <swh:origin url="ORIGIN_URL" />
   </swh:create_origin>
</swh:deposit>

You already made a code deposit for ORIGIN_URL and you want to send a new version.

<swh:deposit>
   <swh:add_to_origin>
      <swh:origin url="ORIGIN_URL" />
   </swh:add_to_origin>
</swh:deposit>

You don’t have a software artefact to send, only metadata related to a SWHID or an ORIGIN_URL.

<swh:deposit>
   <swh:reference>
      <swh:object swhid="SWHID_CONTEXT" />
      <!-- or -->
      <swh:object swhid="SWHID" />
      <!-- or -->
      <swh:origin url="ORIGIN_URL" />
   </swh:reference>
</swh:deposit>

CodeMeta#

We’re using CodeMeta <https://codemeta.github.io/> terms to describe the metadata. For example:

<codemeta:name>A required software name</codemeta:name>
<codemeta:url>ORIGIN_URL</codemeta:url>
<codemeta:applicationCategory>test</codemeta:applicationCategory>
<codemeta:keywords>Some keywords, separated, by, commas</codemeta:keywords>
<codemeta:description>An optional description.</codemeta:description>
<codemeta:version>1.12</codemeta:version>
<codemeta:developmentStatus>stable</codemeta:developmentStatus>
<codemeta:programmingLanguage>ocaml</codemeta:programmingLanguage>
<codemeta:license>
   <codemeta:name>GNU Affero General Public License</codemeta:name>
</codemeta:license>
<codemeta:author>
   <codemeta:name>Hedy Lamarr</codemeta:name>
   <codemeta:email>email@example.com</codemeta:email>
</codemeta:author>
Required fields#

Name

Description

codemeta:name

The name of this software

codemeta:author

The author(s) of this software

Recommended fields#

Name

Description

codemeta:version

The version of the software, used to differentiate multiple deposits of a same ORIGIN_URL, see versioning below

codemeta:description

Short or long description of the software

codemeta:license

The license(s) of the software

See the full CodeMeta terms list for a complete reference of the available properties.

Versioning#

The codemeta:version property is used to differentiate multiple deposits of a same ORIGIN_URL. Use cases:

  • the software has been updated, you want a make a new deposit of it, you need to increment the codemeta:version property (if the property is missing we will use a version number reflecting the number of deposits made for this origin)

  • a mistake was made in a previous deposit, you can use make a new one using the same codemeta:version value. The new snapshot will only contain the latest deposit with this version number

Here is a snapshot view a an origin listing all distinct versions deposited by HAL for the origin https://hal.archives-ouvertes.fr/hal-04088473

Please note that using the same codemeta:version value for multiple deposits will not delete the previous one(s) from the archive: they will still be accessible using their SWHID, but they will not appear in the future snapshots.

Metadata provenance#

To indicate where the metadata is coming from, deporefsit clients can use a <swhdeposit:metadata-provenance> element in <swhdeposit:deposit> whose content is the object the metadata is coming from.

For example, when the metadata is coming from Wikidata, then the provenance should be the page of a Q-entity or when the metadata is coming from a curated repository like HAL, then it should be the HAL project.

For example, to deposit metadata on GNU Hello:

<swh:deposit>
   <swh:metadata-provenance>
      <schema:url>https://www.wikidata.org/wiki/Q16988498</schema:url>
   </swh:metadata-provenance>
</swh:deposit>

Software artefact#

Now that your metadata file is ready you’ll need to prepare your code artefact by packaging the files in a supported archive format:

  • zip: common zip archive (no multi-disk zip files).

  • tar: tar archive without compression or compressed using gzip, bzip2 or lzma

Our server will reject files larger than 100MB, so if your artefact is larger than that you will have to split it in multiple files.

Tools#

To use the deposit services you will need to make API calls or use our command line interface (CLI):

Next step#

You are now ready to make your first deposit!