How to deploy a mirror#
This section describes how to deploy a mirror using the software stack provided by Software Heritage.
A mirror deployment will consists in running several components of the Software Heritage stack:
- An instance of the storage (Software Heritage - Storage); 
- A backend database (PostgreSQL or Cassandra) for the storage; 
- An instance of the object storage (Software Heritage - Object storage); 
- A large storage system (zfs or cloud storage) as the objstorage backend; 
- An instance of the frontend (Software Heritage - Web applications); 
- An instance of the search engine backend (Software Heritage - Search service); 
- An elasticsearch instance as swh-search backend; 
- The vault service and its support tooling (RabbitMQ, swh-scheduler, Software Heritage - Vault, …); 
- The replayer services: - swh.storage.replayservice (part of the Software Heritage - Storage package)
- swh.objstorage.replayer.replayservice (from the Software Heritage - Object storage replayer package)
 
Each service consists in an HTTP-based RPC served by a gunicorn WSGI server.
Note
It is not recommended to try to deploy each Software Heritage service individually. You should rather start from the example docker-based deployment project linked below.
Docker-based deployment#
This represents a lot of services to configure and orchestrate. In order to help to start the configuration of a mirror, a docker-swarm based deployment solution is provided as a working example of the mirror stack:
It is strongly recommended to start from there in a test environment before planning a production-like deployment.