swh.web.save_bulk.api_views module#
- class swh.web.save_bulk.api_views.OriginsDataCSVParser[source]#
- Bases: - BaseParser- media_type = 'text/csv'#
 
- swh.web.save_bulk.api_views.api_origin_save_bulk(request: Request) Response[source]#
- POST /api/1/origin/save/bulk/#
- Request the saving of multiple software origins into the archive. - That endpoint enables to request the archival of multiple software origins through a POST request containing a list of origin URLs and their visit types in its body. - The following visit types are supported: - bzr,- cvs,- hg,- git,- svnand- tarball-directory.- The origins list data can be provided using the following content types: - text/csv(default)- When using CSV format, first column must contain origin URLs and second column the visit types. - "https://git.example.org/user/project","git" "https://download.example.org/project/source.tar.gz","tarball-directory" - To post the content of such file to the endpoint, you can use the following - curlcommand.- $ curl -X POST -H "Authorization: Bearer ****" \ -H "Content-Type: text/csv" \ --data-binary @/path/to/origins.csv \ https://archive.softwareheritage.org/api/1/origin/save/bulk/ 
- application/json- When using JSON format, the following schema must be used. - [ { "origin_url": "https://git.example.org/user/project", "visit_type": "git" }, { "origin_url": "https://download.example.org/project/source.tar.gz", "visit_type": "tarball-directory" } ] - To post the content of such file to the endpoint, you can use the following - curlcommand.- $ curl -X POST -H "Authorization: Bearer ****" \ -H "Content-Type: application/json" \ --data-binary @/path/to/origins.json \ https://archive.softwareheritage.org/api/1/origin/save/bulk/ 
- application/yaml- When using YAML format, the following schema must be used. - - origin_url: https://git.example.org/user/project visit_type: git - origin_url: https://download.example.org/project/source.tar.gz visit_type: tarball-directory - To post the content of such file to the endpoint, you can use the following - curlcommand.- $ curl -X POST -H "Authorization: Bearer ****" \ -H "Content-Type: application/yaml" \ --data-binary @/path/to/origins.yaml \ https://archive.softwareheritage.org/api/1/origin/save/bulk/ 
 - Once received, origins data are checked for correctness by validating URLs and verifying if visit types are supported. A request cannot be accepted if at least one origin is not valid. All origins with invalid format will be reported in the rejected request response. - Warning - That endpoint is not publicly available and requires authentication and special user permission in order to request it. - Request Headers:
- Accept – the requested response content type, either - application/json(default) or- application/yaml
- Content-Type – the content type of posted data, either - text/csv(default),- application/jsonor- application/yaml
 
- Response Headers:
- Content-Type – this depends on Accept header of request 
 
- Response JSON Object:
- status (string) – either - acceptedor- rejected
- reason (string) – details about why a request got rejected 
- request_id (string) – request identifier (only when it its accepted) 
- rejected_origins (array) – list of rejected origins and details about the reasons (only when the request is rejected) 
 
- Status Codes:
- 200 OK – no error 
- 400 Bad Request – provided origins data are not valid 
- 401 Unauthorized – request is not authenticated 
- 403 Forbidden – user does not have permission to query the endpoint 
- 415 Unsupported Media Type – payload format is not supported 
 
 
 
- swh.web.save_bulk.api_views.api_origin_save_bulk_request_info(request: Request, request_id: UUID)[source]#
- GET /api/1/origin/save/bulk/request/(request_id)/#
- Get feedback about loading statuses of origins submitted through a save bulk request. - That endpoint enables to track the archival statuses of origins sumitted through a POST request using the - POST /api/1/origin/save/bulk/endpoint. Info about submitted origins are returned in a paginated way.- Note - Only origin visits whose dates are greater than the request date are reported by that endpoint. - Warning - That endpoint is not publicly available and requires authentication and special user permission in order to request it. Staff users are also allowed to query it. - Warning - Only the user that created a save bulk request or a staff user can get feedback about it. - Parameters:
- request_id (string) – UUID identifier of a save bulk request 
 
- Query Parameters:
- page (number) – The submitted origins info page number to retrieve 
- per_page (number) – Number of submitted origins info per page, default to 1000, maximum is 10000 
 
- Response JSON Array of Objects:
- origin_url (string) – URL of submitted origin 
- visit_type (string) – visit type for the origin 
- status (string) – submitted origin status, either - pending,- acceptedor- rejected
- last_scheduling_date (date) – ISO8601/RFC3339 representation of the last date (in UTC) when the origin was scheduled for loading into the archive, - nullif the origin got rejected
- last_visit_date (date) – ISO8601/RFC3339 representation of the last date (in UTC) when the origin was visited by Software Heritage, - nullif the origin got rejected or was not visited yet
- last_visit_status (string) – last visit status for the origin, either - successfulor- failed,- nullif the origin got rejected or was not visited yet
- last_snapshot_swhid (string) – last produced snapshot SWHID associated to the visit, - nullif the origin got rejected or was not visited yet
- rejection_reason (string) – if the origin got rejected gives more details about it 
- browse_url (string) – URL to browse the submitted origin if it got accepted and loaded into the archive, - nullif the origin got rejected or was not visited yet
 
- Request Headers:
- Accept – the requested response content type, either - application/json(default) or- application/yaml
 
- Response Headers:
- Content-Type – this depends on Accept header of request 
- Link – indicates that a subsequent result page is available and contains the url pointing to it 
 
- Status Codes:
- 200 OK – no error 
- 401 Unauthorized – request is not authenticated 
- 403 Forbidden – user does not have permission to query the endpoint or get feedback about a request he did not submit 
 
 
 
- swh.web.save_bulk.api_views.api_origin_save_bulk_requests(request: Request)[source]#
- GET /api/1/origin/save/bulk/requests/#
- List previously submitted save bulk requests. - That endpoint enables to list the save bulk requests submitted by your user account and get their info URLs (see - GET /api/1/origin/save/bulk/request/(request_id)/). That list is returned in a paginated way if the number or requests is large.- Warning - That endpoint is not publicly available and requires authentication and special user permission in order to request it. - Query Parameters:
- page (number) – The submitted requests page number to retrieve 
- per_page (number) – Number of submitted requests per page, default to 1000, maximum is 10000 
 
- Response JSON Array of Objects:
- request_id (string) – UUID identifier of the request 
- request_date (date) – the date the request was submitted 
- request_info_url (string) – URL to get detailed info about the request 
 
- Request Headers:
- Accept – the requested response content type, either - application/json(default) or- application/yaml
 
- Response Headers:
- Content-Type – this depends on Accept header of request 
- Link – indicates that a subsequent result page is available and contains the url pointing to it 
 
- Status Codes:
- 200 OK – no error 
- 401 Unauthorized – request is not authenticated 
- 403 Forbidden – user does not have permission to query the endpoint