swh.lister.rpm.lister module#
- swh.lister.rpm.lister.RPMPageType#
- Each page is a list of packages for a given (release, component) pair from a Red Hat based distribution. 
- class swh.lister.rpm.lister.RPMSourceData[source]#
- Bases: - TypedDict- Dictionary holding relevant data for listing RPM source packages. - See content of the lister config directory to get examples of RPM source data for famous RedHat based distributions. - index_url_templates: List[str]#
- List of URL templates to discover source packages metadata, the following variables can be substituted in them: - base_url,- releaseand- edition, see- string.Templatefor more details about the format. The generated URLs must target directories containing a sub-directory named- repodata, which contains packages metadata, in order to be successfully processed by the lister.
 
- class swh.lister.rpm.lister.RPMListerState(package_versions: ~typing.Dict[str, ~typing.Set[str]] = <factory>)[source]#
- Bases: - object- State of RPM lister 
- class swh.lister.rpm.lister.RPMLister(scheduler: SchedulerInterface, url: str, instance: str, rpm_src_data: List[RPMSourceData], incremental: bool = False, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None)[source]#
- Bases: - Lister[- RPMListerState,- Tuple[- str,- str,- Repo] |- None]- List source packages for a Red Hat based linux distribution. - The lister creates a snapshot for each package from all its available versions. - In incremental mode, only packages with different snapshot since the last listing operation will be sent to the scheduler that will create loading tasks to archive newly found source code. - Parameters:
- scheduler – instance of SchedulerInterface 
- url – Red Hat based distribution info URL 
- instance – name of Red Hat based distribution 
- rpm_src_data – list of dictionaries holding data required to list RPM source packages, see examples in the config directory. 
- incremental – if - True, only packages with new versions are sent to the scheduler when relisting
 
 - state_from_dict(d: Dict[str, Any]) RPMListerState[source]#
- Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister. 
 - state_to_dict(state: RPMListerState) Dict[str, Any][source]#
- Convert the StateType for this lister to its serialization as dict for storage in the scheduler. - Values must be JSON-compatible as that’s what the backend database expects. 
 - repo_request(index_url_template: Template, base_url: str, release: str, component: str) Tuple[str, str, Repo] | None[source]#
- Return parsed packages for a given distribution release and component. 
 - get_pages() Iterator[Tuple[str, str, Repo] | None][source]#
- Return an iterator on parsed rpm packages, one page per (release, component) pair. 
 - get_origins_from_page(page: Tuple[str, str, Repo] | None) Iterator[ListedOrigin][source]#
- Convert a page of rpm package sources into an iterator of ListedOrigin. 
 - finalize()[source]#
- Custom hook to finalize the lister state before returning from the main loop. - This method must set - updatedif the lister has done some work.- If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings. - This method is called in a finally block, which means it will also run when the lister fails.