swh.lister.bioconductor.lister module#
- class swh.lister.bioconductor.lister.BioconductorListerState(package_versions: ~typing.Dict[str, ~typing.Set[str]] = <factory>)[source]#
- Bases: - object- State of the Bioconductor lister 
- class swh.lister.bioconductor.lister.BioconductorLister(scheduler: SchedulerInterface, url: str = 'https://www.bioconductor.org', instance: str = 'bioconductor', credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, releases: List[str] | None = None, categories: List[str] | None = None, incremental: bool = False, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True, record_batch_size: int = 1000)[source]#
- Bases: - Lister[- BioconductorListerState,- Tuple[- str,- str,- Dict[- str,- Any]] |- None]- List origins from Bioconductor, a collection of open source software for bioinformatics based on the R statistical programming language. - VISIT_TYPE = 'bioconductor'#
 - INSTANCE = 'bioconductor'#
 - BIOCONDUCTOR_HOMEPAGE = 'https://www.bioconductor.org'#
 - state_from_dict(d: Dict[str, Any]) BioconductorListerState[source]#
- Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister. 
 - state_to_dict(state: BioconductorListerState) Dict[str, Any][source]#
- Convert the StateType for this lister to its serialization as dict for storage in the scheduler. - Values must be JSON-compatible as that’s what the backend database expects. 
 - get_pages() Iterator[Tuple[str, str, Dict[str, Any]] | None][source]#
- Return an iterator for each page. Every page is a (release, category) pair. 
 - get_origins_from_page(page: Tuple[str, str, Dict[str, Any]] | None) Iterator[ListedOrigin][source]#
- Convert a page of BioconductorLister PACKAGES/packages.json metadata into a list of ListedOrigins 
 - finalize() None[source]#
- Custom hook to finalize the lister state before returning from the main loop. - This method must set - updatedif the lister has done some work.- If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings. - This method is called in a finally block, which means it will also run when the lister fails.