swh.loader.metadata.journal_client module#
- class swh.loader.metadata.journal_client.JournalClient(scheduler: swh.scheduler.interface.SchedulerInterface, storage: swh.storage.interface.StorageInterface, metadata_fetcher_credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None, reload_after_days: int)[source]#
Bases:
object- scheduler: SchedulerInterface#
- storage: StorageInterface#
- statsd_timed(name: str, tags: Dict[str, Any] = {})[source]#
Wrapper for
swh.core.statsd.Statsd.timed(), which uses the standard metric name and tag.
- statsd_timing(name: str, value: float, tags: Dict[str, Any] = {}) None[source]#
Wrapper for
swh.core.statsd.Statsd.timing(), which uses the standard metric name and tags for loaders.
- process_journal_objects(messages: Dict[str, List[Dict]]) None[source]#
Loads metadata for origins not recently loaded:
reads messages from the origin journal topic
queries the scheduler for a list of listers that produced this origin (to guess what type of forge it is)
if it is a forge we can get extrinsic metadata from, check if we got any recently, using the storage
if not, trigger a metadata load