swh.lister.rubygems package#
Submodules#
- swh.lister.rubygems.lister module
RubyGemsListerRubyGemsLister.LISTER_NAMERubyGemsLister.VISIT_TYPERubyGemsLister.INSTANCERubyGemsLister.RUBY_GEMS_POSTGRES_DUMP_BASE_URLRubyGemsLister.RUBY_GEMS_POSTGRES_DUMP_LIST_URLRubyGemsLister.RUBY_GEM_DOWNLOAD_URL_PATTERNRubyGemsLister.RUBY_GEM_ORIGIN_URL_PATTERNRubyGemsLister.RUBY_GEM_EXTRINSIC_METADATA_URL_PATTERNRubyGemsLister.DB_NAMERubyGemsLister.DUMP_SQL_PATHRubyGemsLister.get_latest_dump_file()RubyGemsLister.create_rubygems_db()RubyGemsLister.populate_rubygems_db()RubyGemsLister.get_pages()RubyGemsLister.get_origins_from_page()
- swh.lister.rubygems.tasks module
Module contents#
RubyGems lister#
The RubyGems lister list origins from RubyGems.org, the Ruby community’s gem hosting service.
As of September 2022 RubyGems.org list 173384 package names.
Origins retrieving strategy#
To get a list of all package names we call an http endpoint which returns a list of gems as text.
Page listing#
Each page returns an origin url based on the following pattern:
https://rubygems.org/gems/{pkgname}
Origins from page#
The lister yields one origin url per page.
Running tests#
Activate the virtualenv and run from within swh-lister directory:
pytest -s -vv --log-cli-level=DEBUG swh/lister/rubygems/tests
Testing with Docker#
Change directory to swh/docker then launch the docker environment:
docker compose up -d
Then schedule a RubyGems listing task:
docker compose exec swh-scheduler swh scheduler task add -p oneshot list-rubygems
You can follow lister execution by displaying logs of swh-lister service:
docker compose logs -f swh-lister