swh.scanner.data module#
- class swh.scanner.data.MerkleNodeInfo[source]#
- Bases: - dict- Store additional information about Merkle DAG nodes, using SWHIDs as keys 
- swh.scanner.data.init_merkle_node_info(source_tree: Directory, data: MerkleNodeInfo, provenance: bool) None[source]#
- Populate the MerkleNodeInfo with the SWHIDs of the given source tree - The dictionary value are pre-filed with dictionary holding the information about the nodes. - The “known” key is always stored as it is always fetched. The “provenance” key is stored if the provenance parameter is - True.
- exception swh.scanner.data.NoProvenanceAPIAccess[source]#
- Bases: - RuntimeError- Raise when the user have not Access to the Provenance API 
- swh.scanner.data.add_provenance(source_tree: ~swh.model.from_disk.Directory, data: ~swh.scanner.data.MerkleNodeInfo, client: ~swh.web.client.client.WebAPIClient, update_progress: ~typing.Callable[[int, int], None] | None = <function _no_update_progress>)[source]#
- Store provenance information about software artifacts retrieved from the Software Heritage graph service. 
- swh.scanner.data.has_dirs(node: Directory) bool[source]#
- Check if the given directory has other directories inside. 
- swh.scanner.data.get_content_from(node_path: bytes, source_tree: Directory, nodes_data: MerkleNodeInfo) Dict[bytes, dict][source]#
- Get content information from the given directory node. 
- swh.scanner.data.get_vcs_ignore_patterns(cwd: Path | None = None) List[bytes][source]#
- Return a list of all patterns to ignore according to the VCS used for the project being scanned, if any.