swh.storage.algos.diff module#
- swh.storage.algos.diff.diff_directories(storage: StorageInterface, from_dir: bytes | None, to_dir: bytes, track_renaming: bool = False) List[Dict[str, Any]][source]#
- Compute the differential between two directories, i.e. the list of file changes (insertion / deletion / modification / renaming) between them. - Parameters:
- storage – instance of a swh storage (either local or remote, for optimal performance the use of a local storage is recommended) 
- from_dir – the swh identifier of the directory to compare from 
- to_dir – the swh identifier of the directory to compare to 
- track_renaming – whether or not to track files renaming 
 
- Returns:
- A list of dict representing the changes between the two revisions. Each dict contains the following entries: - type: a string describing the type of change (- insert/- delete/- modify/- rename)
- from: a dict containing the directory entry metadata in the from revision (- Nonein case of an insertion)
- from_path: bytes string corresponding to the absolute path of the from revision entry (- Nonein case of an insertion)
- to: a dict containing the directory entry metadata in the to revision (- Nonein case of a deletion)
- to_path: bytes string corresponding to the absolute path of the to revision entry (- Nonein case of a deletion)
 - The returned list is sorted in lexicographic depth-first order according to the value of the - to_pathfield.- Warning - The algorithm used to track files renaming is quite naive (it compares hashes between deleted and inserted files) and might fail to detect all renamings for some edge cases. 
 
- swh.storage.algos.diff.diff_revisions(storage: StorageInterface, from_rev: bytes | None, to_rev: bytes, track_renaming: bool = False) List[Dict[str, Any]][source]#
- Compute the differential between two revisions, i.e. the list of file changes between the two associated directories. - Parameters:
- storage – instance of a swh storage (either local or remote, for optimal performance the use of a local storage is recommended) 
- from_rev – the identifier of the revision to compare from 
- to_rev – the identifier of the revision to compare to 
- track_renaming – whether or not to track files renaming 
 
- Returns:
- A list of dict describing the introduced file changes (see - swh.storage.algos.diff.diff_directories()).
 - Warning - The algorithm used to track files renaming is quite naive (it compares hashes between deleted and inserted files) and might fail to detect all renamings for some edge cases. 
- swh.storage.algos.diff.diff_revision(storage: StorageInterface, revision: bytes, track_renaming: bool = False) List[Dict[str, Any]][source]#
- Computes the differential between a revision and its first parent. If the revision has no parents, the directory to compare from is considered as empty. In other words, it computes the file changes introduced in a specific revision. - Parameters:
- storage – instance of a swh storage (either local or remote, for optimal performance the use of a local storage is recommended) 
- revision – the identifier of the revision from which to compute the introduced changes. 
- track_renaming – whether or not to track files renaming 
 
- Returns:
- A list of dict describing the introduced file changes (see - swh.storage.algos.diff.diff_directories()).
 - Warning - The algorithm used to track files renaming is quite naive (it compares hashes between deleted and inserted files) and might fail to detect all renamings for some edge cases.