swh.vault.to_disk module#
- swh.vault.to_disk.get_filtered_file_content(storage: StorageInterface, file_data: Dict[str, Any], objstorage: ObjStorageInterface | None = None) Dict[str, Any][source]#
- Retrieve the file specified by file_data and apply filters for skipped and missing content. - Parameters:
- storage – the storage from which to retrieve the objects 
- file_data – a file entry as returned by directory_ls() 
 
- Returns:
- The entry given in file_data with a new ‘content’ key that points to the file content in bytes. - The contents can be replaced by a specific message to indicate that they could not be retrieved (either due to privacy policy or because their sizes were too big for us to archive it). 
 
- class swh.vault.to_disk.DirectoryBuilder(storage: StorageInterface, root: bytes, dir_id: bytes, thread_pool_size: int = 10, objstorage: ObjStorageInterface | None = None)[source]#
- Bases: - object- Reconstructs the on-disk representation of a directory in the storage. - Initialize the directory builder. - Parameters:
- storage – the storage object 
- root – the path where the directory should be reconstructed 
- dir_id – the identifier of the directory in the storage