swh.graph.luigi.topology module#
- class swh.graph.luigi.topology.TopoSort(*args, **kwargs)[source]#
- Bases: - Task- Creates a file that contains all SWHIDs in topological order from a compressed graph. - local_graph_path = <luigi.parameter.PathParameter object>#
 - topological_order_dir = <luigi.parameter.PathParameter object>#
 - graph_name = <luigi.parameter.Parameter object>#
 - object_types = <luigi.parameter.Parameter object>#
 - direction = <luigi.parameter.ChoiceParameter object>#
 - algorithm = <luigi.parameter.ChoiceParameter object>#
 
- class swh.graph.luigi.topology.ComputeGenerations(*args, **kwargs)[source]#
- Bases: - Task- Creates a file that contains all SWHIDs in topological order from a compressed graph. - local_graph_path = <luigi.parameter.PathParameter object>#
 - topological_order_dir = <luigi.parameter.PathParameter object>#
 - graph_name = <luigi.parameter.Parameter object>#
 - object_types = <luigi.parameter.Parameter object>#
 - direction = <luigi.parameter.ChoiceParameter object>#
 
- class swh.graph.luigi.topology.UploadGenerationsToS3(*args, **kwargs)[source]#
- Bases: - Task- Uploads the output of - ComputeGenerationsto S3- local_graph_path = <luigi.parameter.PathParameter object>#
 - topological_order_dir = <luigi.parameter.PathParameter object>#
 - dataset_name = <luigi.parameter.Parameter object>#
 - graph_name = <luigi.parameter.Parameter object>#
 - object_types = <luigi.parameter.Parameter object>#
 - direction = <luigi.parameter.ChoiceParameter object>#
 - requires() Task[source]#
- Returns an instance of - ComputeGenerations.
 
- class swh.graph.luigi.topology.CountPaths(*args, **kwargs)[source]#
- Bases: - Task- Creates a file that lists: - the number of paths leading to each node, and starting from all leaves, and 
- the number of paths leading to each node, and starting from all other nodes 
 - Singleton paths are not counted. - local_graph_path = <luigi.parameter.PathParameter object>#
 - topological_order_dir = <luigi.parameter.PathParameter object>#
 - graph_name = <luigi.parameter.Parameter object>#
 - object_types = <luigi.parameter.Parameter object>#
 - direction = <luigi.parameter.ChoiceParameter object>#
 - property resources#
- Return the estimated RAM use of this task. 
 
- class swh.graph.luigi.topology.PathCountsParquetToS3(*args, **kwargs)[source]#
- Bases: - _ParquetToS3ToAthenaTask- Reads the CSV from - CountPaths, converts it to ORC, upload the ORC to S3, and create an Athena table for it.- topological_order_dir = <luigi.parameter.PathParameter object>#
 - object_types = <luigi.parameter.Parameter object>#
 - direction = <luigi.parameter.ChoiceParameter object>#
 - dataset_name = <luigi.parameter.Parameter object>#
 - s3_athena_output_location = <swh.export.luigi.S3PathParameter object>#
 - requires() CountPaths[source]#
- Returns corresponding CountPaths instance