swh.graph.luigi.topology module#
- class swh.graph.luigi.topology.TopoSort(*args, **kwargs)[source]#
Bases:
TaskCreates a file that contains all SWHIDs in topological order from a compressed graph.
- local_graph_path = <luigi.parameter.PathParameter object>#
- topological_order_dir = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- object_types = <luigi.parameter.Parameter object>#
- direction = <luigi.parameter.ChoiceParameter object>#
- algorithm = <luigi.parameter.ChoiceParameter object>#
- class swh.graph.luigi.topology.ComputeGenerations(*args, **kwargs)[source]#
Bases:
TaskCreates a file that contains all SWHIDs in topological order from a compressed graph.
- local_graph_path = <luigi.parameter.PathParameter object>#
- topological_order_dir = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- object_types = <luigi.parameter.Parameter object>#
- direction = <luigi.parameter.ChoiceParameter object>#
- class swh.graph.luigi.topology.UploadGenerationsToS3(*args, **kwargs)[source]#
Bases:
TaskUploads the output of
ComputeGenerationsto S3- local_graph_path = <luigi.parameter.PathParameter object>#
- topological_order_dir = <luigi.parameter.PathParameter object>#
- dataset_name = <luigi.parameter.Parameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- object_types = <luigi.parameter.Parameter object>#
- direction = <luigi.parameter.ChoiceParameter object>#
- requires() Task[source]#
Returns an instance of
ComputeGenerations.
- class swh.graph.luigi.topology.CountPaths(*args, **kwargs)[source]#
Bases:
TaskCreates a file that lists:
the number of paths leading to each node, and starting from all leaves, and
the number of paths leading to each node, and starting from all other nodes
Singleton paths are not counted.
- local_graph_path = <luigi.parameter.PathParameter object>#
- topological_order_dir = <luigi.parameter.PathParameter object>#
- graph_name = <luigi.parameter.Parameter object>#
- object_types = <luigi.parameter.Parameter object>#
- direction = <luigi.parameter.ChoiceParameter object>#
- property resources#
Return the estimated RAM use of this task.
- class swh.graph.luigi.topology.PathCountsParquetToS3(*args, **kwargs)[source]#
Bases:
_ParquetToS3ToAthenaTaskReads the CSV from
CountPaths, converts it to ORC, upload the ORC to S3, and create an Athena table for it.- topological_order_dir = <luigi.parameter.PathParameter object>#
- object_types = <luigi.parameter.Parameter object>#
- direction = <luigi.parameter.ChoiceParameter object>#
- dataset_name = <luigi.parameter.Parameter object>#
- s3_athena_output_location = <swh.export.luigi.S3PathParameter object>#
- requires() CountPaths[source]#
Returns corresponding CountPaths instance