Interface SupportsAggregation
- All Superinterfaces:
AutoCloseable,Closeable,Collector,Operator,OperatorPipelineV3,Serializable,org.apache.spark.sql.api.java.UDF0<Iterator<org.apache.spark.sql.Row>>
A mix-in interface for
Collector. Command output collectors can implement this interface to support
computing aggregates and returns the aggregated results as a DataFrame.
When collector supports aggregation, SeqsLab calls the instance method agg() to aggregate on the entire DataFrame.
This feature is invoked after applying Collector's call function.-
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>agg(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df) Aggregates on the entire DataFrame.Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
init, schema, setDirectoryStream, setInputStreamMethods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator
getName, getOperatorContextMethods inherited from interface org.apache.spark.sql.api.java.UDF0
call
-
Method Details
-
agg
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> agg(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df) Aggregates on the entire DataFrame.- Parameters:
df- DataFrame returned from Collector call.- Returns:
- A new DataFrame by aggregating the input DataFrame.
-