Interface SupportsAggregation
- All Superinterfaces:
AutoCloseable
,Closeable
,Collector
,Operator
,OperatorPipelineV3
,Serializable
,org.apache.spark.sql.api.java.UDF0<Iterator<org.apache.spark.sql.Row>>
A mix-in interface for
Collector
. Command output collectors can implement this interface to support
computing aggregates and returns the aggregated results as a DataFrame.
When collector supports aggregation, SeqsLab calls the instance method agg() to aggregate on the entire DataFrame.
This feature is invoked after applying Collector's call function.-
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
agg
(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df) Aggregates on the entire DataFrame.Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
init, schema, setDirectoryStream, setInputStream
Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator
getName, getOperatorContext
Methods inherited from interface org.apache.spark.sql.api.java.UDF0
call
-
Method Details
-
agg
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> agg(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df) Aggregates on the entire DataFrame.- Parameters:
df
- DataFrame returned from Collector call.- Returns:
- A new DataFrame by aggregating the input DataFrame.
-