Interface SupportsAggregation

All Superinterfaces:
AutoCloseable, Closeable, Collector, Operator, OperatorPipelineV3, Serializable, org.apache.spark.sql.api.java.UDF0<Iterator<org.apache.spark.sql.Row>>

@DeveloperApi @FeatureAfterCall public interface SupportsAggregation extends Collector
A mix-in interface for Collector. Command output collectors can implement this interface to support computing aggregates and returns the aggregated results as a DataFrame. When collector supports aggregation, SeqsLab calls the instance method agg() to aggregate on the entire DataFrame. This feature is invoked after applying Collector's call function.
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
    agg(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df)
    Aggregates on the entire DataFrame.

    Methods inherited from interface java.io.Closeable

    close

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector

    init, schema, setDirectoryStream, setInputStream

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator

    getName, getOperatorContext

    Methods inherited from interface org.apache.spark.sql.api.java.UDF0

    call
  • Method Details

    • agg

      org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> agg(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df)
      Aggregates on the entire DataFrame.
      Parameters:
      df - DataFrame returned from Collector call.
      Returns:
      A new DataFrame by aggregating the input DataFrame.