Interface SupportsOrdering

All Superinterfaces:
AutoCloseable, Closeable, Operator, OperatorPipelineV3, Serializable, Transformer, org.apache.spark.sql.api.java.UDF1<org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>,org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>

@DeveloperApi @FeatureAfterCall public interface SupportsOrdering extends Transformer
A mix-in interface for Transformer. Dataset transformer can implement this interface to support sorting for ordering-aware processing that SeqsLab shall assure the same sorting order of all partitions of DataFrame must be preserved in the downstream localization process.
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.spark.sql.Column[]
    getSortExprs(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df)
    Get the sorting expressions within each partition, ex: df.col("seq_id").asc().

    Methods inherited from interface java.io.Closeable

    close

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator

    getName, getOperatorContext

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.transformer.Transformer

    init, numPartitions

    Methods inherited from interface org.apache.spark.sql.api.java.UDF1

    call
  • Method Details

    • getSortExprs

      org.apache.spark.sql.Column[] getSortExprs(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df)
      Get the sorting expressions within each partition, ex: df.col("seq_id").asc().
      Returns:
      Array of DataFrame Columns