Interface SupportsReadPartitions

All Superinterfaces:
AutoCloseable, Closeable, Loader, Operator, OperatorPipelineV3, Serializable, org.apache.spark.sql.api.java.UDF0<Iterator<org.apache.spark.sql.Row>>

@DeveloperApi @FeatureBeforeCall public interface SupportsReadPartitions extends Loader
A mix-in interface for Loader. Dataset loaders can implement this interface to support partition-aware dataset reading concurrently. SeqsLab runs the Loader in parallel and ensure the same data partition across multiple input data sources are in the same command execution in localization process.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Get the number of partitions in the data source.
    void
    setPartitionId(int partitionId)
    Set the target partition that this loader is responsible loading from input data source.

    Methods inherited from interface java.io.Closeable

    close

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.loader.Loader

    init, readSchema

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator

    getName, getOperatorContext

    Methods inherited from interface org.apache.spark.sql.api.java.UDF0

    call
  • Method Details

    • numPartitions

      int numPartitions()
      Get the number of partitions in the data source.
      Returns:
      Number of partitions
    • setPartitionId

      void setPartitionId(int partitionId)
      Set the target partition that this loader is responsible loading from input data source.
      Parameters:
      partitionId - Partition identifier as an integer