All Superinterfaces:
AutoCloseable, Closeable, Operator, OperatorPipelineV3, Serializable, org.apache.spark.sql.api.java.UDF2<org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>,Boolean,Void>
All Known Subinterfaces:
SupportsSaveToBLOB, SupportsSaveToHTTP, SupportsSaveToJDBC

@DeveloperApi public interface Writer extends Operator, org.apache.spark.sql.api.java.UDF2<org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>,Boolean,Void>
The operator responsible for delocalizing/saving command output DataFrame to storage or repositories, e.g. cloud file system, HTTPS repository, or JDBC database. A Writer is a user-defined function (UDF) that takes a DataFrame as input and a boolean parameter, true when the input DataFrame has been aggregated. Writer objects are created first by invoking Writer operator factory (implements WriterSupport) when pipeline task requests it and will be lazily initialized when DataFrame is ready to be delocalized. When completed, the close method will be invoked to release resources. SeqsLab supports multiple data writing features to manage and optimize workloads. Operators can inform SeqsLab its supporting features by implementing the specific mix-in interfaces.
See Also:
  • Method Summary

    Modifier and Type
    Method
    Description
    Obtains the data access information after the output DataFrame is successfully saved.
    Initializes this writer operator.

    Methods inherited from interface java.io.Closeable

    close

    Methods inherited from interface com.atgenomix.seqslab.piper.plugin.api.Operator

    getName, getOperatorContext

    Methods inherited from interface org.apache.spark.sql.api.java.UDF2

    call
  • Method Details

    • init

      Writer init()
      Initializes this writer operator.
      Returns:
      The object itself
    • getDataSource

      DataSource getDataSource()
      Obtains the data access information after the output DataFrame is successfully saved. The object will be registered in SeqsLab Data Hub and be used for successive tasks' inputs.
      Returns:
      A DataSource object