Index
All Classes and Interfaces|All Packages|Serialized Form
A
- agg(Dataset<Row>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.SupportsAggregation
-
Aggregates on the entire DataFrame.
- arguments() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
- ARRAY - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
B
- BOOLEAN - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
C
- Collector - Interface in com.atgenomix.seqslab.piper.plugin.api.collector
-
The operator responsible for retrieving command outputs from local file system and returning a DataFrame.
- CollectorSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.collector
-
Factory object responsible for creating
Collector
operators. - com.atgenomix.seqslab.piper.plugin.api - package com.atgenomix.seqslab.piper.plugin.api
- com.atgenomix.seqslab.piper.plugin.api.collector - package com.atgenomix.seqslab.piper.plugin.api.collector
- com.atgenomix.seqslab.piper.plugin.api.executor - package com.atgenomix.seqslab.piper.plugin.api.executor
- com.atgenomix.seqslab.piper.plugin.api.formatter - package com.atgenomix.seqslab.piper.plugin.api.formatter
- com.atgenomix.seqslab.piper.plugin.api.loader - package com.atgenomix.seqslab.piper.plugin.api.loader
- com.atgenomix.seqslab.piper.plugin.api.transformer - package com.atgenomix.seqslab.piper.plugin.api.transformer
- com.atgenomix.seqslab.piper.plugin.api.writer - package com.atgenomix.seqslab.piper.plugin.api.writer
- com.atgenomix.seqslab.piper.tags - package com.atgenomix.seqslab.piper.tags
- configurations() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
- createCollector(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.CollectorSupport
-
Returns a collector operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
- createExecutor(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.executor.ExecutorSupport
-
Returns an executor operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
- createFormatter(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.formatter.FormatterSupport
-
Returns a formatter operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
- createLoader(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.LoaderSupport
-
Returns a loader operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
- createTransformer(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.transformer.TransformerSupport
-
Returns a transformer operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
- createWriter(PluginContext, OperatorContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.WriterSupport
-
Returns a writer operator with specific plugin context and operator context that is created and initialized by SeqsLab workflow engine.
D
- DATAFRAME - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
- DataSource - Class in com.atgenomix.seqslab.piper.plugin.api
-
The main interface responsible for representing an accessible data source in SeqsLab.
- DataSource() - Constructor for class com.atgenomix.seqslab.piper.plugin.api.DataSource
- DeveloperApi - Annotation Interface in com.atgenomix.seqslab.piper.tags
-
An API intended for developers.
- DICTIONARY - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
- Dimension - Annotation Interface in com.atgenomix.seqslab.piper.tags
-
Indicates the computing resource requirements, types of workload and pipeline structures that an operator supports.
- Dimensions - Annotation Interface in com.atgenomix.seqslab.piper.tags
-
Indicates the computing resource requirements, types of workload and pipeline structures that an operator supports.
- DOUBLE - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
E
- Executor - Interface in com.atgenomix.seqslab.piper.plugin.api.executor
-
The operator responsible for preprocessing (localizing) input dataframe as a managed table for Spark SQL command or saving to local files for shell script execution.
- ExecutorSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.executor
-
Factory object responsible for creating
Executor
operators.
F
- FeatureAfterCall - Annotation Interface in com.atgenomix.seqslab.piper.tags
-
An annotation indicates pipeline operator features that will be checked and invoked after calling operator function.
- FeatureBeforeCall - Annotation Interface in com.atgenomix.seqslab.piper.tags
-
An annotation indicates pipeline operator features that will be checked and invoked prior to calling operator function.
- FILE - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
- Formatter - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter
-
An operator responsible for formatting input datasets, such as converting schema, adding or deleting columns, and encoding domain specific object.
- FormatterSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter
-
Factory object responsible for creating
Formatter
operators.
G
- get(String) - Method in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Get the property value
- getAccessTier() - Method in class com.atgenomix.seqslab.piper.plugin.api.DataSource
-
Retrieve the access tier of the data source, e.g.
- getArray() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as an array of PiperValue.
- getBoolean() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a Boolean.
- getConf() - Method in class com.atgenomix.seqslab.piper.plugin.api.PiperContext
-
Get the information of execution environment and configuration.
- getDataframe() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a Pair of URL and its Spark Dataframe.
- getDataset(String) - Method in class com.atgenomix.seqslab.piper.plugin.api.PiperContext
-
Obtain the dataset object value of an input variable identified by fully-qualified name.
- getDatasets() - Method in interface com.atgenomix.seqslab.piper.plugin.api.Pipeline
-
Get the settings of File inputs.
- getDataSource() - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.Writer
-
Obtains the data access information after the output DataFrame is successfully saved.
- getDictionary() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a list of NamedValue objects with key as a String.
- getDouble() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a Double.
- getFile() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a Pair of path and SeqsLab DRS object.
- getGroupExprs(Dataset<Row>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.transformer.SupportsGroupWithinPartitions
-
Get the grouping expressions as a list of
Column
, ex: df.col("group_id"). - getInteger() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as an Integer.
- getName() - Method in interface com.atgenomix.seqslab.piper.plugin.api.Operator
-
Get the operator name that is used to uniquely specify the operator configuration of task operator pipelines.
- getOperatorContext() - Method in interface com.atgenomix.seqslab.piper.plugin.api.Operator
-
Get the
OperatorContext
containing a list of properties in the form of NamedValue objects associated with this operator object. - getOptions() - Method in class com.atgenomix.seqslab.piper.plugin.api.DataSource
-
Retrieve the access options associated with this data source, e.g.
- getProperties() - Method in interface com.atgenomix.seqslab.piper.plugin.api.executor.SupportsTableLocalization
-
Retrieves one or more NamedValue properties to set into the delta table.
- getProperties() - Method in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Get the properties
- getProperty(String) - Method in class com.atgenomix.seqslab.piper.plugin.api.PluginContext
-
Calls the Hashtable method get.
- getRegion() - Method in class com.atgenomix.seqslab.piper.plugin.api.DataSource
-
Retrieve the data center region, from which the data source will be accessed.
- getSortExprs(Dataset<Row>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.transformer.SupportsOrdering
-
Get the sorting expressions within each partition, ex: df.col("seq_id").asc().
- getString() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Retrieve the value as a String.
- getTask(String) - Method in class com.atgenomix.seqslab.piper.plugin.api.PiperContext
-
Obtain the pipeline task of an input or output variable identified by fully-qualified name.
- getTasks() - Method in interface com.atgenomix.seqslab.piper.plugin.api.Pipeline
-
Get the settings of a pipeline task.
- getType() - Method in class com.atgenomix.seqslab.piper.plugin.api.DataSource
-
Retrieve type of the access method to the data source, e.g.
- getType() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperValue
-
Get the concrete type of this PiperValue object.
- getUrl() - Method in class com.atgenomix.seqslab.piper.plugin.api.DataSource
-
Retrieve the access URL to the datasets.
I
- init() - Method in interface com.atgenomix.seqslab.piper.plugin.api.executor.Executor
-
Initializes this executor operator.
- init() - Method in interface com.atgenomix.seqslab.piper.plugin.api.formatter.Formatter
-
Initializes this formatter operator.
- init() - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.Writer
-
Initializes this writer operator.
- init(boolean) - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
-
Initializes this collector operator.
- init(int, int) - Method in interface com.atgenomix.seqslab.piper.plugin.api.transformer.Transformer
-
Initializes this operator.
- init(DataSource) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.Loader
-
Initializes this operator with a specific data source.
- init(PiperContext) - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Initialize the plugin.
- inputs - Variable in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
An immutable map that contains NamedValue task input variables and files.
- INTEGER - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
L
- Loader - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
The operator responsible for loading (reading) a dataset into in-memory DataFrame or copying to local host file system from a specific data source, e.g.
- LoaderSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
Factory object responsible for creating
Loader
operators.
M
- minMemoryPerCore() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
-
Minimal required memory in GB per CPU core for the target workloads.
N
- numPartitions() - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.SupportsRepartitioning
-
Get the number of partitions in the collected dataframe.
- numPartitions() - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.SupportsReadPartitions
-
Get the number of partitions in the data source.
- numPartitions() - Method in interface com.atgenomix.seqslab.piper.plugin.api.transformer.Transformer
-
Get the number of partitions after repartition.
O
- Operator - Interface in com.atgenomix.seqslab.piper.plugin.api
-
An operator that processes Spark DataFrame and produces a new DataFrame.
- OperatorContext - Class in com.atgenomix.seqslab.piper.plugin.api
-
An object used in pipeline operations to represent a persistent set of properties, inputs, and outputs.
- OperatorContext(Map<String, Object>) - Constructor for class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Creates an operator context with the specified defaults.
- OperatorContext(Map<String, Object>, Map<String, PiperValue>) - Constructor for class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Creates an operator context with the specified defaults and task inputs.
- OperatorContext(Map<String, Object>, Map<String, PiperValue>, Map<String, PiperValue>) - Constructor for class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Creates an operator context with the specified defaults and task inputs and outputs.
- OperatorPipelineV3 - Interface in com.atgenomix.seqslab.piper.plugin.api
-
The base contract interface for SeqsLab operator pipeline v3.
- outputs - Variable in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
An immutable map that contains NamedValue task output variables and files.
P
- parallelizationFactor() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
-
Typical number of dataframe partitions to scale the target workloads.
- Pipeline - Interface in com.atgenomix.seqslab.piper.plugin.api
-
An interface represents a complete configuration of workflow task including input parameters, dataset connection, as well as localization, computation, and delocalization workload processes.
- PipelineTask - Interface in com.atgenomix.seqslab.piper.plugin.api
-
An interface represents the localization, computation, and delocalization workload processes in a pipeline task of a given variable identified by FQN (Fully Qualified Name).
- piper - Variable in class com.atgenomix.seqslab.piper.plugin.api.PluginContext
-
An object used in all operations of a workflow task execution.
- PiperContext - Class in com.atgenomix.seqslab.piper.plugin.api
-
An object used in all operations of a workflow task execution.
- PiperContext(SparkSession, Pipeline) - Constructor for class com.atgenomix.seqslab.piper.plugin.api.PiperContext
-
Creates a context object with the specified Spark session and pipeline configuration.
- PiperPlugin - Interface in com.atgenomix.seqslab.piper.plugin.api
-
A plugin that can be dynamically loaded into a SeqsLab piper application.
- PiperValue - Interface in com.atgenomix.seqslab.piper.plugin.api
-
An interface represents value of primitive types, e.g.
- PiperValue.Type - Enum Class in com.atgenomix.seqslab.piper.plugin.api
-
Concrete types of all supported workflow parameters
- PluginContext - Class in com.atgenomix.seqslab.piper.plugin.api
-
An object returned when initializing this SeqsLab piper.
- PluginContext(PiperContext) - Constructor for class com.atgenomix.seqslab.piper.plugin.api.PluginContext
-
Creates a plugin context object with Seqslab piper context.
- properties - Variable in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
A property map that contains NamedValue objects including those specific to the operator and other properties set by upstream pipeline operators.
- properties - Variable in class com.atgenomix.seqslab.piper.plugin.api.PluginContext
-
A persistent set of plugin-specific properties.
R
- readSchema() - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.Loader
-
Returns the actual schema of this dataset loader, which may be different from the physical schema of the source storage, as column pruning or other optimizations may happen.
- registerCollectors() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register dataset collecting operators published by the plugin with SeqsLab piper system.
- registerExecutors() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register executing operators published by the plugin with SeqsLab piper system.
- registerFormatters() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register format transformation operators published by the plugin with SeqsLab piper system.
- registerLoaders() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register loader operators published by the plugin with SeqsLab piper system.
- registerTransformers() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register partitioning operators published by the plugin with SeqsLab piper system.
- registerUDAFs() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register UDAF as UserDefinedFunction published by the plugin using functions.udaf()
- registerUDFs() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register UserDefinedFunction (UDF) published by the plugin with SeqsLab piper system.
- registerWriters() - Method in interface com.atgenomix.seqslab.piper.plugin.api.PiperPlugin
-
Register dataset saving operators published by the plugin with SeqsLab piper system.
- runOnGPU() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
-
Indicates the operator implementation can run on GPU and be executed with the RAPIDS Accelerator.
S
- save(String) - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.SupportsSaveToBLOB
-
Sets the target storage path where the DataFrame will be saved.
- save(String, String, Properties) - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.SupportsSaveToJDBC
-
Sets the target database connection with which the DataFrame will be saved.
- save(String, Properties) - Method in interface com.atgenomix.seqslab.piper.plugin.api.writer.SupportsSaveToHTTP
-
Sets the target HTTP connection with which the DataFrame will be saved.
- schema() - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
-
Returns the actual schema of this collected dataset, which may be different from the physical schema of the command outputs, as column pruning or other optimizations may happen.
- select() - Method in interface com.atgenomix.seqslab.piper.plugin.api.formatter.Formatter
-
Returns a set of selected column names as an array.
- setConfiguration(Map<String, String>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.SupportsHadoopDFS
-
Sets Hadoop configuration
- setDirectoryStream(DirectoryStream<Path>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
-
When task output to collect from is in a directory, set the object to iterate over the entries in an output directory.
- setInputStream(FileInputStream) - Method in interface com.atgenomix.seqslab.piper.plugin.api.collector.Collector
-
Set an input stream object to obtain input bytes from a task output file.
- setLocalPath(String) - Method in interface com.atgenomix.seqslab.piper.plugin.api.executor.SupportsFileLocalization
-
Sets the local destination path which the datasets will be written to.
- setLocalPath(String) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.SupportsCopyToLocal
-
Sets the local destination path where the datasets will be saved.
- setPartition(Iterator<Row>) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.SupportsScanPartitions
-
Set the partition loaded by SeqsLab for applying Loader's call function.
- setPartitionId(int) - Method in interface com.atgenomix.seqslab.piper.plugin.api.loader.SupportsReadPartitions
-
Set the target partition that this loader is responsible loading from input data source.
- setProperties(Map<String, Object>) - Method in class com.atgenomix.seqslab.piper.plugin.api.OperatorContext
-
Set new properties
- setProperty(String, Object) - Method in class com.atgenomix.seqslab.piper.plugin.api.PluginContext
-
Calls the Hashtable method put.
- setTableName(String) - Method in interface com.atgenomix.seqslab.piper.plugin.api.executor.SupportsTableLocalization
-
Assign table name in the metastore using input dataframe's schema.
- spark - Variable in class com.atgenomix.seqslab.piper.plugin.api.PiperContext
-
An entry point for operators to work with DataFrame and Dataset.
- STRING - Enum constant in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
- SupportsAggregation - Interface in com.atgenomix.seqslab.piper.plugin.api.collector
-
A mix-in interface for
Collector
. - SupportsCopyToLocal - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
A mix-in interface for
Loader
. - SupportsFileLocalization - Interface in com.atgenomix.seqslab.piper.plugin.api.executor
-
A mix-in interface for
Executor
. - SupportsGroupWithinPartitions - Interface in com.atgenomix.seqslab.piper.plugin.api.transformer
-
A mix-in interface for
Transformer
. - SupportsHadoopDFS - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
A mix-in interface for
Loader
. - SupportsOrdering - Interface in com.atgenomix.seqslab.piper.plugin.api.transformer
-
A mix-in interface for
Transformer
. - SupportsReadPartitions - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
A mix-in interface for
Loader
. - SupportsRepartitioning - Interface in com.atgenomix.seqslab.piper.plugin.api.collector
-
A mix-in interface for
Collector
. - SupportsSaveToBLOB - Interface in com.atgenomix.seqslab.piper.plugin.api.writer
-
The mix-in interface for
Writer
. - SupportsSaveToHTTP - Interface in com.atgenomix.seqslab.piper.plugin.api.writer
-
The mix-in interface for
Writer
. - SupportsSaveToJDBC - Interface in com.atgenomix.seqslab.piper.plugin.api.writer
-
The mix-in interface for
Writer
. - SupportsScanPartitions - Interface in com.atgenomix.seqslab.piper.plugin.api.loader
-
A mix-in interface for
Loader
. - SupportsTableLocalization - Interface in com.atgenomix.seqslab.piper.plugin.api.executor
-
A mix-in interface for
Executor
. - SupportsUDF1<T1,
R> - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter -
A mix-in interface for
Formatter
. - SupportsUDF2<T1,
T2, R> - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter -
A mix-in interface for
Formatter
. - SupportsUDF3<T1,
T2, T3, R> - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter -
A mix-in interface for
Formatter
. - SupportsUDF4<T1,
T2, T3, T4, R> - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter -
A mix-in interface for
Formatter
. - SupportsUDF5<T1,
T2, T3, T4, T5, R> - Interface in com.atgenomix.seqslab.piper.plugin.api.formatter -
A mix-in interface for
Formatter
.
T
- Transformer - Interface in com.atgenomix.seqslab.piper.plugin.api.transformer
-
The operator responsible for repartitioning, and additionally sorting, DataFrames loaded by
Loader
to optimize downstream data processing. - TransformerSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.transformer
-
Factory object responsible for creating
Transformer
operators.
V
- value() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimensions
- valueOf(String) - Static method in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
-
Returns the enum constant of this class with the specified name.
- values() - Static method in enum class com.atgenomix.seqslab.piper.plugin.api.PiperValue.Type
-
Returns an array containing the constants of this enum class, in the order they are declared.
W
- withColumn() - Method in interface com.atgenomix.seqslab.piper.plugin.api.formatter.Formatter
-
Returns a pair of output column name and its data type for create (new column name) or update (existing column name) after calling this Formatter user-defined function.
- workloads() - Element in annotation interface com.atgenomix.seqslab.piper.tags.Dimension
-
List of supported workloads, e.g.
- Writer - Interface in com.atgenomix.seqslab.piper.plugin.api.writer
-
The operator responsible for delocalizing/saving command output DataFrame to storage or repositories, e.g.
- WriterSupport - Interface in com.atgenomix.seqslab.piper.plugin.api.writer
-
Factory object responsible for creating
Writer
operators.
All Classes and Interfaces|All Packages|Serialized Form