Class Cluster

  • All Implemented Interfaces:
    Refiner, Refiner

    public class Cluster
    extends Bundle
    A cluster refiner that builds upon the Bundle Refiner. It clusters the stage-in jobs and stage-out jobs per level of the workflow. The difference from the Bundle refiner beings
            - stagein is also clustered/bundled per level. In Bundle it was for the 
              whole workflow.
            - keys that control the clustering ( old name bundling are ) 
              cluster.stagein and cluster.stageout
     
    In order to use the transfer refiner implemented by this class,
            - the property pegasus.transfer.refiner  must be set to value Cluster
     
    Version:
    $Revision$
    Author:
    Karan Vahi
    • Field Detail

      • DESCRIPTION

        public static final java.lang.String DESCRIPTION
        A short description of the transfer refinement.
        See Also:
        Constant Field Values
      • DEFAULT_LOCAL_STAGE_IN_CLUSTER_FACTOR

        public static final java.lang.String DEFAULT_LOCAL_STAGE_IN_CLUSTER_FACTOR
        The default clustering factor that identifies the number of transfer jobs that are being created per execution pool for stageing in data for the workflow.
        See Also:
        Constant Field Values
      • DEFAULT_REMOTE_STAGE_IN_CLUSTER_FACTOR

        public static final java.lang.String DEFAULT_REMOTE_STAGE_IN_CLUSTER_FACTOR
        The default clustering factor that identifies the number of transfer jobs that are being created per execution pool for stageing in data for the workflow.
        See Also:
        Constant Field Values
      • DEFAULT_LOCAL_STAGE_OUT_CLUSTER_FACTOR

        public static final java.lang.String DEFAULT_LOCAL_STAGE_OUT_CLUSTER_FACTOR
        The default bundling factor that identifies the number of transfer jobs that are being created per execution pool for stageing out data for the workflow.
        See Also:
        Constant Field Values
      • DEFAULT_REMOTE_STAGE_OUT_CLUSTER_FACTOR

        public static final java.lang.String DEFAULT_REMOTE_STAGE_OUT_CLUSTER_FACTOR
        The default bundling factor that identifies the number of transfer jobs that are being created per execution pool for stageing out data for the workflow.
        See Also:
        Constant Field Values
      • mStageInLocalMapPerLevel

        protected java.util.Map<java.lang.String,​Bundle.PoolTransfer> mStageInLocalMapPerLevel
        A map indexed by site name, that contains the pointer to the stage in PoolTransfer objects for that site. This is per level of the workflow.
      • mStageInRemoteMapPerLevel

        protected java.util.Map<java.lang.String,​Bundle.PoolTransfer> mStageInRemoteMapPerLevel
        A map indexed by site name, that contains the pointer to the symlink stage in PoolTransfer objects for that site. This is per level of the workflow.
      • mCurrentSILevel

        private int mCurrentSILevel
        The current level of the jobs being traversed.
      • mSyncJobMap

        private java.util.Map<java.lang.String,​Job> mSyncJobMap
        Maps the site name to the current synch job
    • Constructor Detail

      • Cluster

        public Cluster​(ADag dag,
                       PegasusBag bag)
        The overloaded constructor.
        Parameters:
        dag - the workflow to which transfer nodes need to be added.
        bag - the bag of initialization objects
    • Method Detail

      • initializeBundleValues

        protected void initializeBundleValues()
        Initializes the bundle value variables, that are responsible determining the bundle values.
        Overrides:
        initializeBundleValues in class Bundle
      • addStageInXFERNodes

        public void addStageInXFERNodes​(Job job,
                                        java.util.Collection<FileTransfer> files,
                                        java.util.Collection<FileTransfer> symlinkFiles)
        Adds the stage in transfer nodes which transfer the input files for a job, from the location returned from the replica catalog to the job's execution pool.
        Specified by:
        addStageInXFERNodes in interface Refiner
        Overrides:
        addStageInXFERNodes in class Bundle
        Parameters:
        job - Job object corresponding to the node to which the files are to be transferred to.
        files - Collection of FileTransfer objects containing the information about source and destURL's.
        symlinkFiles - Collection of FileTransfer objects containing source and destination file url's for symbolic linking on compute site.
      • addStageInXFERNodes

        public void addStageInXFERNodes​(Job job,
                                        boolean localTransfer,
                                        java.util.Collection files,
                                        int jobType,
                                        java.util.Map<java.lang.String,​Bundle.PoolTransfer> stageInMap,
                                        Bundle.BundleValue cValue,
                                        Implementation implementation)
        Adds the stage in transfer nodes which transfer the input files for a job, from the location returned from the replica catalog to the job's execution pool.
        Overrides:
        addStageInXFERNodes in class Bundle
        Parameters:
        job - Job object corresponding to the node to which the files are to be transferred to.
        files - Collection of FileTransfer objects containing the information about source and destURL's.
        jobType - the type of transfer job being created
        stageInMap - Map indexed by site name that gives all the transfers for that site.
        cValue - used to determine the bundling factor to employ for a job.
        implementation - the transfer implementation to use.
        localTransfer - boolean indicating whether transfer has to happen on local site.
      • resetStageInMaps

        protected void resetStageInMaps()
        Resets the stage in map and adds the stage in jobs for each site per level.
      • resetStageInMap

        protected java.util.Map<java.lang.String,​Bundle.PoolTransfer> resetStageInMap​(java.util.Map<java.lang.String,​Bundle.PoolTransfer> stageInMap,
                                                                                            Implementation implementation,
                                                                                            java.util.Map<java.lang.String,​Job> transientSynchJobMap,
                                                                                            int jobType,
                                                                                            boolean createChildSyncJob,
                                                                                            boolean localTransfer)
        Resets the stage in map and adds the stage in jobs for each site per level.
        Parameters:
        stageInMap - the stagein map per level to be reset
        implementation - the transfer implementation corresponding to the map being reset
        transientSynchJobMap - transient map to keep track of synch jobs per site.
        jobType - the type of Transfer Job created
        createChildSyncJob - boolean indicating whether to create a child synch job or not.
        Returns:
        the reset map
      • getComputeJobBundleValue

        protected java.lang.String getComputeJobBundleValue​(Job job)
        Returns the bundle value associated with a compute job as a String.
        Overrides:
        getComputeJobBundleValue in class Bundle
        Parameters:
        job -
        Returns:
        value as String or NULL
      • getStageInPoolTransfer

        protected Bundle.PoolTransfer getStageInPoolTransfer​(java.util.Map<java.lang.String,​Bundle.PoolTransfer> stageInMap,
                                                             java.lang.String site,
                                                             int num,
                                                             boolean localTransfer)
        Returns the appropriate stagein pool transfer for a particular site.
        Parameters:
        stageInMap - map that indexes site to PoolTransfer objects
        site - the site for which the PT is reqd.
        num - the number of stage in jobs required for that Pool.
        localTransfer - whether the transfer needs to run on local site or not.
        Returns:
        the PoolTransfer
      • done

        public void done()
        Signals that the traversal of the workflow is done. At this point the transfer nodes are actually constructed traversing through the transfer containers and the stdin of the transfer jobs written.
        Specified by:
        done in interface Refiner
        Overrides:
        done in class Bundle
      • getDescription

        public java.lang.String getDescription()
        Returns a textual description of the transfer mode.
        Specified by:
        getDescription in interface Refiner
        Overrides:
        getDescription in class Bundle
        Returns:
        a short textual description
      • getSISiteBundleValue

        protected int getSISiteBundleValue​(java.lang.String site,
                                           java.lang.String deflt)
        Determines the bundle factor for a particular site on the basis of the stage in bundle value associcated with the underlying transfer transformation in the transformation catalog. If the key is not found, then the default value is returned. In case of the default value being null the global default is returned.
        Parameters:
        site - the site at which the value is desired.
        deflt - the default value.
        Returns:
        the bundle factor.
        See Also:
        DEFAULT_LOCAL_STAGE_IN_CLUSTER_FACTOR
      • getSyncJobBetweenLevelsName

        protected java.lang.String getSyncJobBetweenLevelsName​(java.lang.String site,
                                                               int level)
        Returns the name of the job that acts as a synchronization node in between stage in jobs of different levels.
        Parameters:
        site - the site of the transfer job.
        level - the level of the job
        Returns:
        name of synce job
      • createSyncJobBetweenLevels

        private Job createSyncJobBetweenLevels​(java.lang.String name)
        It creates a NoOP synch job that runs on the submit host.
        Parameters:
        name - of the job
        Returns:
        the noop job.
      • constructCondorKey

        protected void constructCondorKey​(Job job,
                                          java.lang.String key,
                                          java.lang.String value)
        Constructs a condor variable in the condor profile namespace associated with the job. Overrides any preexisting key values.
        Parameters:
        job - contains the job description.
        key - the key of the profile.
        value - the associated value.
      • getSyncJob

        public Job getSyncJob​(java.lang.String site)
        Returns the current synch job for a site.
        Parameters:
        site -
        Returns:
        synch job if exists else null