LCOV - code coverage report
Current view: top level - utils/pilgrim - decompmodule.F90 (source / functions) Hit Total Coverage
Test: coverage.info Lines: 212 430 49.3 %
Date: 2025-03-14 01:23:43 Functions: 8 18 44.4 %

          Line data    Source code
       1             : !-------------------------------------------------------------------------
       2             : !         NASA/GSFC, Data Assimilation Office, Code 910.3, GEOS/DAS
       3             : !-------------------------------------------------------------------------
       4             :       MODULE decompmodule
       5             : !BOP
       6             : !
       7             : ! !MODULE: decompmodule
       8             : !
       9             : ! !USES:
      10             : #if defined( STAND_ALONE )
      11             : # define iulog 6
      12             : #else
      13             :       use cam_logfile, only: iulog
      14             : #endif
      15             : #include "debug.h"
      16             : 
      17             :       IMPLICIT NONE
      18             : 
      19             : !
      20             : ! !DESCRIPTION:
      21             : !
      22             : !      This module provides the DecompType and its create and destroy 
      23             : !      routines.
      24             : !      \begin{center}
      25             : !      \begin{tabular}{|l|l|} \hline \hline
      26             : !         DecompType        & Type to describe a decomposition \\ \hline
      27             : !         DecompDefined     & True iff given decomposition is defined\\ \hline
      28             : !         DecompFree        & Destroy a decomposition \\ \hline
      29             : !         DecompCopy        & Copy decomposition to newly created one\\ \hline 
      30             : !         DecompPermute     & Permute decomposition \\ \hline 
      31             : !         DecompRegular1D   & Create a 1-D decomposition \\ \hline 
      32             : !         DecompRegular2D   & Create a 2-D decomposition \\ \hline 
      33             : !         DecompRegular3D   & Create a 3-D decomposition \\ \hline 
      34             : !         DecompRegular4D   & Create a 4-D decomposition \\ \hline 
      35             : !         DecompCreateIrr   & Create an irregular 1-D decomposition \\ \hline
      36             : !         DecompCreateTags  & Create a decomposition from Pe and Tags \\ \hline
      37             : !         DecompGlobalToLocal& Map a global index to a local one \\ \hline
      38             : !         DecompLocalToGlobal& Map a local index to a global one \\
      39             : !         \hline  \hline
      40             : !      \end{tabular}
      41             : !      \end{center}
      42             : !
      43             : !      The decomposition type contains the sizes of the global array,
      44             : !      the number of entries on each PE, and for each PE a list
      45             : !      of "runs", i.e., the starting and finishing global indices
      46             : !      or "tags" whose inclusive array section resides on that PE.
      47             : !      Clearly this method of decomposition is only efficient if
      48             : !      there are long runs, i.e., long array sections which are 
      49             : !      mapped to one PE.  A random decomposition will cause poor
      50             : !      results.
      51             : !
      52             : !      The decomposition is thus very efficient for 1-D, 2-D or 3-D
      53             : !      block distributions (particularly for 1-D distributions, where
      54             : !      there is one "run" per processor).  Problems may occur for
      55             : !      an irregular decomposition (which is by definition 1-D).  If
      56             : !      there is little correspondence between the global indices of the 
      57             : !      entries and the actual decomposition (e.g., the tags are
      58             : !      assigned randomly), then there will be many runs, most
      59             : !      containing only one tag, and the resulting instance of
      60             : !      DecompType will be very large.  Fortunately, most applications
      61             : !      assign tags to entries in some sort of contiguous fashion, 
      62             : !      which is then quite appropriate for this data structure.
      63             : !
      64             : !      All numbering of multi-dimensional arrays is ROW-MAJOR, that
      65             : !      is, first in the X direction and then in the Y (and then,
      66             : !      if appropriate, in Z).  This is true for both the 2-D and
      67             : !      3-D data sets as also the Cartesian description of the PEs.
      68             : !
      69             : !      There is one glaring feature of DecompType.  It is
      70             : !      supposed to be a `one-size-fits-all' description of the
      71             : !      decomposition (with the exception of the random indexing
      72             : !      mentioned above).  Unfortunately, to describe 2-D and 3-D
      73             : !      regions, it is necessary to carry additional dimension
      74             : !      information in order have complete information for the 
      75             : !      mapping.  This means that 2-D and 3-D decompositions 
      76             : !      inherently carry more information than a 1-D decomposition.
      77             : !      Thus it {\it is} possible to use a decomposition created
      78             : !      with the Regular2D or Regular3D routines to describe the
      79             : !      corresponding decomposition when the 2-D or 3-D array is
      80             : !      viewed as a 1-D array, but it is clearly {\it not}
      81             : !      possible to use a decomposition created with Regular1D
      82             : !      to describe the decomposition of a 2-D or 3-D array
      83             : !      --- the appropriate information just is not there.
      84             : !
      85             : ! !REVISION HISTORY:
      86             : !   97.07.22   Sawyer     Creation
      87             : !   97.09.01   Sawyer     Release date
      88             : !   97.11.06   Sawyer     Addition of row and column communicators
      89             : !   97.01.24   Sawyer     Added support for non-MPI derived types solution
      90             : !   97.01.29   Sawyer     Minor revisions for production service
      91             : !   98.01.30   Sawyer     Added DecompCopy
      92             : !   98.02.04   Sawyer     Removed Comm, CommRow and CommCol from DecompType
      93             : !   98.03.13   Sawyer     Removed DecompTypeOld, brushed up for walkthrough
      94             : !   98.03.19   Sawyer     Minor corrections after walkthrough
      95             : !   98.05.02   Sawyer     Added DecompPermute
      96             : !   98.05.11   Sawyer     Removed Permutation from all but DecompPermute
      97             : !   99.01.19   Sawyer     Minor cleaning
      98             : !   00.07.07   Sawyer     Removed DimSizes; decomp is now 1D only
      99             : !   00.11.12   Sawyer     Added DecompCreateTags and DecompInfo
     100             : !   01.02.03   Sawyer     Updated for free format; corrected DecompCreateTags
     101             : !   01.03.20   Sawyer     Added DecompRegular3DOrder
     102             : !   02.12.04   Sawyer     Added DecompDefined, optimized DecompGlobalToLocal
     103             : !   02.12.06   Sawyer     Bug in new DecompGlobalToLocal (remove out of bounds check)
     104             : !   02.12.08   Sawyer     Another bug: calculate the Offsets field correctly
     105             : !   02.12.23   Sawyer     Added DecompRegular4D
     106             : !
     107             : ! !PUBLIC TYPES:
     108             :       PUBLIC DecompType, DecompCreate, DecompFree
     109             :       PUBLIC DecompCopy, DecompPermute, DecompDefined
     110             :       PUBLIC DecompGlobalToLocal, DecompLocalToGlobal, DecompInfo
     111             : 
     112             :       INTERFACE     DecompCreate
     113             :         MODULE PROCEDURE DecompRegular1D
     114             :         MODULE PROCEDURE DecompRegular2D
     115             :         MODULE PROCEDURE DecompRegular3D
     116             :         MODULE PROCEDURE DecompRegular3DOrder
     117             :         MODULE PROCEDURE DecompRegular4D
     118             :         MODULE PROCEDURE DecompCreateIrr
     119             :         MODULE PROCEDURE DecompCreateTags
     120             :       END INTERFACE
     121             : 
     122             :       INTERFACE     DecompGlobalToLocal
     123             :         MODULE PROCEDURE DecompG2L
     124             :         MODULE PROCEDURE DecompG2LVector
     125             :       END INTERFACE
     126             :  
     127             :       INTERFACE     DecompLocalToGlobal
     128             :         MODULE PROCEDURE DecompL2G
     129             :         MODULE PROCEDURE DecompL2GVector
     130             :       END INTERFACE
     131             :  
     132             : ! Decomposition info
     133             : 
     134             :        TYPE Lists
     135             :          INTEGER, POINTER     :: StartTags(:) => null() ! Start of tag run
     136             :          INTEGER, POINTER     :: EndTags(:)   => null() ! End of tag run
     137             :          INTEGER, POINTER     :: Offsets(:)   => null() ! Local offsets for efficiency
     138             :        END TYPE Lists
     139             : 
     140             :        TYPE DecompType
     141             :          LOGICAL              :: Defined      ! Is it defined?
     142             :          INTEGER              :: GlobalSize   ! Size in each dimension
     143             :          INTEGER, POINTER     :: NumEntries(:) => null() ! Number of entries per PE
     144             :          TYPE(Lists), POINTER :: Head(:)       => null() ! Array of pointers 
     145             :        END TYPE DecompType
     146             : 
     147             : !EOP
     148             :       CONTAINS
     149             : 
     150             : !-----------------------------------------------------------------------
     151             : !BOP
     152             : ! !IROUTINE: DecompDefined --- Is the decomp type defined?
     153             : !
     154             : ! !INTERFACE:
     155           0 :       LOGICAL FUNCTION DecompDefined ( Decomp )
     156             : ! !USES:
     157             :       IMPLICIT NONE
     158             : 
     159             : ! !INPUT PARAMETERS:
     160             :       TYPE(DecompType), INTENT( IN ):: Decomp  ! Decomp information
     161             : 
     162             : !
     163             : ! !DESCRIPTION:
     164             : !     Returns true if Decomp has been created but not yet destroyed
     165             : !
     166             : ! !REVISION HISTORY:
     167             : !   02.12.04   Sawyer     Creation from GhostDefined
     168             : !
     169             : !EOP
     170             : !-----------------------------------------------------------------------
     171             : !BOC
     172             : !
     173             : !
     174             :       CPP_ENTER_PROCEDURE( "DECOMPDEFINED" )
     175           0 :       DecompDefined = Decomp%Defined
     176             :       CPP_LEAVE_PROCEDURE( "DECOMPDEFINED" )
     177             : 
     178             :       RETURN
     179             : !EOC
     180             :       END FUNCTION DecompDefined
     181             : !-----------------------------------------------------------------------
     182             : 
     183             : 
     184             : !---------------------------------------------------------------------
     185             : !BOP
     186             : ! !IROUTINE: DecompFree --- Free a decomposition
     187             : !
     188             : ! !INTERFACE:
     189       43008 :       SUBROUTINE DecompFree ( Decomp )
     190             : ! !USES:
     191             :       IMPLICIT NONE
     192             : 
     193             : ! !INPUT/OUTPUT PARAMETERS:
     194             :       TYPE(DecompType), INTENT( INOUT ):: Decomp  ! Decomp information
     195             : !
     196             : ! !DESCRIPTION:
     197             : !     Free the decomposition -- deallocate the data structures.
     198             : !
     199             : ! !SYSTEM ROUTINES:
     200             : !     ASSOCIATED, DEALLOCATE
     201             : !
     202             : ! !REVISION HISTORY:
     203             : !   98.01.30   Sawyer     Creation
     204             : !
     205             : !EOP
     206             : !------------------------------------------------------------------------
     207             : !BOC
     208             : !
     209             : ! !LOCAL VARIABLES:
     210             :       INTEGER  :: I, NPEs
     211             : !
     212             :       CPP_ENTER_PROCEDURE( "DECOMPFREE" )
     213             : 
     214       43008 :       IF ( ASSOCIATED( Decomp%NumEntries ) )                             &
     215       43008 :                      DEALLOCATE( Decomp%NumEntries )
     216       43008 :       IF ( ASSOCIATED( Decomp%Head ) ) THEN
     217       43008 :         NPEs = SIZE( Decomp%Head )
     218    28588032 :         DO I = 1, NPEs
     219             : !
     220             : ! Copy the number of entries on each PE
     221             : !
     222    28545024 :           IF ( ASSOCIATED( Decomp%Head(I)%StartTags ) )                  &
     223    28545024 :                      DEALLOCATE( Decomp%Head(I)%StartTags )
     224    28545024 :           IF ( ASSOCIATED( Decomp%Head(I)%EndTags ) )                    &
     225    28545024 :                      DEALLOCATE( Decomp%Head(I)%EndTags )
     226    28545024 :           IF ( ASSOCIATED( Decomp%Head(I)%Offsets ) )                    &
     227    28588032 :                      DEALLOCATE( Decomp%Head(I)%Offsets )
     228             :         ENDDO
     229       43008 :         DEALLOCATE( Decomp%Head )
     230             :       ENDIF
     231       43008 :       Decomp%Defined = .FALSE.
     232             : 
     233             :       CPP_LEAVE_PROCEDURE( "DECOMPFREE" )
     234       43008 :       RETURN
     235             : !EOC
     236             :       END SUBROUTINE DecompFree
     237             : !------------------------------------------------------------------------
     238             : 
     239             : 
     240             : !------------------------------------------------------------------------
     241             : !BOP
     242             : ! !IROUTINE: DecompCopy --- Copy one decomposition to another
     243             : !
     244             : ! !INTERFACE:
     245        7680 :       SUBROUTINE DecompCopy ( DecompIn, DecompOut )
     246             : ! !USES:
     247             :       IMPLICIT NONE
     248             : !
     249             : ! !INPUT PARAMETERS:
     250             :       TYPE(DecompType), INTENT( IN )   :: DecompIn  ! Decomp information
     251             : !
     252             : ! !OUTPUT PARAMETERS:
     253             :       TYPE(DecompType), INTENT( OUT )  :: DecompOut ! Decomp information
     254             : !
     255             : ! !DESCRIPTION:
     256             : !
     257             : !   Creates an output decomposition and copies the DecompIn input values 
     258             : !
     259             : ! !SYSTEM ROUTINES:
     260             : !     ALLOCATE
     261             : !
     262             : ! !REVISION HISTORY:
     263             : !   98.01.30   Sawyer     Creation
     264             : !
     265             : !EOP
     266             : !------------------------------------------------------------------------
     267             : !BOC
     268             : !
     269             : ! !LOCAL VARIABLES:
     270             :       INTEGER  :: I, J, NDims, NPEs, NRuns
     271             : !
     272             :       CPP_ENTER_PROCEDURE( "DECOMPCOPY" )
     273             : !
     274             : ! Copy the size of the global array
     275             : !
     276        7680 :       DecompOut%GlobalSize = DecompIn%GlobalSize
     277             : 
     278             : !
     279             : ! Allocate the number of entries and list head arrays
     280             : !
     281        7680 :       NPEs = SIZE( DecompIn%NumEntries )
     282             :       CPP_ASSERT_F90( SIZE( DecompIn%Head ) .EQ. NPEs )
     283       23040 :       ALLOCATE( DecompOut%NumEntries( NPEs ) )
     284     5921280 :       ALLOCATE( DecompOut%Head( NPEs ) )
     285             : 
     286     5905920 :       DO I = 1, NPEs
     287             : !
     288             : ! Copy the number of entries on each PE
     289             : !
     290     5898240 :         DecompOut%NumEntries( I ) = DecompIn%NumEntries( I )
     291     5898240 :         NRuns = SIZE( DecompIn%Head( I )%StartTags )
     292             :         CPP_ASSERT_F90( SIZE( DecompIn%Head( I )%EndTags ) .EQ. NRuns )
     293             : !
     294             : ! Allocate and copy the array of runs
     295             : !
     296    17694720 :         ALLOCATE( DecompOut%Head(I)%StartTags( NRuns ) )
     297    11796480 :         ALLOCATE( DecompOut%Head(I)%EndTags( NRuns ) )
     298    11796480 :         ALLOCATE( DecompOut%Head(I)%Offsets( NRuns ) )
     299   109714944 :         DO J = 1, NRuns
     300   103809024 :           DecompOut%Head(I)%StartTags(J) = DecompIn%Head(I)%StartTags(J)
     301   103809024 :           DecompOut%Head(I)%EndTags(J) = DecompIn%Head(I)%EndTags(J)
     302   109707264 :           DecompOut%Head(I)%Offsets(J) = DecompIn%Head(I)%Offsets(J)
     303             :         ENDDO
     304             :       ENDDO
     305        7680 :       DecompOut%Defined = .TRUE.
     306             : 
     307             :       CPP_LEAVE_PROCEDURE( "DECOMPCOPY" )
     308        7680 :       RETURN
     309             : !EOC
     310             :       END SUBROUTINE DecompCopy
     311             : !------------------------------------------------------------------------
     312             : 
     313             : 
     314             : !------------------------------------------------------------------------
     315             : !BOP
     316             : ! !IROUTINE: DecompPermute --- Permute one decomposition to another
     317             : !
     318             : ! !INTERFACE:
     319           0 :       SUBROUTINE DecompPermute ( Permutation, Decomp )
     320             : ! !USES:
     321             :       IMPLICIT NONE
     322             : !
     323             : ! !INPUT PARAMETERS:
     324             :       INTEGER :: Permutation( : )                  ! Permutation
     325             : 
     326             : ! !INPUT/OUTPUT PARAMETERS:
     327             :       TYPE(DecompType), INTENT( INOUT ) :: Decomp  ! Decomp information
     328             : !
     329             : !
     330             : ! !DESCRIPTION:
     331             : !
     332             : !   Permutes the PE assignment of a given decomposition. Confusion will
     333             : !   always arise about whether this is a forward or backward
     334             : !   transformation.  Picture it this way: draw the array and slice it
     335             : !   up as indicated by the distribution.  The resulting boxes are of
     336             : !   course indexed by natural numbering 1, 2, 3, 4, ...  (these are
     337             : !   the virtual one-based PEs). Now write the true PE numbering
     338             : !   (one-based) as you would like it.  The resulting array is Perm.
     339             : !
     340             : !
     341             : ! !SYSTEM ROUTINES:
     342             : !     ALLOCATE
     343             : !
     344             : ! !REVISION HISTORY:
     345             : !   98.05.02   Sawyer     Creation
     346             : !
     347             : !EOP
     348             : !---------------------------------------------------------------------
     349             : !BOC
     350             : !
     351             : ! !LOCAL VARIABLES:
     352           0 :       INTEGER, POINTER     :: NumEntries(:)! Number of entries
     353           0 :       TYPE(Lists), POINTER :: Head(:)      ! Array of pointers 
     354             :       INTEGER              :: I, J, NPEs, NRuns, TruePE
     355             : !
     356             :       CPP_ENTER_PROCEDURE( "DECOMPPERMUTE" )
     357             : !
     358             : ! Allocate the number of entries and list head arrays
     359             : !
     360           0 :       NPEs = SIZE( Decomp%NumEntries )
     361           0 :       ALLOCATE( NumEntries( NPEs ) )
     362           0 :       DO I = 1, NPEs
     363           0 :         TruePE = Permutation( I )
     364           0 :         NumEntries( TruePE ) = Decomp%NumEntries( I )
     365             :       ENDDO
     366             : !
     367             : ! Deallocate old NumEntries and put the new pointer in its place
     368             : !
     369           0 :       DEALLOCATE( Decomp%NumEntries )
     370           0 :       Decomp%NumEntries => NumEntries
     371           0 :       NULLIFY( NumEntries )
     372             : 
     373             : !
     374             : ! Allocate and set the permuted Lists called with pointer Head
     375             : !
     376           0 :       ALLOCATE( Head( NPEs ) )
     377           0 :       DO I = 1, NPEs
     378           0 :         TruePE = Permutation( I )
     379           0 :         NRuns = SIZE( Decomp%Head(I)%StartTags )
     380             :         CPP_ASSERT_F90( SIZE( Decomp%Head(I)%EndTags ) .EQ. NRuns )
     381             : !
     382             : ! Allocate and permute the array of runs
     383             : !
     384           0 :         ALLOCATE( Head(TruePE)%StartTags(NRuns) )
     385           0 :         ALLOCATE( Head(TruePE)%EndTags(NRuns) )
     386           0 :         ALLOCATE( Head(TruePE)%Offsets(NRuns) )
     387           0 :         DO J = 1, NRuns
     388           0 :           Head(TruePE)%StartTags(J) = Decomp%Head(I)%StartTags(J)
     389           0 :           Head(TruePE)%EndTags(J)   = Decomp%Head(I)%EndTags(J)
     390           0 :           Head(TruePE)%Offsets(J)   = Decomp%Head(I)%Offsets(J)
     391             :         ENDDO
     392             :       ENDDO
     393             : !
     394             : ! Deallocate the arrays of starting and ending tags
     395             : !
     396           0 :       DO I = 1, NPEs
     397           0 :         DEALLOCATE( Decomp%Head(I)%StartTags )
     398           0 :         DEALLOCATE( Decomp%Head(I)%EndTags )
     399           0 :         DEALLOCATE( Decomp%Head(I)%Offsets )
     400             :       ENDDO
     401             : !
     402             : ! Deallocate the heads to the Lists
     403             : !
     404           0 :       DEALLOCATE( Decomp%Head )
     405             : 
     406             : !
     407             : ! Link the new head to that in the decomposition
     408             : !
     409           0 :       Decomp%Head => Head
     410             :       
     411           0 :       NULLIFY( Head )
     412             : 
     413             :       CPP_LEAVE_PROCEDURE( "DECOMPPERMUTE" )
     414           0 :       RETURN
     415             : !EOC
     416           0 :       END SUBROUTINE DecompPermute
     417             : !------------------------------------------------------------------------
     418             : 
     419             : 
     420             : !------------------------------------------------------------------------
     421             : !BOP
     422             : ! !IROUTINE: DecompRegular1D --- Create a decomposition for a 1-D grid
     423             : !
     424             : ! !INTERFACE:
     425           0 :       SUBROUTINE DecompRegular1D ( NPEs, Dist, Decomp )
     426             : ! !USES:
     427             :       IMPLICIT NONE
     428             : !
     429             : ! !INPUT PARAMETERS:
     430             :       INTEGER, INTENT( IN )            :: NPEs     ! Number of PEs
     431             :       INTEGER, INTENT( IN )            :: Dist(:)  ! Distribution in X
     432             : !
     433             : ! !OUTPUT PARAMETERS:
     434             :       TYPE(DecompType), INTENT( OUT )  :: Decomp   ! Decomp information
     435             : !
     436             : ! !DESCRIPTION:
     437             : !     Creates a variable block decomposition for a regular 1-D grid
     438             : !     (this is also known as a "block-general" distribution).  The
     439             : !     decomposition is given through the Dist distribution 
     440             : !     which contains the number of entries on each PE.  
     441             : !
     442             : ! !SYSTEM ROUTINES:
     443             : !     ALLOCATE
     444             : !
     445             : ! !REVISION HISTORY:
     446             : !   98.01.19   Sawyer     Creation
     447             : !   98.01.22   Sawyer     Corrections, TESTED
     448             : !   98.05.11   Sawyer     Removed Perm from arglist -- see DecompPermute
     449             : !   00.07.07   Sawyer     Removed use of DimSizes(:) array
     450             : !
     451             : !EOP
     452             : !------------------------------------------------------------------------
     453             : !BOC
     454             : !
     455             : ! !LOCAL VARIABLES:
     456             :       INTEGER  :: I, Counter
     457             : !
     458             :       CPP_ENTER_PROCEDURE( "DECOMPREGULAR1D" )
     459             : !
     460             :       CPP_ASSERT_F90( NPEs .EQ. SIZE( Dist ) )
     461             : !
     462             : ! The head contains NPEs pointers to the tag lists.
     463             : !
     464           0 :       Decomp%GlobalSize = SUM(Dist)
     465           0 :       ALLOCATE( Decomp%NumEntries( NPEs ) )
     466           0 :       ALLOCATE( Decomp%Head( NPEs ) )
     467           0 :       Counter = 0
     468           0 :       DO I = 1, NPEs
     469           0 :         Decomp%NumEntries(I) = Dist(I)
     470             : !
     471             : ! Since this is a regular distribution there is only one run of tags per PE.
     472             : !
     473           0 :         NULLIFY( Decomp%Head(I)%StartTags )
     474           0 :         NULLIFY( Decomp%Head(I)%EndTags )
     475           0 :         NULLIFY( Decomp%Head(I)%Offsets )
     476           0 :         ALLOCATE( Decomp%Head(I)%StartTags(1) )
     477           0 :         ALLOCATE( Decomp%Head(I)%EndTags(1) )
     478           0 :         ALLOCATE( Decomp%Head(I)%Offsets(1) )
     479             : !
     480             : ! The starting and ending tags are immediately determined from
     481             : ! the decomposition arrays  
     482             : !
     483           0 :         Decomp%Head(I)%StartTags(1) = Counter+1
     484           0 :         Counter = Counter + Dist(I)
     485           0 :         Decomp%Head(I)%EndTags(1) = Counter
     486           0 :         Decomp%Head(I)%Offsets(1) = 0   ! Offset in local segment
     487             :       ENDDO
     488             : 
     489           0 :       Decomp%Defined = .TRUE.
     490             : 
     491             :       CPP_LEAVE_PROCEDURE( "DECOMPREGULAR1D" )
     492           0 :       RETURN
     493             : !EOC
     494             :       END SUBROUTINE DecompRegular1D
     495             : !------------------------------------------------------------------------
     496             : 
     497             : 
     498             : !------------------------------------------------------------------------
     499             : !BOP
     500             : ! !IROUTINE: DecompRegular2D --- Create a decomposition for a 2-D grid
     501             : !
     502             : ! !INTERFACE:
     503        6144 :       SUBROUTINE DecompRegular2D( NPEsX, NPEsY, Xdist, Ydist, Decomp )
     504             : ! !USES:
     505             :       IMPLICIT NONE
     506             : !
     507             : ! !INPUT PARAMETERS:
     508             :       INTEGER, INTENT( IN )            :: NPEsX    ! Number of PEs in X
     509             :       INTEGER, INTENT( IN )            :: NPEsY    ! Number of PEs in Y
     510             :       INTEGER, INTENT( IN )            :: Xdist(:) ! Distribution in X
     511             :       INTEGER, INTENT( IN )            :: Ydist(:) ! Distribution in Y
     512             : !
     513             : ! !OUTPUT PARAMETERS:
     514             :       TYPE(DecompType), INTENT( OUT )  :: Decomp  ! Decomp information
     515             : !
     516             : !
     517             : ! !DESCRIPTION:
     518             : !     Creates a variable block-block decomposition for a regular 
     519             : !     2-D grid.  The decomposition is given through the Xdist and 
     520             : !     Ydist distributions, which contain the number of entries on 
     521             : !     each PE in that dimension.  This routine thus defines
     522             : !     a rectangular "checkerboard" distribution.
     523             : !
     524             : ! !SYSTEM ROUTINES:
     525             : !     ALLOCATE
     526             : !
     527             : ! !REVISION HISTORY:
     528             : !   98.01.19   Sawyer     Creation
     529             : !   98.01.22   Sawyer     Corrections, TESTED
     530             : !   98.05.11   Sawyer     Removed Perm from arglist -- see DecompPermute
     531             : !   00.07.07   Sawyer     Removed use of DimSizes(:) array
     532             : !
     533             : ! !BUGS:
     534             : !     This routine makes the assumption that the sum of the
     535             : !     distribution in each dimension adds up to the total 
     536             : !     number of entries in that dimension.  It will cause
     537             : !     problems if the actual local arrays are over- or 
     538             : !     under-allocated.  For example, if the local array is
     539             : !     allocated statically for the maximum size of the
     540             : !     array on any processor, problems will occur on those
     541             : !     PEs which have less than the maximum.
     542             : !
     543             : !EOP
     544             : !------------------------------------------------------------------------
     545             : !BOC
     546             : !
     547             : ! !LOCAL VARIABLES:
     548             :       INTEGER :: TruePE, I, J, K, Counter1, Counter2, SizeX, SizeY
     549             : !
     550             :       CPP_ENTER_PROCEDURE( "DECOMPREGULAR2D" )
     551             : !
     552             : ! Some sanity checks
     553             : !
     554             :       CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
     555             :       CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
     556             : !
     557             : ! The head contains NPEs pointers to the tag lists.
     558             : !
     559       62976 :       SizeX = SUM(Xdist)
     560      302592 :       SizeY = SUM(Ydist)
     561        6144 :       Decomp%GlobalSize = SizeX * SizeY
     562       18432 :       ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY ) )
     563     2494464 :       ALLOCATE( Decomp%Head( NPEsX*NPEsY ) )
     564        6144 :       Counter1 = 0
     565      302592 :       DO J = 1, NPEsY
     566     2772480 :         DO I = 1, NPEsX
     567             : !
     568             : ! WARNING!!!!  The definition of the PE is Row-major ordering
     569             : !
     570     2476032 :           TruePE = ( J-1 ) * NPEsX + I 
     571             : 
     572             : !
     573             : ! The number of entries is the product of the local X, Y, Z allotment
     574             : !
     575     2476032 :           Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)
     576             : !
     577             : ! For each Y there is a separate run
     578             : !
     579     2476032 :           NULLIFY( Decomp%Head(TruePE)%StartTags )
     580     2476032 :           NULLIFY( Decomp%Head(TruePE)%EndTags )
     581     2476032 :           NULLIFY( Decomp%Head(TruePE)%Offsets )
     582     6266880 :           ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)) )
     583     6266880 :           ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)) )
     584     6266880 :           ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)) )
     585     2476032 :           Counter2 = Counter1
     586     9904128 :           DO K = 1, Ydist(J)
     587             : !
     588             : ! Since this is a regular distribution the definition of
     589             : ! tags is dictated by Xdist(I), and appears Ydist(J) times
     590             : !
     591             : !
     592     7428096 :             Decomp%Head(TruePE)%StartTags(K) = Counter2 + 1
     593     7428096 :             Decomp%Head(TruePE)%EndTags(K) = Counter2 + Xdist(I)
     594     9904128 :             Counter2 = Counter2 + SizeX
     595             :           ENDDO
     596     2772480 :           Counter1 = Counter1 + Xdist(I)
     597             :         ENDDO
     598             : !
     599             : ! Align the counter such that it points to the start of the next 
     600             : ! block.  (Ydist(J)-1) since already one layer has been added in.
     601             : ! Implicit assumption that SizeX = SUM( Xdist )
     602             : !
     603      302592 :         Counter1 = Counter1 + SizeX*(Ydist(J)-1)
     604             :       ENDDO
     605             : !
     606             : ! Calculate offsets
     607             : !
     608     2482176 :       DO I=1, NPEsX*NPEsY
     609     2482176 :         IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
     610     1314816 :           Decomp%Head(I)%Offsets(1) = 0
     611     7428096 :           DO J=2, SIZE(Decomp%Head(I)%StartTags)
     612     6113280 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +      &
     613    13541376 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
     614             :           ENDDO
     615             :         ENDIF
     616             :       ENDDO
     617             : 
     618        6144 :       Decomp%Defined = .TRUE.
     619             : 
     620             :       CPP_LEAVE_PROCEDURE( "DECOMPREGULAR2D" )
     621        6144 :       RETURN
     622             : !EOC
     623             :       END SUBROUTINE DecompRegular2D
     624             : !------------------------------------------------------------------------
     625             : 
     626             : 
     627             : !------------------------------------------------------------------------
     628             : !BOP
     629             : ! !IROUTINE: DecompRegular3D --- Create a decomposition for a 3-D grid
     630             : !
     631             : ! !INTERFACE:
     632       27648 :       SUBROUTINE DecompRegular3D ( NPEsX, NPEsY, NPEsZ,                  &
     633       13824 :                                    Xdist, Ydist, Zdist, Decomp )
     634             : ! !USES:
     635             :       IMPLICIT NONE
     636             : !
     637             : ! !INPUT PARAMETERS:
     638             :       INTEGER, INTENT( IN )            :: NPEsX    ! Number of PEs in X
     639             :       INTEGER, INTENT( IN )            :: NPEsY    ! Number of PEs in Y
     640             :       INTEGER, INTENT( IN )            :: NPEsZ    ! Number of PEs in Z
     641             :       INTEGER, INTENT( IN )            :: Xdist(:) ! Distribution in X
     642             :       INTEGER, INTENT( IN )            :: Ydist(:) ! Distribution in Y
     643             :       INTEGER, INTENT( IN )            :: Zdist(:) ! Distribution in Z
     644             : !
     645             : ! !OUTPUT PARAMETERS:
     646             :       TYPE(DecompType), INTENT( OUT )  :: Decomp  ! Decomp information
     647             : !
     648             : !
     649             : ! !DESCRIPTION:
     650             : !     Creates a decomposition for a regular 3-D grid.  The
     651             : !     decomposition is given through the Xdist, Ydist, and Zdist
     652             : !     distributions, which contain the number of entries on 
     653             : !     each PE in that dimension.    This routine thus defines
     654             : !     a parallelopiped (SOMA-block) distribution.
     655             : !
     656             : ! !SYSTEM ROUTINES:
     657             : !     ALLOCATE
     658             : !
     659             : ! !REVISION HISTORY:
     660             : !   98.01.19   Sawyer     Creation
     661             : !   98.05.11   Sawyer     Removed Perm from arglist -- see DecompPermute
     662             : !   00.07.07   Sawyer     Removed use of Sizes(:) array
     663             : !
     664             : ! !BUGS:
     665             : !     This routine makes the assumption that the sum of the
     666             : !     distribution in each dimension adds up to the total 
     667             : !     number of entries in that dimension.  It will cause
     668             : !     problems if the actual local arrays are over- or 
     669             : !     under-allocated.  For example, if the local array is
     670             : !     allocated statically for the maximum size of the
     671             : !     array on any processor, problems will occur on those
     672             : !     PEs which have less than the maximum.
     673             : !
     674             : !EOP
     675             : !------------------------------------------------------------------------
     676             : !BOC
     677             : !
     678             : ! !LOCAL VARIABLES:
     679             :       INTEGER  :: TruePE, Counter1, Counter2, Counter3
     680             :       INTEGER  :: I, J, K, L, M, N, SizeX, SizeY, SizeZ
     681             : !
     682             :       CPP_ENTER_PROCEDURE( "DECOMPREGULAR3D" )
     683             : !
     684             : ! Some sanity checks
     685             : !
     686             : !
     687             :       CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
     688             :       CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
     689             :       CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
     690             : !
     691             : ! The head contains NPEs pointers to the tag lists.
     692             : !
     693       78336 :       SizeX = SUM(Xdist)
     694      801792 :       SizeY = SUM(Ydist)
     695       95232 :       SizeZ = SUM(Zdist)
     696       13824 :       Decomp%GlobalSize = SizeX * SizeY * SizeZ
     697       41472 :       ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ ) )
     698     7334400 :       ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ ) )
     699       13824 :       Counter1 = 0
     700       95232 :       DO K = 1, NPEsZ
     701     4130304 :         DO J = 1, NPEsY
     702    11341824 :           DO I = 1, NPEsX
     703             : !
     704             : ! WARNING!!!!  The definition of the PE is Row-major ordering
     705             : !
     706     7292928 :             TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + I 
     707     7292928 :             NULLIFY( Decomp%Head(TruePE)%StartTags )
     708     7292928 :             NULLIFY( Decomp%Head(TruePE)%EndTags )
     709     7292928 :             NULLIFY( Decomp%Head(TruePE)%Offsets )
     710             : !
     711             : ! The number of entries is the product of the local X, Y, Z allotment
     712             : !
     713     7292928 :             Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)*Zdist(K)
     714             : !
     715             : ! For each Z there are Y separate runs
     716             : !
     717    21878784 :             ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)) )
     718    21878784 :             ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)) )
     719    21878784 :             ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)) )
     720     7292928 :             Counter2 = Counter1
     721     7292928 :             L = 0
     722   204073984 :             DO N = 1, Zdist(K)
     723   196781056 :               Counter3 = Counter2
     724   787124224 :               DO M = 1, Ydist(J)
     725             : !
     726             : !     Since this is a regular distribution the definition of
     727             : !     tags is dictated by Xdist(I), and appears Ydist(J) times
     728             : !
     729             : !
     730   590343168 :                 L = L + 1
     731   590343168 :                 Decomp%Head(TruePE)%StartTags(L) = Counter3 + 1
     732   590343168 :                 Decomp%Head(TruePE)%EndTags(L) = Counter3 + Xdist(I)
     733   787124224 :                 Counter3 = Counter3 + SizeX
     734             :               ENDDO
     735   204073984 :               Counter2 = Counter2 + SizeX*SizeY
     736             :             ENDDO
     737    11341824 :             Counter1 = Counter1 + Xdist(I)
     738             :           ENDDO
     739             : !
     740             : ! Align the counter such that it points to the start of the next 
     741             : ! block.  (Ydist(J)-1) since already one X layer has been added in.
     742             : ! Implicit assumption that SizeX = SUM( Xdist )
     743             : !
     744     4130304 :           Counter1 = Counter1 + SizeX*(Ydist(J)-1)
     745             :         ENDDO
     746             : !
     747             : ! Align the counter such that it points to the start of the next 
     748             : ! block.  (Zdist(K)-1) since already one X-Y layer has been added in.
     749             : ! Implicit assumption that SizeY = SUM( Ydist )
     750             : !
     751       95232 :         Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
     752             :       ENDDO
     753             : !
     754             : ! Calculate offsets
     755             : !
     756     7306752 :       DO I=1, NPEsX*NPEsY*NPEsZ
     757     7306752 :         IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
     758     7292928 :           Decomp%Head(I)%Offsets(1) = 0
     759   590343168 :           DO J=2, SIZE(Decomp%Head(I)%StartTags)
     760   583050240 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +      &
     761  1173393408 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
     762             :           ENDDO
     763             :         ENDIF
     764             :       ENDDO
     765             : 
     766       13824 :       Decomp%Defined = .TRUE.
     767             : 
     768             :       CPP_LEAVE_PROCEDURE( "DECOMPREGULAR3D" )
     769       13824 :       RETURN
     770             : !EOC
     771             :       END SUBROUTINE DecompRegular3D
     772             : !------------------------------------------------------------------------
     773             : 
     774             : 
     775             : !------------------------------------------------------------------------
     776             : !BOP
     777             : ! !IROUTINE: DecompRegular3Dorder --- Create a decomposition for a 3-D grid
     778             : !
     779             : ! !INTERFACE:
     780           0 :       SUBROUTINE DecompRegular3Dorder( Order, NPEsX, NPEsY, NPEsZ,       &
     781        7680 :                                        Xdist, Ydist, Zdist, Decomp )
     782             : ! !USES:
     783             :       IMPLICIT NONE
     784             : !
     785             : ! !INPUT PARAMETERS:
     786             :       CHARACTER(3), INTENT( IN )       :: Order    ! Dim. ordering
     787             :       INTEGER, INTENT( IN )            :: NPEsX    ! Number of PEs in X
     788             :       INTEGER, INTENT( IN )            :: NPEsY    ! Number of PEs in Y
     789             :       INTEGER, INTENT( IN )            :: NPEsZ    ! Number of PEs in Z
     790             :       INTEGER, INTENT( IN )            :: Xdist(:) ! Distribution in X
     791             :       INTEGER, INTENT( IN )            :: Ydist(:) ! Distribution in Y
     792             :       INTEGER, INTENT( IN )            :: Zdist(:) ! Distribution in Z
     793             : !
     794             : ! !OUTPUT PARAMETERS:
     795             :       TYPE(DecompType), INTENT( OUT )  :: Decomp  ! Decomp information
     796             : !
     797             : ! !DESCRIPTION:
     798             : !     Creates a variable block-block-block decomposition for a regular 
     799             : !     3-D grid, where the ordering of the PEs can be explicitly given
     800             : !     (see next paragraph). The decomposition is given through the 
     801             : !     Xdist, Ydist, and Zdist distributions, which contain the number 
     802             : !     of entries on each PE in that dimension.  This routine thus defines
     803             : !     a parallelopiped (SOMA-block) distribution.
     804             : !
     805             : !     With the string argument Order, the order of counting in the
     806             : !     3d PE space can be specified.  There are six possible values:
     807             : !     "xyz", "xzy", "yxz", "yzx", "zxy", and "zyx".  
     808             : !
     809             : !     The same as DecompRegular3Dorder could also be achieved by
     810             : !     using DecompRegular3D and then permuting the PE ownership
     811             : !     with DecompPermute.
     812             : !
     813             : ! !SYSTEM ROUTINES:
     814             : !     ALLOCATE
     815             : !
     816             : ! !REVISION HISTORY:
     817             : !   01.03.20   Sawyer     Creation from DecompRegular3Dzy, added ordering
     818             : !
     819             : ! !BUGS:
     820             : !   Not yet tested
     821             : !
     822             : !EOP
     823             : !---------------------------------------------------------------------
     824             : !BOC
     825             : ! !LOCAL VARIABLES:
     826             :       INTEGER  :: TruePE, Counter1, Counter2, Counter3
     827             :       INTEGER  :: I, J, K, L, M, N, SizeX, SizeY, SizeZ
     828             :       INTEGER  :: Imult, Jmult, Kmult
     829             : !
     830             :       CPP_ENTER_PROCEDURE( "DECOMPREGULAR3DORDER" )
     831             : !
     832             : ! Some sanity checks
     833             : !
     834             : !
     835             :       CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
     836             :       CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
     837             :       CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
     838             : 
     839        7680 :       IF ( Order=="xyz" ) THEN
     840             : ! Looks like:      TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + (I-1) + 1
     841           0 :         Imult = 1
     842           0 :         Jmult = NPEsX
     843           0 :         Kmult = NPEsX*NPEsY
     844        7680 :       ELSE IF ( Order=="xzy" ) THEN
     845             : ! Looks like:      TruePE = (J-1)*NPEsX*NPEsZ + (K-1)*NPEsX + (I-1) + 1
     846        7680 :         Imult = 1
     847        7680 :         Jmult = NPEsX*NPEsZ
     848        7680 :         Kmult = NPEsX
     849           0 :       ELSE IF ( Order=="yxz" ) THEN
     850             : ! Looks like:      TruePE = (K-1)*NPEsY*NPEsX + (I-1)*NPEsY + (J-1) + 1
     851           0 :         Imult = NPEsY
     852           0 :         Jmult = 1
     853           0 :         Kmult = NPEsX*NPEsY
     854           0 :       ELSE IF ( Order=="yzx" ) THEN
     855             : ! Looks like:      TruePE = (I-1)*NPEsY*NPEsZ + (K-1)*NPEsY + (J-1) + 1
     856           0 :         Imult = NPEsY*NPEsZ
     857           0 :         Jmult = 1
     858           0 :         Kmult = NPEsY
     859           0 :       ELSE IF ( Order=="zxy" ) THEN
     860             : ! Looks like:      TruePE = (J-1)*NPEsX*NPEsZ + (I-1)*NPEsZ + (K-1) + 1
     861           0 :         Imult = NPEsZ
     862           0 :         Jmult = NPEsX*NPEsZ
     863           0 :         Kmult = 1
     864           0 :       ELSE IF ( Order=="zyx" ) THEN
     865             : ! Looks like:      TruePE = (I-1)*NPEsY*NPEsZ + (J-1)*NPEsZ + (K-1) + 1
     866           0 :         Imult = NPEsY*NPEsZ
     867           0 :         Jmult = NPEsZ
     868           0 :         Kmult = 1
     869             :       ELSE 
     870             : ! Looks like:      TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + (I-1) + 1
     871           0 :         write(iulog,*) "Warning: DecompCreate3Dorder", Order, "not supported"
     872           0 :         write(iulog,*) "         Continuing with XYZ ordering"
     873           0 :         Imult = 1
     874           0 :         Jmult = NPEsX
     875           0 :         Kmult = NPEsX*NPEsY
     876             :       ENDIF
     877             : 
     878             : !
     879             : ! The head contains NPEs pointers to the tag lists.
     880             : !
     881       49152 :       SizeX = SUM(Xdist)
     882       66048 :       SizeY = SUM(Ydist)
     883      402432 :       SizeZ = SUM(Zdist)
     884        7680 :       Decomp%GlobalSize = SizeX * SizeY * SizeZ
     885       23040 :       ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ ) )
     886     4760064 :       ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ ) )
     887        7680 :       Counter1 = 0
     888             : 
     889      402432 :       DO K = 1, NPEsZ
     890     2969088 :         DO J = 1, NPEsY
     891     7311360 :           DO I = 1, NPEsX
     892             : !
     893             : ! WARNING!!!!  The definition of the PE is Row-major ordering
     894             : !
     895             :             
     896     4737024 :             TruePE = (I-1)*Imult + (J-1)*Jmult + (K-1)*Kmult + 1 
     897             : !
     898             : ! The number of entries is the product of the local X, Y, Z allotment
     899             : !
     900     4737024 :             Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)*Zdist(K)
     901             : !
     902             : ! For each Z there are Y separate runs
     903             : !
     904    14211072 :             ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)) )
     905    14211072 :             ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)) )
     906    14211072 :             ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)) )
     907     4737024 :             Counter2 = Counter1
     908     4737024 :             L = 0
     909    18948096 :             DO N = 1, Zdist(K)
     910    14211072 :               Counter3 = Counter2
     911   555111936 :               DO M = 1, Ydist(J)
     912             : !
     913             : !     Since this is a regular distribution the definition of
     914             : !     tags is dictated by Xdist(I), and appears Ydist(J) times
     915             : !
     916             : !
     917   540900864 :                 L = L + 1
     918   540900864 :                 Decomp%Head(TruePE)%StartTags(L) = Counter3 + 1
     919   540900864 :                 Decomp%Head(TruePE)%EndTags(L) = Counter3 + Xdist(I)
     920   555111936 :                 Counter3 = Counter3 + SizeX
     921             :               ENDDO
     922    18948096 :               Counter2 = Counter2 + SizeX*SizeY
     923             :             ENDDO
     924     7311360 :             Counter1 = Counter1 + Xdist(I)
     925             :           ENDDO
     926             : !
     927             : ! Align the counter such that it points to the start of the next 
     928             : ! block.  (Ydist(J)-1) since already one X layer has been added in.
     929             : ! Implicit assumption that SizeX = SUM( Xdist )
     930             : !
     931     2969088 :           Counter1 = Counter1 + SizeX*(Ydist(J)-1)
     932             :         ENDDO
     933             : !
     934             : ! Align the counter such that it points to the start of the next 
     935             : ! block.  (Zdist(K)-1) since already one X-Y layer has been added in.
     936             : ! Implicit assumption that SizeY = SUM( Ydist )
     937             : !
     938      402432 :         Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
     939             :       ENDDO
     940             : !
     941             : ! Calculate offsets
     942             : !
     943     4744704 :       DO I=1, NPEsX*NPEsY*NPEsZ
     944     4744704 :         IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
     945     4737024 :           Decomp%Head(I)%Offsets(1) = 0
     946   540900864 :           DO J=2, SIZE(Decomp%Head(I)%StartTags)
     947   536163840 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +      &
     948  1077064704 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
     949             :           ENDDO
     950             :         ENDIF
     951             :       ENDDO
     952             : 
     953        7680 :       Decomp%Defined = .TRUE.
     954             : 
     955             :       CPP_LEAVE_PROCEDURE( "DECOMPREGULAR3DORDER" )
     956        7680 :       RETURN
     957             : !EOC
     958             :       END SUBROUTINE DecompRegular3DOrder
     959             : !------------------------------------------------------------------------
     960             : 
     961             : 
     962             : !------------------------------------------------------------------------
     963             : !BOP
     964             : ! !IROUTINE: DecompRegular4D --- Create a decomposition for a 3-D grid
     965             : !
     966             : ! !INTERFACE:
     967           0 :       SUBROUTINE DecompRegular4D ( NPEsX, NPEsY, NPEsZ, NPEsT,          &
     968           0 :                                    Xdist, Ydist, Zdist, Tdist, Decomp )
     969             : ! !USES:
     970             :       IMPLICIT NONE
     971             : !
     972             : ! !INPUT PARAMETERS:
     973             :       INTEGER, INTENT( IN )            :: NPEsX    ! Number of PEs in X
     974             :       INTEGER, INTENT( IN )            :: NPEsY    ! Number of PEs in Y
     975             :       INTEGER, INTENT( IN )            :: NPEsZ    ! Number of PEs in Z
     976             :       INTEGER, INTENT( IN )            :: NPEsT    ! Number of PEs in T
     977             :       INTEGER, INTENT( IN )            :: Xdist(:) ! Distribution in X
     978             :       INTEGER, INTENT( IN )            :: Ydist(:) ! Distribution in Y
     979             :       INTEGER, INTENT( IN )            :: Zdist(:) ! Distribution in Z
     980             :       INTEGER, INTENT( IN )            :: Tdist(:) ! Distribution in T
     981             : !
     982             : ! !OUTPUT PARAMETERS:
     983             :       TYPE(DecompType), INTENT( OUT )  :: Decomp  ! Decomp information
     984             : !
     985             : !
     986             : ! !DESCRIPTION:
     987             : !     Creates a decomposition for a regular 4-D grid.  The
     988             : !     decomposition is given through the Xdist, Ydist, Zdist, Tdist
     989             : !     distributions, which contain the number of entries on 
     990             : !     each PE in that dimension.    This routine thus defines
     991             : !     a parallelopiped (SOMA-block) distribution.
     992             : !
     993             : ! !SYSTEM ROUTINES:
     994             : !     ALLOCATE
     995             : !
     996             : ! !REVISION HISTORY:
     997             : !   02.12.23   Sawyer     Creation from DecompRegular4D
     998             : !
     999             : ! !BUGS:
    1000             : !     This routine makes the assumption that the sum of the
    1001             : !     distribution in each dimension adds up to the total 
    1002             : !     number of entries in that dimension.  It will cause
    1003             : !     problems if the actual local arrays are over- or 
    1004             : !     under-allocated.  For example, if the local array is
    1005             : !     allocated statically for the maximum size of the
    1006             : !     array on any processor, problems will occur on those
    1007             : !     PEs which have less than the maximum.
    1008             : !
    1009             : !EOP
    1010             : !------------------------------------------------------------------------
    1011             : !BOC
    1012             : !
    1013             : ! !LOCAL VARIABLES:
    1014             :       INTEGER  :: TruePE, Counter1, Counter2, Counter3, Counter4
    1015             :       INTEGER  :: I, J, K, L, M, N, P, T, SizeX, SizeY, SizeZ, SizeT
    1016             : !
    1017             :       CPP_ENTER_PROCEDURE( "DECOMPREGULAR4D" )
    1018             : !
    1019             : ! Some sanity checks
    1020             : !
    1021             : !
    1022             :       CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
    1023             :       CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
    1024             :       CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
    1025             :       CPP_ASSERT_F90( NPEsT .EQ. SIZE( Tdist ) )
    1026             : !
    1027             : ! The head contains NPEs pointers to the tag lists.
    1028             : !
    1029           0 :       SizeX = SUM(Xdist)
    1030           0 :       SizeY = SUM(Ydist)
    1031           0 :       SizeZ = SUM(Zdist)
    1032           0 :       SizeT = SUM(Tdist)
    1033           0 :       Decomp%GlobalSize = SizeX * SizeY * SizeZ * SizeT
    1034           0 :       ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ*NPEsT ) )
    1035           0 :       ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ*NPEsT ) )
    1036           0 :       Counter1 = 0
    1037           0 :       DO T = 1, NPEsT
    1038           0 :         DO K = 1, NPEsZ
    1039           0 :           DO J = 1, NPEsY
    1040           0 :             DO I = 1, NPEsX
    1041             : !
    1042             : ! WARNING!!!!  The definition of the PE is Row-major ordering
    1043             : !
    1044             :               TruePE = (T-1)*NPEsX*NPEsY*NPEsZ +                      &
    1045           0 :                        (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + I 
    1046           0 :               NULLIFY( Decomp%Head(TruePE)%StartTags )
    1047           0 :               NULLIFY( Decomp%Head(TruePE)%EndTags )
    1048           0 :               NULLIFY( Decomp%Head(TruePE)%Offsets )
    1049             : !
    1050             : ! The number of entries is the product of the local X, Y, Z allotment
    1051             : !
    1052           0 :               Decomp%NumEntries(TruePE) =                             &
    1053           0 :                      Xdist(I)*Ydist(J)*Zdist(K)*Tdist(T)
    1054             : !
    1055             : ! For each Z there are Y separate runs
    1056             : !
    1057           0 :               ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)*Tdist(T)) )
    1058           0 :               ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)*Tdist(T)) )
    1059           0 :               ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)*Tdist(T)) )
    1060           0 :               Counter2 = Counter1     ! Base address
    1061           0 :               L = 0
    1062           0 :               DO P = 1, Tdist(T)
    1063           0 :                 Counter3 = Counter2
    1064           0 :                 DO N = 1, Zdist(K)
    1065           0 :                   Counter4 = Counter3
    1066           0 :                   DO M = 1, Ydist(J)
    1067             : !
    1068             : !     Since this is a regular distribution the definition of
    1069             : !     tags is dictated by Xdist(I), and appears Ydist(J) times
    1070             : !
    1071             : !
    1072           0 :                     L = L + 1
    1073           0 :                     Decomp%Head(TruePE)%StartTags(L) = Counter4 + 1
    1074           0 :                     Decomp%Head(TruePE)%EndTags(L) = Counter4 + Xdist(I)
    1075           0 :                     Counter4 = Counter4 + SizeX
    1076             :                   ENDDO
    1077           0 :                   Counter3 = Counter3 + SizeX*SizeY
    1078             :                 ENDDO
    1079           0 :                 Counter2 = Counter2 + SizeX*SizeY*SizeZ
    1080             :               ENDDO
    1081           0 :               Counter1 = Counter1 + Xdist(I)   ! Increment base address
    1082             :             ENDDO
    1083             : !
    1084             : ! Align the counter such that it points to the start of the next 
    1085             : ! block.  (Ydist(J)-1) since already one X layer has been added in.
    1086             : ! Implicit assumption that SizeX = SUM( Xdist )
    1087             : !
    1088           0 :             Counter1 = Counter1 + SizeX*(Ydist(J)-1)
    1089             :           ENDDO
    1090             : !
    1091             : ! Align the counter such that it points to the start of the next 
    1092             : ! block.  (Zdist(K)-1) since already one X-Y layer has been added in.
    1093             : ! Implicit assumption that SizeY = SUM( Ydist )
    1094             : !
    1095           0 :           Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
    1096             :         ENDDO
    1097           0 :         Counter1 = Counter1 + SizeX*SizeY*SizeZ*(Tdist(T)-1)
    1098             :       ENDDO
    1099             : !
    1100             : ! Calculate offsets
    1101             : !
    1102           0 :       DO I=1, NPEsX*NPEsY*NPEsZ*NPEsT
    1103           0 :         IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
    1104           0 :           Decomp%Head(I)%Offsets(1) = 0
    1105           0 :           DO J=2, SIZE(Decomp%Head(I)%StartTags)
    1106           0 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +      &
    1107           0 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
    1108             :           ENDDO
    1109             :         ENDIF
    1110             :       ENDDO
    1111             : 
    1112           0 :       Decomp%Defined = .TRUE.
    1113             : 
    1114             :       CPP_LEAVE_PROCEDURE( "DECOMPREGULAR4D" )
    1115           0 :       RETURN
    1116             : !EOC
    1117             :       END SUBROUTINE DecompRegular4D
    1118             : !------------------------------------------------------------------------
    1119             : 
    1120             : 
    1121             : !------------------------------------------------------------------------
    1122             : !BOP
    1123             : ! !IROUTINE: DecompCreateIrr --- Decomposition for an irregular mesh
    1124             : !
    1125             : ! !INTERFACE:
    1126           0 :       SUBROUTINE DecompCreateIrr( NPEs, Pe, TotalPts, Decomp )
    1127             : ! !USES:
    1128             :       IMPLICIT NONE
    1129             : !
    1130             : ! !INPUT PARAMETERS:
    1131             :       INTEGER, INTENT( IN )            :: NPEs     ! Number of PEs
    1132             :       INTEGER, INTENT( IN )            :: Pe(:)    ! Processor location
    1133             :       INTEGER, INTENT( IN )            :: TotalPts ! Number of points
    1134             : !
    1135             : ! !OUTPUT PARAMETERS:
    1136             :       TYPE(DecompType), INTENT( OUT )  :: Decomp  ! Decomp information
    1137             : !
    1138             : !
    1139             : ! !DESCRIPTION:
    1140             : !     Creates a decomposition for a irregular 1-D mesh.  The
    1141             : !     decomposition is given through the number of points and
    1142             : !     an array containing the PE which each point is mapped to.
    1143             : !     This mapping naturally assumes that the local numbering
    1144             : !     is incrementally increasing as points are mapped to PEs.
    1145             : !     This assumption is sufficient for most applications, but
    1146             : !     another irregular mapping routine is available for more
    1147             : !     complex cases.
    1148             : !
    1149             : ! !SYSTEM ROUTINES:
    1150             : !     ALLOCATE
    1151             : !
    1152             : ! !REVISION HISTORY:
    1153             : !   98.01.19   Sawyer     Creation, with requirements from Jay Larson
    1154             : !   98.11.02   Sawyer     Rewritten to requirements for Andrea Molod
    1155             : !   00.07.07   Sawyer     Removed use of DimSizes(:) array
    1156             : !   00.11.12   Sawyer     Changed argument order for overloading
    1157             : !
    1158             : !EOP
    1159             : !------------------------------------------------------------------------
    1160             : !BOC
    1161             : !
    1162             : ! !LOCAL VARIABLES:
    1163             :       INTEGER  :: I, J, PEhold
    1164           0 :       INTEGER  :: Counter( NPEs )
    1165             : !
    1166             :       CPP_ENTER_PROCEDURE( "DECOMPCREATEIRR" )
    1167             : !
    1168             :       CPP_ASSERT_F90( TotalPts .LE. SIZE( PE ) )
    1169             : 
    1170             : !
    1171             : ! The head contains NPEs pointers to the tag lists.
    1172             : !
    1173           0 :       Decomp%GlobalSize = TotalPts
    1174           0 :       ALLOCATE( Decomp%NumEntries( NPEs ) )
    1175           0 :       ALLOCATE( Decomp%Head( NPEs ) )
    1176             : !
    1177             : ! Perform over all points in the mapping
    1178             : !
    1179           0 :       PEhold= -1
    1180           0 :       Counter = 0
    1181           0 :       Decomp%NumEntries = 0
    1182           0 :       DO I=1, TotalPts
    1183             :         CPP_ASSERT_F90( ( PE( I ) .LT. NPEs .AND. PE( I ) .GE. 0 ) )
    1184           0 :         IF ( PE( I ) .NE. PEhold ) THEN
    1185           0 :           PEhold = PE( I )
    1186           0 :           Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
    1187             :         ENDIF
    1188           0 :         Decomp%NumEntries(PEHold+1) = Decomp%NumEntries(PEHold+1) + 1
    1189             :       ENDDO
    1190           0 :       DO I=1, NPEs
    1191             : !
    1192             : ! Now the amount of space to allocate is known.  It is acceptable
    1193             : ! to in allocated an array of size 0 (F90 Handbook, Section 6.5.1)
    1194             : !
    1195           0 :         ALLOCATE( Decomp%Head(I)%StartTags(Counter(I)) )
    1196           0 :         ALLOCATE( Decomp%Head(I)%EndTags(Counter(I)) )
    1197           0 :         ALLOCATE( Decomp%Head(I)%Offsets(Counter(I)) )
    1198             :       ENDDO
    1199             : !
    1200             : ! Perform over all points in the mapping
    1201             : !
    1202           0 :       PEhold= -1
    1203           0 :       Counter = 0
    1204           0 :       DO I=1, TotalPts
    1205           0 :         IF ( PE( I ) .NE. PEhold ) THEN
    1206             : !
    1207             : ! If not first entry, close up shop on previous run
    1208             : !
    1209           0 :           IF ( I .GT. 1 ) THEN
    1210           0 :             Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = I-1
    1211             :           ENDIF
    1212           0 :           PEhold = PE( I )
    1213           0 :           Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
    1214           0 :           Decomp%Head(PEhold+1)%StartTags(Counter(PEhold+1)) = I
    1215             :         ENDIF
    1216             :       ENDDO
    1217             : !
    1218             : ! Clean up shop for the final run
    1219             : !
    1220           0 :       IF ( TotalPts .GT. 0 ) THEN
    1221           0 :         Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = TotalPts
    1222             :       ENDIF
    1223             : 
    1224             : !
    1225             : ! Calculate offsets
    1226             : !
    1227           0 :       DO I=1, NPEs
    1228           0 :         IF ( Counter(I) > 0 ) THEN
    1229           0 :           Decomp%Head(I)%Offsets(1) = 0
    1230           0 :           DO J=2, Counter(I)
    1231           0 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +    &
    1232           0 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
    1233             :           ENDDO
    1234             :         ENDIF
    1235             :       ENDDO
    1236           0 :       Decomp%Defined = .TRUE.
    1237             : 
    1238             :       CPP_LEAVE_PROCEDURE( "DECOMPCREATEIRR" )
    1239           0 :       RETURN
    1240             : !EOC
    1241             :       END SUBROUTINE DecompCreateIrr
    1242             : !------------------------------------------------------------------------
    1243             : 
    1244             : 
    1245             : !------------------------------------------------------------------------
    1246             : !BOP
    1247             : ! !IROUTINE: DecompCreateTags --- Decomposition from Pe and Tags
    1248             : !
    1249             : ! !INTERFACE:
    1250       15360 :       SUBROUTINE DecompCreateTags(Npes, Pe, TotalPts, Tags, Decomp )
    1251             : ! !USES:
    1252             :       IMPLICIT NONE
    1253             : !
    1254             : ! !INPUT PARAMETERS:
    1255             :       INTEGER, INTENT( IN )            :: NPEs     ! Number of PEs
    1256             :       INTEGER, INTENT( IN )            :: Pe(:)    ! Processor location
    1257             :       INTEGER, INTENT( IN )            :: TotalPts ! Number of points
    1258             :       INTEGER, INTENT( IN )            :: Tags(:)  ! Global index
    1259             : !
    1260             : ! !OUTPUT PARAMETERS:
    1261             :       TYPE(DecompType), INTENT( OUT )  :: Decomp   ! Decomp information
    1262             : !
    1263             : !
    1264             : ! !DESCRIPTION:
    1265             : !     Creates a decomposition for a irregular mesh from the 
    1266             : !     Pe ownership and the Tags.  This is a simple extension of 
    1267             : !     DecompCreateIrr (previously DecompIrregular1D) but is
    1268             : !     much more dangerous, since the user can define the Tags
    1269             : !     (global indices) arbitrarily.
    1270             : !
    1271             : ! !SYSTEM ROUTINES:
    1272             : !     ALLOCATE
    1273             : !
    1274             : ! !REVISION HISTORY:
    1275             : !   00.11.12   Sawyer     Creation from DecompCreateIrr
    1276             : !
    1277             : !EOP
    1278             : !------------------------------------------------------------------------
    1279             : !BOC
    1280             : !
    1281             : ! !LOCAL VARIABLES:
    1282             :       INTEGER  :: I, J, PEhold, LastTag
    1283       30720 :       INTEGER  :: Counter( NPEs )
    1284             : !
    1285             :       CPP_ENTER_PROCEDURE( "DECOMPCREATETAGS" )
    1286             : !
    1287             :       CPP_ASSERT_F90( TotalPts .LE. SIZE( PE ) )
    1288             :       CPP_ASSERT_F90( TotalPts .LE. SIZE( Tags ) )
    1289             : 
    1290             : !
    1291             : ! The head contains NPEs pointers to the tag lists.
    1292             : !
    1293       15360 :       Decomp%GlobalSize = TotalPts
    1294       46080 :       ALLOCATE( Decomp%NumEntries( NPEs ) )
    1295    11842560 :       ALLOCATE( Decomp%Head( NPEs ) )
    1296             : !
    1297             : ! Perform over all points in the mapping
    1298             : !
    1299       15360 :       PEhold  = -1
    1300       15360 :       LastTag = -999999999
    1301    11811840 :       Counter = 0
    1302    11811840 :       Decomp%NumEntries = 0
    1303   136707072 :       DO I=1, TotalPts
    1304             :         CPP_ASSERT_F90( PE( I ) .LT. NPEs .AND. PE( I ) .GE. 0 )
    1305   136691712 :         IF ( LastTag==0 .OR. Tags(I)/=LastTag+1 .OR. PE(I)/=PEhold ) THEN
    1306     1552056 :           PEhold = PE( I )
    1307     1552056 :           Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
    1308             :         ENDIF
    1309   136691712 :         Decomp%NumEntries(PEHold+1) = Decomp%NumEntries(PEHold+1) + 1
    1310   136707072 :         LastTag = Tags(I)
    1311             :       ENDDO
    1312             : 
    1313    11811840 :       DO I=1, NPEs
    1314             : !
    1315             : ! Now the amount of space to allocate is known.  It is acceptable
    1316             : ! to in allocated an array of size 0 (F90 Handbook, Section 6.5.1)
    1317             : !
    1318    23612672 :         ALLOCATE( Decomp%Head(I)%StartTags(Counter(I)) )
    1319    23612672 :         ALLOCATE( Decomp%Head(I)%EndTags(Counter(I)) )
    1320    23628032 :         ALLOCATE( Decomp%Head(I)%Offsets(Counter(I)) )
    1321             :       ENDDO
    1322             : 
    1323             : !
    1324             : ! Perform over all points in the domain
    1325             : !
    1326    11811840 :       PEhold  = -1
    1327    11811840 :       LastTag = -999999999
    1328    11811840 :       Counter = 0
    1329   136707072 :       DO I=1, TotalPts
    1330   136691712 :         IF ( LastTag==0 .OR. Tags(I)/=LastTag+1 .OR. PE(I)/=PEhold ) THEN
    1331             : !
    1332             : ! If not first entry, close up shop on previous run
    1333             : !
    1334     1552056 :           IF ( I .GT. 1 ) THEN
    1335     1536952 :             Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = LastTag
    1336             :           ENDIF
    1337     1552056 :           PEhold = PE( I )
    1338     1552056 :           Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
    1339     1552056 :           Decomp%Head(PEhold+1)%StartTags(Counter(PEhold+1)) = Tags(I)
    1340             :         ENDIF
    1341   136707072 :         LastTag = Tags(I)
    1342             :       ENDDO
    1343             : !
    1344             : ! Clean up shop for the final run
    1345             : !
    1346       15360 :       IF ( TotalPts .GT. 0 ) THEN
    1347       15104 :         Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) =Tags(TotalPts)
    1348             :       ENDIF
    1349             : 
    1350             : !
    1351             : ! Calculate offsets
    1352             : !
    1353    11811840 :       DO I=1, NPEs
    1354    11811840 :         IF ( Counter(I) > 0 ) THEN
    1355       19712 :           Decomp%Head(I)%Offsets(1) = 0
    1356     1552056 :           DO J=2, Counter(I)
    1357     1532344 :             Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) +      &
    1358     3084400 :             Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
    1359             :           ENDDO
    1360             :         ENDIF
    1361             :       ENDDO
    1362       15360 :       Decomp%Defined = .TRUE.
    1363             : 
    1364             :       CPP_LEAVE_PROCEDURE( "DECOMPCREATETAGS" )
    1365       15360 :       RETURN
    1366             : !EOC
    1367             :       END SUBROUTINE DecompCreateTags
    1368             : !------------------------------------------------------------------------
    1369             : 
    1370             : 
    1371             : !------------------------------------------------------------------------
    1372             : !BOP
    1373             : ! !IROUTINE: DecompG2L --- Map global index to local and PE
    1374             : !
    1375             : ! !INTERFACE:
    1376   424958976 :       SUBROUTINE DecompG2L ( Decomp, Global, Local, Pe )
    1377             : ! !USES:
    1378             :       IMPLICIT NONE
    1379             : !
    1380             : ! !INPUT PARAMETERS:
    1381             :       TYPE(DecompType), INTENT( IN )   :: Decomp  ! Decomp information
    1382             :       INTEGER, INTENT( IN )            :: Global  ! Global index
    1383             : !
    1384             : ! !OUTPUT PARAMETERS:
    1385             : 
    1386             :       INTEGER, INTENT( OUT )  :: Local            ! Local index
    1387             :       INTEGER, INTENT( OUT )  :: Pe               ! Pe location
    1388             : !
    1389             : !
    1390             : ! !DESCRIPTION:
    1391             : !     Given a decomposition and a global index, this routine returns
    1392             : !     the local index and PE location of that global tag.  If the
    1393             : !     global index is not found, Local = 0, Pe = -1 is returned.
    1394             : !
    1395             : !     Note that this routine is not efficient by any stretch of the 
    1396             : !     imagination --- only one index can be converted at a time.
    1397             : !     In addition, a search procedure must be performed, whose 
    1398             : !     efficiency is inversely proportional to the size of the decomposition
    1399             : !     (in particular, to the number of "runs").  Conceptually this
    1400             : !     mapping should be used only once in the program for
    1401             : !     initialization, and subsequently all calculations should take
    1402             : !     place using local indices.
    1403             : !
    1404             : ! !SYSTEM ROUTINES:
    1405             : !     SIZE
    1406             : !
    1407             : ! !REVISION HISTORY:
    1408             : !   98.03.20   Sawyer     Creation
    1409             : !   01.03.17   Sawyer     Test for Global==0 (undefined element)
    1410             : !   02.11.22   Sawyer     Optimized by caching previously used block
    1411             : !EOP
    1412             : !------------------------------------------------------------------------
    1413             : !BOC
    1414             : !
    1415             : ! !LOCAL VARIABLES:
    1416             :       INTEGER, SAVE  :: Ipe   =  0   ! Initial process ID
    1417             :       INTEGER, SAVE  :: J     =  0   ! Initial DO loop value
    1418             :       INTEGER        :: Ipeold, Jold, PEsize, Jsize
    1419             : !
    1420             :       CPP_ENTER_PROCEDURE( "DECOMPG2L" )
    1421             : 
    1422             : !
    1423             : ! Search over all the PEs
    1424             : !
    1425   424958976 :       Pe    = -1
    1426   424958976 :       Local = 0
    1427   424958976 :       IF ( Global == 0 ) RETURN          ! quick return
    1428   424958976 :       PEsize = SIZE( Decomp%Head )
    1429   424958976 :       IF ( Ipe >= PEsize ) Ipe = 0
    1430   424958976 :       Ipeold= Ipe
    1431             : PEs:  DO                                 ! Loop over all PEs starting 
    1432   995747326 :         Jsize = SIZE( Decomp%Head(Ipe+1)%StartTags )
    1433   995747326 :         IF ( J >= Jsize ) J = 0
    1434   995747326 :         Jold = J                         ! from the PE used previously
    1435 >10856*10^7 : Blocks: DO WHILE (Jsize > 0)             ! Loop through data segments
    1436 >21713*10^7 :           IF ( Global >= Decomp%Head(Ipe+1)%StartTags(J+1) .AND.          &
    1437 >10856*10^7 :                Global <= Decomp%Head(Ipe+1)%EndTags(J+1) ) THEN
    1438           0 :             Local = Decomp%Head(Ipe+1)%Offsets(J+1) + Global -            &
    1439   424958976 :                     Decomp%Head(Ipe+1)%StartTags(J+1) + 1
    1440   424958976 :             Pe = Ipe
    1441   424958976 :             EXIT PEs                     ! Global tag has been found
    1442             :           ELSE
    1443 >10814*10^7 :             J = MOD(J+1,Jsize)           ! Increment the block index
    1444             :           ENDIF
    1445 >10814*10^7 :           IF ( J == Jold ) EXIT Blocks   ! Global tag not on this PE
    1446             :         ENDDO Blocks
    1447   570788350 :         Ipe = MOD(Ipe+1,PEsize)          ! Increment the pe number
    1448   570788350 :         J   = 0
    1449   570788350 :         IF ( Ipe == Ipeold ) EXIT PEs    ! Global tag not found on any PE
    1450             :       ENDDO PEs
    1451             : 
    1452             :       CPP_ASSERT_F90( Local .LE. Decomp%NumEntries(Pe+1) ) 
    1453             : 
    1454             :       CPP_LEAVE_PROCEDURE( "DECOMPG2L" )
    1455             :       RETURN
    1456             : !
    1457             : !EOC
    1458             :       END SUBROUTINE DecompG2L
    1459             : !------------------------------------------------------------------------
    1460             : 
    1461             : 
    1462             : !------------------------------------------------------------------------
    1463             : !BOP
    1464             : ! !IROUTINE: DecompG2LVector --- Map global index to local and PE
    1465             : !
    1466             : ! !INTERFACE:
    1467           0 :       SUBROUTINE DecompG2LVector ( Decomp, N, Global, Local, Pe )
    1468             : ! !USES:
    1469             :       IMPLICIT NONE
    1470             : !
    1471             : ! !INPUT PARAMETERS:
    1472             :       TYPE(DecompType), INTENT( IN ):: Decomp     ! Decomp information
    1473             :       INTEGER, INTENT( IN )         :: N          ! Number of indices
    1474             :       INTEGER, INTENT( IN )         :: Global(:)  ! Global index
    1475             : !
    1476             : ! !OUTPUT PARAMETERS:
    1477             : 
    1478             :       INTEGER, INTENT( OUT )  :: Local(:)         ! Local index
    1479             :       INTEGER, INTENT( OUT )  :: Pe(:)            ! Pe location
    1480             : !
    1481             : !
    1482             : ! !DESCRIPTION:
    1483             : !     Given a decomposition and a global index, this routine returns
    1484             : !     the local index and PE location of that global tag.  If the
    1485             : !     global index is not found, Local = 0, Pe = -1 is returned.
    1486             : !
    1487             : !     Note that this routine is not efficient by any stretch of the 
    1488             : !     imagination --- only one index can be converted at a time.
    1489             : !     In addition, a search procedure must be performed, whose 
    1490             : !     efficiency is inversely proportional to the size of the decomposition
    1491             : !     (in particular, to the number of "runs").  Conceptually this
    1492             : !     mapping should be used only once in the program for
    1493             : !     initialization, and subsequently all calculations should take
    1494             : !     place using local indices.
    1495             : !
    1496             : ! !SYSTEM ROUTINES:
    1497             : !     SIZE
    1498             : !
    1499             : ! !REVISION HISTORY:
    1500             : !   02.11.09   Sawyer     Creation from decompglobaltolocal
    1501             : !   02.11.22   Sawyer     Optimized by caching previously used block
    1502             : !
    1503             : !EOP
    1504             : !------------------------------------------------------------------------
    1505             : !BOC
    1506             : !
    1507             : ! !LOCAL VARIABLES:
    1508             :       INTEGER, SAVE :: J    = 0   ! Initial value
    1509             :       INTEGER, SAVE :: Ipe  = 0   ! Initial value
    1510             :       INTEGER  :: I, Ipeold, Jold, PEsize, Jsize
    1511             : !
    1512             :       CPP_ENTER_PROCEDURE( "DECOMPG2LVECTOR" )
    1513             : 
    1514           0 :       PEsize = SIZE( Decomp%Head )
    1515             : !
    1516             : ! Search over all the PEs
    1517             : !
    1518           0 :       DO I=1, N
    1519           0 :         Pe(I) = -1
    1520           0 :         Local(I) = 0
    1521           0 :         IF ( Global(I) == 0 ) CYCLE
    1522           0 :         IF ( Ipe >= PEsize ) Ipe = 0
    1523           0 :         Ipeold= Ipe
    1524           0 : PEs:    DO WHILE ( PEsize > 0 )            ! Loop over all PEs starting 
    1525           0 :           Jsize = SIZE( Decomp%Head(Ipe+1)%StartTags )
    1526           0 :           IF ( J >= Jsize ) J = 0
    1527           0 :           Jold = J                         ! from the PE used previously
    1528           0 : Blocks: DO WHILE (Jsize > 0)               ! Loop through data segments
    1529           0 :             IF ( Global(I) >= Decomp%Head(Ipe+1)%StartTags(J+1) .AND.        &
    1530           0 :                Global(I) <= Decomp%Head(Ipe+1)%EndTags(J+1) ) THEN
    1531           0 :               Local(I) = Decomp%Head(Ipe+1)%Offsets(J+1) + Global(I) -       &
    1532           0 :                          Decomp%Head(Ipe+1)%StartTags(J+1) + 1
    1533           0 :               Pe(I) = Ipe
    1534           0 :               EXIT PEs                     ! Global tag has been found
    1535             :             ELSE
    1536           0 :               J = MOD(J+1,Jsize)           ! Increment the block index
    1537             :             ENDIF
    1538           0 :             IF ( J == Jold ) EXIT Blocks   ! Global tag not on this PE
    1539             :           ENDDO Blocks
    1540           0 :           Ipe = MOD(Ipe+1,PEsize)          ! Increment the pe number
    1541           0 :           J   = 0
    1542           0 :           IF ( Ipe == Ipeold ) EXIT PEs    ! Global tag not found on any PE
    1543             :         ENDDO PEs
    1544             : 
    1545             :       CPP_ASSERT_F90( Local(I) .LE. Decomp%NumEntries(Pe(I)+1) ) 
    1546             : 
    1547             :       ENDDO
    1548             :       CPP_LEAVE_PROCEDURE( "DECOMPG2LVECTOR" )
    1549           0 :       RETURN
    1550             : !
    1551             : !EOC
    1552             :     END SUBROUTINE DecompG2LVector
    1553             : !------------------------------------------------------------------------
    1554             : 
    1555             : 
    1556             : !------------------------------------------------------------------------
    1557             : !BOP
    1558             : ! !IROUTINE: DecompL2G --- Map global index to local and PE
    1559             : !
    1560             : ! !INTERFACE:
    1561           0 :       SUBROUTINE DecompL2G ( Decomp, Local, Pe, Global )
    1562             : ! !USES:
    1563             :       IMPLICIT NONE
    1564             : !
    1565             : ! !INPUT PARAMETERS:
    1566             :       TYPE(DecompType), INTENT( IN )   :: Decomp  ! Decomp information
    1567             :       INTEGER, INTENT( IN )            :: Local   ! Local index
    1568             :       INTEGER, INTENT( IN )            :: Pe      ! Pe location
    1569             : !
    1570             : ! !OUTPUT PARAMETERS:
    1571             :       INTEGER, INTENT( OUT )           :: Global  ! Global index
    1572             : !
    1573             : !
    1574             : ! !DESCRIPTION:
    1575             : !     Given a decomposition and a local-pe index pair, this routine 
    1576             : !     returns  the 2-D global index. If the local index is not found, 
    1577             : !     0 is returned. 
    1578             : !
    1579             : !     Note that this routine is not efficient by any stretch of the 
    1580             : !     imagination --- only one index can be converted at a time.
    1581             : !     In addition, a search procedure must be performed, whose 
    1582             : !     efficiency is inversely proportional to the size of the 
    1583             : !     decomposition (in particular, to the number of "runs").  
    1584             : !     Conceptually this mapping should be used only once in the 
    1585             : !     program for initialization, and subsequently all calculations 
    1586             : !     should take place using local indices.
    1587             : !
    1588             : ! !SYSTEM ROUTINES:
    1589             : !     SIZE
    1590             : !
    1591             : ! !REVISION HISTORY:
    1592             : !   98.03.20   Sawyer     Creation
    1593             : !
    1594             : !EOP
    1595             : !------------------------------------------------------------------------
    1596             : !BOC
    1597             : !
    1598             : ! !LOCAL VARIABLES:
    1599             :       INTEGER  :: J, Counter
    1600             :       LOGICAL  :: Found
    1601             : !
    1602             :       CPP_ENTER_PROCEDURE( "DECOMPL2G" )
    1603             :       CPP_ASSERT_F90( Pe .GE. 0 )
    1604             :       CPP_ASSERT_F90( Pe .LT. SIZE(Decomp%Head) )
    1605             :       CPP_ASSERT_F90( Local .GT. 0 )
    1606             :       CPP_ASSERT_F90( Local .LE. Decomp%NumEntries(Pe+1) )
    1607             : 
    1608           0 :       Counter = 0
    1609           0 :       Found   = .FALSE.
    1610           0 :       J = 0
    1611           0 :       DO WHILE ( .NOT. Found )
    1612           0 :         J = J+1
    1613           0 :         Counter = Counter + Decomp%Head(Pe+1)%EndTags(J) -               &
    1614           0 :                             Decomp%Head(Pe+1)%StartTags(J) + 1
    1615           0 :         IF ( Local .LE.  Counter ) THEN
    1616           0 :           Found = .TRUE.
    1617             : !
    1618             : ! The following calculation is not immediately obvious.  Think about it
    1619             : !
    1620           0 :           Global = Local - Counter + Decomp%Head(Pe+1)%EndTags(J)
    1621           0 :           Found = .TRUE.
    1622           0 :         ELSEIF ( J .GE. SIZE( Decomp%Head(Pe+1)%StartTags ) ) THEN
    1623             : !
    1624             : ! Emergency brake
    1625             : !
    1626           0 :           Found = .TRUE.
    1627           0 :           Global = 0
    1628             :         ENDIF
    1629             :       ENDDO
    1630             : 
    1631             :       CPP_LEAVE_PROCEDURE( "DECOMPL2G" )
    1632           0 :       RETURN
    1633             : !
    1634             : !EOC
    1635             :       END SUBROUTINE DecompL2G
    1636             : !------------------------------------------------------------------------
    1637             : 
    1638             : 
    1639             : !------------------------------------------------------------------------
    1640             : !BOP
    1641             : ! !IROUTINE: DecompL2GVector --- Map global index to local and PE
    1642             : !
    1643             : ! !INTERFACE:
    1644           0 :       SUBROUTINE DecompL2GVector ( Decomp, N, Local, Pe, Global )
    1645             : ! !USES:
    1646             :       IMPLICIT NONE
    1647             : !
    1648             : ! !INPUT PARAMETERS:
    1649             :       TYPE(DecompType), INTENT( IN )   :: Decomp  ! Decomp information
    1650             :       INTEGER, INTENT( IN )            :: N       ! Number of indices
    1651             :       INTEGER, INTENT( IN )            :: Local(:)! Local index
    1652             :       INTEGER, INTENT( IN )            :: Pe(:)   ! Pe location
    1653             : !
    1654             : ! !OUTPUT PARAMETERS:
    1655             :       INTEGER, INTENT( OUT )           :: Global(:) ! Global index
    1656             : !
    1657             : !
    1658             : ! !DESCRIPTION:
    1659             : !     Given a decomposition and a local-pe index pair, this routine 
    1660             : !     returns  the 2-D global index. If the local index is not found, 
    1661             : !     0 is returned. 
    1662             : !
    1663             : !     Note that this routine is not efficient by any stretch of the 
    1664             : !     imagination --- only one index can be converted at a time.
    1665             : !     In addition, a search procedure must be performed, whose 
    1666             : !     efficiency is inversely proportional to the size of the 
    1667             : !     decomposition (in particular, to the number of "runs").  
    1668             : !     Conceptually this mapping should be used only once in the 
    1669             : !     program for initialization, and subsequently all calculations 
    1670             : !     should take place using local indices.
    1671             : !
    1672             : ! !SYSTEM ROUTINES:
    1673             : !     SIZE
    1674             : !
    1675             : ! !REVISION HISTORY:
    1676             : !   02.11.09   Sawyer     Creation from decomplocaltoglobal
    1677             : !
    1678             : !EOP
    1679             : !------------------------------------------------------------------------
    1680             : !BOC
    1681             : !
    1682             : ! !LOCAL VARIABLES:
    1683             :     INTEGER  :: I, J, Counter
    1684             :     LOGICAL  :: Found
    1685             : !
    1686             :     CPP_ENTER_PROCEDURE( "DECOMPL2GVECTOR" )
    1687           0 :     DO I=1,N
    1688             :       CPP_ASSERT_F90( Pe(I) .GE. 0 )
    1689             :       CPP_ASSERT_F90( Pe(I) .LT. SIZE(Decomp%Head) )
    1690             :       CPP_ASSERT_F90( Local(I) .GT. 0 )
    1691             :       CPP_ASSERT_F90( Local(I) .LE. Decomp%NumEntries(Pe(I)+1) )
    1692             : 
    1693             :       Counter = 0
    1694             :       Found   = .FALSE.
    1695             :       J = 0
    1696           0 :       DO WHILE ( .NOT. Found )
    1697           0 :         J = J+1
    1698           0 :         Counter = Counter + Decomp%Head(Pe(I)+1)%EndTags(J) -               &
    1699           0 :                             Decomp%Head(Pe(I)+1)%StartTags(J) + 1
    1700           0 :         IF ( Local(I) .LE.  Counter ) THEN
    1701           0 :           Found = .TRUE.
    1702             : !
    1703             : ! The following calculation is not immediately obvious.  Think about it
    1704             : !
    1705           0 :           Global(I) = Local(I) - Counter + Decomp%Head(Pe(I)+1)%EndTags(J)
    1706           0 :           Found = .TRUE.
    1707           0 :         ELSEIF ( J .GE. SIZE( Decomp%Head(Pe(I)+1)%StartTags ) ) THEN
    1708             : !
    1709             : ! Emergency brake
    1710             : !
    1711           0 :           Found = .TRUE.
    1712           0 :           Global(I) = 0
    1713             :         ENDIF
    1714             :       ENDDO
    1715             :     ENDDO
    1716             : 
    1717             :     CPP_LEAVE_PROCEDURE( "DECOMPL2GVECTOR" )
    1718           0 :     RETURN
    1719             : !
    1720             : !EOC
    1721             :     END SUBROUTINE DecompL2GVector
    1722             : !------------------------------------------------------------------------
    1723             : 
    1724             : 
    1725             : !------------------------------------------------------------------------
    1726             : !BOP
    1727             : ! !IROUTINE: DecompInfo --- Information about decomposition
    1728             : !
    1729             : ! !INTERFACE:
    1730      204288 :       SUBROUTINE DecompInfo( Decomp, Npes, TotalPts )
    1731             : ! !USES:
    1732             :       IMPLICIT NONE
    1733             : 
    1734             : ! !INPUT PARAMETERS:
    1735             :       TYPE(DecompType), INTENT( IN ):: Decomp   ! Decomp information
    1736             : 
    1737             : ! !OUTPUT PARAMETERS:
    1738             :       INTEGER, INTENT( OUT )        :: Npes     ! Npes in decomposition
    1739             :       INTEGER, INTENT( OUT )        :: TotalPts ! Total points in domain
    1740             : !
    1741             : !
    1742             : ! !DESCRIPTION:
    1743             : !     Return information about the decomposition: the number of
    1744             : !     PEs over which the domain is decomposed, and the size of
    1745             : !     the domain.
    1746             : !
    1747             : ! !REVISION HISTORY:
    1748             : !   00.11.12   Sawyer     Creation
    1749             : !
    1750             : !EOP
    1751             : !---------------------------------------------------------------------
    1752             : !BOC
    1753             : !
    1754             : !
    1755             :       CPP_ENTER_PROCEDURE( "DECOMPINFO" )
    1756             : 
    1757      204288 :       Npes = SIZE( Decomp%Head )
    1758      204288 :       TotalPts = Decomp%GlobalSize
    1759             : 
    1760             :       CPP_LEAVE_PROCEDURE( "DECOMPINFO" )
    1761      204288 :       RETURN
    1762             : !EOC
    1763             :       END SUBROUTINE DecompInfo
    1764             : !------------------------------------------------------------------------
    1765             : 
    1766           0 :       END MODULE decompmodule
    1767             : 

Generated by: LCOV version 1.14