Line data Source code
1 : !-------------------------------------------------------------------------
2 : ! NASA/GSFC, Data Assimilation Office, Code 910.3, GEOS/DAS
3 : !-------------------------------------------------------------------------
4 : MODULE decompmodule
5 : !BOP
6 : !
7 : ! !MODULE: decompmodule
8 : !
9 : ! !USES:
10 : #if defined( STAND_ALONE )
11 : # define iulog 6
12 : #else
13 : use cam_logfile, only: iulog
14 : #endif
15 : #include "debug.h"
16 :
17 : IMPLICIT NONE
18 :
19 : !
20 : ! !DESCRIPTION:
21 : !
22 : ! This module provides the DecompType and its create and destroy
23 : ! routines.
24 : ! \begin{center}
25 : ! \begin{tabular}{|l|l|} \hline \hline
26 : ! DecompType & Type to describe a decomposition \\ \hline
27 : ! DecompDefined & True iff given decomposition is defined\\ \hline
28 : ! DecompFree & Destroy a decomposition \\ \hline
29 : ! DecompCopy & Copy decomposition to newly created one\\ \hline
30 : ! DecompPermute & Permute decomposition \\ \hline
31 : ! DecompRegular1D & Create a 1-D decomposition \\ \hline
32 : ! DecompRegular2D & Create a 2-D decomposition \\ \hline
33 : ! DecompRegular3D & Create a 3-D decomposition \\ \hline
34 : ! DecompRegular4D & Create a 4-D decomposition \\ \hline
35 : ! DecompCreateIrr & Create an irregular 1-D decomposition \\ \hline
36 : ! DecompCreateTags & Create a decomposition from Pe and Tags \\ \hline
37 : ! DecompGlobalToLocal& Map a global index to a local one \\ \hline
38 : ! DecompLocalToGlobal& Map a local index to a global one \\
39 : ! \hline \hline
40 : ! \end{tabular}
41 : ! \end{center}
42 : !
43 : ! The decomposition type contains the sizes of the global array,
44 : ! the number of entries on each PE, and for each PE a list
45 : ! of "runs", i.e., the starting and finishing global indices
46 : ! or "tags" whose inclusive array section resides on that PE.
47 : ! Clearly this method of decomposition is only efficient if
48 : ! there are long runs, i.e., long array sections which are
49 : ! mapped to one PE. A random decomposition will cause poor
50 : ! results.
51 : !
52 : ! The decomposition is thus very efficient for 1-D, 2-D or 3-D
53 : ! block distributions (particularly for 1-D distributions, where
54 : ! there is one "run" per processor). Problems may occur for
55 : ! an irregular decomposition (which is by definition 1-D). If
56 : ! there is little correspondence between the global indices of the
57 : ! entries and the actual decomposition (e.g., the tags are
58 : ! assigned randomly), then there will be many runs, most
59 : ! containing only one tag, and the resulting instance of
60 : ! DecompType will be very large. Fortunately, most applications
61 : ! assign tags to entries in some sort of contiguous fashion,
62 : ! which is then quite appropriate for this data structure.
63 : !
64 : ! All numbering of multi-dimensional arrays is ROW-MAJOR, that
65 : ! is, first in the X direction and then in the Y (and then,
66 : ! if appropriate, in Z). This is true for both the 2-D and
67 : ! 3-D data sets as also the Cartesian description of the PEs.
68 : !
69 : ! There is one glaring feature of DecompType. It is
70 : ! supposed to be a `one-size-fits-all' description of the
71 : ! decomposition (with the exception of the random indexing
72 : ! mentioned above). Unfortunately, to describe 2-D and 3-D
73 : ! regions, it is necessary to carry additional dimension
74 : ! information in order have complete information for the
75 : ! mapping. This means that 2-D and 3-D decompositions
76 : ! inherently carry more information than a 1-D decomposition.
77 : ! Thus it {\it is} possible to use a decomposition created
78 : ! with the Regular2D or Regular3D routines to describe the
79 : ! corresponding decomposition when the 2-D or 3-D array is
80 : ! viewed as a 1-D array, but it is clearly {\it not}
81 : ! possible to use a decomposition created with Regular1D
82 : ! to describe the decomposition of a 2-D or 3-D array
83 : ! --- the appropriate information just is not there.
84 : !
85 : ! !REVISION HISTORY:
86 : ! 97.07.22 Sawyer Creation
87 : ! 97.09.01 Sawyer Release date
88 : ! 97.11.06 Sawyer Addition of row and column communicators
89 : ! 97.01.24 Sawyer Added support for non-MPI derived types solution
90 : ! 97.01.29 Sawyer Minor revisions for production service
91 : ! 98.01.30 Sawyer Added DecompCopy
92 : ! 98.02.04 Sawyer Removed Comm, CommRow and CommCol from DecompType
93 : ! 98.03.13 Sawyer Removed DecompTypeOld, brushed up for walkthrough
94 : ! 98.03.19 Sawyer Minor corrections after walkthrough
95 : ! 98.05.02 Sawyer Added DecompPermute
96 : ! 98.05.11 Sawyer Removed Permutation from all but DecompPermute
97 : ! 99.01.19 Sawyer Minor cleaning
98 : ! 00.07.07 Sawyer Removed DimSizes; decomp is now 1D only
99 : ! 00.11.12 Sawyer Added DecompCreateTags and DecompInfo
100 : ! 01.02.03 Sawyer Updated for free format; corrected DecompCreateTags
101 : ! 01.03.20 Sawyer Added DecompRegular3DOrder
102 : ! 02.12.04 Sawyer Added DecompDefined, optimized DecompGlobalToLocal
103 : ! 02.12.06 Sawyer Bug in new DecompGlobalToLocal (remove out of bounds check)
104 : ! 02.12.08 Sawyer Another bug: calculate the Offsets field correctly
105 : ! 02.12.23 Sawyer Added DecompRegular4D
106 : !
107 : ! !PUBLIC TYPES:
108 : PUBLIC DecompType, DecompCreate, DecompFree
109 : PUBLIC DecompCopy, DecompPermute, DecompDefined
110 : PUBLIC DecompGlobalToLocal, DecompLocalToGlobal, DecompInfo
111 :
112 : INTERFACE DecompCreate
113 : MODULE PROCEDURE DecompRegular1D
114 : MODULE PROCEDURE DecompRegular2D
115 : MODULE PROCEDURE DecompRegular3D
116 : MODULE PROCEDURE DecompRegular3DOrder
117 : MODULE PROCEDURE DecompRegular4D
118 : MODULE PROCEDURE DecompCreateIrr
119 : MODULE PROCEDURE DecompCreateTags
120 : END INTERFACE
121 :
122 : INTERFACE DecompGlobalToLocal
123 : MODULE PROCEDURE DecompG2L
124 : MODULE PROCEDURE DecompG2LVector
125 : END INTERFACE
126 :
127 : INTERFACE DecompLocalToGlobal
128 : MODULE PROCEDURE DecompL2G
129 : MODULE PROCEDURE DecompL2GVector
130 : END INTERFACE
131 :
132 : ! Decomposition info
133 :
134 : TYPE Lists
135 : INTEGER, POINTER :: StartTags(:) => null() ! Start of tag run
136 : INTEGER, POINTER :: EndTags(:) => null() ! End of tag run
137 : INTEGER, POINTER :: Offsets(:) => null() ! Local offsets for efficiency
138 : END TYPE Lists
139 :
140 : TYPE DecompType
141 : LOGICAL :: Defined ! Is it defined?
142 : INTEGER :: GlobalSize ! Size in each dimension
143 : INTEGER, POINTER :: NumEntries(:) => null() ! Number of entries per PE
144 : TYPE(Lists), POINTER :: Head(:) => null() ! Array of pointers
145 : END TYPE DecompType
146 :
147 : !EOP
148 : CONTAINS
149 :
150 : !-----------------------------------------------------------------------
151 : !BOP
152 : ! !IROUTINE: DecompDefined --- Is the decomp type defined?
153 : !
154 : ! !INTERFACE:
155 0 : LOGICAL FUNCTION DecompDefined ( Decomp )
156 : ! !USES:
157 : IMPLICIT NONE
158 :
159 : ! !INPUT PARAMETERS:
160 : TYPE(DecompType), INTENT( IN ):: Decomp ! Decomp information
161 :
162 : !
163 : ! !DESCRIPTION:
164 : ! Returns true if Decomp has been created but not yet destroyed
165 : !
166 : ! !REVISION HISTORY:
167 : ! 02.12.04 Sawyer Creation from GhostDefined
168 : !
169 : !EOP
170 : !-----------------------------------------------------------------------
171 : !BOC
172 : !
173 : !
174 : CPP_ENTER_PROCEDURE( "DECOMPDEFINED" )
175 0 : DecompDefined = Decomp%Defined
176 : CPP_LEAVE_PROCEDURE( "DECOMPDEFINED" )
177 :
178 : RETURN
179 : !EOC
180 : END FUNCTION DecompDefined
181 : !-----------------------------------------------------------------------
182 :
183 :
184 : !---------------------------------------------------------------------
185 : !BOP
186 : ! !IROUTINE: DecompFree --- Free a decomposition
187 : !
188 : ! !INTERFACE:
189 43008 : SUBROUTINE DecompFree ( Decomp )
190 : ! !USES:
191 : IMPLICIT NONE
192 :
193 : ! !INPUT/OUTPUT PARAMETERS:
194 : TYPE(DecompType), INTENT( INOUT ):: Decomp ! Decomp information
195 : !
196 : ! !DESCRIPTION:
197 : ! Free the decomposition -- deallocate the data structures.
198 : !
199 : ! !SYSTEM ROUTINES:
200 : ! ASSOCIATED, DEALLOCATE
201 : !
202 : ! !REVISION HISTORY:
203 : ! 98.01.30 Sawyer Creation
204 : !
205 : !EOP
206 : !------------------------------------------------------------------------
207 : !BOC
208 : !
209 : ! !LOCAL VARIABLES:
210 : INTEGER :: I, NPEs
211 : !
212 : CPP_ENTER_PROCEDURE( "DECOMPFREE" )
213 :
214 43008 : IF ( ASSOCIATED( Decomp%NumEntries ) ) &
215 43008 : DEALLOCATE( Decomp%NumEntries )
216 43008 : IF ( ASSOCIATED( Decomp%Head ) ) THEN
217 43008 : NPEs = SIZE( Decomp%Head )
218 28588032 : DO I = 1, NPEs
219 : !
220 : ! Copy the number of entries on each PE
221 : !
222 28545024 : IF ( ASSOCIATED( Decomp%Head(I)%StartTags ) ) &
223 28545024 : DEALLOCATE( Decomp%Head(I)%StartTags )
224 28545024 : IF ( ASSOCIATED( Decomp%Head(I)%EndTags ) ) &
225 28545024 : DEALLOCATE( Decomp%Head(I)%EndTags )
226 28545024 : IF ( ASSOCIATED( Decomp%Head(I)%Offsets ) ) &
227 28588032 : DEALLOCATE( Decomp%Head(I)%Offsets )
228 : ENDDO
229 43008 : DEALLOCATE( Decomp%Head )
230 : ENDIF
231 43008 : Decomp%Defined = .FALSE.
232 :
233 : CPP_LEAVE_PROCEDURE( "DECOMPFREE" )
234 43008 : RETURN
235 : !EOC
236 : END SUBROUTINE DecompFree
237 : !------------------------------------------------------------------------
238 :
239 :
240 : !------------------------------------------------------------------------
241 : !BOP
242 : ! !IROUTINE: DecompCopy --- Copy one decomposition to another
243 : !
244 : ! !INTERFACE:
245 7680 : SUBROUTINE DecompCopy ( DecompIn, DecompOut )
246 : ! !USES:
247 : IMPLICIT NONE
248 : !
249 : ! !INPUT PARAMETERS:
250 : TYPE(DecompType), INTENT( IN ) :: DecompIn ! Decomp information
251 : !
252 : ! !OUTPUT PARAMETERS:
253 : TYPE(DecompType), INTENT( OUT ) :: DecompOut ! Decomp information
254 : !
255 : ! !DESCRIPTION:
256 : !
257 : ! Creates an output decomposition and copies the DecompIn input values
258 : !
259 : ! !SYSTEM ROUTINES:
260 : ! ALLOCATE
261 : !
262 : ! !REVISION HISTORY:
263 : ! 98.01.30 Sawyer Creation
264 : !
265 : !EOP
266 : !------------------------------------------------------------------------
267 : !BOC
268 : !
269 : ! !LOCAL VARIABLES:
270 : INTEGER :: I, J, NDims, NPEs, NRuns
271 : !
272 : CPP_ENTER_PROCEDURE( "DECOMPCOPY" )
273 : !
274 : ! Copy the size of the global array
275 : !
276 7680 : DecompOut%GlobalSize = DecompIn%GlobalSize
277 :
278 : !
279 : ! Allocate the number of entries and list head arrays
280 : !
281 7680 : NPEs = SIZE( DecompIn%NumEntries )
282 : CPP_ASSERT_F90( SIZE( DecompIn%Head ) .EQ. NPEs )
283 23040 : ALLOCATE( DecompOut%NumEntries( NPEs ) )
284 5921280 : ALLOCATE( DecompOut%Head( NPEs ) )
285 :
286 5905920 : DO I = 1, NPEs
287 : !
288 : ! Copy the number of entries on each PE
289 : !
290 5898240 : DecompOut%NumEntries( I ) = DecompIn%NumEntries( I )
291 5898240 : NRuns = SIZE( DecompIn%Head( I )%StartTags )
292 : CPP_ASSERT_F90( SIZE( DecompIn%Head( I )%EndTags ) .EQ. NRuns )
293 : !
294 : ! Allocate and copy the array of runs
295 : !
296 17694720 : ALLOCATE( DecompOut%Head(I)%StartTags( NRuns ) )
297 11796480 : ALLOCATE( DecompOut%Head(I)%EndTags( NRuns ) )
298 11796480 : ALLOCATE( DecompOut%Head(I)%Offsets( NRuns ) )
299 109714944 : DO J = 1, NRuns
300 103809024 : DecompOut%Head(I)%StartTags(J) = DecompIn%Head(I)%StartTags(J)
301 103809024 : DecompOut%Head(I)%EndTags(J) = DecompIn%Head(I)%EndTags(J)
302 109707264 : DecompOut%Head(I)%Offsets(J) = DecompIn%Head(I)%Offsets(J)
303 : ENDDO
304 : ENDDO
305 7680 : DecompOut%Defined = .TRUE.
306 :
307 : CPP_LEAVE_PROCEDURE( "DECOMPCOPY" )
308 7680 : RETURN
309 : !EOC
310 : END SUBROUTINE DecompCopy
311 : !------------------------------------------------------------------------
312 :
313 :
314 : !------------------------------------------------------------------------
315 : !BOP
316 : ! !IROUTINE: DecompPermute --- Permute one decomposition to another
317 : !
318 : ! !INTERFACE:
319 0 : SUBROUTINE DecompPermute ( Permutation, Decomp )
320 : ! !USES:
321 : IMPLICIT NONE
322 : !
323 : ! !INPUT PARAMETERS:
324 : INTEGER :: Permutation( : ) ! Permutation
325 :
326 : ! !INPUT/OUTPUT PARAMETERS:
327 : TYPE(DecompType), INTENT( INOUT ) :: Decomp ! Decomp information
328 : !
329 : !
330 : ! !DESCRIPTION:
331 : !
332 : ! Permutes the PE assignment of a given decomposition. Confusion will
333 : ! always arise about whether this is a forward or backward
334 : ! transformation. Picture it this way: draw the array and slice it
335 : ! up as indicated by the distribution. The resulting boxes are of
336 : ! course indexed by natural numbering 1, 2, 3, 4, ... (these are
337 : ! the virtual one-based PEs). Now write the true PE numbering
338 : ! (one-based) as you would like it. The resulting array is Perm.
339 : !
340 : !
341 : ! !SYSTEM ROUTINES:
342 : ! ALLOCATE
343 : !
344 : ! !REVISION HISTORY:
345 : ! 98.05.02 Sawyer Creation
346 : !
347 : !EOP
348 : !---------------------------------------------------------------------
349 : !BOC
350 : !
351 : ! !LOCAL VARIABLES:
352 0 : INTEGER, POINTER :: NumEntries(:)! Number of entries
353 0 : TYPE(Lists), POINTER :: Head(:) ! Array of pointers
354 : INTEGER :: I, J, NPEs, NRuns, TruePE
355 : !
356 : CPP_ENTER_PROCEDURE( "DECOMPPERMUTE" )
357 : !
358 : ! Allocate the number of entries and list head arrays
359 : !
360 0 : NPEs = SIZE( Decomp%NumEntries )
361 0 : ALLOCATE( NumEntries( NPEs ) )
362 0 : DO I = 1, NPEs
363 0 : TruePE = Permutation( I )
364 0 : NumEntries( TruePE ) = Decomp%NumEntries( I )
365 : ENDDO
366 : !
367 : ! Deallocate old NumEntries and put the new pointer in its place
368 : !
369 0 : DEALLOCATE( Decomp%NumEntries )
370 0 : Decomp%NumEntries => NumEntries
371 0 : NULLIFY( NumEntries )
372 :
373 : !
374 : ! Allocate and set the permuted Lists called with pointer Head
375 : !
376 0 : ALLOCATE( Head( NPEs ) )
377 0 : DO I = 1, NPEs
378 0 : TruePE = Permutation( I )
379 0 : NRuns = SIZE( Decomp%Head(I)%StartTags )
380 : CPP_ASSERT_F90( SIZE( Decomp%Head(I)%EndTags ) .EQ. NRuns )
381 : !
382 : ! Allocate and permute the array of runs
383 : !
384 0 : ALLOCATE( Head(TruePE)%StartTags(NRuns) )
385 0 : ALLOCATE( Head(TruePE)%EndTags(NRuns) )
386 0 : ALLOCATE( Head(TruePE)%Offsets(NRuns) )
387 0 : DO J = 1, NRuns
388 0 : Head(TruePE)%StartTags(J) = Decomp%Head(I)%StartTags(J)
389 0 : Head(TruePE)%EndTags(J) = Decomp%Head(I)%EndTags(J)
390 0 : Head(TruePE)%Offsets(J) = Decomp%Head(I)%Offsets(J)
391 : ENDDO
392 : ENDDO
393 : !
394 : ! Deallocate the arrays of starting and ending tags
395 : !
396 0 : DO I = 1, NPEs
397 0 : DEALLOCATE( Decomp%Head(I)%StartTags )
398 0 : DEALLOCATE( Decomp%Head(I)%EndTags )
399 0 : DEALLOCATE( Decomp%Head(I)%Offsets )
400 : ENDDO
401 : !
402 : ! Deallocate the heads to the Lists
403 : !
404 0 : DEALLOCATE( Decomp%Head )
405 :
406 : !
407 : ! Link the new head to that in the decomposition
408 : !
409 0 : Decomp%Head => Head
410 :
411 0 : NULLIFY( Head )
412 :
413 : CPP_LEAVE_PROCEDURE( "DECOMPPERMUTE" )
414 0 : RETURN
415 : !EOC
416 0 : END SUBROUTINE DecompPermute
417 : !------------------------------------------------------------------------
418 :
419 :
420 : !------------------------------------------------------------------------
421 : !BOP
422 : ! !IROUTINE: DecompRegular1D --- Create a decomposition for a 1-D grid
423 : !
424 : ! !INTERFACE:
425 0 : SUBROUTINE DecompRegular1D ( NPEs, Dist, Decomp )
426 : ! !USES:
427 : IMPLICIT NONE
428 : !
429 : ! !INPUT PARAMETERS:
430 : INTEGER, INTENT( IN ) :: NPEs ! Number of PEs
431 : INTEGER, INTENT( IN ) :: Dist(:) ! Distribution in X
432 : !
433 : ! !OUTPUT PARAMETERS:
434 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
435 : !
436 : ! !DESCRIPTION:
437 : ! Creates a variable block decomposition for a regular 1-D grid
438 : ! (this is also known as a "block-general" distribution). The
439 : ! decomposition is given through the Dist distribution
440 : ! which contains the number of entries on each PE.
441 : !
442 : ! !SYSTEM ROUTINES:
443 : ! ALLOCATE
444 : !
445 : ! !REVISION HISTORY:
446 : ! 98.01.19 Sawyer Creation
447 : ! 98.01.22 Sawyer Corrections, TESTED
448 : ! 98.05.11 Sawyer Removed Perm from arglist -- see DecompPermute
449 : ! 00.07.07 Sawyer Removed use of DimSizes(:) array
450 : !
451 : !EOP
452 : !------------------------------------------------------------------------
453 : !BOC
454 : !
455 : ! !LOCAL VARIABLES:
456 : INTEGER :: I, Counter
457 : !
458 : CPP_ENTER_PROCEDURE( "DECOMPREGULAR1D" )
459 : !
460 : CPP_ASSERT_F90( NPEs .EQ. SIZE( Dist ) )
461 : !
462 : ! The head contains NPEs pointers to the tag lists.
463 : !
464 0 : Decomp%GlobalSize = SUM(Dist)
465 0 : ALLOCATE( Decomp%NumEntries( NPEs ) )
466 0 : ALLOCATE( Decomp%Head( NPEs ) )
467 0 : Counter = 0
468 0 : DO I = 1, NPEs
469 0 : Decomp%NumEntries(I) = Dist(I)
470 : !
471 : ! Since this is a regular distribution there is only one run of tags per PE.
472 : !
473 0 : NULLIFY( Decomp%Head(I)%StartTags )
474 0 : NULLIFY( Decomp%Head(I)%EndTags )
475 0 : NULLIFY( Decomp%Head(I)%Offsets )
476 0 : ALLOCATE( Decomp%Head(I)%StartTags(1) )
477 0 : ALLOCATE( Decomp%Head(I)%EndTags(1) )
478 0 : ALLOCATE( Decomp%Head(I)%Offsets(1) )
479 : !
480 : ! The starting and ending tags are immediately determined from
481 : ! the decomposition arrays
482 : !
483 0 : Decomp%Head(I)%StartTags(1) = Counter+1
484 0 : Counter = Counter + Dist(I)
485 0 : Decomp%Head(I)%EndTags(1) = Counter
486 0 : Decomp%Head(I)%Offsets(1) = 0 ! Offset in local segment
487 : ENDDO
488 :
489 0 : Decomp%Defined = .TRUE.
490 :
491 : CPP_LEAVE_PROCEDURE( "DECOMPREGULAR1D" )
492 0 : RETURN
493 : !EOC
494 : END SUBROUTINE DecompRegular1D
495 : !------------------------------------------------------------------------
496 :
497 :
498 : !------------------------------------------------------------------------
499 : !BOP
500 : ! !IROUTINE: DecompRegular2D --- Create a decomposition for a 2-D grid
501 : !
502 : ! !INTERFACE:
503 6144 : SUBROUTINE DecompRegular2D( NPEsX, NPEsY, Xdist, Ydist, Decomp )
504 : ! !USES:
505 : IMPLICIT NONE
506 : !
507 : ! !INPUT PARAMETERS:
508 : INTEGER, INTENT( IN ) :: NPEsX ! Number of PEs in X
509 : INTEGER, INTENT( IN ) :: NPEsY ! Number of PEs in Y
510 : INTEGER, INTENT( IN ) :: Xdist(:) ! Distribution in X
511 : INTEGER, INTENT( IN ) :: Ydist(:) ! Distribution in Y
512 : !
513 : ! !OUTPUT PARAMETERS:
514 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
515 : !
516 : !
517 : ! !DESCRIPTION:
518 : ! Creates a variable block-block decomposition for a regular
519 : ! 2-D grid. The decomposition is given through the Xdist and
520 : ! Ydist distributions, which contain the number of entries on
521 : ! each PE in that dimension. This routine thus defines
522 : ! a rectangular "checkerboard" distribution.
523 : !
524 : ! !SYSTEM ROUTINES:
525 : ! ALLOCATE
526 : !
527 : ! !REVISION HISTORY:
528 : ! 98.01.19 Sawyer Creation
529 : ! 98.01.22 Sawyer Corrections, TESTED
530 : ! 98.05.11 Sawyer Removed Perm from arglist -- see DecompPermute
531 : ! 00.07.07 Sawyer Removed use of DimSizes(:) array
532 : !
533 : ! !BUGS:
534 : ! This routine makes the assumption that the sum of the
535 : ! distribution in each dimension adds up to the total
536 : ! number of entries in that dimension. It will cause
537 : ! problems if the actual local arrays are over- or
538 : ! under-allocated. For example, if the local array is
539 : ! allocated statically for the maximum size of the
540 : ! array on any processor, problems will occur on those
541 : ! PEs which have less than the maximum.
542 : !
543 : !EOP
544 : !------------------------------------------------------------------------
545 : !BOC
546 : !
547 : ! !LOCAL VARIABLES:
548 : INTEGER :: TruePE, I, J, K, Counter1, Counter2, SizeX, SizeY
549 : !
550 : CPP_ENTER_PROCEDURE( "DECOMPREGULAR2D" )
551 : !
552 : ! Some sanity checks
553 : !
554 : CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
555 : CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
556 : !
557 : ! The head contains NPEs pointers to the tag lists.
558 : !
559 62976 : SizeX = SUM(Xdist)
560 302592 : SizeY = SUM(Ydist)
561 6144 : Decomp%GlobalSize = SizeX * SizeY
562 18432 : ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY ) )
563 2494464 : ALLOCATE( Decomp%Head( NPEsX*NPEsY ) )
564 6144 : Counter1 = 0
565 302592 : DO J = 1, NPEsY
566 2772480 : DO I = 1, NPEsX
567 : !
568 : ! WARNING!!!! The definition of the PE is Row-major ordering
569 : !
570 2476032 : TruePE = ( J-1 ) * NPEsX + I
571 :
572 : !
573 : ! The number of entries is the product of the local X, Y, Z allotment
574 : !
575 2476032 : Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)
576 : !
577 : ! For each Y there is a separate run
578 : !
579 2476032 : NULLIFY( Decomp%Head(TruePE)%StartTags )
580 2476032 : NULLIFY( Decomp%Head(TruePE)%EndTags )
581 2476032 : NULLIFY( Decomp%Head(TruePE)%Offsets )
582 6266880 : ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)) )
583 6266880 : ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)) )
584 6266880 : ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)) )
585 2476032 : Counter2 = Counter1
586 9904128 : DO K = 1, Ydist(J)
587 : !
588 : ! Since this is a regular distribution the definition of
589 : ! tags is dictated by Xdist(I), and appears Ydist(J) times
590 : !
591 : !
592 7428096 : Decomp%Head(TruePE)%StartTags(K) = Counter2 + 1
593 7428096 : Decomp%Head(TruePE)%EndTags(K) = Counter2 + Xdist(I)
594 9904128 : Counter2 = Counter2 + SizeX
595 : ENDDO
596 2772480 : Counter1 = Counter1 + Xdist(I)
597 : ENDDO
598 : !
599 : ! Align the counter such that it points to the start of the next
600 : ! block. (Ydist(J)-1) since already one layer has been added in.
601 : ! Implicit assumption that SizeX = SUM( Xdist )
602 : !
603 302592 : Counter1 = Counter1 + SizeX*(Ydist(J)-1)
604 : ENDDO
605 : !
606 : ! Calculate offsets
607 : !
608 2482176 : DO I=1, NPEsX*NPEsY
609 2482176 : IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
610 1314816 : Decomp%Head(I)%Offsets(1) = 0
611 7428096 : DO J=2, SIZE(Decomp%Head(I)%StartTags)
612 6113280 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
613 13541376 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
614 : ENDDO
615 : ENDIF
616 : ENDDO
617 :
618 6144 : Decomp%Defined = .TRUE.
619 :
620 : CPP_LEAVE_PROCEDURE( "DECOMPREGULAR2D" )
621 6144 : RETURN
622 : !EOC
623 : END SUBROUTINE DecompRegular2D
624 : !------------------------------------------------------------------------
625 :
626 :
627 : !------------------------------------------------------------------------
628 : !BOP
629 : ! !IROUTINE: DecompRegular3D --- Create a decomposition for a 3-D grid
630 : !
631 : ! !INTERFACE:
632 27648 : SUBROUTINE DecompRegular3D ( NPEsX, NPEsY, NPEsZ, &
633 13824 : Xdist, Ydist, Zdist, Decomp )
634 : ! !USES:
635 : IMPLICIT NONE
636 : !
637 : ! !INPUT PARAMETERS:
638 : INTEGER, INTENT( IN ) :: NPEsX ! Number of PEs in X
639 : INTEGER, INTENT( IN ) :: NPEsY ! Number of PEs in Y
640 : INTEGER, INTENT( IN ) :: NPEsZ ! Number of PEs in Z
641 : INTEGER, INTENT( IN ) :: Xdist(:) ! Distribution in X
642 : INTEGER, INTENT( IN ) :: Ydist(:) ! Distribution in Y
643 : INTEGER, INTENT( IN ) :: Zdist(:) ! Distribution in Z
644 : !
645 : ! !OUTPUT PARAMETERS:
646 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
647 : !
648 : !
649 : ! !DESCRIPTION:
650 : ! Creates a decomposition for a regular 3-D grid. The
651 : ! decomposition is given through the Xdist, Ydist, and Zdist
652 : ! distributions, which contain the number of entries on
653 : ! each PE in that dimension. This routine thus defines
654 : ! a parallelopiped (SOMA-block) distribution.
655 : !
656 : ! !SYSTEM ROUTINES:
657 : ! ALLOCATE
658 : !
659 : ! !REVISION HISTORY:
660 : ! 98.01.19 Sawyer Creation
661 : ! 98.05.11 Sawyer Removed Perm from arglist -- see DecompPermute
662 : ! 00.07.07 Sawyer Removed use of Sizes(:) array
663 : !
664 : ! !BUGS:
665 : ! This routine makes the assumption that the sum of the
666 : ! distribution in each dimension adds up to the total
667 : ! number of entries in that dimension. It will cause
668 : ! problems if the actual local arrays are over- or
669 : ! under-allocated. For example, if the local array is
670 : ! allocated statically for the maximum size of the
671 : ! array on any processor, problems will occur on those
672 : ! PEs which have less than the maximum.
673 : !
674 : !EOP
675 : !------------------------------------------------------------------------
676 : !BOC
677 : !
678 : ! !LOCAL VARIABLES:
679 : INTEGER :: TruePE, Counter1, Counter2, Counter3
680 : INTEGER :: I, J, K, L, M, N, SizeX, SizeY, SizeZ
681 : !
682 : CPP_ENTER_PROCEDURE( "DECOMPREGULAR3D" )
683 : !
684 : ! Some sanity checks
685 : !
686 : !
687 : CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
688 : CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
689 : CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
690 : !
691 : ! The head contains NPEs pointers to the tag lists.
692 : !
693 78336 : SizeX = SUM(Xdist)
694 801792 : SizeY = SUM(Ydist)
695 95232 : SizeZ = SUM(Zdist)
696 13824 : Decomp%GlobalSize = SizeX * SizeY * SizeZ
697 41472 : ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ ) )
698 7334400 : ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ ) )
699 13824 : Counter1 = 0
700 95232 : DO K = 1, NPEsZ
701 4130304 : DO J = 1, NPEsY
702 11341824 : DO I = 1, NPEsX
703 : !
704 : ! WARNING!!!! The definition of the PE is Row-major ordering
705 : !
706 7292928 : TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + I
707 7292928 : NULLIFY( Decomp%Head(TruePE)%StartTags )
708 7292928 : NULLIFY( Decomp%Head(TruePE)%EndTags )
709 7292928 : NULLIFY( Decomp%Head(TruePE)%Offsets )
710 : !
711 : ! The number of entries is the product of the local X, Y, Z allotment
712 : !
713 7292928 : Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)*Zdist(K)
714 : !
715 : ! For each Z there are Y separate runs
716 : !
717 21878784 : ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)) )
718 21878784 : ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)) )
719 21878784 : ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)) )
720 7292928 : Counter2 = Counter1
721 7292928 : L = 0
722 204073984 : DO N = 1, Zdist(K)
723 196781056 : Counter3 = Counter2
724 787124224 : DO M = 1, Ydist(J)
725 : !
726 : ! Since this is a regular distribution the definition of
727 : ! tags is dictated by Xdist(I), and appears Ydist(J) times
728 : !
729 : !
730 590343168 : L = L + 1
731 590343168 : Decomp%Head(TruePE)%StartTags(L) = Counter3 + 1
732 590343168 : Decomp%Head(TruePE)%EndTags(L) = Counter3 + Xdist(I)
733 787124224 : Counter3 = Counter3 + SizeX
734 : ENDDO
735 204073984 : Counter2 = Counter2 + SizeX*SizeY
736 : ENDDO
737 11341824 : Counter1 = Counter1 + Xdist(I)
738 : ENDDO
739 : !
740 : ! Align the counter such that it points to the start of the next
741 : ! block. (Ydist(J)-1) since already one X layer has been added in.
742 : ! Implicit assumption that SizeX = SUM( Xdist )
743 : !
744 4130304 : Counter1 = Counter1 + SizeX*(Ydist(J)-1)
745 : ENDDO
746 : !
747 : ! Align the counter such that it points to the start of the next
748 : ! block. (Zdist(K)-1) since already one X-Y layer has been added in.
749 : ! Implicit assumption that SizeY = SUM( Ydist )
750 : !
751 95232 : Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
752 : ENDDO
753 : !
754 : ! Calculate offsets
755 : !
756 7306752 : DO I=1, NPEsX*NPEsY*NPEsZ
757 7306752 : IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
758 7292928 : Decomp%Head(I)%Offsets(1) = 0
759 590343168 : DO J=2, SIZE(Decomp%Head(I)%StartTags)
760 583050240 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
761 1173393408 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
762 : ENDDO
763 : ENDIF
764 : ENDDO
765 :
766 13824 : Decomp%Defined = .TRUE.
767 :
768 : CPP_LEAVE_PROCEDURE( "DECOMPREGULAR3D" )
769 13824 : RETURN
770 : !EOC
771 : END SUBROUTINE DecompRegular3D
772 : !------------------------------------------------------------------------
773 :
774 :
775 : !------------------------------------------------------------------------
776 : !BOP
777 : ! !IROUTINE: DecompRegular3Dorder --- Create a decomposition for a 3-D grid
778 : !
779 : ! !INTERFACE:
780 0 : SUBROUTINE DecompRegular3Dorder( Order, NPEsX, NPEsY, NPEsZ, &
781 7680 : Xdist, Ydist, Zdist, Decomp )
782 : ! !USES:
783 : IMPLICIT NONE
784 : !
785 : ! !INPUT PARAMETERS:
786 : CHARACTER(3), INTENT( IN ) :: Order ! Dim. ordering
787 : INTEGER, INTENT( IN ) :: NPEsX ! Number of PEs in X
788 : INTEGER, INTENT( IN ) :: NPEsY ! Number of PEs in Y
789 : INTEGER, INTENT( IN ) :: NPEsZ ! Number of PEs in Z
790 : INTEGER, INTENT( IN ) :: Xdist(:) ! Distribution in X
791 : INTEGER, INTENT( IN ) :: Ydist(:) ! Distribution in Y
792 : INTEGER, INTENT( IN ) :: Zdist(:) ! Distribution in Z
793 : !
794 : ! !OUTPUT PARAMETERS:
795 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
796 : !
797 : ! !DESCRIPTION:
798 : ! Creates a variable block-block-block decomposition for a regular
799 : ! 3-D grid, where the ordering of the PEs can be explicitly given
800 : ! (see next paragraph). The decomposition is given through the
801 : ! Xdist, Ydist, and Zdist distributions, which contain the number
802 : ! of entries on each PE in that dimension. This routine thus defines
803 : ! a parallelopiped (SOMA-block) distribution.
804 : !
805 : ! With the string argument Order, the order of counting in the
806 : ! 3d PE space can be specified. There are six possible values:
807 : ! "xyz", "xzy", "yxz", "yzx", "zxy", and "zyx".
808 : !
809 : ! The same as DecompRegular3Dorder could also be achieved by
810 : ! using DecompRegular3D and then permuting the PE ownership
811 : ! with DecompPermute.
812 : !
813 : ! !SYSTEM ROUTINES:
814 : ! ALLOCATE
815 : !
816 : ! !REVISION HISTORY:
817 : ! 01.03.20 Sawyer Creation from DecompRegular3Dzy, added ordering
818 : !
819 : ! !BUGS:
820 : ! Not yet tested
821 : !
822 : !EOP
823 : !---------------------------------------------------------------------
824 : !BOC
825 : ! !LOCAL VARIABLES:
826 : INTEGER :: TruePE, Counter1, Counter2, Counter3
827 : INTEGER :: I, J, K, L, M, N, SizeX, SizeY, SizeZ
828 : INTEGER :: Imult, Jmult, Kmult
829 : !
830 : CPP_ENTER_PROCEDURE( "DECOMPREGULAR3DORDER" )
831 : !
832 : ! Some sanity checks
833 : !
834 : !
835 : CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
836 : CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
837 : CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
838 :
839 7680 : IF ( Order=="xyz" ) THEN
840 : ! Looks like: TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + (I-1) + 1
841 0 : Imult = 1
842 0 : Jmult = NPEsX
843 0 : Kmult = NPEsX*NPEsY
844 7680 : ELSE IF ( Order=="xzy" ) THEN
845 : ! Looks like: TruePE = (J-1)*NPEsX*NPEsZ + (K-1)*NPEsX + (I-1) + 1
846 7680 : Imult = 1
847 7680 : Jmult = NPEsX*NPEsZ
848 7680 : Kmult = NPEsX
849 0 : ELSE IF ( Order=="yxz" ) THEN
850 : ! Looks like: TruePE = (K-1)*NPEsY*NPEsX + (I-1)*NPEsY + (J-1) + 1
851 0 : Imult = NPEsY
852 0 : Jmult = 1
853 0 : Kmult = NPEsX*NPEsY
854 0 : ELSE IF ( Order=="yzx" ) THEN
855 : ! Looks like: TruePE = (I-1)*NPEsY*NPEsZ + (K-1)*NPEsY + (J-1) + 1
856 0 : Imult = NPEsY*NPEsZ
857 0 : Jmult = 1
858 0 : Kmult = NPEsY
859 0 : ELSE IF ( Order=="zxy" ) THEN
860 : ! Looks like: TruePE = (J-1)*NPEsX*NPEsZ + (I-1)*NPEsZ + (K-1) + 1
861 0 : Imult = NPEsZ
862 0 : Jmult = NPEsX*NPEsZ
863 0 : Kmult = 1
864 0 : ELSE IF ( Order=="zyx" ) THEN
865 : ! Looks like: TruePE = (I-1)*NPEsY*NPEsZ + (J-1)*NPEsZ + (K-1) + 1
866 0 : Imult = NPEsY*NPEsZ
867 0 : Jmult = NPEsZ
868 0 : Kmult = 1
869 : ELSE
870 : ! Looks like: TruePE = (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + (I-1) + 1
871 0 : write(iulog,*) "Warning: DecompCreate3Dorder", Order, "not supported"
872 0 : write(iulog,*) " Continuing with XYZ ordering"
873 0 : Imult = 1
874 0 : Jmult = NPEsX
875 0 : Kmult = NPEsX*NPEsY
876 : ENDIF
877 :
878 : !
879 : ! The head contains NPEs pointers to the tag lists.
880 : !
881 49152 : SizeX = SUM(Xdist)
882 66048 : SizeY = SUM(Ydist)
883 402432 : SizeZ = SUM(Zdist)
884 7680 : Decomp%GlobalSize = SizeX * SizeY * SizeZ
885 23040 : ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ ) )
886 4760064 : ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ ) )
887 7680 : Counter1 = 0
888 :
889 402432 : DO K = 1, NPEsZ
890 2969088 : DO J = 1, NPEsY
891 7311360 : DO I = 1, NPEsX
892 : !
893 : ! WARNING!!!! The definition of the PE is Row-major ordering
894 : !
895 :
896 4737024 : TruePE = (I-1)*Imult + (J-1)*Jmult + (K-1)*Kmult + 1
897 : !
898 : ! The number of entries is the product of the local X, Y, Z allotment
899 : !
900 4737024 : Decomp%NumEntries(TruePE) = Xdist(I)*Ydist(J)*Zdist(K)
901 : !
902 : ! For each Z there are Y separate runs
903 : !
904 14211072 : ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)) )
905 14211072 : ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)) )
906 14211072 : ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)) )
907 4737024 : Counter2 = Counter1
908 4737024 : L = 0
909 18948096 : DO N = 1, Zdist(K)
910 14211072 : Counter3 = Counter2
911 555111936 : DO M = 1, Ydist(J)
912 : !
913 : ! Since this is a regular distribution the definition of
914 : ! tags is dictated by Xdist(I), and appears Ydist(J) times
915 : !
916 : !
917 540900864 : L = L + 1
918 540900864 : Decomp%Head(TruePE)%StartTags(L) = Counter3 + 1
919 540900864 : Decomp%Head(TruePE)%EndTags(L) = Counter3 + Xdist(I)
920 555111936 : Counter3 = Counter3 + SizeX
921 : ENDDO
922 18948096 : Counter2 = Counter2 + SizeX*SizeY
923 : ENDDO
924 7311360 : Counter1 = Counter1 + Xdist(I)
925 : ENDDO
926 : !
927 : ! Align the counter such that it points to the start of the next
928 : ! block. (Ydist(J)-1) since already one X layer has been added in.
929 : ! Implicit assumption that SizeX = SUM( Xdist )
930 : !
931 2969088 : Counter1 = Counter1 + SizeX*(Ydist(J)-1)
932 : ENDDO
933 : !
934 : ! Align the counter such that it points to the start of the next
935 : ! block. (Zdist(K)-1) since already one X-Y layer has been added in.
936 : ! Implicit assumption that SizeY = SUM( Ydist )
937 : !
938 402432 : Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
939 : ENDDO
940 : !
941 : ! Calculate offsets
942 : !
943 4744704 : DO I=1, NPEsX*NPEsY*NPEsZ
944 4744704 : IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
945 4737024 : Decomp%Head(I)%Offsets(1) = 0
946 540900864 : DO J=2, SIZE(Decomp%Head(I)%StartTags)
947 536163840 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
948 1077064704 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
949 : ENDDO
950 : ENDIF
951 : ENDDO
952 :
953 7680 : Decomp%Defined = .TRUE.
954 :
955 : CPP_LEAVE_PROCEDURE( "DECOMPREGULAR3DORDER" )
956 7680 : RETURN
957 : !EOC
958 : END SUBROUTINE DecompRegular3DOrder
959 : !------------------------------------------------------------------------
960 :
961 :
962 : !------------------------------------------------------------------------
963 : !BOP
964 : ! !IROUTINE: DecompRegular4D --- Create a decomposition for a 3-D grid
965 : !
966 : ! !INTERFACE:
967 0 : SUBROUTINE DecompRegular4D ( NPEsX, NPEsY, NPEsZ, NPEsT, &
968 0 : Xdist, Ydist, Zdist, Tdist, Decomp )
969 : ! !USES:
970 : IMPLICIT NONE
971 : !
972 : ! !INPUT PARAMETERS:
973 : INTEGER, INTENT( IN ) :: NPEsX ! Number of PEs in X
974 : INTEGER, INTENT( IN ) :: NPEsY ! Number of PEs in Y
975 : INTEGER, INTENT( IN ) :: NPEsZ ! Number of PEs in Z
976 : INTEGER, INTENT( IN ) :: NPEsT ! Number of PEs in T
977 : INTEGER, INTENT( IN ) :: Xdist(:) ! Distribution in X
978 : INTEGER, INTENT( IN ) :: Ydist(:) ! Distribution in Y
979 : INTEGER, INTENT( IN ) :: Zdist(:) ! Distribution in Z
980 : INTEGER, INTENT( IN ) :: Tdist(:) ! Distribution in T
981 : !
982 : ! !OUTPUT PARAMETERS:
983 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
984 : !
985 : !
986 : ! !DESCRIPTION:
987 : ! Creates a decomposition for a regular 4-D grid. The
988 : ! decomposition is given through the Xdist, Ydist, Zdist, Tdist
989 : ! distributions, which contain the number of entries on
990 : ! each PE in that dimension. This routine thus defines
991 : ! a parallelopiped (SOMA-block) distribution.
992 : !
993 : ! !SYSTEM ROUTINES:
994 : ! ALLOCATE
995 : !
996 : ! !REVISION HISTORY:
997 : ! 02.12.23 Sawyer Creation from DecompRegular4D
998 : !
999 : ! !BUGS:
1000 : ! This routine makes the assumption that the sum of the
1001 : ! distribution in each dimension adds up to the total
1002 : ! number of entries in that dimension. It will cause
1003 : ! problems if the actual local arrays are over- or
1004 : ! under-allocated. For example, if the local array is
1005 : ! allocated statically for the maximum size of the
1006 : ! array on any processor, problems will occur on those
1007 : ! PEs which have less than the maximum.
1008 : !
1009 : !EOP
1010 : !------------------------------------------------------------------------
1011 : !BOC
1012 : !
1013 : ! !LOCAL VARIABLES:
1014 : INTEGER :: TruePE, Counter1, Counter2, Counter3, Counter4
1015 : INTEGER :: I, J, K, L, M, N, P, T, SizeX, SizeY, SizeZ, SizeT
1016 : !
1017 : CPP_ENTER_PROCEDURE( "DECOMPREGULAR4D" )
1018 : !
1019 : ! Some sanity checks
1020 : !
1021 : !
1022 : CPP_ASSERT_F90( NPEsX .EQ. SIZE( Xdist ) )
1023 : CPP_ASSERT_F90( NPEsY .EQ. SIZE( Ydist ) )
1024 : CPP_ASSERT_F90( NPEsZ .EQ. SIZE( Zdist ) )
1025 : CPP_ASSERT_F90( NPEsT .EQ. SIZE( Tdist ) )
1026 : !
1027 : ! The head contains NPEs pointers to the tag lists.
1028 : !
1029 0 : SizeX = SUM(Xdist)
1030 0 : SizeY = SUM(Ydist)
1031 0 : SizeZ = SUM(Zdist)
1032 0 : SizeT = SUM(Tdist)
1033 0 : Decomp%GlobalSize = SizeX * SizeY * SizeZ * SizeT
1034 0 : ALLOCATE( Decomp%NumEntries( NPEsX*NPEsY*NPEsZ*NPEsT ) )
1035 0 : ALLOCATE( Decomp%Head( NPEsX*NPEsY*NPEsZ*NPEsT ) )
1036 0 : Counter1 = 0
1037 0 : DO T = 1, NPEsT
1038 0 : DO K = 1, NPEsZ
1039 0 : DO J = 1, NPEsY
1040 0 : DO I = 1, NPEsX
1041 : !
1042 : ! WARNING!!!! The definition of the PE is Row-major ordering
1043 : !
1044 : TruePE = (T-1)*NPEsX*NPEsY*NPEsZ + &
1045 0 : (K-1)*NPEsX*NPEsY + (J-1)*NPEsX + I
1046 0 : NULLIFY( Decomp%Head(TruePE)%StartTags )
1047 0 : NULLIFY( Decomp%Head(TruePE)%EndTags )
1048 0 : NULLIFY( Decomp%Head(TruePE)%Offsets )
1049 : !
1050 : ! The number of entries is the product of the local X, Y, Z allotment
1051 : !
1052 0 : Decomp%NumEntries(TruePE) = &
1053 0 : Xdist(I)*Ydist(J)*Zdist(K)*Tdist(T)
1054 : !
1055 : ! For each Z there are Y separate runs
1056 : !
1057 0 : ALLOCATE( Decomp%Head(TruePE)%StartTags(Ydist(J)*Zdist(K)*Tdist(T)) )
1058 0 : ALLOCATE( Decomp%Head(TruePE)%EndTags(Ydist(J)*Zdist(K)*Tdist(T)) )
1059 0 : ALLOCATE( Decomp%Head(TruePE)%Offsets(Ydist(J)*Zdist(K)*Tdist(T)) )
1060 0 : Counter2 = Counter1 ! Base address
1061 0 : L = 0
1062 0 : DO P = 1, Tdist(T)
1063 0 : Counter3 = Counter2
1064 0 : DO N = 1, Zdist(K)
1065 0 : Counter4 = Counter3
1066 0 : DO M = 1, Ydist(J)
1067 : !
1068 : ! Since this is a regular distribution the definition of
1069 : ! tags is dictated by Xdist(I), and appears Ydist(J) times
1070 : !
1071 : !
1072 0 : L = L + 1
1073 0 : Decomp%Head(TruePE)%StartTags(L) = Counter4 + 1
1074 0 : Decomp%Head(TruePE)%EndTags(L) = Counter4 + Xdist(I)
1075 0 : Counter4 = Counter4 + SizeX
1076 : ENDDO
1077 0 : Counter3 = Counter3 + SizeX*SizeY
1078 : ENDDO
1079 0 : Counter2 = Counter2 + SizeX*SizeY*SizeZ
1080 : ENDDO
1081 0 : Counter1 = Counter1 + Xdist(I) ! Increment base address
1082 : ENDDO
1083 : !
1084 : ! Align the counter such that it points to the start of the next
1085 : ! block. (Ydist(J)-1) since already one X layer has been added in.
1086 : ! Implicit assumption that SizeX = SUM( Xdist )
1087 : !
1088 0 : Counter1 = Counter1 + SizeX*(Ydist(J)-1)
1089 : ENDDO
1090 : !
1091 : ! Align the counter such that it points to the start of the next
1092 : ! block. (Zdist(K)-1) since already one X-Y layer has been added in.
1093 : ! Implicit assumption that SizeY = SUM( Ydist )
1094 : !
1095 0 : Counter1 = Counter1 + SizeX*SizeY*(Zdist(K)-1)
1096 : ENDDO
1097 0 : Counter1 = Counter1 + SizeX*SizeY*SizeZ*(Tdist(T)-1)
1098 : ENDDO
1099 : !
1100 : ! Calculate offsets
1101 : !
1102 0 : DO I=1, NPEsX*NPEsY*NPEsZ*NPEsT
1103 0 : IF ( SIZE(Decomp%Head(I)%StartTags) > 0 ) THEN
1104 0 : Decomp%Head(I)%Offsets(1) = 0
1105 0 : DO J=2, SIZE(Decomp%Head(I)%StartTags)
1106 0 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
1107 0 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
1108 : ENDDO
1109 : ENDIF
1110 : ENDDO
1111 :
1112 0 : Decomp%Defined = .TRUE.
1113 :
1114 : CPP_LEAVE_PROCEDURE( "DECOMPREGULAR4D" )
1115 0 : RETURN
1116 : !EOC
1117 : END SUBROUTINE DecompRegular4D
1118 : !------------------------------------------------------------------------
1119 :
1120 :
1121 : !------------------------------------------------------------------------
1122 : !BOP
1123 : ! !IROUTINE: DecompCreateIrr --- Decomposition for an irregular mesh
1124 : !
1125 : ! !INTERFACE:
1126 0 : SUBROUTINE DecompCreateIrr( NPEs, Pe, TotalPts, Decomp )
1127 : ! !USES:
1128 : IMPLICIT NONE
1129 : !
1130 : ! !INPUT PARAMETERS:
1131 : INTEGER, INTENT( IN ) :: NPEs ! Number of PEs
1132 : INTEGER, INTENT( IN ) :: Pe(:) ! Processor location
1133 : INTEGER, INTENT( IN ) :: TotalPts ! Number of points
1134 : !
1135 : ! !OUTPUT PARAMETERS:
1136 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
1137 : !
1138 : !
1139 : ! !DESCRIPTION:
1140 : ! Creates a decomposition for a irregular 1-D mesh. The
1141 : ! decomposition is given through the number of points and
1142 : ! an array containing the PE which each point is mapped to.
1143 : ! This mapping naturally assumes that the local numbering
1144 : ! is incrementally increasing as points are mapped to PEs.
1145 : ! This assumption is sufficient for most applications, but
1146 : ! another irregular mapping routine is available for more
1147 : ! complex cases.
1148 : !
1149 : ! !SYSTEM ROUTINES:
1150 : ! ALLOCATE
1151 : !
1152 : ! !REVISION HISTORY:
1153 : ! 98.01.19 Sawyer Creation, with requirements from Jay Larson
1154 : ! 98.11.02 Sawyer Rewritten to requirements for Andrea Molod
1155 : ! 00.07.07 Sawyer Removed use of DimSizes(:) array
1156 : ! 00.11.12 Sawyer Changed argument order for overloading
1157 : !
1158 : !EOP
1159 : !------------------------------------------------------------------------
1160 : !BOC
1161 : !
1162 : ! !LOCAL VARIABLES:
1163 : INTEGER :: I, J, PEhold
1164 0 : INTEGER :: Counter( NPEs )
1165 : !
1166 : CPP_ENTER_PROCEDURE( "DECOMPCREATEIRR" )
1167 : !
1168 : CPP_ASSERT_F90( TotalPts .LE. SIZE( PE ) )
1169 :
1170 : !
1171 : ! The head contains NPEs pointers to the tag lists.
1172 : !
1173 0 : Decomp%GlobalSize = TotalPts
1174 0 : ALLOCATE( Decomp%NumEntries( NPEs ) )
1175 0 : ALLOCATE( Decomp%Head( NPEs ) )
1176 : !
1177 : ! Perform over all points in the mapping
1178 : !
1179 0 : PEhold= -1
1180 0 : Counter = 0
1181 0 : Decomp%NumEntries = 0
1182 0 : DO I=1, TotalPts
1183 : CPP_ASSERT_F90( ( PE( I ) .LT. NPEs .AND. PE( I ) .GE. 0 ) )
1184 0 : IF ( PE( I ) .NE. PEhold ) THEN
1185 0 : PEhold = PE( I )
1186 0 : Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
1187 : ENDIF
1188 0 : Decomp%NumEntries(PEHold+1) = Decomp%NumEntries(PEHold+1) + 1
1189 : ENDDO
1190 0 : DO I=1, NPEs
1191 : !
1192 : ! Now the amount of space to allocate is known. It is acceptable
1193 : ! to in allocated an array of size 0 (F90 Handbook, Section 6.5.1)
1194 : !
1195 0 : ALLOCATE( Decomp%Head(I)%StartTags(Counter(I)) )
1196 0 : ALLOCATE( Decomp%Head(I)%EndTags(Counter(I)) )
1197 0 : ALLOCATE( Decomp%Head(I)%Offsets(Counter(I)) )
1198 : ENDDO
1199 : !
1200 : ! Perform over all points in the mapping
1201 : !
1202 0 : PEhold= -1
1203 0 : Counter = 0
1204 0 : DO I=1, TotalPts
1205 0 : IF ( PE( I ) .NE. PEhold ) THEN
1206 : !
1207 : ! If not first entry, close up shop on previous run
1208 : !
1209 0 : IF ( I .GT. 1 ) THEN
1210 0 : Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = I-1
1211 : ENDIF
1212 0 : PEhold = PE( I )
1213 0 : Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
1214 0 : Decomp%Head(PEhold+1)%StartTags(Counter(PEhold+1)) = I
1215 : ENDIF
1216 : ENDDO
1217 : !
1218 : ! Clean up shop for the final run
1219 : !
1220 0 : IF ( TotalPts .GT. 0 ) THEN
1221 0 : Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = TotalPts
1222 : ENDIF
1223 :
1224 : !
1225 : ! Calculate offsets
1226 : !
1227 0 : DO I=1, NPEs
1228 0 : IF ( Counter(I) > 0 ) THEN
1229 0 : Decomp%Head(I)%Offsets(1) = 0
1230 0 : DO J=2, Counter(I)
1231 0 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
1232 0 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
1233 : ENDDO
1234 : ENDIF
1235 : ENDDO
1236 0 : Decomp%Defined = .TRUE.
1237 :
1238 : CPP_LEAVE_PROCEDURE( "DECOMPCREATEIRR" )
1239 0 : RETURN
1240 : !EOC
1241 : END SUBROUTINE DecompCreateIrr
1242 : !------------------------------------------------------------------------
1243 :
1244 :
1245 : !------------------------------------------------------------------------
1246 : !BOP
1247 : ! !IROUTINE: DecompCreateTags --- Decomposition from Pe and Tags
1248 : !
1249 : ! !INTERFACE:
1250 15360 : SUBROUTINE DecompCreateTags(Npes, Pe, TotalPts, Tags, Decomp )
1251 : ! !USES:
1252 : IMPLICIT NONE
1253 : !
1254 : ! !INPUT PARAMETERS:
1255 : INTEGER, INTENT( IN ) :: NPEs ! Number of PEs
1256 : INTEGER, INTENT( IN ) :: Pe(:) ! Processor location
1257 : INTEGER, INTENT( IN ) :: TotalPts ! Number of points
1258 : INTEGER, INTENT( IN ) :: Tags(:) ! Global index
1259 : !
1260 : ! !OUTPUT PARAMETERS:
1261 : TYPE(DecompType), INTENT( OUT ) :: Decomp ! Decomp information
1262 : !
1263 : !
1264 : ! !DESCRIPTION:
1265 : ! Creates a decomposition for a irregular mesh from the
1266 : ! Pe ownership and the Tags. This is a simple extension of
1267 : ! DecompCreateIrr (previously DecompIrregular1D) but is
1268 : ! much more dangerous, since the user can define the Tags
1269 : ! (global indices) arbitrarily.
1270 : !
1271 : ! !SYSTEM ROUTINES:
1272 : ! ALLOCATE
1273 : !
1274 : ! !REVISION HISTORY:
1275 : ! 00.11.12 Sawyer Creation from DecompCreateIrr
1276 : !
1277 : !EOP
1278 : !------------------------------------------------------------------------
1279 : !BOC
1280 : !
1281 : ! !LOCAL VARIABLES:
1282 : INTEGER :: I, J, PEhold, LastTag
1283 30720 : INTEGER :: Counter( NPEs )
1284 : !
1285 : CPP_ENTER_PROCEDURE( "DECOMPCREATETAGS" )
1286 : !
1287 : CPP_ASSERT_F90( TotalPts .LE. SIZE( PE ) )
1288 : CPP_ASSERT_F90( TotalPts .LE. SIZE( Tags ) )
1289 :
1290 : !
1291 : ! The head contains NPEs pointers to the tag lists.
1292 : !
1293 15360 : Decomp%GlobalSize = TotalPts
1294 46080 : ALLOCATE( Decomp%NumEntries( NPEs ) )
1295 11842560 : ALLOCATE( Decomp%Head( NPEs ) )
1296 : !
1297 : ! Perform over all points in the mapping
1298 : !
1299 15360 : PEhold = -1
1300 15360 : LastTag = -999999999
1301 11811840 : Counter = 0
1302 11811840 : Decomp%NumEntries = 0
1303 136707072 : DO I=1, TotalPts
1304 : CPP_ASSERT_F90( PE( I ) .LT. NPEs .AND. PE( I ) .GE. 0 )
1305 136691712 : IF ( LastTag==0 .OR. Tags(I)/=LastTag+1 .OR. PE(I)/=PEhold ) THEN
1306 1552056 : PEhold = PE( I )
1307 1552056 : Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
1308 : ENDIF
1309 136691712 : Decomp%NumEntries(PEHold+1) = Decomp%NumEntries(PEHold+1) + 1
1310 136707072 : LastTag = Tags(I)
1311 : ENDDO
1312 :
1313 11811840 : DO I=1, NPEs
1314 : !
1315 : ! Now the amount of space to allocate is known. It is acceptable
1316 : ! to in allocated an array of size 0 (F90 Handbook, Section 6.5.1)
1317 : !
1318 23612672 : ALLOCATE( Decomp%Head(I)%StartTags(Counter(I)) )
1319 23612672 : ALLOCATE( Decomp%Head(I)%EndTags(Counter(I)) )
1320 23628032 : ALLOCATE( Decomp%Head(I)%Offsets(Counter(I)) )
1321 : ENDDO
1322 :
1323 : !
1324 : ! Perform over all points in the domain
1325 : !
1326 11811840 : PEhold = -1
1327 11811840 : LastTag = -999999999
1328 11811840 : Counter = 0
1329 136707072 : DO I=1, TotalPts
1330 136691712 : IF ( LastTag==0 .OR. Tags(I)/=LastTag+1 .OR. PE(I)/=PEhold ) THEN
1331 : !
1332 : ! If not first entry, close up shop on previous run
1333 : !
1334 1552056 : IF ( I .GT. 1 ) THEN
1335 1536952 : Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) = LastTag
1336 : ENDIF
1337 1552056 : PEhold = PE( I )
1338 1552056 : Counter( PEhold+1 ) = Counter( PEhold+1 ) + 1
1339 1552056 : Decomp%Head(PEhold+1)%StartTags(Counter(PEhold+1)) = Tags(I)
1340 : ENDIF
1341 136707072 : LastTag = Tags(I)
1342 : ENDDO
1343 : !
1344 : ! Clean up shop for the final run
1345 : !
1346 15360 : IF ( TotalPts .GT. 0 ) THEN
1347 15104 : Decomp%Head(PEhold+1)%EndTags(Counter(PEhold+1)) =Tags(TotalPts)
1348 : ENDIF
1349 :
1350 : !
1351 : ! Calculate offsets
1352 : !
1353 11811840 : DO I=1, NPEs
1354 11811840 : IF ( Counter(I) > 0 ) THEN
1355 19712 : Decomp%Head(I)%Offsets(1) = 0
1356 1552056 : DO J=2, Counter(I)
1357 1532344 : Decomp%Head(I)%Offsets(J) = Decomp%Head(I)%Offsets(J-1) + &
1358 3084400 : Decomp%Head(I)%EndTags(J-1) - Decomp%Head(I)%StartTags(J-1) + 1
1359 : ENDDO
1360 : ENDIF
1361 : ENDDO
1362 15360 : Decomp%Defined = .TRUE.
1363 :
1364 : CPP_LEAVE_PROCEDURE( "DECOMPCREATETAGS" )
1365 15360 : RETURN
1366 : !EOC
1367 : END SUBROUTINE DecompCreateTags
1368 : !------------------------------------------------------------------------
1369 :
1370 :
1371 : !------------------------------------------------------------------------
1372 : !BOP
1373 : ! !IROUTINE: DecompG2L --- Map global index to local and PE
1374 : !
1375 : ! !INTERFACE:
1376 424958976 : SUBROUTINE DecompG2L ( Decomp, Global, Local, Pe )
1377 : ! !USES:
1378 : IMPLICIT NONE
1379 : !
1380 : ! !INPUT PARAMETERS:
1381 : TYPE(DecompType), INTENT( IN ) :: Decomp ! Decomp information
1382 : INTEGER, INTENT( IN ) :: Global ! Global index
1383 : !
1384 : ! !OUTPUT PARAMETERS:
1385 :
1386 : INTEGER, INTENT( OUT ) :: Local ! Local index
1387 : INTEGER, INTENT( OUT ) :: Pe ! Pe location
1388 : !
1389 : !
1390 : ! !DESCRIPTION:
1391 : ! Given a decomposition and a global index, this routine returns
1392 : ! the local index and PE location of that global tag. If the
1393 : ! global index is not found, Local = 0, Pe = -1 is returned.
1394 : !
1395 : ! Note that this routine is not efficient by any stretch of the
1396 : ! imagination --- only one index can be converted at a time.
1397 : ! In addition, a search procedure must be performed, whose
1398 : ! efficiency is inversely proportional to the size of the decomposition
1399 : ! (in particular, to the number of "runs"). Conceptually this
1400 : ! mapping should be used only once in the program for
1401 : ! initialization, and subsequently all calculations should take
1402 : ! place using local indices.
1403 : !
1404 : ! !SYSTEM ROUTINES:
1405 : ! SIZE
1406 : !
1407 : ! !REVISION HISTORY:
1408 : ! 98.03.20 Sawyer Creation
1409 : ! 01.03.17 Sawyer Test for Global==0 (undefined element)
1410 : ! 02.11.22 Sawyer Optimized by caching previously used block
1411 : !EOP
1412 : !------------------------------------------------------------------------
1413 : !BOC
1414 : !
1415 : ! !LOCAL VARIABLES:
1416 : INTEGER, SAVE :: Ipe = 0 ! Initial process ID
1417 : INTEGER, SAVE :: J = 0 ! Initial DO loop value
1418 : INTEGER :: Ipeold, Jold, PEsize, Jsize
1419 : !
1420 : CPP_ENTER_PROCEDURE( "DECOMPG2L" )
1421 :
1422 : !
1423 : ! Search over all the PEs
1424 : !
1425 424958976 : Pe = -1
1426 424958976 : Local = 0
1427 424958976 : IF ( Global == 0 ) RETURN ! quick return
1428 424958976 : PEsize = SIZE( Decomp%Head )
1429 424958976 : IF ( Ipe >= PEsize ) Ipe = 0
1430 424958976 : Ipeold= Ipe
1431 : PEs: DO ! Loop over all PEs starting
1432 995747326 : Jsize = SIZE( Decomp%Head(Ipe+1)%StartTags )
1433 995747326 : IF ( J >= Jsize ) J = 0
1434 995747326 : Jold = J ! from the PE used previously
1435 >10856*10^7 : Blocks: DO WHILE (Jsize > 0) ! Loop through data segments
1436 >21713*10^7 : IF ( Global >= Decomp%Head(Ipe+1)%StartTags(J+1) .AND. &
1437 >10856*10^7 : Global <= Decomp%Head(Ipe+1)%EndTags(J+1) ) THEN
1438 0 : Local = Decomp%Head(Ipe+1)%Offsets(J+1) + Global - &
1439 424958976 : Decomp%Head(Ipe+1)%StartTags(J+1) + 1
1440 424958976 : Pe = Ipe
1441 424958976 : EXIT PEs ! Global tag has been found
1442 : ELSE
1443 >10814*10^7 : J = MOD(J+1,Jsize) ! Increment the block index
1444 : ENDIF
1445 >10814*10^7 : IF ( J == Jold ) EXIT Blocks ! Global tag not on this PE
1446 : ENDDO Blocks
1447 570788350 : Ipe = MOD(Ipe+1,PEsize) ! Increment the pe number
1448 570788350 : J = 0
1449 570788350 : IF ( Ipe == Ipeold ) EXIT PEs ! Global tag not found on any PE
1450 : ENDDO PEs
1451 :
1452 : CPP_ASSERT_F90( Local .LE. Decomp%NumEntries(Pe+1) )
1453 :
1454 : CPP_LEAVE_PROCEDURE( "DECOMPG2L" )
1455 : RETURN
1456 : !
1457 : !EOC
1458 : END SUBROUTINE DecompG2L
1459 : !------------------------------------------------------------------------
1460 :
1461 :
1462 : !------------------------------------------------------------------------
1463 : !BOP
1464 : ! !IROUTINE: DecompG2LVector --- Map global index to local and PE
1465 : !
1466 : ! !INTERFACE:
1467 0 : SUBROUTINE DecompG2LVector ( Decomp, N, Global, Local, Pe )
1468 : ! !USES:
1469 : IMPLICIT NONE
1470 : !
1471 : ! !INPUT PARAMETERS:
1472 : TYPE(DecompType), INTENT( IN ):: Decomp ! Decomp information
1473 : INTEGER, INTENT( IN ) :: N ! Number of indices
1474 : INTEGER, INTENT( IN ) :: Global(:) ! Global index
1475 : !
1476 : ! !OUTPUT PARAMETERS:
1477 :
1478 : INTEGER, INTENT( OUT ) :: Local(:) ! Local index
1479 : INTEGER, INTENT( OUT ) :: Pe(:) ! Pe location
1480 : !
1481 : !
1482 : ! !DESCRIPTION:
1483 : ! Given a decomposition and a global index, this routine returns
1484 : ! the local index and PE location of that global tag. If the
1485 : ! global index is not found, Local = 0, Pe = -1 is returned.
1486 : !
1487 : ! Note that this routine is not efficient by any stretch of the
1488 : ! imagination --- only one index can be converted at a time.
1489 : ! In addition, a search procedure must be performed, whose
1490 : ! efficiency is inversely proportional to the size of the decomposition
1491 : ! (in particular, to the number of "runs"). Conceptually this
1492 : ! mapping should be used only once in the program for
1493 : ! initialization, and subsequently all calculations should take
1494 : ! place using local indices.
1495 : !
1496 : ! !SYSTEM ROUTINES:
1497 : ! SIZE
1498 : !
1499 : ! !REVISION HISTORY:
1500 : ! 02.11.09 Sawyer Creation from decompglobaltolocal
1501 : ! 02.11.22 Sawyer Optimized by caching previously used block
1502 : !
1503 : !EOP
1504 : !------------------------------------------------------------------------
1505 : !BOC
1506 : !
1507 : ! !LOCAL VARIABLES:
1508 : INTEGER, SAVE :: J = 0 ! Initial value
1509 : INTEGER, SAVE :: Ipe = 0 ! Initial value
1510 : INTEGER :: I, Ipeold, Jold, PEsize, Jsize
1511 : !
1512 : CPP_ENTER_PROCEDURE( "DECOMPG2LVECTOR" )
1513 :
1514 0 : PEsize = SIZE( Decomp%Head )
1515 : !
1516 : ! Search over all the PEs
1517 : !
1518 0 : DO I=1, N
1519 0 : Pe(I) = -1
1520 0 : Local(I) = 0
1521 0 : IF ( Global(I) == 0 ) CYCLE
1522 0 : IF ( Ipe >= PEsize ) Ipe = 0
1523 0 : Ipeold= Ipe
1524 0 : PEs: DO WHILE ( PEsize > 0 ) ! Loop over all PEs starting
1525 0 : Jsize = SIZE( Decomp%Head(Ipe+1)%StartTags )
1526 0 : IF ( J >= Jsize ) J = 0
1527 0 : Jold = J ! from the PE used previously
1528 0 : Blocks: DO WHILE (Jsize > 0) ! Loop through data segments
1529 0 : IF ( Global(I) >= Decomp%Head(Ipe+1)%StartTags(J+1) .AND. &
1530 0 : Global(I) <= Decomp%Head(Ipe+1)%EndTags(J+1) ) THEN
1531 0 : Local(I) = Decomp%Head(Ipe+1)%Offsets(J+1) + Global(I) - &
1532 0 : Decomp%Head(Ipe+1)%StartTags(J+1) + 1
1533 0 : Pe(I) = Ipe
1534 0 : EXIT PEs ! Global tag has been found
1535 : ELSE
1536 0 : J = MOD(J+1,Jsize) ! Increment the block index
1537 : ENDIF
1538 0 : IF ( J == Jold ) EXIT Blocks ! Global tag not on this PE
1539 : ENDDO Blocks
1540 0 : Ipe = MOD(Ipe+1,PEsize) ! Increment the pe number
1541 0 : J = 0
1542 0 : IF ( Ipe == Ipeold ) EXIT PEs ! Global tag not found on any PE
1543 : ENDDO PEs
1544 :
1545 : CPP_ASSERT_F90( Local(I) .LE. Decomp%NumEntries(Pe(I)+1) )
1546 :
1547 : ENDDO
1548 : CPP_LEAVE_PROCEDURE( "DECOMPG2LVECTOR" )
1549 0 : RETURN
1550 : !
1551 : !EOC
1552 : END SUBROUTINE DecompG2LVector
1553 : !------------------------------------------------------------------------
1554 :
1555 :
1556 : !------------------------------------------------------------------------
1557 : !BOP
1558 : ! !IROUTINE: DecompL2G --- Map global index to local and PE
1559 : !
1560 : ! !INTERFACE:
1561 0 : SUBROUTINE DecompL2G ( Decomp, Local, Pe, Global )
1562 : ! !USES:
1563 : IMPLICIT NONE
1564 : !
1565 : ! !INPUT PARAMETERS:
1566 : TYPE(DecompType), INTENT( IN ) :: Decomp ! Decomp information
1567 : INTEGER, INTENT( IN ) :: Local ! Local index
1568 : INTEGER, INTENT( IN ) :: Pe ! Pe location
1569 : !
1570 : ! !OUTPUT PARAMETERS:
1571 : INTEGER, INTENT( OUT ) :: Global ! Global index
1572 : !
1573 : !
1574 : ! !DESCRIPTION:
1575 : ! Given a decomposition and a local-pe index pair, this routine
1576 : ! returns the 2-D global index. If the local index is not found,
1577 : ! 0 is returned.
1578 : !
1579 : ! Note that this routine is not efficient by any stretch of the
1580 : ! imagination --- only one index can be converted at a time.
1581 : ! In addition, a search procedure must be performed, whose
1582 : ! efficiency is inversely proportional to the size of the
1583 : ! decomposition (in particular, to the number of "runs").
1584 : ! Conceptually this mapping should be used only once in the
1585 : ! program for initialization, and subsequently all calculations
1586 : ! should take place using local indices.
1587 : !
1588 : ! !SYSTEM ROUTINES:
1589 : ! SIZE
1590 : !
1591 : ! !REVISION HISTORY:
1592 : ! 98.03.20 Sawyer Creation
1593 : !
1594 : !EOP
1595 : !------------------------------------------------------------------------
1596 : !BOC
1597 : !
1598 : ! !LOCAL VARIABLES:
1599 : INTEGER :: J, Counter
1600 : LOGICAL :: Found
1601 : !
1602 : CPP_ENTER_PROCEDURE( "DECOMPL2G" )
1603 : CPP_ASSERT_F90( Pe .GE. 0 )
1604 : CPP_ASSERT_F90( Pe .LT. SIZE(Decomp%Head) )
1605 : CPP_ASSERT_F90( Local .GT. 0 )
1606 : CPP_ASSERT_F90( Local .LE. Decomp%NumEntries(Pe+1) )
1607 :
1608 0 : Counter = 0
1609 0 : Found = .FALSE.
1610 0 : J = 0
1611 0 : DO WHILE ( .NOT. Found )
1612 0 : J = J+1
1613 0 : Counter = Counter + Decomp%Head(Pe+1)%EndTags(J) - &
1614 0 : Decomp%Head(Pe+1)%StartTags(J) + 1
1615 0 : IF ( Local .LE. Counter ) THEN
1616 0 : Found = .TRUE.
1617 : !
1618 : ! The following calculation is not immediately obvious. Think about it
1619 : !
1620 0 : Global = Local - Counter + Decomp%Head(Pe+1)%EndTags(J)
1621 0 : Found = .TRUE.
1622 0 : ELSEIF ( J .GE. SIZE( Decomp%Head(Pe+1)%StartTags ) ) THEN
1623 : !
1624 : ! Emergency brake
1625 : !
1626 0 : Found = .TRUE.
1627 0 : Global = 0
1628 : ENDIF
1629 : ENDDO
1630 :
1631 : CPP_LEAVE_PROCEDURE( "DECOMPL2G" )
1632 0 : RETURN
1633 : !
1634 : !EOC
1635 : END SUBROUTINE DecompL2G
1636 : !------------------------------------------------------------------------
1637 :
1638 :
1639 : !------------------------------------------------------------------------
1640 : !BOP
1641 : ! !IROUTINE: DecompL2GVector --- Map global index to local and PE
1642 : !
1643 : ! !INTERFACE:
1644 0 : SUBROUTINE DecompL2GVector ( Decomp, N, Local, Pe, Global )
1645 : ! !USES:
1646 : IMPLICIT NONE
1647 : !
1648 : ! !INPUT PARAMETERS:
1649 : TYPE(DecompType), INTENT( IN ) :: Decomp ! Decomp information
1650 : INTEGER, INTENT( IN ) :: N ! Number of indices
1651 : INTEGER, INTENT( IN ) :: Local(:)! Local index
1652 : INTEGER, INTENT( IN ) :: Pe(:) ! Pe location
1653 : !
1654 : ! !OUTPUT PARAMETERS:
1655 : INTEGER, INTENT( OUT ) :: Global(:) ! Global index
1656 : !
1657 : !
1658 : ! !DESCRIPTION:
1659 : ! Given a decomposition and a local-pe index pair, this routine
1660 : ! returns the 2-D global index. If the local index is not found,
1661 : ! 0 is returned.
1662 : !
1663 : ! Note that this routine is not efficient by any stretch of the
1664 : ! imagination --- only one index can be converted at a time.
1665 : ! In addition, a search procedure must be performed, whose
1666 : ! efficiency is inversely proportional to the size of the
1667 : ! decomposition (in particular, to the number of "runs").
1668 : ! Conceptually this mapping should be used only once in the
1669 : ! program for initialization, and subsequently all calculations
1670 : ! should take place using local indices.
1671 : !
1672 : ! !SYSTEM ROUTINES:
1673 : ! SIZE
1674 : !
1675 : ! !REVISION HISTORY:
1676 : ! 02.11.09 Sawyer Creation from decomplocaltoglobal
1677 : !
1678 : !EOP
1679 : !------------------------------------------------------------------------
1680 : !BOC
1681 : !
1682 : ! !LOCAL VARIABLES:
1683 : INTEGER :: I, J, Counter
1684 : LOGICAL :: Found
1685 : !
1686 : CPP_ENTER_PROCEDURE( "DECOMPL2GVECTOR" )
1687 0 : DO I=1,N
1688 : CPP_ASSERT_F90( Pe(I) .GE. 0 )
1689 : CPP_ASSERT_F90( Pe(I) .LT. SIZE(Decomp%Head) )
1690 : CPP_ASSERT_F90( Local(I) .GT. 0 )
1691 : CPP_ASSERT_F90( Local(I) .LE. Decomp%NumEntries(Pe(I)+1) )
1692 :
1693 : Counter = 0
1694 : Found = .FALSE.
1695 : J = 0
1696 0 : DO WHILE ( .NOT. Found )
1697 0 : J = J+1
1698 0 : Counter = Counter + Decomp%Head(Pe(I)+1)%EndTags(J) - &
1699 0 : Decomp%Head(Pe(I)+1)%StartTags(J) + 1
1700 0 : IF ( Local(I) .LE. Counter ) THEN
1701 0 : Found = .TRUE.
1702 : !
1703 : ! The following calculation is not immediately obvious. Think about it
1704 : !
1705 0 : Global(I) = Local(I) - Counter + Decomp%Head(Pe(I)+1)%EndTags(J)
1706 0 : Found = .TRUE.
1707 0 : ELSEIF ( J .GE. SIZE( Decomp%Head(Pe(I)+1)%StartTags ) ) THEN
1708 : !
1709 : ! Emergency brake
1710 : !
1711 0 : Found = .TRUE.
1712 0 : Global(I) = 0
1713 : ENDIF
1714 : ENDDO
1715 : ENDDO
1716 :
1717 : CPP_LEAVE_PROCEDURE( "DECOMPL2GVECTOR" )
1718 0 : RETURN
1719 : !
1720 : !EOC
1721 : END SUBROUTINE DecompL2GVector
1722 : !------------------------------------------------------------------------
1723 :
1724 :
1725 : !------------------------------------------------------------------------
1726 : !BOP
1727 : ! !IROUTINE: DecompInfo --- Information about decomposition
1728 : !
1729 : ! !INTERFACE:
1730 204288 : SUBROUTINE DecompInfo( Decomp, Npes, TotalPts )
1731 : ! !USES:
1732 : IMPLICIT NONE
1733 :
1734 : ! !INPUT PARAMETERS:
1735 : TYPE(DecompType), INTENT( IN ):: Decomp ! Decomp information
1736 :
1737 : ! !OUTPUT PARAMETERS:
1738 : INTEGER, INTENT( OUT ) :: Npes ! Npes in decomposition
1739 : INTEGER, INTENT( OUT ) :: TotalPts ! Total points in domain
1740 : !
1741 : !
1742 : ! !DESCRIPTION:
1743 : ! Return information about the decomposition: the number of
1744 : ! PEs over which the domain is decomposed, and the size of
1745 : ! the domain.
1746 : !
1747 : ! !REVISION HISTORY:
1748 : ! 00.11.12 Sawyer Creation
1749 : !
1750 : !EOP
1751 : !---------------------------------------------------------------------
1752 : !BOC
1753 : !
1754 : !
1755 : CPP_ENTER_PROCEDURE( "DECOMPINFO" )
1756 :
1757 204288 : Npes = SIZE( Decomp%Head )
1758 204288 : TotalPts = Decomp%GlobalSize
1759 :
1760 : CPP_LEAVE_PROCEDURE( "DECOMPINFO" )
1761 204288 : RETURN
1762 : !EOC
1763 : END SUBROUTINE DecompInfo
1764 : !------------------------------------------------------------------------
1765 :
1766 0 : END MODULE decompmodule
1767 :
|