[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::fortran

Title:Digital Fortran
Notice:Read notes 1.* for important information
Moderator:QUARK::LIONEL
Created:Thu Jun 01 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1333
Total number of notes:6734

1277.0. "prefetching" by RTOMS::PARETIJ () Wed Apr 30 1997 17:16

unix 4.0b, f77 v4.1
--------------------
Does the fortran compiler allow a prefetch intrinsic ?

- I have an application (fem ) that contains indirect addressing
which I would want to modify as follows :

      Do 10 I=1,M
       KNOI = LELEM(1,I)+1
       KNOJ = LELEM(2,I)+1
       KNOK = LELEM(3,I)+1
c
       joe_i=lelem(1,i+1)+1
       joe_j=lelem(2,i+1)+1
       joe_k=lelem(3,i+1)+1
c
       prefetch(XKNO(1,joe_i))
       prefetch(XKNO(2,joe_j))
       prefetch(XKNO(3,joe_k))
c
       XI   = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI   = XKNO(2,KNOJ)-XKNO(2,KNOK)
       XJ   = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ   = XKNO(2,KNOK)-XKNO(2,KNOI)
       XK   = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK   = XKNO(2,KNOI)-XKNO(2,KNOJ)

...
(is there any documentation on this ? )
/Thanks
Joseph
T.RTitleUserPersonal
Name
DateLines
1277.1We're glad you asked that...TLE::EKLUNDAlways smiling on the inside!Wed Apr 30 1997 17:2913
    	Try:
    
    	CALL DFOR$PREFETCH (address_expression)
    	CALL DFOR$PREFETCH_MODIFY (address_expression)
    	CALL DFOR$PREFETCH_EVICT_NEXT (address_expression)
    
    in f77 only (edit 4-25 for the first, 4-99 for the last two).  There
    is really no formal documentation (yet).  Let us know if you get a
    big benefit!
    
    Cheers!
    Dave Eklund
    
1277.2Great! Uhhh, what's "evict next"HYDRA::NEWMANChuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26Wed Apr 30 1997 18:0320
1277.3TLE::EKLUNDAlways smiling on the inside!Wed Apr 30 1997 19:3311
    	You are correct, we have not yet released a compiler with 4-99
    in it.  I added the intrinsics to the compiler, but do NOT know
    the fine distinctions between the instructions which will get
    generated.  You might also take a look at notes 625.*, 825.* and
    1048.* (among others found with "search pretetch") for a bit more
    information.  For the time being, you will need to content yourself
    with DFOR$PREFETCH until the others show up.
    
    Cheers!
    Dave Eklund
    
1277.410% improvementRTOMS::PARETIJWed Apr 30 1997 22:04885
Dave,
many thanks ...

with this code i get some 10% better performance, but as you
may notice from the comment lines, more "call prefetch"
may slow it down . Is this reasonable ?
Thanks for any additional advice.
cheers
-Joseph


      PROGRAM BUB
C--------------------------------------------------------------------------
C     XKNO   - Knotenkoordinaten
C     TIEFE  - Tiefe am Knoten
C     LELEM  - Knotennummern eines Elementes
C     KEKEN  - Kennung des Elementes fuer geschlossenen Rand
C     F      - Flaeche eines Elementes
C     IKNO   - Nachbarelemente eines Knotens
C     INACH  - Nachbarelemente eines Elementes
C--------------------------------------------------------------------------
      PARAMETER  (NKNOT=40000)
      PARAMETER  (NELEM=80000)
      PARAMETER  (NRKNO=5000)

      LOGICAL LINEAR
      PARAMETER  (LINEAR=.false.)

      DIMENSION XKNO(2,NKNOT),TIEFE(NKNOT),LELEM(3,NELEM),KEKEN(NELEM)
     &     ,UBUBL(NELEM),VBUBL(NELEM)
     &     ,UBUBLA(NELEM),VBUBLA(NELEM)
     &     ,KNOKEN(NKNOT),FM(NKNOT),RLV(NKNOT),F(NELEM)
     &     ,SNORML(2,NKNOT),KNOR(3,NRKNO),Rwert(NRKNO),Rwpol(NRKNO)
     &     ,QOUTx(NKNOT),QOUTy(NKNOT),Hout(NKNOT)
     &     ,uneu(NKNOT),vneu(NKNOT),hneu(NKNOT)
     &     ,hnull(NKNOT),HALT(NKNOT)
     &     ,REIB(NELEM)
     &     ,XRAND(NKNOT),YRAND(NKNOT)
     &     ,QXOUT(NKNOT),QYOUT(NKNOT),QHOUT(NKNOT)
     &     ,INACH(3,NELEM),IKNO(6*NKNOT),IANFA(NKNOT)
     &     , ELETIE(NELEM),ELEH(NELEM)
      CHARACTER IUEB1*80,FORMS*40,NAME*80
      LOGICAL  FILTERT

      DELTAH=0.02
      FILTERT=.false.
      REIBK=0.003
      IAUFS=0
      IVOLU=0
      IB=255

      Write(*,*) 'Einlesen der Systemdaten'
      OPEN(11,FILE='sysdat.bin',STATUS='unknown',
     &       FORM='unformatted',ERR=999)
      Rewind(11)

      READ (11) KENN,NRAND
      READ (11) N,M,N2,N3,M3,IW,IE,II,RHOLUF,G,CORIO,II,II,
     &              IUEB1,IUEB1,IUEB1,IUEB1,IUEB1,FORMS
      READ (11) NEA,NEE,MEA,MEE,NZA,NZE,MZA,MZE
      LFW = IW*M
      LEW = IE*M
      IAPSEQ = 1
      CALL RDSEQ(0,LELEM,DUMMY,M3,11,IAPSEQ)
      CALL RDSEQ(0,KEKEN,DUMMY,M,11,IAPSEQ)
      CALL RDSEQ(1,IDUMMY,XKNO,N2,11,IAPSEQ)
      CALL RDSEQ(1,IDUMMY,TIEFE,N,11,IAPSEQ)
      Close(11)

      Write(*,*) 'Einlesen der Koeffizienten'
      OPEN(11,FILE='coeff.dat',STATUS='old',ERR=999)
      G=9.81
      Read(11,*) TSTART
      Read(11,*) TENDE
      Read(11,*) DT
      Read(11,*) TAPEI,Tmitl,iergz,ihmit
      Read(11,*) Nlay
      Read(11,*) 
      Read(11,*) REIBK
      Read(11,*) AUST
      Read(11,*) AUSTW
      Read(11,*) FILT
      Read(11,*) FILTERT
      Read(11,*) ANFWRT
      Read(11,'(A)') NAME
      Read(11,*) IAUFS
      Read(11,*) Corio
      CORIO=CORIO*DT
      REIBA = REIBK/2.*DT
      G=G/2.*DT
      AUSTF=AUST*DT*1.5
      ianf=0
      iend=0
      Do 76 I=1,80
        If(NAME(I:I).eq.' '.and.Ianf.ne.0)  Iend=I
        If(NAME(I:I).ne.' '.and.Ianf.eq.0)  Ianf=I
        If(Ianf.ne.0) NAME(I-Ianf+1:I-Ianf+1)=NAME(I:I)
        If(Iend.ne.0) NAME(I-Ianf+1:I-Ianf+1)=CHAR(0)
  76  Continue
      Do 75 I=Iend,80
        NAME(I:I)=CHAR(0)
  75  Continue

      Write(*,*) Tstart,Tende,DT
      CLOSE(11)
      TAPE=TAPEI
      TAPE=0.
      if(iergz) then
      imitl= int(Tmitl/(2.*DT))
      if(imitl.gt.int(TAPEI/(2.*DT)-0.6)) imitl=int(TAPEI/(2*DT)-0.6)
      TAPEMA=0.
      TAPEME=imitl*DT
      endif

      Do 2 I=1,N
       uneu(I)=0.0
       vneu(I)=0.0
       hnull(I)=0.0
       hneu(I) =0.0
       hnull(I)= ANFWRT 
       hneu(I) = ANFWRT 
       if(Tiefe(I).lt.-ANFWRT) then
       hneu(I)  = -Tiefe(I)-0.01
       hnull(I) = -Tiefe(I)-0.01
       endif
       FM(I) =0.
       XRAND(I)=0.
       YRAND(I)=0.
       QXOUT(I)=0.
       QYOUT(I)=0.
       QHOUT(I)=0.
       IANFA(I)=0
      Do 2 K=0,5
       IKNO(6*I+K)=0
    2 Continue

      DO 1 I=1,M
       UBUBL(I)=0.
       VBUBL(I)=0.
       UBUBLA(I)=0.
       VBUBLA(I)=0.
         KNOI  = LELEM(1,I)+1
         KNOJ  = LELEM(2,I)+1
         KNOK  = LELEM(3,I)+1
         XI = XKNO(1,KNOI)
         YI = XKNO(2,KNOI)
         XJ = XKNO(1,KNOJ)
         YJ = XKNO(2,KNOJ)
         XK = XKNO(1,KNOK)
         YK = XKNO(2,KNOK)
         F(I)=0.5*(XI*(YJ-YK)+XJ*(YK-YI)+XK*(YI-YJ))
         FM(KNOI)=FM(KNOI)+F(I)
         FM(KNOJ)=FM(KNOJ)+F(I)
         FM(KNOK)=FM(KNOK)+F(I)
         If(F(I).le.0.00001) Write(*,*)'Warning Area of Element '
     &                               ,I,' <=10**-5'
       XI   = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI   = XKNO(2,KNOJ)-XKNO(2,KNOK)
       XJ   = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ   = XKNO(2,KNOK)-XKNO(2,KNOI)
       XK   = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK   = XKNO(2,KNOI)-XKNO(2,KNOJ)
       XRAND(KNOI)=XRAND(KNOI)+XI
       XRAND(KNOJ)=XRAND(KNOJ)+XJ
       XRAND(KNOK)=XRAND(KNOK)+XK
       YRAND(KNOI)=YRAND(KNOI)+YI
       YRAND(KNOJ)=YRAND(KNOJ)+YJ
       YRAND(KNOK)=YRAND(KNOK)+YK
       IANFA(KNOI)=IANFA(KNOI)+1
       IANFA(KNOJ)=IANFA(KNOJ)+1
       IANFA(KNOK)=IANFA(KNOK)+1
       ELEH(I)=0.
       ELETIE(I)=(Tiefe(KNOI)+Tiefe(KNOJ)+Tiefe(KNOK))/3.
       if(ELEH(I).lt.-ELETIE(I)+0.1) ELEH(I)=-ELETIE(I)+0.1
    1 CONTINUE

      J=IANFA(1)
      IANFA(1)=1
      Do 3 I=2,N
       INACH(1,I)=0
       K=IANFA(I)
       IANFA(I)=IANFA(I-1)+J
       J=K
    3 CONTINUE
      IANFA(N+1)=IANFA(N)+J
      if(IANFA(N+1).gt.6*NKNOT)STOP'Speicher zu klein 1'
      Do 5 I =1,M
        KNOI  = LELEM(1,I)+1
        KNOJ  = LELEM(2,I)+1
        KNOK  = LELEM(3,I)+1
        IKNO(IANFA(KNOI)+INACH(1,KNOI))=I
	INACH(1,KNOI)=INACH(1,KNOI)+1
        IKNO(IANFA(KNOJ)+INACH(1,KNOJ))=I
	INACH(1,KNOJ)=INACH(1,KNOJ)+1
        IKNO(IANFA(KNOK)+INACH(1,KNOK))=I
	INACH(1,KNOK)=INACH(1,KNOK)+1
    5 CONTINUE
      Do 561 I=1,M
      Do 561 J=1,3
	INACH(J,I)=-999
  561 CONTINUE
      Do 555 I=1,M
        KNOI  = LELEM(1,I)+1
        KNOJ  = LELEM(2,I)+1
        KNOK  = LELEM(3,I)+1
	do 556 J=IANFA(KNOJ),IANFA(KNOJ+1)-1
	 IEL=IKNO(J)
	 if(LELEM(1,IEL)+1.eq.KNOK.or.
     &      LELEM(2,IEL)+1.eq.KNOK.or.
     &      LELEM(3,IEL)+1.eq.KNOK) then
	    if(I.ne.IEL) INACH(1,I)=IEL
         endif
  556   Continue
	do 557 J=IANFA(KNOI),IANFA(KNOI+1)-1
	 IEL=IKNO(J)
	 if(LELEM(1,IEL)+1.eq.KNOK.or.
     &      LELEM(2,IEL)+1.eq.KNOK.or.
     &      LELEM(3,IEL)+1.eq.KNOK) then
            if(I.NE.IEL) INACH(2,I)=IEL
         endif
  557   Continue
	do 558 J=IANFA(KNOI),IANFA(KNOI+1)-1
	 IEL=IKNO(J)
	 if(LELEM(1,IEL)+1.eq.KNOJ.or.
     &      LELEM(2,IEL)+1.eq.KNOJ.or.
     &      LELEM(3,IEL)+1.eq.KNOJ) then
            if(I.NE.IEL) INACH(3,I)=IEL
         endif
  558   Continue
  555 Continue

      Do 563 I=1,M
      KEK=KEKEN(I)
      if(KEK.ge.4) then
       KEK=KEK-4
      elseif(INACH(3,I).eq.-999) then
       INACH(3,I)=-99
      endif
      if(KEK.ge.2) then
       KEK=KEK-2
      elseif(INACH(2,I).eq.-999) then
       INACH(2,I)=-99
      endif
      if(KEK.ge.1) then
       KEK=KEK-1
      elseif(INACH(1,I).eq.-999) then
       INACH(1,I)=-99
      endif
  563 Continue

      if(REIBK.gt.0.) then
      Z0 = 1./exp( 0.4 /SQRT(REIBK))
      Write(*,*) ' Rauhigkeitslaenge: ',Z0
      rloge = alog(2.7182)
      Do 23 I=1,M
       KNOI  = LELEM(1,I)+1
       KNOJ  = LELEM(2,I)+1
       KNOK  = LELEM(3,I)+1
       tief=(tiefe(KNOI)+tiefe(KNOJ)+tiefe(KNOK))/3.
       ZLAY=Tief
       if(zlay.lt.0.0) zlay=0.0
       if(zlay.gt.Z0*2.7183) then
       REIB(I) = (0.4/(alog(Zlay/Z0)/rloge + Z0/Zlay -1.))**2
       else
       REIB(I) = (0.4/0.36788)**2
      endif
      REIB(I)=REIBK
      REIB(I)=REIB(I)/2.*DT
  23  Continue
      else
      Z0 = 0.
      Write(*,*) ' Rauhigkeitslaenge: ',Z0
      Do 24 I=1,M
      REIB(I)=0.
  24  Continue
      endif

      OPEN(10,FILE=NAME,STATUS='unknown'
     &                    ,FORM='unformatted',ERR=999)
      REWIND(10)
      if(iergz) then
      OPEN(15,FILE='ergnet.bin',STATUS='unknown'
     &                    ,FORM='unformatted',ERR=999)
      REWIND(15)
      endif

      IF(iaufs.gt.0) then
        Do 7 I=1,iaufs-1
         Call RDERG(Zeit,QOUTx(1),QOUTy(1),HNULL(1),N,10,IB)
             Write(*,*) ' einlesen ',I,Zeit
    7   CONTINUE
         Call RDERG(Zeit,QOUTx(1),QOUTy(1),HNULL(1),N,10,IB)
             Write(*,*) ' einlesen ',Iaufs,Zeit
        Do 8 I=1,N
         AH=Tiefe(I)+HNULL(I)
         if(AH.gt.0.0001) then
         UNEU(I)=Qoutx(I)/AH
         VNEU(I)=Qouty(I)/AH
         else
         UNEU(I)= 0.
         VNEU(I)= 0.
         endif
    8 Continue
      TSTART=Zeit
      TAPE=Zeit+TAPEI
      ISATZ=0
      if(iergz) then
      TAPEMA=TAPE-imitl*DT
      TAPEME=TAPE+imitl*DT
      Call RDERG(Zeit1,QOUTx(1),QOUTy(1),Hout(1),N,15,IB)
      Write(*,*)'Eingelesen "ergnet" ',Zeit1
      if(Zeit1.lt.Zeit) then
   34 Call RDERG(Zeit1,QOUTx(1),QOUTy(1),Hout(1),N,15,IB)
      Write(*,*)'Eingelesen "ergnet" ',Zeit1
      if(Zeit1.lt.Zeit) goto 34
      elseif(Zeit1.gt.Zeit) then
      Write(*,*)'Zeiten entsprechen sich nicht',Zeit,Zeit1
      if(Zeit1.lt.Zeit+Tapei) then
      Write(*,*)'"ergnet" wird nicht ordentlich gemittelt'
      TAPEMA=Zeit
      TAPEME=Zeit+imitl*DT
      endif
      endif
      endif
      Do 11 I=1,M
        KNOI = LELEM(1,I)+1
        KNOJ = LELEM(2,I)+1
        KNOK = LELEM(3,I)+1
	UBUBL(I)=(UNEU(KNOI)+UNEU(KNOJ)+UNEU(KNOK))/3.
	VBUBL(I)=(VNEU(KNOI)+VNEU(KNOJ)+VNEU(KNOK))/3.
	ELEH(I)=(HNULL(KNOI)+HNULL(KNOJ)+HNULL(KNOK))/3.
        if(ELEH(I).lt.-ELETIE(I)+0.1) ELEH(I)=-ELETIE(I)+0.1
	UBUBLA(I)=UBUBL(I)
	VBUBLA(I)=VBUBL(I)
   11 Continue
      Else
      Zeit=TSTART
      TAPE=Zeit+TAPEI
      ISATZ=0
      CALL WRERG(Zeit,QOUTx(1),QOUTy(1),Hnull(1),N,10,IB)
      if(iergz) CALL WRERG(Zeit,QOUTx(1),QOUTy(1),Hout(1),N,15,IB)
      Endif

      OPEN(12,FILE='rndbed.bin',STATUS='old',
     &                         FORM='unformatted',ERR=999)
      REWIND(12)
      READ  (12) JJ,IRand
      READ  (12) IMIN,DELTAT
      IF(JJ.GT.0) Write(*,*) ' JJ ne 0 '
      READ (12) ((KNOR(I,K),I=1,3),K=1,IRand)


      NR = N2 - M - 2
      CALL NOREFL(NR,N,M,lelem,xkno,keken,knoken,snorml)

      Write(*,*)
      Write(*,'(30X,1A14)') ' in Berechnung'
      Write(*,*)
      Write(*,*)
      Write(*,'(1X,2A1,1A3)') ' ',char(27),'[3A'

      NZeit=INT((TENDE-TSTART)/DT)+1
      Do 99 IZeit=1,NZEIT

C      Zeit=TSTART+DT*FLOAT(IZeit-1)
      Zeit=Zeit+DT
C      Write(*,*) ' Zeitpunkt ',Zeit,' Satz ',ISATZ
      Write(*,'(1X,2A1,1A3,1A10,1F8.1,1A6,1I4)') ' ',char(27),'[1A',
     & 'Zeitpunkt ',Zeit,' Satz ',ISATZ

      DO 4 I=1,N
      HNEU(I)=0.0
c      RLV(I)= FM(I)
      RLV(I)= 0.0
    4 Continue

      Do 12 I=1,M
       KNOI = LELEM(1,I)+1
       KNOJ = LELEM(2,I)+1
       KNOK = LELEM(3,I)+1
       joe_i=lelem(1,i+1)+1
       joe_j=lelem(2,i+1)+1
       joe_k=lelem(3,i+1)+1
c       joe_inach1=inach(1,i+1)
c       joe_inach2=inach(2,i+1)
c       joe_inach3=inach(3,i+1)
c
       CALL DFOR$prefetch(XKNO(1,joe_i))
       CALL DFOR$prefetch(XKNO(2,joe_j))
       CALL DFOR$prefetch(XKNO(3,joe_k))

       FE   = F(I)
       AHI = ELETIE(I)+ELEH(I)
      UBA=UBUBLA(I)
      VBA=VBUBLA(I)
      KNNA=INACH(1,I)
      if(KNNA.gt.0) then
       XI = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI = XKNO(2,KNOJ)-XKNO(2,KNOK)
       T = (YI*UBA+XI*VBA)
      if(T.lt.0.) then
      AHN = ELETIE(KNNA)+ELEH(KNNA)
      AHK=AHN
      if(AHI.lt.AHK) AHK=AHI
      T=T*DT*AHK
      UT = T * UBA
      VT = T * VBA
      UBUBL(I)   = UBUBL(I)+UT/FE/AHI
      VBUBL(I)   = VBUBL(I)+VT/FE/AHI
      UBUBL(KNNA)= UBUBL(KNNA)-UT/F(KNNA)/AHN
      VBUBL(KNNA)= VBUBL(KNNA)-VT/F(KNNA)/AHN
      endif
      elseif(KNNA.eq.-99) then
       XI = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI = XKNO(2,KNOJ)-XKNO(2,KNOK)
       T = (YI*UBA+XI*VBA)*DT
       if(T.lt.0.) then
       UT = T * UBA
       VT = T * VBA
       UBUBL(I)= UBUBL(I)+UT/FE
       VBUBL(I)= VBUBL(I)+VT/FE
       endif
      endif
      KNNA=INACH(2,I)
      if(KNNA.gt.0) then
       XJ = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ = XKNO(2,KNOK)-XKNO(2,KNOI)
       T = (YJ*UBA+XJ*VBA)
      if(T.lt.0.) then
      AHN = ELETIE(KNNA)+ELEH(KNNA)
      AHK=AHN
      if(AHI.lt.AHK) AHK=AHI
      T=T*DT*AHK
      UT = T * UBA
      VT = T * VBA
      UBUBL(I)   = UBUBL(I)+UT/FE/AHI
      VBUBL(I)   = VBUBL(I)+VT/FE/AHI
      UBUBL(KNNA)= UBUBL(KNNA)-UT/F(KNNA)/AHN
      VBUBL(KNNA)= VBUBL(KNNA)-VT/F(KNNA)/AHN
      endif
      elseif(KNNA.eq.-99) then
       XJ = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ = XKNO(2,KNOK)-XKNO(2,KNOI)
       T = (YJ*UBA+XJ*VBA)*DT
       if(T.lt.0.) then
       UT = T * UBA
       VT = T * VBA
       UBUBL(I)= UBUBL(I)+UT/FE
       VBUBL(I)= VBUBL(I)+VT/FE
       endif
      endif
      KNNA=INACH(3,I)
      if(KNNA.gt.0) then
       XK = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK = XKNO(2,KNOI)-XKNO(2,KNOJ)
       T = (YK*UBA+XK*VBA)
      if(T.lt.0.) then
      AHN = ELETIE(KNNA)+ELEH(KNNA)
      AHK=AHN
      if(AHI.lt.AHK) AHK=AHI
      T=T*DT*AHK
      UT = T * UBA
      VT = T * VBA
      UBUBL(I)   = UBUBL(I)+UT/FE/AHI
      VBUBL(I)   = VBUBL(I)+VT/FE/AHI
      UBUBL(KNNA)= UBUBL(KNNA)-UT/F(KNNA)/AHN
      VBUBL(KNNA)= VBUBL(KNNA)-VT/F(KNNA)/AHN
      endif
      elseif(KNNA.eq.-99) then
       XK = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK = XKNO(2,KNOI)-XKNO(2,KNOJ)
       T = (YK*UBA+XK*VBA)*DT
       if(T.lt.0.) then
       UT = T * UBA
       VT = T * VBA
       UBUBL(I)= UBUBL(I)+UT/FE
       VBUBL(I)= VBUBL(I)+VT/FE
       endif
      endif
   12 CONTINUE

      Do 10 I=1,M
       KNOI = LELEM(1,I)+1
       KNOJ = LELEM(2,I)+1
       KNOK = LELEM(3,I)+1
c
       joe_i=lelem(1,i+1)+1
       joe_j=lelem(2,i+1)+1
       joe_k=lelem(3,i+1)+1
       joe_inach1=inach(1,i+1)
       joe_inach2=inach(2,i+1)
       joe_inach3=inach(3,i+1)
c
       CALL DFOR$prefetch(XKNO(1,joe_i))
       CALL DFOR$prefetch(XKNO(2,joe_j))
       CALL DFOR$prefetch(XKNO(3,joe_k))
c       call dfor$prefetch(tiefe(joe_i))
c       call dfor$prefetch(tiefe(joe_j))
c       call dfor$prefetch(tiefe(joe_k))
c       call dfor$prefetch(Hnull(joe_i))
c       call dfor$prefetch(Hnull(joe_j))
c       call dfor$prefetch(Hnull(joe_k))
       call dfor$prefetch(ububla(joe_inach1))
       call dfor$prefetch(ububla(joe_inach2))
       call dfor$prefetch(ububla(joe_inach3))
c       
       XI   = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI   = XKNO(2,KNOJ)-XKNO(2,KNOK)
       XJ   = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ   = XKNO(2,KNOK)-XKNO(2,KNOI)
       XK   = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK   = XKNO(2,KNOI)-XKNO(2,KNOJ)
       FE   = F(I)
       AHI  = Tiefe(KNOI)+Hnull(KNOI)
       AHJ  = Tiefe(KNOJ)+Hnull(KNOJ)
       AHK  = Tiefe(KNOK)+Hnull(KNOK)
       AHM = (AHI+AHJ+AHK)/3.
       if(LINEAR) THEN
       AHI  = Tiefe(KNOI)
       AHJ  = Tiefe(KNOJ)
       AHK  = Tiefe(KNOK)
       AHM = (AHI+AHJ+AHK)/3.
       ENDIF

       itr = 0
       itro = 0
       if(AHI.le.0.) itro = itro+1
       if(AHJ.le.0.) itro = itro+2
       if(AHK.le.0.) itro = itro+4

c       if(itro.ne.0)then
       if(AHM.le.0.00)then
       AHM=0.0
       UBUBL(I)=0.
       VBUBL(I)=0.
       DIVI = 0.
       DIVJ = 0.
       DIVK = 0.
       goto 33
       endif

       UBA = UBUBL(I)
       VBA = VBUBL(I)

       AHMR=AHM
       if(AHMR.lt.0.001) AHMR=0.001
       AHMR = REIB(I)*SQRT(UBA*UBA+VBA*VBA)/AHMR
       if(AHMR.gt.1.) AHMR=1.
       AHMR=(1.-AHMR)/(1.+AHMR)
       UBUBL(I) = UBUBL(I)*AHMR
       VBUBL(I) = VBUBL(I)*AHMR

      if(INACH(1,I).gt.0) then
      UAT= UBUBLA(INACH(1,I)) - UBUBLA(I)
      VAT= VBUBLA(INACH(1,I)) - VBUBLA(I)
      FG = (XI*XI+YI*YI)*AUSTF/(FE+F(INACH(1,I)))/FE
      UBUBL(I)=UBUBL(I) + UAT*FG
      VBUBL(I)=VBUBL(I) + VAT*FG
      Endif
      if(INACH(2,I).gt.0) then
      UAT= UBUBLA(INACH(2,I)) - UBUBLA(I)
      VAT= VBUBLA(INACH(2,I)) - VBUBLA(I)
      FG = (XJ*XJ+YJ*YJ)*AUSTF/(FE+F(INACH(2,I)))/FE
      UBUBL(I)=UBUBL(I) + UAT*FG
      VBUBL(I)=VBUBL(I) + VAT*FG
      Endif
      if(INACH(3,I).gt.0) then
      UAT= UBUBLA(INACH(3,I)) - UBUBLA(I)
      VAT= VBUBLA(INACH(3,I)) - VBUBLA(I)
      FG = (XK*XK+YK*YK)*AUSTF/(FE+F(INACH(3,I)))/FE
      UBUBL(I)=UBUBL(I) + UAT*FG
      VBUBL(I)=VBUBL(I) + VAT*FG
      Endif


       UBUBL(I) = UBUBL(I) + (
     & -G/FE*(HNULL(KNOI)*YI+HNULL(KNOJ)*YJ+HNULL(KNOK)*YK)
     & + CORIO * VBA
     & )

       VBUBL(I) = VBUBL(I) + (
     & -G/FE*(HNULL(KNOI)*XI+HNULL(KNOJ)*XJ+HNULL(KNOK)*XK)
     & - CORIO * UBA
     & )

       AHM=AHM*1.5*DT
       DIVI = 
     &  AHM * (
     &  +  UBUBL(I) * YI 
     &  +  VBUBL(I) * XI 
     & )
       DIVJ = 
     &  AHM * (
     &  +  UBUBL(I) * YJ 
     &  +  VBUBL(I) * XJ 
     & )
       DIVK = 
     &  AHM * (
     &  +  UBUBL(I) * YK
     &  +  VBUBL(I) * XK
     & )

       goto(700,701,702,706,704,706,706,707) itro+1
  701 Continue
       if(DIVI.lt.0.) then
       DIVJ=DIVJ+DIVI*0.5
       DIVK=DIVK+DIVI*0.5
       DIVI=0.
       itr=1
       endif
       goto 700
  702 Continue
       if(DIVJ.lt.0.) then
       DIVI=DIVI+DIVJ*0.5
       DIVK=DIVK+DIVJ*0.5
       DIVJ=0.
       itr=1
       endif
       goto 700
  704 Continue
       if(DIVK.lt.0.) then
       DIVI=DIVI+DIVK*0.5
       DIVJ=DIVJ+DIVK*0.5
       DIVK=0.
       itr=1
       endif
       goto 700
  706 Continue
       DIVI=0.
       DIVJ=0.
       DIVK=0.
       itr=1
  707 Continue
       itr=1
  700 Continue
       if(itr.ne.0) then
C       if(itro.ne.0) then
       UBUBL(I)=0.
       VBUBL(I)=0.
       DIVI=0.
       DIVJ=0.
       DIVK=0.
       endif

   33 CONTINUE

       Hneu(KNOI) = Hneu(KNOI) + Hnull(KNOI)*FE +DIVI
       Hneu(KNOJ) = Hneu(KNOJ) + Hnull(KNOJ)*FE +DIVJ
       Hneu(KNOK) = Hneu(KNOK) + Hnull(KNOK)*FE +DIVK

   10 CONTINUE
	
      IF(IRand.GT.0) Then
      IF(ZEIT.GE.Rwpol(2)) Then
      JJ = IRand*4+2
   51 READ(12) (Rwpol(J), J=1,JJ)
      IF(ZEIT.GE.Rwpol(2)) Goto 51
      Endif
      IF(ZEIT.LT.Rwpol(1)) Write(*,*) 'Fehler in Datei Randbed.bin '
      DO 53  I=1,IRand
       RWERT(I) = 0.
       I4 = 4*(I-1)+2
       RZ = ZEIT-Rwpol(1)
       RWert(I) = Rwpol(I4+1)+RZ*(Rwpol(I4+2)+RZ*
     &                 (Rwpol(I4+3)+RZ*Rwpol(I4+4)))
   53 Continue
      Endif

      Do 25 KSCH=1,1
      DO 56 I=1,IRand
      DO 56 K=1,3
      KN = KNOR(K,I)+1
      IF(KN.EQ.0) GOTO 56
      IF(K.EQ.3) THEN
      HNEU(KN)   =  RWert(I)*FM(KN)
      Endif
      IF(K.EQ.1) Then
      HNEU(KN) = HNEU(KN)-RWert(I)*YRAND(KN)*DT*1.5
      ELSEIF(K.EQ.2) THEN
      HNEU(KN) = HNEU(KN)-RWert(I)*XRAND(KN)*DT*1.5
      Endif
   56 Continue

      Do 21 I=1,N
      HALT(I)=HNEU(I)/FM(I)
      HNEU(I)=0.0
      UNEU(I)=0.
      VNEU(I)=0.
   21 Continue

      Do 20 I=1,M
       KNOI = LELEM(1,I)+1
       KNOJ = LELEM(2,I)+1
       KNOK = LELEM(3,I)+1
       joe_i=lelem(1,i+1)+1
       joe_j=lelem(2,i+1)+1
       joe_k=lelem(3,i+1)+1
c       joe_inach1=inach(1,i+1)
c       joe_inach2=inach(2,i+1)
c       joe_inach3=inach(3,i+1)
c
       CALL DFOR$prefetch(XKNO(1,joe_i))
       CALL DFOR$prefetch(XKNO(2,joe_j))
       CALL DFOR$prefetch(XKNO(3,joe_k))

       XI   = XKNO(1,KNOK)-XKNO(1,KNOJ)
       YI   = XKNO(2,KNOJ)-XKNO(2,KNOK)
       XJ   = XKNO(1,KNOI)-XKNO(1,KNOK)
       YJ   = XKNO(2,KNOK)-XKNO(2,KNOI)
       XK   = XKNO(1,KNOJ)-XKNO(1,KNOI)
       YK   = XKNO(2,KNOI)-XKNO(2,KNOJ)
       FE   = F(I)
       AHI  = Tiefe(KNOI)
       AHJ  = Tiefe(KNOJ)
       AHK  = Tiefe(KNOK)
       AHM  = (AHI+AHJ+AHK)/3.
       AHI  = Tiefe(KNOI)+(Hnull(KNOI)+HALT(KNOI))/2.
       AHJ  = Tiefe(KNOJ)+(Hnull(KNOJ)+HALT(KNOJ))/2.
       AHK  = Tiefe(KNOK)+(Hnull(KNOK)+HALT(KNOK))/2.
       if(.not.LINEAR) THEN
       AHM  = (AHI+AHJ+AHK)/3.
       ENDIF
       HI  = (-Hnull(KNOI)+HALT(KNOI))
       HJ  = (-Hnull(KNOJ)+HALT(KNOJ))
       HK  = (-Hnull(KNOK)+HALT(KNOK))

C Elementdiagnose
       itro = 0
       if(AHI.le.0.) itro = itro+2
       if(AHJ.le.0.) itro = itro+3
       if(AHK.le.0.) itro = itro+4

       if(AHM.le.0.00)then
       AHM=0.0
       UBUBL(I)=0.
       VBUBL(I)=0.
       DIVI = 0.
       DIVJ = 0.
       DIVK = 0.
       goto 22
       endif

       UBUBL(I) = UBUBL(I) - (
     & G*(HI*YI+HJ*YJ+HK*YK)/FE/5.
     & )

       VBUBL(I) = VBUBL(I) - (
     & G*(HI*XI+HJ*XJ+HK*XK)/FE/5.
     & )

       AHM=AHM*1.5*DT
       DIVI = 
     &  AHM * (
     &  +  UBUBL(I) * YI 
     &  +  VBUBL(I) * XI 
     & )
       DIVJ = 
     &  AHM * (
     &  +  UBUBL(I) * YJ 
     &  +  VBUBL(I) * XJ 
     & )
       DIVK = 
     &  AHM * (
     &  +  UBUBL(I) * YK
     &  +  VBUBL(I) * XK
     & )

       Uneu(KNOI) = Uneu(KNOI)+UBUBL(I)*FE
       Uneu(KNOJ) = Uneu(KNOJ)+UBUBL(I)*FE
       Uneu(KNOK) = Uneu(KNOK)+UBUBL(I)*FE
       Vneu(KNOI) = Vneu(KNOI)+VBUBL(I)*FE
       Vneu(KNOJ) = Vneu(KNOJ)+VBUBL(I)*FE
       Vneu(KNOK) = Vneu(KNOK)+VBUBL(I)*FE
       RLV(KNOI)=RLV(KNOI)+FE
       RLV(KNOj)=RLV(KNOj)+FE
       RLV(KNOk)=RLV(KNOk)+FE
       UBUBLA(I)=UBUBL(I)
       VBUBLA(I)=VBUBL(I)

   22 Continue
       Hneu(KNOI) = Hneu(KNOI) + Hnull(KNOI)*FE +DIVI
       Hneu(KNOJ) = Hneu(KNOJ) + Hnull(KNOJ)*FE +DIVJ
       Hneu(KNOK) = Hneu(KNOK) + Hnull(KNOK)*FE +DIVK

   20 Continue

   25 Continue

      DO 57 I=1,IRand
      DO 57 K=1,3
      KN = KNOR(K,I)+1
      IF(KN.EQ.0) GOTO 57
      IF(K.EQ.3) THEN
      HNEU(KN)   =  RWert(I)*FM(KN)
      Endif

      IF(K.EQ.1) Then
      HNEU(KN) = HNEU(KN)-RWert(I)*YRAND(KN)*DT*1.5
      AHI=HNEU(KN)/FM(KN)+Tiefe(KN)
      IF(AHI.gt.0.05) then
      UNEU(KN)   =  RWert(I)/AHI*RLV(KN)
      Endif
      ELSEIF(K.EQ.2) THEN
      HNEU(KN) = HNEU(KN)-RWert(I)*XRAND(KN)*DT*1.5
      AHI=HNEU(KN)/FM(KN)+Tiefe(KN)
      IF(AHI.gt.0.05) then
      VNEU(KN)   =  RWert(I)/AHI*RLV(KN)
      Endif
      Endif
   57 Continue

	DO 50 I=1,M
        KNOI = LELEM(1,I)+1
        KNOJ = LELEM(2,I)+1
        KNOK = LELEM(3,I)+1
	ELEH(I)=(HNULL(KNOI)+HNULL(KNOJ)+HNULL(KNOK))/3.
	if(ELEH(I).lt.-ELETIE(I)+0.1) ELEH(I)=-ELETIE(I)+0.1
   50 Continue

      Do 30 I=1,N
       HNULL(I)   = HNEU(I)/FM(I)
       if(RLV(I).gt.1e-10) then
       Uneu(I) = Uneu(I)/RLV(I)
       Vneu(I) = Vneu(I)/RLV(I)
       else
       Uneu(I) = 0.
       Vneu(I) = 0.
       endif
   30 Continue
      IF(Zeit.ge.TAPE) then
      TAPE=TAPE+TAPEI
      Do 61 I=1,N
       AHI=Tiefe(I)+HNULL(I)
       QOUTx(I)=UNeu(I)*AHI
       QOUTy(I)=VNeu(I)*AHI
       Hout(I) =HNULL(I)
   61 Continue
      ISATZ=ISATZ+1
      CALL WRERG(Zeit,QOUTx(1),QOUTy(1),Hout(1),N,10,IB)
      write(*,*)' joe QOUTx(1112),QOUTy(7151),Hout(8100)',QOUTx(1112),QOUTy(7151),Hout(8100)
      endif

      if(iergz) then
      If(Zeit.ge.TAPEMA) then
      If(Zeit.le.TAPEME) then
      Do 31 I=1,N
       QXOUT(I)=QXOUT(I)+UNEU(I)
       QYOUT(I)=QYOUT(I)+VNEU(I)
       QHOUT(I)=QHOUT(I)+HNULL(I)
   31 Continue
      IVOLU=IVOLU+1
      else
      Do 62 I=1,N
	if(ihmit) then
         Hout(I)=QHOUT(I)/IVOLU
	else
         Hout(I)=HNULL(I)
	endif
       AHI=Tiefe(I)+Hout(I)
       if(AHI.lt.0.) AHI=0.
       QOUTx(I)=QXOUT(I)/IVOLU*AHI
       QOUTy(I)=QYOUT(I)/IVOLU*AHI
	QXOUT(I)=0.
	QYOUT(I)=0.
	QHOUT(I)=0.
   62 Continue
      Zeit1=(TAPEMA+TAPEME)/2.
      CALL WRERG(Zeit1,QOUTx(1),QOUTy(1),Hout(1),N,15,IB)
      Write(*,*)' Tapen auf ergnet Zeitraum: ',TAPEMA,TAPEME,IVOLU
      Write(*,'(1X,2A1,1A3)') ' ',char(27),'[2A'
      TAPEMA=TAPE-imitl*DT
      TAPEME=TAPE+imitl*DT
      IVOLU=0
      Endif
      Endif
      endif

   99 CONTINUE
      CLOSE(10)
      if(iergz) CLOSE(15)

      STOP
  999 Write(*,*) ' Fehler beim Oeffnen der Datei '
      END

1277.5Thanks for the .4 exampleSTEVEN::hobbsSteven HobbsThu May 01 1997 00:0917
Note that the Digital Fortran compiler can automatically generate its
own prefetch instructions for some loops.  There will be more of this
optimization in future releases.  If you put a call to
DFOR$PREFETCH_xxx in a loop then the optimizer will assume that your
prefetching is better targeted for your application behavior than the
compiler generated prefetching.  In this situation the optimizer will
not do any prefetching of its own in that loop.  It is important that
use that appropriate -tune xxx (/optimize=tune=xxx) option for your
target system.  Different Alpha chips use different prefetch
instructions.

The optimizer will still do automatic prefetching in any loop that
does not contain a call to DFOR$PREFETCH_XXX.

We appreciate examples like .4 because studying these programs will
help us better tune the automatic prefetching optimizations in Digital
Fortran.
1277.610% here, 10% there ...TLE::EKLUNDAlways smiling on the inside!Thu May 01 1997 13:1223
    	I'm pretty sure that I cannot give you good advice in
    this area!  We put in the prefetch intrinsics so that people
    like yourself might experiment with critical loops where
    the compiler might not be generating the very best code.
    I think that the hope is that using switches like -tune and
    -arch and -O5 and -fast and -unroll ... should give you enough
    to get really good performance.  Adding prefetch calls should
    be a last resort for those who really need a little more
    performance, and are willing to experiment long enough to see
    what is possible.  These intrinsics provide just one additional
    tool to help tune critical loops - without the intrinsics, such
    tuning would be very tedious indeed, if not impossible.
                        
    	As to whether adding more calls may slow things down - sure,
    after all it's more instructions.  Only a very careful analysis
    would determine WHY it might be slower.
    
    	Thanks again for the feedback.  And by the way, a 10% improvement
    overnight is not bad at all...!
    
    Cheers!
    Dave Eklund