[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::c_plus_plus

Title:C++
Notice:Read 1.* and use keywords (e.g. SHOW KEY/FULL KIT_CXX_VAX_VMS)
Moderator:DECCXX::AMARTIN
Created:Fri Nov 06 1987
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3604
Total number of notes:18242

3451.0. "possible optimization bug on UNIX" by HYDRA::DORHAMER () Fri Feb 14 1997 18:36

The following problem was reported by a software partner who is porting
to Digital UNIX v4.0 with DEC C++ v5.5.  Their code gets the expected answers 
when they compile with -g, but gets incorrect answers when compiling without 
-g.

In the working version, compiled with "-g", the lists of int and float
contain the appropriate values:

        % cxx -g -I. -o works bug.cc libL.a
        % works
         1 2 -3 5
         123456789012 20015998343868 5714878441043
         1 -5 3.2 -5.9 3e+07 3.2e+07
         1 -5 3.2 -5.9 3e+07 3.2e+07

In the failing version, compiled without "-g", the lists of int and
float all contain 0.

        % cxx    -I. -o fails bug.cc libL.a
	% fails
         0 0 0 0
         123456789012 20015998343868 5714878441043
         0 0 0 0 0 0
         1 -5 3.2 -5.9 3e+07 3.2e+07

In both versions, the lists of long and double behave correctly.
The code makes use of the LEDA library, which they obtained off the net.

Here is the source code.  The entire source with the header files, library
and makefile can be found in the tar file 8878.tar in the anonymous ftp area on 
fluid.mro.dec.com.


#include <LEDA/list.h>

main ()
{
  list<int> i32List;
  i32List.append (1);
  i32List.append (2);
  i32List.append (-3);
  i32List.append (5);
  int i32;
  forall (i32, i32List) {
    cout << " " << i32;
  }
  cout << endl;

  list<long> i64List;
  i64List.append (123456789012L);
  i64List.append (0x123456789abcL);
  i64List.append (0123123123123123L);
  long i64;
  forall (i64, i64List) {
    cout << " " << i64;
  }
  cout << endl;

  list<float> f32List;
  f32List.append (1);
  f32List.append (-5);
  f32List.append (3.2);
  f32List.append (-5.9);
  f32List.append (3e7);
  f32List.append (3.2e7);
  float f32;
  forall (f32, f32List) {
    cout << " " << f32;
  }
  cout << endl;

  list<double> f64List;
  f64List.append (1);
  f64List.append (-5);
  f64List.append (3.2);
  f64List.append (-5.9);
  f64List.append (3e7);
  f64List.append (3.2e7);
  double f64;
  forall (f64, f64List) {
    cout << " " << f64;
  }
  cout << endl;
}
    
T.RTitleUserPersonal
Name
DateLines
3451.1User error, I think: illegal aliasingWIDTH::MDAVISMark Davis - compiler maniacTue Feb 18 1997 16:46135
Here is a cut down example:

#include <stdio.h>

typedef void *GenPtr;

inline void Avoid_Unused_Warning(GenPtr) {}

inline void* operator new(size_t,void* p,int) { return p; }

inline GenPtr leda_copy(const int& x){
 GenPtr p=0;
if (sizeof(int) <= sizeof(GenPtr)) // COPY_INintO_WORD(int,x,p);
           Avoid_Unused_Warning(new((int*)&p,0) int(x));
 if (sizeof(int) >  sizeof(GenPtr)) p = new int(x);
 return p;
}

struct list {
GenPtr empty, item;

  list(): empty((GenPtr)999), item((GenPtr)1000){}
void append(const int& x) { item = (leda_copy(x));}
};

void main(){
list L;
L.append(123456789);
printf("%ld \n",L.item);
}

The problem is trying to store the int value into storage declared
to be GenPtr p in leda_copy.  The "new((int*)&p,0) int(x) " returns
the address of p as the result of new, and then stores "x" into it.
The type of &p is "pointer to Genptr", the type of new() is 
"pointer to int", and the store is done through the "pointer to int".
C++ aliasing rules say that storing through "pointer to type A" does
NOT affect memory pointed to by pointers of type "pointer to type B"
(except for "pointer to char").

At optimization levels higher than -O0 (implied by -g), all of the
routines are inlined, so the compiler can "see" exactly what is going
on.  To the compiler, the program looks like:

1.	p = 0;
2.	*((int *)&p) = x;
3.	L.item = p;
4.	printf(...,L.item);

The NO aliasing rule implies that the store at 2. cannot modify
the value of p, because p is of type GenPtr, and the object
being stored into is of type int.  Therefore the compiler looks
around and notices that the last value assigned to p was the 0 from
stmt 1., so the generated code looks like:

1.	p = 0;
2.	*((int *)&p) = x;
3.	L.item = 0;
4.	printf(...,0);

and so the value stored, and the value printed, is Zero.  The reason
the program in .0 prints "unexpected" values is because the value
originally stored into the list_items is Zero; it is not that the 
iterators or access routines perform incorrectly.


Soo, I believe the program as written uses illegal aliasing, and 
therefore the compiler is intitled to produce undefined results,
including storing zeros into your list_items.  However, you STILL
want to be able to store your small objects into the pointer, instead
of allocating new storage to place the small value into.  I have 2
workarounds, and there are no doubt others.

I. Omit the assignment of 0 to p.  This makes the sequence look
like:

2.	*((int *)&p) = x;
3.	L.item = p;
4.	printf(...,L.item);

and since the compiler has no literal value that has been assigned
to p, it actually has to load a value out of the storage for p
in order to do stmt 3.  Of course, the compiler might decide to
execute 3. before 2., since they are "independent" according to 
the aliasing rule, which would produce total garbage.  (Note, if
you don't initialize p to 0, there will be garbage in the part of
p that is NOT occupied by x.)

II.  Change Avoid_Unused_Warning so that it is defined in a different
file (so it cannot be inlined), and modify it so that it ALSO takes
the address of p as an argument.  So the program looks like:

void Avoid_Unused_Warning(GenPtr,GenPtr*);  // fn prototype
...
if (sizeof(int) <= sizeof(GenPtr)) // COPY_INintO_WORD(int,x,p);
           Avoid_Unused_Warning(new((int*)&p,0) int(x),&p);
                                                      ^^^

and have a new file with only:

typedef void *GenPtr;

void Avoid_Unused_Warning(GenPtr a,GenPtr* p){}


The effect of this is
1. the compiler has to store x into new(...) before calling
Avoid_Unused_Warning
2. Since &p is passed as an arg, AND the compiler can't see the
routine, it has to assume the value of p might be changed.

So at 
3.	L.item = p;

the compiler has to load the value out of p.  This has the advantage
of always working, at the disadvantage of actually having to make
the subroutine call.


Oh, the same game can be played with "new(...)", I had thought that it
was templatized, but it isn't.  So you can use a prototype:

void* operator new(size_t,void* p,int);

in your main file, and then put the definition into another file:

#include <stdio.h>
void* operator new(size_t  i,void* p,int x) { return p; }


Since new(...&p...) already passes the address of p to an external
routine, the compiler has to assume p has been changed.

Mark Davis
c/c++ team
3451.2workaround #3: declare p to be volatileWIDTH::MDAVISMark Davis - compiler maniacTue Feb 18 1997 19:294
Randy Meyers' suggestion is to use volatile, which will force the
compiler to perform every load and store to p:

 GenPtr volatile p=0;
3451.3workaround #4 and #5DECCXL::OUELLETTEWed Feb 19 1997 20:2424
I think you could also change GenPtr to be a union (and change all of the
uses to match).

typedef union { int i; void *p; } GenPtr;

While the ANSI C standard states that a valid program will only fetch the
union field most recently stored, contrary usage is common enough that no
compiler takes advantage of this rule (and is unlikely to ever do so by
default).  It'd be a whole lot lighter of a hammer than the volatile fix.

Were the cxx compiler to have a -noansi_alias switch, you could use that to
make your much like K&R C code compile.  I'm surprised that the cxx driver
doesn't have the switch.  But the compiler does.  You can bypass the driver
by adding...

	-Wf,-noansi_alias

to your cxx command.

I would think that either of these choices would fix your problem,
but I've tried neither to verify this assertion.  Since Mark has your
example nicely cut down, I may try it with him tomorrow.

R.
3451.4Further clarification.WIDTH::MDAVISMark Davis - compiler maniacFri Feb 21 1997 14:03114
In case I wasn't clear enought in .1, there IS a user error in the program
because it relies on illegal aliasing.  Here's the Draft Standard wording
from section 3.10:
<quote>
15If  a program attempts to access the stored value of an object through
  an lvalue of other than one of the following  types  the  behavior  is
  undefined24):

  --the dynamic type of the object,

  --a cv-qualified version of the dynamic type of the object,

  --a type that is the signed or  unsigned  type  corresponding  to  the
    dynamic type of the object,

  --a  type  that  is the signed or unsigned type corresponding to a cv-
    qualified version of the dynamic type of the object,

  --an aggregate or union type that includes one of  the  aforementioned
    types  among its members (including, recursively, a member of a sub-
    aggregate or contained union),

  --a type that is a (possibly cv-qualified)  base  class  type  of  the
    dynamic type of the object,

  --a char or unsigned char type.

  _________________________
  24) The intent of this list is to specify those circumstances in which
  an object may or may not be aliased.
<unquote>

In other words, if you want to change the value of a GenPtr, the store has
to be done to something of type GenPtr or char; storing to something of
type int produces undefined results.

The "workarounds" in .1 and .2 still produce "undefined" results according
to the standard; they happen to hide enough from the compiler so it does
what we want.  [Digression: .3 workaround:

typedef union { int i; void *p; } GenPtr;

does not work in the original case, where leda_copy is templatized on  
T (=int here) : GenPtr has to be void*, a "universal pointer" in the
underlying, non-template base class.
	It also does not work to use a local union{GenPtr pp;int ii}p;
in leda_copy: the store of "x" will be through an int*, which does not
modify a GenPtr.
end digression]


I think I have a "defined" way of performing the "magic" of storing an
int (or a T) into a GenPtr: by using memmove after doing the "new".
The memmove copies chars via char* s, which seems to obey the last
clause about aliasing.

#include <string.h>		// decl for memmove
#pragma intrinsic(memmove)	// compiler will expand it inline
...


inline GenPtr leda_copy(const int& x){
 GenPtr p=0;
if (sizeof(int) <= sizeof(GenPtr)) 
  { char buf[2*sizeof(int)]; 
  memmove(&p,new((int*)&buf,0) int(x),sizeof(int));}
 if (sizeof(int) >  sizeof(GenPtr)) p = new int(x);
 return p;
}


The effect is to do a "new" into a buffer that is large enough for
the int (including alignment, if necessary, though the compiler will
always align buf on a quadword boundary....); and then bit-copy
those bits into p.  This "defined" workaround does generate the
expected answer.

Note:
1. we can't just copy x, because if it is a class, it might have
	special copy-constructor semantics.
2. we can't just declare "int y=x;" and memmove y to p, because if
	x is a class, y will be destructed when leda_copy is exited,
	with maybe the wrong effects for the bit-copy of y in p.
3. it's OK to bit-copy the new-ed object into p, because the value in
	p is going to be bit-copied all over the place. (I.e., the hack
	of putting an object INSIDE of p will not work for a class type
	that has a self-referential field - though these will probably
	be larger than sizeof(GenPtr), so this is not much of a 
	drawback).


In order to fix the undefined behavior in the original program, you
need to modify LEDA/param_types.h:

inline void* operator new(size_t, void* where, int) { return where; }


/* New code starts here */
#if defined(__DECCXX)

#include <string.h>
#pragma intrinsic(memmove)

#define COPY_INTO_WORD(T,x,p) { char buf[2*sizeof(T)]; \
           memmove(&p,new((T*)&buf,0) T(x),sizeof(T));}
#define CONSTRUCT_WORD(T,p) { char buf[2*sizeof(T)]; \
           memmove(&p,new((T*)&buf,0) T,sizeof(T));}
#else
/* original code */
#define COPY_INTO_WORD(T,x,p) Avoid_Unused_Warning(new((T*)&p,0) T(x))
#define CONSTRUCT_WORD(T,p)   Avoid_Unused_Warning(new((T*)&p,0) T)

#endif    /* new endif */

3451.5documentation on reinterpret_cast?HYDRA::DORHAMERFri Feb 28 1997 12:3833
    The software partner would like a pointer to documentation that
    explains how the compiler handles reinterpret_cast, so they can make
    sure that their code is correct.  Attached is the note from the
    partner.
    
    Thanks,
    Karen
    
    
        Thanks very much, i've forwarded your analysis and fix to the leda
        group.
    
        > In case I wasn't clear enought in .1, there IS a user error in the 
    program
        > because it relies on illegal aliasing.  Here's the Draft Standard
    wording
        > from section 3.10:
        > <quote>
        > 15If  a program attempts to access the stored value of an object through
        >   an lvalue of other than one of the following  types  the behavior is
        >   undefined24):
    
        I had been reading this section and discussing the problem, trying to
        verify your analysis. There seems to be general agreement that the compiler
        is in "spec" though some don't particularly like the handling. Its OK
        with me, i'd rather have the performance that will usually result.
        However, could you point me to the best spot in the docs to read
        about the cxx compiler's handling of reinterpret_cast. The lowest
        levels of our software uses considerable "undefined" or vendor
        defined casting, and though things seem fine now, i don't want to
        be suprised with a new compiler release.....
    
        - - -ernie
3451.6reinterpret_cast is not in current DEC C++ compilers (v5.*)WIDTH::MDAVISMark Davis - compiler maniacFri Feb 28 1997 15:2627
reinterpret_cast, dynamic_cast, static_cast, and const_cast are new
features of ANSI C++, and do not appear in our current V5.* cxx
compilers.  So the easy answer to the question is that there is
no documentation because we don't do it :)

The new ANSI-compliant C++ compiler we are working on, V6.*, does
support these new ansi features, and all the others.  The standard
(5.2.10 Reinterpret cast) sure does contain a lot of 
"the result of such a xxx conversion is unspecified", i.e., "Behavior ...
that depends on the implementation."

The only behavior guaranteed in the standard for most cases is that
for "T1 x"

	x == reinterpret_cast<T1>(reinterpret_cast<T2>(x))

reinterpret_cast<T&>(A) seems to be just a formalization of the old
	*(T*)(void *)(&A) hack .

I assume our "official" behavior specification will be similar to
the current meaning of *(T*)(void *)(&A): that you refer to the same 
storage as A .

HOWEVER: the alias rule against storing to an object via one type and 
loading via another ("unrelated") type will still be assumed.

Mark
3451.7Thanks :-) HYDRA::DORHAMERMon Mar 03 1997 20:168
    Thanks for all your help.  I'm glad to see that while developing a
    world class compiler, you folks are maintaining your sense of humor
    (just noticed that someone changed the title on the base note :-)
    
    I have forwarded all the info to the software partner and they do
    appreciate all of the help, feedback and work arounds.
    
    Karen
3451.8De nadaDECCXX::AMARTINAlan H. MartinMon Mar 03 1997 23:0814
Re .7:

>    Thanks for all your help.  I'm glad to see that while developing a
>    world class compiler, you folks are maintaining your sense of humor
>    (just noticed that someone changed the title on the base note :-)

Actually, it's a technical term.  An AltaVista search on ``type punning''
uncovers a few uses of it out on the web, such as in N.M Maclaren's _User
Experiences with the X Windowing System_, at URL:
http://www-h.eng.cam.ac.uk/help/tpl/graphics/X/comments.html .

I'd have used a more popular phrase if I could think of one that made the title
fit...
				/AHM