Language Specification of the COM Bridge![]() |
1 About this Document |
This is a draft concerning how COM and UNO types are mapped to types of another environment. The term mapping, refers to how COM type library information is represented by UNO type library information and vice versa. So technically, this document is about converting type library data.
2 COM - UNO |
2.1 Restrictions and Conventions
COM components are usually distributed with a type library rather than IDL files. The TLB (type library) does not contain all the information that is provided by IDL files. These are:
To generate UNO type library information, one would use the COM TLB and hence have to deal with ambiguities caused by insufficient information. These ambiguities mainly affect pointer parameters. To solve this problem, all of those types are mapped to one special UNO type, that contains additional information, which is to be supplied by the programmer.
COM uses unicode Strings: LPCOLESTR, LPOLESTR, OLECHAR*, LPCWSTR, LPWST,
and BSTR
; however, the TLB can contain ASCII strings and COM programmers might
choose to use arrays, char[], small[]
, or short[]
as strings.
This specification only takes those strings into account which are declared
as strings by the TLB, such as:
BSTR
string
in the MIDL
fileCOM allows the partial transmission of array data. The COM to UNO mapping does not take this into account; instead, arrays are always transmitted as a whole.
The bridge uses the UNO type library to marshal calls from UNO to COM. Therefore,
the used COM interfaces must be represented in that library. To resolve ambiguities
and to make programming easier, the UNO TLB contains custom type information,
such as pointer parameter, SAFEARRAYS
s (mapped to any), etc.
MIDL |
UNO IDL |
---|---|
(unsigned) boolean |
boolean |
byte |
unsigned byte |
signed char |
byte |
(unsigned) char |
unsigned byte |
double |
double |
float |
float |
(signed) hyper |
hyper |
unsigned hyper |
unsigned hyper |
signed __int32 |
long |
unsigned __int32 |
unsigned long |
(signed) __int3264 |
unsigned long, unsigned hyper (dependent on platform) |
signed __int64 |
hyper |
unsigned __int64 | unsigned hyper |
(signed) long | long |
unsigned long | unsigned long |
(signed) short | short |
unsigned short | unsigned short |
(signed) small | byte |
unsigned small | unsigned byte |
void | [1] |
(unsigned) wchar_t | char |
[1] void is allowed when used with iid_is
. void* represents an
interface pointer. There is no similar UNO idl type. See chapter 2.3.
void*
are used along with the iid_is attribute, in cases where interface
pointers are passed whose type is not known until runtime, for example:
HRESULT IActiveScript::GetScriptSite( [in] REFIID riid, [out, iid_is(riid)] void **ppvObject); |
When this method is called from UNO, then the caller knows the Interface which
is to be returned. In this case, the bridge knows the interface by the parameter
riid
, but it must be told what parameter holds this information.
Unfortunately, the COM type library does not contain information concerning iid_is
,
hence it is up to the UNO programmer to provide this information.
Assuming the bridge knows what parameter contains the IID
, then
there is still the problem of mapping the IID
to the UNO interface
name. There is currently no way to obtain interface information with just an
ID.
If one generates UNO type infos from a COM TLB, then the void*
must be replaced by an UNO type. Because void*
implies that iid_is
is used, the COM2UNO type converter could create a function like this:
// UNO IDL – identifier in, interface out void GetScriptSite( [in] uik, [out] COM_UNKNOWN_TYPE); ... struct COM_UNKNOWN_TYPE { short posId; // identifies the parameter that contains the uik any value; }; |
In MIDL, it is also possible to tag in parameters with the iid_is
attribute; then the bridge would not need to know the parameter with the IID
because the type is implicitly given.
// UNO IDL -IID in, interface in void func( [in] uik, [in] any); |
In this example, the any contains the interface along with its type description.
The IID
can also be an out parameter:
// MIDL HRESULT func([out] IID* refid, [out, iid_is( refid)]void**); |
In this case, the bridge also needs to know what parameter contains the IID
,
so it can map the interface after return.
There is still the possible case where both parameters are in/out parameters.
//MIDL HRESULT func([in,out] IID* refid, [in,out, iid_is(refid)]void**); |
This function can be treated like the case where both parameters are out parameters.
Here is a summary of reasonable combinations:
IID | Interface | Information necessary | Possible UNO Type |
---|---|---|---|
in | out | IID parameter | struct |
in |
in |
no | any |
out |
out |
IID parameter | struct |
in,out |
in,out |
IID parameter | struct |
Unfortunately, the iid_is
attribute is not only used with void pointer,
but also with well known types which are base interfaces, for example, IUnknown
.
These examples can be found in system IDL files:
HRESULT ITypeFactory::CreateFromTypeInfo( [in] ITypeInfo *pTypeInfo, [in] REFIID riid, [out, iid_is(riid)] IUnknown **ppv ); HRESULT IOleInPlaceActiveObject::RemoteResizeBorder ( [in] LPCRECT prcBorder, [in] REFIID riid, [in, unique, iid_is(riid)] IOleInPlaceUIWindow *pUIWindow, [in] BOOL fFrameWindow ); |
Because of this, pointers are ambiguous, they could be normal pointers or those
attributed with iid_is
. Therefore, one has to use an UNO type that
can contain both: normal pointers and void*
. The information on what
parameter contains the type has to be merged into the struct used for pointers.
See paragraph 2.16.
When a COM interface containing iid_is
attributes is mapped to
UNO, then the interface is implemented in an event sink object, and gets called
from COM; then the bridge cannot convert the pointer to the corresponding UNO
type. See paragraph 2.16, 2.17.
Only if the pointer is not void
and it is an in parameter, then
the bridge could create the proxy on UNO side.
COM uses a variety of arrays, such as: fixed, varying, conformant, varying,
and conformant arrays, as well as SAFEARRAY
s and sized pointers.
Fixed arrays have a predefined size and are used as in C.
[ /*Attributes. */ ] interface MyInterface { const long ARRAY_SIZE = 1000; void ARemoteFunc(char achArray[ARRAY_SIZE]); /* Other interface functions. */ } |
Varying arrays have a fixed size; however, depending on additional arguments and attributes, only a part of the array is transmitted.
[ /*Attributes*/ ] interface MyInterface { const long ARRAY_SIZE = 1000; ARemoteFunc( [in] long lFirstElement, [in] long lBlockSize, [in, first_is(lFirstElement), length_is(lBlockSize)] char achArray[ARRAY_SIZE] ); /* Other interface functions */ }; |
Conformant arrays can vary in size which is specified by a separate argument.
[ /*Attributes are defined here. */ ] interface MyInterface { ARemoteProc( long lArraySize, [size_is(lArraySize)] char achArray[*] ); /* Other interface procedures are defined here. */ }; |
HRESULT Func1( [in] long n; [size_is(n)] long * plong); /* Specifies a pointer to an n-sized block of longs */ HRESULT Func2( [in] long n; [size_is( , n)] long ** pplong); /* Specifies a pointer to a pointer to an n-sized block of longs */ HRESULT Func3( [in] long n; [size_is(n ,)] long ** pplong); /* Specifies an n-sized block of pointers to longs */ HRESULT Func4( [in] long m; [in] long n; [size_is(m,n)] long ** pplong); /* Specifies a pointer to an m-sized block of pointers, each of which points to an n-sized block of long.*/ HRESULT Func5( [out] long * pSize, [out, size_is( , *pSize)] my_type ** ppMyType); /* Specifies a pointer to a sized pointer, which points to a block of my_types, whose size is unknown when the stub calls the server. */ |
Arrays can be both conformant and varying.
OLE Automation provides the SAFEARRAY
type.
COM arrays can have multiple dimensions, for example:
HRESULT Proc2( [in] short m; [in, size_is(m)] short b[][20]); // If m = 10, b[10][20] |
The array attributes, which specify a portion of an array that is to be transmitted,
cause the MIDL compiler to generate a TLB with a parameter description that
contains the VARTYPE VT_USERDEFINED
. Those attributes are length_is
,
first_is
, and last_is
. The parameter description does
not give any information about the fact that the parameter is an array nor does
it indicate the element type. If those attributes are used along with an array
"[]", then the TLB does not even contain the correct information. Instead, it
contains a type with the name of the interface itself. But this is not an issue
because the system interfaces use those attributes only with size pointers.
If the array has an attribute which specifies the size of the array, then the
parameter description in the TLB contains the VARTYPE VT_PTR
and
the respective element type. Those attributes are size_is
, min_is
,
and max_is
. The type information does not give any clue about whether
the parameter is an array.
Arrays can also be represented by pointers as in C, and the MIDL array attributes
can be applied as well. The parameter description contains VT_PTR
and the element types, but does not indicate an array.
When a SAFEARRAY
is used, then the TLB contains that information.
Unfortunately, it does not tell about the dimensions being used. That prevents
a tool from generating types like sequence <sequence<xxx>>
when a SAFEARRAY
has two dimensions.
Multidimensional arrays whose dimensions all have a fixed size are contained in the TLB as multidimensional arrays with the respective dimensions and sizes. If the size of a dimension has been omitted in MIDL then the TLB contains a pointer instead, for example:
//MIDL HRESULT func( [in]long _size, [in, size_is(m)]long ar[][10]); // TLB generated function HRESULT func( long _size, long ** ar); |
VT_CARRAY
, the element type, count
of dimensions, and their boundaries.
It would be desirable to map MIDL arrays to UNO IDL sequences, because the
sequence
already contains the length; therefore, an additional parameter
or member that is referenced by the size_is
attribute could be
saved. Also, it could be specified that, always, the whole array is transmitted,
which would make parameters redundant which are referenced with the length_is
,
first_is
, and last_is
attributes, e.g., a COM interface
defines a function:
HRESULT func( [in]short _length, [in]short _first, [in]short _last, [in, size_is(_length), first_is( _first), last_is( _last)] char ar[]); |
void func( [in] sequence<sal_Int8> ar) raises (SomeException); |
If an array is not fixed then the parameter or member is described as pointer by the TLB. Because of the ambiguities caused by that, pointers are mapped to a special UNO type that can contain all those ambiguous types. See chapter Mapping for Pointers for a description of that type. The function in the example above would be mapped to this UNO function:
void func( [in]short _length, [in]short _first, [in]short _last, [in] COM_Ptr ar) raises ( SomeException); |
SAFEARRAY
s, it does not tell how many
dimensions it has and what size they are; therefore, they cannot be mapped to sequences,
instead, they are mapped to anys. The bridge can infer from the type description
provided by the any that it has to convert the data to a SAFEARRAY
.
Fixed arrays are identified as such by the COM TLB. They are mapped to UNO arrays of a fixed size.
Arrays | Possible Mappings |
---|---|
pointer | COM_Ptr |
fixed array |
array |
SAFEARRAY |
any |
Examples:
//MIDL HRESULT func( [in] long ar[10], [in]short ar2[10][10]); //UNO void func( [in] long ar[10], [in]short ar2[10][10]); //MIDL HRESULT func( [in] long _size, [out, size_is( _size)]long* ar); //UNO void func( [in]long _size, [out] COM_Ptr ar); //MIDL HRESULT func( [in] SAFEARRAY(long) ar); //UNO void func( [in] any ar); |
Strings can occur in various ways in COM interfaces. This specification only takes those strings into account that are identified as strings by the COM TLB. That excludes strings which are actually arrays as in counted arrays:
//Counted Strings HRESULT func( [in, length_is(count), size_is(STRSIZE)] char ar[], [in] long count); /* counted string */ typedef struct { unsigned short size; unsigned short length; [size_is(size), length_is(length)] char string[*]; } COUNTED_STRING; /* counted string with a fixed maximum length */ typedef struct { unsigned short length; [length_is(length)] char string[100]; } COUNTED_STRING2; |
BSTR
s or
they have the string attribute set in the IDL description.
Strings can also be specified with the string
attribute. When
applied to one-dimensional arrays of char
, byte
,
or wchar_t
then these arrays are treated as strings. The proxy
code generated by the MIDL compiler determines the length of the array automatically.
When using C, the arrays have to be concluded by a null.
Ascii and unicode strings are mapped to UNO strings, only when they were declared
with the string attribute. That is, the COM TLB contains the information that
they are in fact strings.
BSTR
s are automatically contained in the TLB and can therefore
be mapped.
MIDL | UNO IDL |
---|---|
ascii string ( char*, byte*, small*) | string |
wide character string (wchar_t*) |
string |
BSTR |
string |
Typedefs are very similar to C typedefs. The MIDL compiler generates headers which use the defined types and includes the typedef statement.
The generated TLB only contains information about typedef
s if
the typedef
'ed type itself can be represented by an ITypeInfo
interface. What types those are is stipulated be TYPEATTR::typekind
.
typedef enum tagTYPEKIND { TKIND_ENUM = 0 , TKIND_RECORD , TKIND_MODULE , TKIND_INTERFACE , TKIND_DISPATCH , TKIND_COCLASS , TKIND_ALIAS , TKIND_UNION , TKIND_MAX } TYPEKIND; |
A type is a typedef
whenever the member TYPEATTR::typekind
has the value TKIND_ALIAS
. Then the TYPEATTR
struct
and ITypeInfo
interface provide a means to obtain a type description
of the original type.
Whenever the COM TLB contains a typedef
(TKIND_ALIAS
)
then it is an alias for types which are usually not typedef
'ed
in UNO IDL. This is because MIDL uses C syntax and hence typedef
's:
structs, enums, and unions. Those typedef
s are not mapped.
MIDL knows two union types, encapsulated unions and nonencapsulated unions. The
MIDL compiler generates a struct from an encapsulated union that contains a discriminant
and the union. Nonencapsulated unions need an additional parameter or member that
contains the discriminant. The discriminant is referred to by the switch_is
keyword that is attributed to the union.
//Microsoft IDL //encapsulated union typedef union My UNION switch (long l) Type { case 1: char a; case 2: short b; case 4: long c; default: foo obj; }My_UNION; // nonencapsulated union typedef [switch_type(short)] union _UNION_TYPE { [case(24)] float _a; [case(25)] double _b; [default] ; } UNION_TYPE; typedef struct _SOME_TYPE { [switch_is(aNumber)] UNION_TYPE w; short aNumber; } SOME_TYPE; |
The COM TLB does not contain the switch_is attribute.
Encapsulated unions are contained in the COM TLB, as a struct with a member that acts as discriminant and another member that is a union. Because a union always needs a discriminant so that MIDL can produce proper marshaling code, one can infer that a struct with a union and only one additional member is an encapsulated union.
// UNO representation of the MIDL example above: //UNO union My_UNION switch (long) { case 1: char a; case 2: short b; case 3: long c; default: foo obj; }; |
switch_is
attribute). Therefore, the COM2UNO converter cannot omit the discriminant member
or parameter. The union itself is mapped to an any. The bridge knows when a parameter
or member is a union (custom type information), and can map the any appropriately.
Example:
//MIDL typedef [switch_type(long)] union _A_UNION { [case(1)] float a; [case(2)] double b; } A_UNION; HRESULT func( [in]long _type,[in, switch_is( _type)] A_UNION u); // UNO void func( [in] long _type, [in] any u); |
Mapping:
MIDL | UNO IDL |
---|---|
encapsulated union | union |
nonencapsulated union |
any |
The COM TLB does not contain information about const values. Instead,
those values are directly used wherever applied in MIDL, for example, a fixed
array. Therefore, there is no mapping required.
2.9 Enumerators
Enumerations in MIDL are much alike UNO enumerations, because MIDL uses a C like syntax, the enums are typedef'ed .
typedef enum{...} ENUM_TYPE; or typedef enum _tagXXX{...} ENUM_TYPE; |
Mapping:
//MIDL typedef enum _MyEnum{a,b,c} MyEnum; typedef enum {a,b,c} MyEnum are mapped to //UNO IDL enum MyEnum { a, b, c }; |
MIDL structures can be mapped to UNO IDL structures. The members are mapped according to the rules for the mapping of those members.
Structs are often declared within a typedef statement.
typedef struct{ ....} MYSTRUCT; typedef struct _MYSTRUCT{...} MYSTRUCT; |
typedef
if the first statement is used.
Mapping:
// Microsoft IDL typedef struct { T0 m0; Tl ml; ... TN mN; } STRUCTURE; typedef struct _tagSTRUCTURE { T0 m0; Tl ml; ... TN mN; } STRUCTURE; struct STRUCTURE { T0 m0; Tl ml; ... TN mN; }; //UNO struct STRUCTURE { T0 m0; Tl ml; ... TN mN; }; |
MIDL uses three different pointer types: reference pointers, unique pointers, and
full pointers, which are used for in, out, and in/out parameter. See the chapter about
parameters for more details.
Because of the ambiguity between
pointers and arrays, in parameters cannot be simplified for UNO. Say there is a
COM function:
HRESULT func( [in] long * l); |
void func( [in] long l); |
void func( [in] COM_PTR l); |
COM_Ptr
type contains flags which indicate the meaning of the
data, for example, a long**
could mean:
long**
long* []
long [][]
long*
long []
One certainly does not have to supply all the information, because the bridge could use some logic and inference rules to take some of the load off the programmer.
As mentioned, COM uses three different pointer types with different characteristics.
The ptr
pointer ( ptr
is the attribute used in MIDL)
is the one that comes closest to the normal C pointer, in that it can be null,
it can change the value during the call, it can have aliases, etc. The latter
is the most important for the bridge because it has to ensure this behavior
even after the conversion of parameters. Let us assume there is a COM function:
HRESULT func( [in] BSTR aString, [in,ptr] wchar_t* pCurrentPostion); |
pCurrentPosition
pointer would point to a position within the string.
The corresponding UNO function would look like:
void func( [in] string aString, COM_Ptr pCurrentPostion); |
COM_Ptr
type has been used. BSTR
and the COM_Ptr
value into a pointer. If the bridge does not know about the correlation between
those parameters then the converted pointer would not point into the string any
more. That is because the BSTR
is created by the system, so that
the pointer to the string data changes completely.
The bridge therefore, needs to be informed about a possible correlation between
parameters. The programmer could put additional data into the COM_Ptr
in terms of which other parameters are affected. This has the drawback that
the programmer has to care about another detail, and that the programmer supplied
information is not provided whenever a COM interface is implemented in UNO and
gets called from COM. Therefore, the better solution is that the bridge examine
the parameters and finds those which are referenced by the pointer. During the
parameter conversion, the bridge can take this into account.
MIDL | UNO IDL |
---|---|
boolean | boolean |
unsigned char |
unsigned char |
double | double |
float | float |
int | long, hyper (system dependent) |
long | long |
short | short |
BSTR | string |
CY | hyper |
DATE | double |
SCODE | long |
enum | long, hyper (system dependent) |
IDispatch* | COM_IDispatch |
IUnknown* | COM_IUnknown |
SAFEARRAY(type) | any |
VARIANT | any |
VARIANT_BOOL | boolean |
The GUID
has a similar construct in UNO where it is named Uik
and has a slightly different layout. In both environments, the id's are unequivocal
and their creation is even based on the same tool. The mapping is shown below:
typedef struct _GUID { DWORD Data1; WORD Data2; WORD Data3; BYTE Data4[8]; } GUID; struct Uik { unsigned long Data1; unsigned short Data2; unsigned short Data3; unsigned long Data4; unsigned long Data5; }; |
GUID::Data1
... GUID::Data3
have direct
counterparts: Uik::Data1
... Uik::Data3
. The first four
elements of GUID::Data4
are mapped to Uik::Data4
and
the remaining four bytes, GUID::Data4[4] – GUID::Data[7]
are mapped
to Uik::Data5
.
Mapped COM interfaces are decorated with a COM_ prefix, for example, ISomeInterface
-> COM_ISomeInterface
.
Function and parameter names remain the same.
Mapped COM functions have the same name and parameter names; however, the return type changes, and the parameter types may change as well.
COM functions return a HRESULT
. This type contains error and success
codes for a variety of scenarios. Because HRESULTs
not only contains
error codes one cannot map them to exceptions.
Often a parameter has the attribute retval, which means that the parameter
is the actual return value. If the function is used by Visual Basic or one uses
some kind of wrapper, as provided by the #import
statement, then
that special parameters acts, in fact, as the return value.
The parameter types are mapped to UNO types according to the specification.
COM interface functions are called with the __stdcall
calling
convention. The mapped UNO function, however, is called with __cdecl
calling convention. The bridge is in charge of realizing the "mapping" of the
calling convention at runtime.
A mapped COM interface must have all the characteristics of an UNO interface.
This means, that it has to inherit from XInterface
, because XInterface
provides the same functionality as IUnknown
: it acts as a substitute
for IUnknown
.
// COM interface IUnknown { typedef [unique] IUnknown *LPUNKNOWN; HRESULT QueryInterface( [in] REFIID riid, [out, iid_is(riid)] void **ppvObject); ULONG AddRef(); ULONG Release(); } |
XInterface::queryInterface
then the bridge
converts the Type parameter into the IID
. The IID
s are
kept in the UNO type library as part of the interface description. That IID
is basically the same ID as the one one obtained from the COM TLB.
On return, the out parameter ppvObject
is mapped to the corresponding
UNO interface. If the HRESULT
is any other than S_OK
then the XInterface::queryInterface
raises a com::sun::star::uno::RuntimeException
.
IUnknown::AddRef
and IUnknown::Release
are mapped
to XInterface::acquire
and XInterface::release
. When
acquire
or release
are called, then the bridge ignores
the return values of the corresponding AddRef
and Release
functions.
UNO, as well as, COM interfaces support single inheritance. Whenever a COM interface inherits an interface then the mapped UNO interface does the same. The inherited interface must be mapped as well.
IDispatch
is a very special interface in that it realizes dynamic
invocation, in other words, scripting. UNO offers a similar interface, XInvocation
.
One possible mapping would be to map IDispatch
directly to XInvocation
as is already done by the OLE bridge. But then, the question comes up, what about dual
interfaces and whether they have to inherit XInvocation
? Seen from
the UNO perspective, this is not necessary because UNO provides an invocation
service that automatically creates an invocation object from any given UNO object.
Mapped COM interfaces could be used in event sink implementations, that is,
they get called from COM, then they must have a mapping of IDispatch
because many COM components require their source interfaces ( the callback interface)
to be dispatch interfaces.
The table summarizes the importance of a mapping of IDispatch
:
Scenario | Mapping required |
---|---|
pure Automation component | yes |
dual interfaces ( no source) |
no |
source interface | yes |
This matter is still worth discussing. As for now, the IDispatch
is mapped as it is and according to this specification.
2.15 In, Out, In / Out Parameters
This chapter shows what information is needed for the bridge to do a proper conversion of all the different parameter types.
COM in parameters can be recognized by the PARAMFLAG_FIN
flag
in the TLB. The major difference to UNO in parameters is that pointers can be
used. Those pointers can act as sized pointers, in other words, they are arrays. In
parameter which are not pointers can be mapped as specified for the respective
COM types.
The bridge converts UNO parameters and creates pointers to the converted values,
as necessary. The knowledge of what the level of indirection is can be obtained
from the COM TLB (custom type information). Because of the ambiguity between
pointers and arrays, one has to pass a special UNO construct, that can act as
those types. That construct could be, for example, an any, so the type
(array or not) is implicitly given. There is no need to pass information about
the fact whether the value is a pointer, because the bridge knows that from
the COM TLB. The bridge might still need additional information, for example, whether the pointer
is a ptr
pointer. Then one needs to define a struct:
struct COM_Type { long flags; //contains pointer type (ref, unique, ptr any value; }; |
COM_Ptr
), because
of the ambiguities inherent to the COM TLB.
In/out parameters have the flags PARAMFLAG_FIN and PARAMFLAG_FOUT set. For conversions, the bridge has to convert parameters in both directions. Hence, the considerations for in as well as for out parameters apply. Unlike pure out parameters, in/out parameter can be caller allocated, even if the pointer has two and more levels.
HRESULT func8([in]long _size, [in, out, size_is(_size)]long ** _ar); |
Level of Indirection | Caller Allocated | Callee Allocated | Additional Info necessary |
---|---|---|---|
1 | yes | no | No. (A sequence must have allocated memory). |
2 and more | yes | yes |
position of size parameter, when type is array, Callee allocated, Caller frees memory |
The datatype for in/out parameters could look like this:
// example for an eligible structure struct COM_Type { long flags; // for example, FREE_MEM long posSize; // identifies the parameter that contains the size of // the array, default is 0 any value; }; Example: 1. Caller allocated // MIDL HRESULT func( [in]long _size, [in,out]long** _ar ) // COM long arLong[10]; long* arPLong[10]; for( i= 0; i < 10; i++) { arPLong[i]= &arLong[i]; arLong[i]= 0; } hr= object->func( 10, (long**)arPLong); // UNO COM_Type outparam1; outparam1.flags= 0; // means caller allocated outparam1.outSize= 1; // left -most parameter is #1 Sequence<sal_Int32> ar( 10); //reallocated by bridge // filling the sequence with long values, the bridge has to create an array of long*[10] ... outparam1.value= makeAny( ar); object->func( 10, outparam1); |
The COM specification says: out parameters are allocated by the caller and freed by the callee. The bridge acts, then, as a caller and thus has to free the results. This statement is quite confusing, because lots of COM interfaces, as can be found in the include directory of Visual Studio, pass arrays as out parameters and have the function fill in the data. That filling in could hardly be called allocating data. An example:
HRESULT ITypeInfo::GetIDsOfNames( [in] REFIID riid, [in, size_is(cNames)] LPOLESTR * rgszNames, [in] UINT cNames, [in] LCID lcid, [out, size_is(cNames)] DISPID * rgDispId); |
PARAMFLAGS_FOUT
flag. That flag
allows us to produce appropriate out parameters in UNO. Depending on the specification
of the COM interfaces, the bridge has to consider releasing the out parameters.
The bridge, either allocates memory for the type and passes a pointer to the COM
function, or it allocates memory for a pointer to the type and passes a pointer
to that pointer to the function. If a high level pointer is required then the
bridge always allocates memory for the type in memory and passes a pointer.
Another issue is arrays, which are allocated by the callee and whose size is passed as an out parameter; that is, the caller does not know the size until return of the function:
HRESULT func( [out]long* _size, [out, size_is( ,*_size)]long** ar); |
If a pointer of level two or higher is used, then the bridge cannot recognize whether it has to supply the memory or only a pointer; therefore, we need an UNO construct that is used as an out parameter and contains the information.
We could use a structure that contains a member which indicates who is responsible for freeing the memory.
struct COM_Type { long flags; // for example, FREE_MEM long posSize; // identifies the parameter that contains the size of // the array, default is 0 any value; }; |
COM_Type::value
contains the actual ou tparameter, which is provided
by the programmer and is assigned to the out value by the bridge.
In the case an array is passed by reference, then the sequence must have the proper length. This length is used by the bridge to determine the size of the memory which has to be copied from the out parameter into the sequence.
Examples:
1. out parameter is *
1a)
// MIDL HRESULT func( [out]long* l); // COM usage long l; object->func( &l); // UNO usage COM_Type outparam; sal_Int32 lvalue; outparam.posSize= 0; outparam.value= makeAny( lvalue); object->func( outparam) |
long (*)[]
, caller allocated. The bridge needs to know the size,
therefore the Sequence's length must be set.
//MIDL HRESULT func( [in]long _size, [out, size_is(_size)]long * ar); //COM long ar[10]; object->func( 10, (long*) ar); // UNO COM_Type outparam; Sequence<sal_Int32> seq(10); outparam.posSize= 0; outparam.value= makeAny( seq); object->func( 10, outparam); |
long
, callee allocated and caller ( the bridge) must free the
memory.
// MIDL HRESULT func( [out]long** l); // COM long* pLong= NULL; object->func( &pLong); CoTaskMemFree( pLong); // UNO COM_Type outparam; outparam.flags= FREE_MEM; outparam.posSize= 0; sal_Int32 lvalue; outparam.value= makeAny( lvalue); object->func( outparam); |
long []
, callee allocated and caller must free the memory, the
size of the array is not known before the call
//MIDL HRESULT func( [out]long* _size,[out, _size_is( , *_size)]long** par); //COM long _size=0; long * ar= NULL; object->( &_size, &ar); CoTaskMemFree( ar); //UNO COM_Type outparam1; outparam1.flags=0; outparam1.posSize=0; sal_Int32 lsize; outparam1.value= makeAny( lsize); COM_Type outparam2; outparam2.flags= FREE_MEM; outparam2.outSize= 1; // left -most parameter is #1 Sequence<sal_Int32> ar; //reallocated by bridge outparam2.value= makeAny( ar); object->func( outparam1, outparam2); |
long[][]
, caller allocated multidimensional array
//MIDL HRESULT func( [in] long _size, [out, size_is(_size)]long ar[][10]); //COM long ar[5][10]; object->( 5, (long**)ar); //UNO outparam1.flags=0; outparam1.posSize=0; Sequence < Sequence<long> > ar; ....// reallocate all Sequences to have the proper size outparam1.value= makeAny( lsize); object->func( 5, outparam); |
The table shows when a value might be caller or callee allocated dependent on
the level of indirection of the pointer, described in the COM TLB.
Level of Indirection | Caller Allocated | Callee Allocated | Additional Info necessary |
---|---|---|---|
1 | yes | no | No. (A sequence must have allocated memory). |
2 and more | yes | yes |
position of size parameter, when type is array and callee-allocated |
This mapping combines the mapping of pointers which can be used as in, out, and in/out parameters. The UNO type that is mapped to, must contain the following information:
ptr
attribute).(iid_is
) of the interface.
// UNO IDL struct COM_Ptr { long flags; short attrPos; // position of size_is or iid_is attribute any value; }; constants COM_Ptr_Flags { const long PTR_POINTER= 0x1; const long FREE_MEM= 0x2; const long IID= 0x4; const long SIZE= 0x8; }; |
COM_Ptr
is used for C arrays of varying length and pointers.
The meaning of the COM_Ptr_Flags
is the following:
PTR_POINTER
: COM_Ptr::value
contains a pointer
to data that can be used by other parameters or those parameters point to
the same data. The bridge must then examine the parameters, to determine whether there is
any correlation.FREE_MEM
: If a pointer is an in/out or out parameter, and the
level of indirection is higher than one, then the flag means that the bridge
( the caller ) must free the memory, which has been allocated by the callee.
IID
: The pointer is a void*
or other interface
pointer, and can be an in, out, or in/out parameter. SIZE
: The pointer is an array that is allocated by the callee,
and the size parameter is set by the callee.
The members of COM_Ptr
:
flags
: contains one or a combination of flags from the COM_Ptr_Flags
group.attrPos
contains the position of a parameter that contains
information concerning the COM_Ptr
type. Those parameters are
referenced in MIDL by the size_is
and iid_is
attribute.
The leftmost parameter is the first parameter.
When size_is
used then the COM_Ptr
acts as in/out or
out parameter where arrays are allocated by the callee.
2.17 Restrictions with COM interfaces implemented in UNO
COM can have parameters which are only useful with another parameter, that is, they, somehow, bear information about that parameter which is only known at runtime. An example is the C -array whose size is not known at compile time. To realize a mapping of those functions to UNO one has to employ special UNO types that carry some additional information.
//MIDL HRESULT func( [out]long *_size, [in, size_is( ,_size)]long**); |
sequence
provides its length itself). The corresponding UNO function would look like
void func( [out] COM_Ptr _size, [out] COM_Ptr _p); |
COM_Ptr
type is a struct
with a dditional information for the bridge. _p
would contain the
information that parameter one represents the size of the array. The point is,
that the UNO programmer has to supply this info, and there is no other way for
the bridge to obtain that info ( except one would parse the MIDL file).
Now lets say that a COM object is being accessed by an UNO client and that
the object supports events. In other words, the client could register an event
sink with the object. The sink object needs to implement the COM interface
that is described by the COM TLB ( attribute source
in the coclass
section). The client provides the sink implementation, which implements the COM
interface as a mapped UNO interface. The sink object, therefore, is pure UNO. After
registering, the COM object can fire events which results in calls to the UNO
event sink interface. Then the bridge converts an array ( long**
to a COM_Ptr
type, but it does not know the size of the array,
because it cannot know what parameter contains the size.
Usually, COM event interfaces rarely use out parameters, but it might be feasible.
The following types do not work with event interfaces:
MIDL | UNO |
---|---|
void** | COM_Ptr |
type** ( callee allocated array, out parameter) | COM_Ptr
no |
3 UNO - COM |
A COM programmer usually needs header files with type declarations to program
a component. The header files are generated either by the MIDL compiler or through
the #import
statement. To get headers from UNO types, one could
either create them directly from the UNO type library, or one creates first a
COM TLB. A COM TLB would allow for the UNO components to be used from several
programming languages, including the .NET languages; therefore, the first specification
approach is to define mappings from the UNO TLB to the COM TLB.
UNO IDL | MIDL |
---|---|
char | wchar_t |
boolean |
boolean |
byte | char |
short | short |
unsigned short | unsigned short |
long | long |
unsigned long | unsigned long |
hyper | hyper |
unsigned hyper | unsigned hyper |
float | float |
double | double |
3.3 Constants
UNO constants cannot be mapped to COM constants by generating a COM
TLB.
MIDL allows one to declare static or const members of interfaces, and the generated
header, in fact, contains a static variable or a define, but the TLB does not contain
that information. The only way to declare constants would be to use a module,
but that does not correspond to the use of modules along with COM components.
Moreover, the #import
statement apparently only evaluates the coclass
entry which cannot contain a reference to the module.
UNO enumerators are mapped to MIDL enumerators. The namescape is merged into the enumeration name.
//UNO module com { module sun { module star { module text { enum WrapTextMode { NONE, THROUGHT, PARALLEL, DYNAMIC, LEFT, RIGHT }; }; }; }; }; // MIDL typedef enum _com_sun_star_text_WrapTextMode { NONE, THROUGHT, PARALLEL, DYNAMIC, LEFT, RIGHT }com_sun_star_text_WrapTextMode; |
An UNO string is mapped to a BSTR
.
3.6 Struct
Structs are mapped to MIDL structs, according to the specification of each element. The name of MIDL struct contains the namespace.
//MIDL typedef struct _com_sun_star_uno_SomeStruct { long a; double b; } com_sun_star_uno_SomeStruct; |
UNO unions are mapped to encapsulated unions. ( UNO unions are currently not specified, but might be in the near future.)
//UNO module com { module sun { module star { module uno { union SomeUnion switch (long) { case 1: long a; case 2: double b; default: byte[8]; } }; }; }; }; //MIDL typedef union _com_sun_star_uno_SomeUnion switch (long) { case 1:long a; case 2: double b; default: char[8] }com_sun_star_uno_SomeUnion; |
Sequence
s are mapped to SAFEARRAY
s. A sequence
that contains another sequence is mapped to a SAFEARRAY
with two
dimensions; a sequence that contains sequences which in turn contain sequences
is mapped to a SAFEARRAY
of three dimensions; and so on.
3.9 Arrays
Arrays are mapped to SAFEARRAY
s. Multidimensional arrays are mapped
to SAFEARRAY
s, with as many dimensions.
3.10 Any
anys are mapped to VARIANT
s.
3.11 Uik
Uik
s are mapped to GUID
s.
struct Uik { unsigned long Data1; unsigned short Data2; unsigned short Data3; unsigned long Data4; unsigned long Data5; }; typedef struct _GUID { DWORD Data1; WORD Data2; WORD Data3; BYTE Data4[8]; } GUID; |
Uik::Data1
... Uik::Data3
are mapped
to GUID::Data1
... GUID::Data3
, and the members Uik::Data4
and Uik::Data5
are mapped to GUID::Data4
.
Only typedefs for
can be mapped to COM TLB typedefs. All other typedefs must be converted so that the new type is substituted by its original type.
In fact, structs, enumeration, and unions should always be typedef'ed in the
COM TLB. That is because the #import
statement creates C code as
well.
COM offers two way of reporting errors, HRESULT
or error handling
interfaces. One is free to define ones own HRESULT
s but that bears the
risk of possible clashes when code from different contributors is mixed.
The error handling interfaces instead offer interface based error information.
To support this error reporting mechanism, the bridge must provide implementations
for ISupportErrorInfo
and IErrorInfo
. Whenever a mapped
UNO interface is queried for ISupportErrorInfo
then the bridge
has to return that interface.
3.14 Interfaces
A mapped UNO interface keeps it name except for the leading "X"
which is substituted by an "I". The namespace is not reflected by the
name. That poses no risk of ambiguities because COM interfaces are identified
by their GUID
s. The GUID
is created out of the Uik
belonging to the interface.
The return value is mapped to an out parameter, and exceptions are mapped to
error codes or, if there is no error code specified, they are mapped to E_FAIL
.
//UNO long func( [in]short a, [out] string s ) throw ( SomeException), //MIDL HRESULT func( [in]short a, [out] BSTR* s, [out, retval] long ret); |
retval
attributes indicates that the parameter is the return
value. The parameters are mapped according to their respective specification.
During run time, the bridge has to map the calling convention from __stdcall
to __cdecl
when a call from COM to UNO is made.
A mapped UNO interface inherits IUnknown
as does every other COM interface. XInterface
is not mapped; instead, the IUnknown
takes over all the functionality that XInterface
provides.
// UNO [ uik(E227A391-33D6-11D1-AABE00A0-249D5590), ident( "XInterface", 1.0 ) ] interface XInterface { any queryInterface( [in] type aType ); [oneway] void acquire(); [oneway] void release(); }; |
IUnknown::QueryInterface
to XInterface::queryInterface
;
IUnknown::AddRef
to XInterface::acquire
; and IUnknown::Release
to XInterface::release
. While the latter two mappings are straightforward,
the mapping of QueryInterface
requires a conversion of the COM IID
( GUID
) to a Type. Although every UNO interface has an Uik
which is equal to the IID
of a mapped UNO interface, there is currently
no way to obtain type information with just the Uik
. The tool that
creates the COM TLBs from the UNO type library needs to create additional information
that can be used by the bridge to map GUID
s to interface names which
can then be used to obtain type information.
On return, the return value is being mapped to the interface that has been queried
for. In case the any contains a type of TypeClass_VOID
then QueryInterface
returns E_NOINTERFACE
.
COM interfaces support single inheritance. When an UNO interface is mapped that inherits another interface, then the mapped interface does the same.
//UNO interface XAnUnoInterface: XOtherUnoInterface { ... }; //MIDL interface IAnUnoInterface: IOtherUnoInterface { ... }; |
XInvocation
is special in that it is used for scripting. There
could be a direct mapping to IDispatch
as it is realized by the
OLE bridge. This matter needs some further evaluation. But for now, the interface
is mapped as it is.
3.15 In, Out, In / Out Parameters
In parameter are mapped so that they are passed by value.
Out and In/Out parameters are passed by reference, for example:
//UNO void func( [in] char a, [in,out] double b, [out] short c); //MIDL HRESULT func([in]char a, [in,out] double* b, [out]short* c); |
Example:
//UNO XSomeInterface func( [in,out]string s, [out] sequence |
Author: Joachim Lingner ($Date: 2002/01/30 09:08:36 $) |