CORBA, Part II: Advanced Features
In Part I of this series on CORBA, we presented a brief look at CORBA, what it is and how it used. In Part II we are going to look at two advanced features of CORBA. The first feature we are going to look at deals with the advanced data structures supported by the CORBA IDL. In particular, we are going to look at sequences and unions. The second feature deals with the CORBA services included as part of the CORBA standard. In particular, we are going to look at the CORBA Naming Service and how it can be used to located CORBA objects.
Anyone who has programmed using the C or C++ languages knows that pointers and, in the case of C++, references, are valuable tools for manipulating and processing data. Because CORBA objects are global objects and may exist in different address spaces, data pointers cannot be passed between objects. For example, a linked list structure in one object cannot be passed to another object as part of a method invocation.
Even though this lack of support for data pointers would seem to be a severe restriction, in practice it rarely turns out to be one. For example, the linked-list data structure can be converted to an array, which is supported by CORBA, with the link pointers being converted into array indices. Even though arrays can be used for such situations as linked lists, they also are restrictive in the sense that the size of the array must be know a priori. This is necessary because the underlying transport must know how many bytes to send when an array is transferred between objects.
In order to deal with this type of situation, CORBA provides the sequence data type. One can view the sequence data type simply as a variable length one-dimensional array. As we can see in the example, one can create either a bounded or unbounded sequence, the difference being whether the maximum length of the sequence is specified.
The IDL code in Listing 1 is an updated version of our original IDL. It now includes a new interface method, invokeOperation(), that has three arguments, two of which are sequences. The IDL begins by defining a new enumerated data type called DataType. Enumerations in CORBA are the same as enumerations in C/C++.
Listing 1. Updated IDL Code
enum DataType { DT_SHORT, DT_LONG, DT_USHORT, DT_ULONG, DT_FLOAT, DT_DOUBLE, DT_CHAR, DT_STRING, DT_BOOLEAN }; union DataValue switch (DataType) { case DT_SHORT: short shortData; case DT_LONG: long longData; case DT_USHORT: unsigned short ushortData; case DT_ULONG: unsigned long ulongData; case DT_FLOAT: float floatData; case DT_DOUBLE: double doubleData; case DT_CHAR: char charData; case DT_STRING: string stringData; case DT_BOOLEAN: boolean booleanData; }; typedef sequence<DataValue> Results; typedef sequence<DataValue> Arguments; interface CPULoad { void getLoadAvgs(out float oneMinAvg, out float fiveMinAvg, out float tenMinAvg); void invokeOperation(in string operation, in Arguments args, out Results res); };
Following the DataType specification is the DataValue union, a union of all possible data types (as specified by DataType). Unions in CORBA are different from unions in C/C++ in that only one field of the union is active at any one time. The selection of which field is active is based on a discriminator. As can been seen, the CORBA IDL uses a switch-like syntax to specify the discriminator's type. How to select which field is active is based on the value of the discriminator. For DataValue, the discriminator's type is DataType, and it is used to select the field that corresponds to one of the possible values of DataType.
Following the union specification, we create two new data types, Arguments and Results, both of which are a sequence of DataValues. These new types actually are unbounded sequences, because we have not specified the maximum length of the sequence. If we had wanted to create a bounded (fixed length) sequence, we would have specified the maximum length of the sequence following the sequence's data type, for example, sequence<DataValue, 100>.
Finally, we use these two new data types to extend the CPULoad interface from the Part I article to include a new method, invokeOperation(). The goal of this method is to act as a generic interface method for the CPULoad interface. Clients can invoke a variety of different operations by specifying the name (operation) of the operation they wish to invoke, along with a sequence of arguments for that specific operation (args). Upon completion, the results of the operation is returned (res) as another sequence of DataValues.
Notice here that generic methods, such a invokeOperation(), have a major drawback in that they circumvent the compile time, data type checking. However, situations arise in which such methods can be useful. For example, the code shown in these examples was derived from work I did that involved the use of a central dispatcher required to dispatch jobs to different servers based upon the operation specified, as well as other aspects, such as queue lengths. Using a generic method such as invokeOperation() allowed the dispatcher to be written such that it needed to know only which operations a server could handle, not the operation's calling conventions.
Now, let's look at how invokeOperation() is implemented. Listing 2 contains the source code for invokeOperation(). The first action taken by this method is to get the current load values for the system. This is accomplished by invoking getLoadAvgs(); see Part I.
Listing 2. invokeOperation()
void CPULoad_impl:: invokeOperation(const char* operation, const Arguments& args, Results_out results) throw(CORBA::SystemException) { (void)args; static const string getOneMinAvg("getOneMinAvg"); static const string getFiveMinAvg("getFiveMinAvg"); static const string getTenMinAvg("getTenMinAvg"); static const string getAllAvgs("getAllAvgs"); CORBA::Float oneMinAvg, fiveMinAvg, tenMinAvg; getLoadAvgs(oneMinAvg, fiveMinAvg, tenMinAvg); results = new Results(3); if (operation == getOneMinAvg) { results->length(1); (*results)[0].floatData(oneMinAvg); } else if (operation == getFiveMinAvg) { results->length(1); (*results)[0].floatData(fiveMinAvg); } else if (operation == getTenMinAvg) { results->length(1); (*results)[0].floatData(tenMinAvg); } else if (operation == getAllAvgs) { results->length(3); (*results)[0].floatData(oneMinAvg); (*results)[1].floatData(fiveMinAvg); (*results)[2].floatData(tenMinAvg); } else { cerr << "Unknown operation specified: " << operation; throw 0; } }
The second action taken is the creation of the Results sequence. The pointer to the new object is assigned to the variable results. The value specified when creating the Results object is the starting maximum length of the sequence. This value provides the constructor for the new object with additional information; therefore, the constructor can allocate efficiently the memory required to support a sequence of that length.
Once the current load has been obtained, invokeOperation() can create and return the proper results, depending on the operation specified. As the code shows, invokeOperation() does this by comparing the operation to a sequence of strings and, depending on which string is specified, taking the proper action. The requested load information is returned to the client in the results argument.
The Result_out data type is a reference to a pointer to an object of type Results. For those readers not familiar with C++ references, a reference basically is a compiled time name that refers to the same data that another name refers to. If a function's parameter is specified as a reference, then that parameter refers to the same data that the argument passed into that function refers to. This is similar to specifying in C that a parameter is a pointer and then dereferencing that pointer within the function. The main difference is there is no need to dereference a reference parameter.
Regardless of which operation was specified (except for the case in which an unknown operation was specified), two actions are taken. The first action sets the actual length of the sequence, while the second action initializes the sequence.
The length of the sequence is set using the sequence's length() method. Once must be careful in setting this value as it indicates the actual length of the sequence, that is, the number of valid entries in the sequence. This number is used by the underlying transport layers to determine the actual number of bytes that need to be transferred when the sequence is passed between objects.
Once the length is specified, the sequence is initialized by using the sequence's [] operator. The syntax used for initializing can be somewhat tricky to understand if one forgets that results actually is a pointer to an object. Thus, before the [] operator can be invoked, we need to dereference the pointer using *. Given the precedence rules, we force the dereferencing by surrounding the *results with parentheses. Once the pointer has been dereferenced, we can invoke the [] operator to select the specific sequence element.
Keep in mind that each element in the sequence is a union object. In order to set the value of a union object, one must activate the field to be set and then provide the value used to set that field. In order to active a field, one sets the object's discriminator using one of the object's discriminator methods. A union object has one such method for each field specified. For example, for the DataValue's floatData field, a floatData() method was created. Invoking this method sets the objects determinator to the correct value, in this case, DT_FLOAT.
The determinator methods are overloaded methods. If they are invoked with a value, then the object's specified field is set to that value. If they are invoked without a value, then the current value is returned. Given that we are setting the object's value, we invoke floatData() with the value we want to set the object to--the current processor load.
Now that we have looked at how invokeOperation() functions, we can look at how the client invokes it. Listing 3 contains the source code for the getLoadData() function that is part of the client application. It is in this function that invokeOperation() is called and the results are processed. The first action this function takes is the creation of two _var variables, args and results. Given that invokeOperation() does not expect any arguments, we are free to pass it a 0-length sequence, which is what we get by creating args using the default Argument constructor.
Listing 3. getLoadData()
void getLoadData(CPULoad_var& cpuLoad, const string command, unsigned int length, float averages[]) { Arguments_var args = new Arguments; Results_var results; cpuLoad->invokeOperation(command.c_str(), args.in(), results.out()); if (results->length() != length) { ostringstream message; message << "Invalid results: Length != " << length << " (length = " << results->length() << ")"; throw runtime_error(message.str()); } for (unsigned int i = 0; i < length; ++i) if ((*results)[i]._d() != DT_FLOAT) throw runtime_error("Invalid data type for results"); for (unsigned int i = 0; i < length; ++i) averages[i] = (*results)[i].floatData(); }
When we execute invokeOperation(), we pass to it the string specifying the operation we want executed along with the two sequences, args and results. For those not familiar with the C++ string class, the c_str() method simply returns a const char pointer. We use the in() method with args because we are providing the arguments to invokeOperation(); we use the out() method because we are retrieving the results from invokeOperation().
Calling invokeOperation() as shown causes the desired load information to be stored in the results sequence. In order to be sure the correct action was taken, we check that the results sequence contains the correct type of data. The first check we perform is to make sure the length of the sequence is correct. Next, we check that each element in the sequence is of the right type. Remember, results is a sequence of union objects. Thus, we use the discriminator access method, _d(), to determine the element's type. As you can see, we check to see if the element is of type DT_FLOAT.
Finally, we return the load data to the calling function. We have chosen to copy the values into a float array, as opposed to returning the sequence, in order to demonstrate that it is easy to isolate the CORBA specific components of an application. One easily could wrap all of the CORBA-related functionality within a set of wrapper functions, like we did with getLoadData(), and the remaining application code would never need to know that it was a CORBA-based client.
We conclude our look at some of the advanced CORBA data structures at this point and turn our attention to the world of CORBA services. Readers are encouraged to look further into the CORBA data structures in order to appreciate fully the CORBA data structure model.
Up to this point, what we have been looking at what generally is known as Core CORBA. In addition to this Core functionality, CORBA also supports a wide range of services. These services include those targeted at a specific industry as well as those services more general in nature. For the remainder of this article, we are going to look at one of the fundamental general services, namely the Naming Service.
In the example we have created (see Part I for the details), the server object announced its existence by generating a stringified version of its CORBA reference and placing that string in the file ior.dat. This setup was sufficient for our simple example. However, in a system that consists of many CORBA objects--think in terms of hundreds--generating a file for each object becomes a major management nightmare. In order to deal with this problem, ORB vendors provided a variety of solutions. Eventually, a standard service, the Naming Services, was specified to deal with this problem.
The Naming Services uses a simple directed graph for translating a name to a value. This graph is similar to the Linux filesystem; it also has a hierarchy composed contexts (directories) and application objects (files). Unlike the Linux filesystem, though, the Naming Services doesn't have a single, unique root node. Instead, the Naming Services can support any number of root contexts, which are known as orphan contexts.
Listing 4 shows the steps an object has to take in order to use the Naming Services to announce its existence. Once the object exists, the ORB's resolve_initial_references() method is invoked using the string NameService. This method returns a generic object reference if an initial naming service has been specified; we discuss how to specify a naming service shortly. This generic object reference must be narrowed down to the specify object reference require, namely a reference to a NamingContext object.
Listing 4. An Object Announcing Its Existence
// Get reference to the initial name context obj = orb->resolve_initial_references("NameService"); if (CORBA::is_nil(obj.in())) { cerr << "Nil name service reference" << endl; throw 0; } CosNaming::NamingContext_var inc = CosNaming::NamingContext::_narrow(obj.in()); if (CORBA::is_nil(obj.in())) { cerr << "Cannot narrow name service reference" << endl; throw 0; } // Create the SKY name context CosNaming::NamingContext* skyNC; CosNaming::Name name; name.length(1); name[0].id = CORBA::string_dup("SKY"); skyNC = inc->bind_new_context(name); // Publish our reference under cpuLoad name[0].id = CORBA::string_dup("cpuLoad"); skyNC->bind(name, cpuLoadRef.in());
Once the generic reference has been narrowed, the context in which the name exists needs to be created. The context is created using the bind_new_context() method. This method is invoked with SKY as the name of the new context; it returns a reference to the SKY name context. Once the context has been created, the object's name can be added and its reference bound, using the context's bind() method. This method takes the name we want to use, in this case cpuLoad, and the value we want to bind to it, in this case the object's reference. With this information, it creates a new entry in the specified context with the value bound to that name. At this point, any client can use the naming services to get a reference to the cpuLoad object.
Listing 5 shows the steps a client takes to retrieve a reference to the desired object. As with the server, the first thing a client has to do is get a specific reference to the Naming Services. This is done by invoking the ORB's resolve_initial_references() method and then narrowing the returned reference. Next, the Naming Services resolve() method is invoked in order to get the reference is bound to the name. This method requires the full name, including all contexts, of the object to be resolved. Once the name has been resolved, the returned reference must be narrowed to the specific desired object reference, namely CPULoad.
Listing 5. Retrieving a Reference to an Object
// Get reference to the initial name context CORBA::Object_var obj = orb->resolve_initial_references("NameService"); if (CORBA::is_nil(obj.in())) { cerr << "Nil name service reference" << endl; throw 0; } CosNaming::NamingContext_var inc = CosNaming::NamingContext::_narrow(obj.in()); if (CORBA::is_nil(obj.in())) { cerr << "Cannot narrow name service reference" << endl; throw 0; } // Get reference to cpuLoad object CosNaming::Name name; name.length(2); name[0].id = CORBA::string_dup("SKY"); name[1].id = CORBA::string_dup("cpuLoad"); obj = inc->resolve(name); if (CORBA::is_nil(obj.in())) { cerr << "Nil binding in name service" << endl; throw 0; } CPULoad_var cpuLoad = CPULoad::_narrow(obj.in()); if (CORBA::is_nil(obj.in())) { cerr << "Cannot narrow cpuLoad reference" << endl; throw 0; } </emphasis>
At this point, the client now has access to the cpuLoad object and can invoke any of its interface methods.
The final aspect of the Naming Services we need to look at is specifying the initial naming context. This needs to be done in order for resolve_initial_references() to return the proper reference. The CORBA standard does not specify how this is to be accomplished and leaves it up to the vendors to specify. Because of this decision, the initial naminx context can be specified in several different ways. In this example, we are going to fall back to the simplest version, namely, putting the reference to the naming service that will be the initial naming context in a file, just as we originally put the reference to the CPULoad object in the file ior.dat.
TAO provides a simple mechanism for doing this--the Naming_Service executable is supplied with TAO and accepts an -o option. When the option is specified, the Naming_Service executable outputs the reference to the naming services object to the specified file. This file then can be accessed by any program that needs to get the Naming Services. The usual way to do this is to pass the option -ORBInitRef when executing the program and letting the program pass the main argument vector to ORB_init(), as we did in our examples. Listing 6 contains a simple shell script that demonstrates how to use these options.
Listing 6. Using the Options
#!/bin/sh aceRoot=/home/Work/Sky/Packages/ACE-TAO/ACE_wrappers taoRoot=$aceRoot/TAO nameService=$taoRoot/orbsvcs/Naming_Service/Naming_Service echo Starting name services $nameService -o nsIor.dat & sleep 5 echo Starting server ./server -ORBInitRef NameService=file://nsIor.dat & sleep 5 echo Starting client ./client -ORBInitRef NameService=file://nsIor.dat killall server killall Naming_Service
In Part I of this article, we provided a brief look at CORBA and how it extends the object-oriented programming paradigm to a global space. In Part II, we looked at some of the advanced data structures supported in CORBA and introduced the readers to the CORBA services. CORBA has much more to it, and readers are encouraged to explore it more fully. For those readers who program multiprocess applications, even single processor applications, CORBA should be in their programming toolboxes.
Dr. Gerry Pocock has been working with multiprocessor systems since his undergraduate days at the Laboratory for Laser Energetics at the University of Rochester. Since then, he obtained his PhD from UMass Amherst, worked as a professor at UMass Lowell, worked as a consultant for various companies including Intel and held several full time positions. He is currently Chief Software Architect for Sky Computers, Chelmsford MA.