JPA Transitive Persistence for the Play Framework

In play framework 1.x the JPA persistency model has been changed in a dramatic way. JPA as a standard is all about transitive persistence. That means, within a JPA transaction, any change to any object that is managed by the EntityManager will result in propagation of these changes into the database, that means the transitive hull of the loaded reachable object graph is traversed (considering Cascade annotation), then compared to the loaded state and then the full delta is committed to the DB. It is all happening implicitly. The easiness of this automatism is an achievement which took the Java Enterprise world quite some time.

Play now, on the other hand, tries to resemble the logic found in web frameworks like Ruby on Rails, where the save operation is an explicit operation. You will need to call save on any object whose changes you want to propagate to the database. Play has implemented a cascade when the corresponding JPA annotation is present so you don’t need to call save() on all of them but even this solution is not complete, which I wrote about in Beware of this: play framework’s cascaded save() only works on loaded collections.

To be honest, I don’t like play’s (or RoR’s) approach. The logic behind it is, that you can make changes to as many objects you want and if an error occurs somewhere down the logic, you will not need to undo the changes that you made before the error occurred. The same can be accomplished by just not committing the running JPA transaction (i.e., rolling it back). In a J2EE environment, you just don’t want to control the transactions yourself. The JPA hack in play makes it necessary to do exactly that. If you are in a non-trivial environment, where you need to manage more complex object models you don’t want to mess up with the transaction which demarcates the unit of work! Some logic might apply changes which result in changes in the reachable object graph. The programmer of that logic doesn’t necessarily need to know that perhaps the initial change will ripple through to other objects. We call that encapsulation, and that is broken by play’s approach.

Long story short, in order to circumvent that, I created a module that brings the original JPA behavior of transitive persistence back to play. The API is exactly the same as in original play with the exception that it is necessary to inherit the model classes from a different base class. Instead of calling save on every changed object it is only necessay to call persist() on newly created objects. Every other change is handled by the transitive persistence logic of JPA just as it was originally intended.

The module is not publicly available, but I’ve been using it successfully in all my play projects. I just wanted to let you know about it and of course will I make it available (in github) if people ask for it. So, if you want the real JPA back in play, drop me a note and I will publish it.

NSInvocation in Objective-C++

UPDATE: This article was initially about some difficulties that I had calling some Objective-C++ code from C++ in a dynamic way using NSInvocation, which failed in the first place. I switched to directly calling the IMP pointer instead while passing C++ object pointers as arguments. Thanks to the comments that I received from bob, an obviously very experienced Objective-C++ developer, I could change the code and make it work also with NSInvocation.

Here is a C++ class that represents the details of a notifier/observer association called Observation:

    class Observation: public Object {
    public:

        void setObjcObserver(NSObject *theObserver) { _nsObserver = theObserver; }
        void setObjcSlot(SEL theSlot) { _nsSlot = theSlot; }
                
        void notify(bool complete);
        
        Observation() : _nsObserver(nil), _nsSlot(nil) {}
        
    private:

        Object *_notifier;
        NSObject *_nsObserver;
        SEL _nsSlot;
    };

First Attempt Using NSInvocation Failed

    void Observation::notify(bool complete) {

        NSMethodSignature *aSignature = [[_nsObserver class]
                        instanceMethodSignatureForSelector:_nsSlot];

        assert(aSignature);
        assert([_nsObserver respondsToSelector:_nsSlot]);

        NSInvocation *anInvocation = [NSInvocation invocationWithMethodSignature:aSignature];
        [anInvocation setSelector:_nsSlot];
        [anInvocation setArgument:this atIndex:2];
        [anInvocation setArgument:&complete atIndex:3];
            
        [anInvocation invokeWithTarget:_nsObserver];
    }

this code compiled fine but it crashed as soon as I tried to access members of the received Observation object in Objective-C.

IMP Based Solution

My approach to solve the issue is that I am directly calling the IMP of the method in the observer object. For this, I slightly change the C++ Observation model to cache the IMP pointer:

    class Observation: public Object {
    public:

        void setObjcObserver(NSObject *theObserver) { _nsObserver = theObserver; }
        void setObjcSlot(SEL theSlot) { _nsSlot = theSlot; }
                
        void notify(bool complete);
        
        Observation() : _nsObserver(nil), _nsSlot(nil), _nsSlotImpl(nil) {}
        
    private:

        Object *_notifier;
        NSObject *_nsObserver;
        SEL _nsSlot;

        typedef void (*slotImpl_t)(__strong id, SEL, lang::Observation *, BOOL);
        
        // used to cache the IMP of the objcSlot
        slotImpl_t _nsSlotImpl;

    };

Here, I simply add a typedef of the desired observer method signature called “slotImpl_t” and a member to cache the IMP function pointer. The cache would not be necessary but in my case the observer IMP will not change, so I cache it in order to save the look-up on frequent execution of the code. The notify() method now looks different:

    void Observation::notify(bool complete) {
        if(!_nsSlotImpl) {
            _nsSlotImpl = (slotImpl_t)[_nsObserver methodForSelector:_nsSlot];
            assert(_nsSlotImpl);
        }
        _nsSlotImpl(_nsObserver, _nsSlot, this, complete);
    }

This is working like a charm now. The address of the Observation object at the receiving end is correct as it should be, even in case of multiple-inheritance combined with virtual inheritance.

Correct Solution with NSInvocation

It turned out that in my original solution using NSInvocation I did it wrong. NSInvocation needs the address of the this pointer and not the this pointer itself. To make it worse, when simply doing

        [anInvocation setArgument:&this atIndex:2];

the compiler complains with the error:

Address expression must be an lvalue or a function designator

So, instead of &this it has to be an lvalue which essentially requires to copy the this pointer to a temporary variable, ‘temp’ in this case. Here is the complete and working code:

    void Observation::notify(bool complete) {

        NSMethodSignature *aSignature = [[_nsObserver class]
                        instanceMethodSignatureForSelector:_nsSlot];

        assert(aSignature);
        assert([_nsObserver respondsToSelector:_nsSlot]);

        NSInvocation *anInvocation = [NSInvocation invocationWithMethodSignature:aSignature];
        [anInvocation setSelector:_nsSlot];
        Observation *temp = this;
        [anInvocation setArgument:&temp atIndex:2];
        [anInvocation setArgument:&complete atIndex:3];
            
        [anInvocation invokeWithTarget:_nsObserver];
    }

Conclusion

There is expert knowlegde about Objective-C++ available on the planet. But it’s not easy to get access to it. Sometimes, it helps to blog about a problem and the solution will come and find you;) Again, bob, thank you very much

Java Like Programming in C++

C++ is a nice and very flexible language, but this comes at the cost that it forces you to think about many programming details before you can even think about solving your actual problem. Examples would be:

  • object oriented or generic programming
  • when to use references, value types and pointers
  • memory management rules
  • casting rules, const correctness, virtual methods
  • STL? boost?

This list can become quite endless. Being in the same boat as every other C++ programmer but having grasped some of the look&feel of other languages like Java and Objective-C (in its Cocoa incarnation) or even Qt (as a good C++ OO style example), I am continuously and unconsciously thinking about the pros and cons of each of them and recently I thought: wouldn’t it be nice to be able to program in C++ but make it look like Java?

Sure it would! Even it it were only just for fun…

So, I came up with a working code sample which looks quite like Java but is actually C++. I show you the code sample first before I elaborate on any details. First come some classes which you can consider the “framework”:

#include <iostream>
#include <sstream>
#include <assert.h>
#include <vector>


size_t globalRefCount = 0;

class Object;
class String;
std::ostream &operator<<(std::ostream&, Object&);
std::ostream &operator<<(std::ostream &, const String&);



class Object {    
protected:
    
    struct Impl {
        size_t _refCount;
        Impl() : _refCount(0) {}
        virtual ~Impl() {}
    } *data;
    
    inline void retain() {
        if(data) {
            data->_refCount++;
            globalRefCount++;
        }
    }
    inline void release() {
        if(data) {
            data->_refCount--;
            globalRefCount--;
            if(data->_refCount == 0) {
                delete data;
                data = 0;
            }
        }
    }
    
    Object(): data(new Impl) {
        retain();
    }
    Object(Impl *imp) : data(imp) {
        retain();
    }
    Object(const Object& other): data(0) {
        operator=(other);
    }
    
public:
    virtual ~Object() {
        release();
    }
    virtual const char *type() const { return "Object"; }
    
    void operator=(const Object& other) {
        if(data!=other.data) {
            release();
            data = other.data;
            retain();
        }
    }
    
    // make a heap clone of this object for usage in containers
    virtual Object *clone() const {
        return new Object(*this);
    }
    
    virtual String toString() const;
        
};




class String : public Object {
protected:
    struct Impl: public Object::Impl {
        std::string str;
    };
public:
    String(): Object(new Impl) {}
    String(const char *s): Object(new Impl) {
        static_cast<Impl*>(data)->str = s;
    }
    String(const String &other): Object(other) {}
    String operator+(const String& s) const {
        String result;
        static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
        static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.data)->str;
        return result;
    }
    String operator+(const Object& s) const {
        String result;
        static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
        static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.toString().data)->str;
        return result;
    }
    String operator+(long l) const {
        std::ostringstream oss;
        oss << static_cast<Impl*>(data)->str << l;
        String result;
        static_cast<Impl*>(result.data)->str = oss.str();
        return result;
    }
    const char *c_str() const {
        return static_cast<Impl*>(data)->str.c_str();
    }
    bool operator==(const String &other) const {
        return static_cast<Impl*>(data)->str == static_cast<Impl*>(other.data)->str;
    }
    
    // must have's
    const char *type() const { return "String"; }
    Object *clone() const { return new String(*this); }
    
    String toString() const {
        return *this;
    }
};

std::ostream &operator<<(std::ostream &os, const String& s) {
    os << s.c_str();
    return os;
}

String Object::toString() const {
    std::ostringstream os;
    os << this->type() << "@" << (void *)this << "[" << (data ? data->_refCount : 0) << "]";
    return String(os.str().c_str());
}

std::ostream &operator<<(std::ostream &os, Object& o) {
    os << o.toString().c_str();
    return os;
}




class ClassCastException: public Object {
    struct Impl: public Object::Impl {
        String message;
    };
public:
    ClassCastException() : Object(new Impl) {}
    ClassCastException(const String& msg) : Object(new Impl) {
        static_cast<Impl*>(data)->message = msg;
    }
    const char *type() const { return "ClassCastException"; }
    Object *clone() const { return new ClassCastException(*this); }
    String message() const {
        return static_cast<Impl*>(data)->message;
    }
    String toString() const {
        return message();
    }

};



class ArrayList: public Object {
    struct Impl: public Object::Impl {
        std::vector<Object*> _data;
    };
    
public:
    ArrayList(): Object(new Impl) {}
    ~ArrayList() {
        Impl *self = static_cast<Impl*>(data);
        for (std::vector<Object*>::iterator it = self->_data.begin(); it!=self->_data.end(); it++) {
            delete *it;
        }
    }
    void add(const Object& element) {
        static_cast<Impl*>(data)->_data.push_back(element.clone());
    }
    size_t size() const {
        return static_cast<Impl*>(data)->_data.size();
    }
    
    Object &at(size_t index) const {
        return *static_cast<Impl*>(data)->_data.at(index);
    }
    template<class T> const T &at(size_t index) const {
        Object *o = static_cast<Impl*>(data)->_data.at(index);
        T *t = dynamic_cast<T*>(o);
        if(t) return *t;
        throw ClassCastException(o->type());
    }
    
    const char *type() const { return "ArrayList"; }
    Object *clone() const { return new ArrayList(*this); }

    String toString() const {
        std::ostringstream oss;
        oss << Object::toString() << "(";
        for (size_t i = 0; i<size(); i++) {
            oss << at(i).toString();
            if(i+1<size()) {
                oss << ",";
            }
        }
        oss << ")";
        return oss.str().c_str();
    }

};


class OutputStream: public Object {
    struct Impl: public Object::Impl {
        std::ostream &stream;
        Impl(std::ostream &os): stream(os) {}
    };
    OutputStream() {}
public:
    
    OutputStream(std::ostream& os): Object(new Impl(os)) {}
    OutputStream(const OutputStream &other): Object(other) {}

    void println(const Object &object) {
        static_cast<Impl*>(data)->stream << object.toString() << std::endl;
    }
    
    const char *type() const { return "OutputStream"; }
    Object *clone() const { return new OutputStream(*this); }

};

Now let’s see how to use it in client code. I supply only a main() here but I have some anonymous blocks to show the effects of scoping:


struct system {
    OutputStream out;
    system(): out(std::cout) {}
};

struct system System;


int main (int argc, const char * argv[]) {
    
    assert(globalRefCount==1); // 1 is for System.out
    
    {
        String s;
        assert(globalRefCount==2);
    }
    assert(globalRefCount==1);
    
    
    {
        String s = "Connecting...";
        System.out.println(String("s = ") + s);
        
        String dots = "the dots";
        String t = s + " " + dots + ".";
        System.out.println(String("t = ") + t);
        
        ArrayList l;
        l.add(s);
        assert(l.size() == 1);
        assert(globalRefCount==6);
        
        l.add(t);
        l.add(ArrayList());
        System.out.println(String("l = ") + l);
        assert(globalRefCount==8);
        
        try {
            System.out.println(String("l[0] as String = ") + l.at<String>(0)); // ok
            System.out.println(String("l[1] as String = ") + l.at<String>(1)); // ok
            System.out.println(String("l[2] as String = ") + l.at<String>(2)); // this throws!
            assert(false);
        } catch(ClassCastException e) {
            System.out.println(String("ClassCastException: ") + e);
            l.add(e);
        }
        
        
        // adding to the list in other scope will keep the object valid
        {
            String other = "Created in other scope";
            l.add(other);
        }
        System.out.println(String("l[3] as String = ") + l.at(3));
        assert(l.size()==5);
        assert(l.at<String>(4) == "Created in other scope");
        System.out.println(String("l = ") + l);
    }
    assert(globalRefCount == 1);
    
    return 0;
}

Pretty much like Java, isn’t it?

I have written this just as a proof of concept. As such, it is fully working and I like it so far. It could serve as a good starting point for a complete implementation. Here’s the output when executed:

s = Connecting...
t = Connecting... the dots.
l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1]())
l[0] as String = Connecting...
l[1] as String = Connecting... the dots.
l[2] as String = ClassCastException: ClassCastException@0x100100cb8[1]
l[3] as String = ClassCastException@0x100100b20[1]
l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1](),ClassCastException@0x100100b20[1],Created in other scope)
Program ended with exit code: 0

Fundamental Design

Java is (with the exception of primitive types like int, double etc. and their array forms) an object-oriented language. Everything in Java or Objective-C derives from a common and well known base class. So there is a base class Object in this example as well and there is an equivalent of the platform type String as well as a sample collection called ArrayList which is holding just Objects.

One thing is very important: There is no need for pointers as memory management is part of the solution! For instance, the ClassCastException that is thrown can be added to the ArrayList without having to worry about leaking memory afterwards. It’s not complete yet though as the notion of “weak” pointers is still missing (think: ARC), but the strong part it is fully working here.

The whole idea of the implementation is based on the well-known PIMPL idiom but it also throws the idea of the smart_ptr into the mix but it even goes further. Firstly, all behavior and data are strictly separated, not just the private members. Each conceptual class like String, ArrayList or ClassCastException has a functional class that implements behavior only and no data at all, it acts like a fully functional proxy to the data. This makes it possible to clone (copy assign) these proxy objects very cheaply, because they consist only of 2 pointers (data and _vtable). The actual data is implemented in the nested class “Impl”. There is one specialized Impl class for each conceptual class (1v1 mapping). Both the conceptual classes and the Impl classes span two parallel type hierarchies. As one picture says more like 1000 words, here it is:

In the base Impl (Object::Impl) a smart_ptr like reference counting is implemented (_refCount). I have explicitly added two methods Object::retain() and Object::release() in the code to express the similarity to Objective-C’s NSObject, but this is all handled internally during copy construction or assignment.

Conclusion

I have still to decide wether an approach like this is generally feasible. What I like is to be able to clone good concepts and class library designs from other languages like Java or Objective-C into C++ and continue coding without having to worry about the aforementioned detailed C++ design decisions that trouble me every day.

Of course, I would have to implement ARC style memory management completely before it can be used, otherwise cyclical references would leak. Also, I’d like to mention that I’m fully aware that a coding style like this leads to immediate code bloat. But so does the PIMPL idiom. In order to mitigate that, I have a flexible code generator on my side which let’s me do most part of the actual coding in UML as opposed to hand-crafting it where I would definitely think twice or even more before traveling down this road…

View the full source: https://gist.github.com/2279561

Multiple Inheritance in Objective-C / Core Data vs C++

Multiple inheritance is hard. In fact it is so hard, that only very few programming languages support it. Objective-C is one for instance, where support for multiple inheritance is limited to the conformance to @protocols. Behavior can only be inherited from one single base class.

If you need multiple inheritance in Objective-C you have several options to choose from. Most of the time when you are looking for answers to the question of how to do multiple inheritance in Objective-C the right way, you will be pointed into one of two directions: #1 don’t use it because it implies a flawed design. #2 do it using composition (via delegates).

Option #1 I do not like at all. There are cases where MI is very well suited and only because Objective-C doesn’t support it doesn’t mean it is bad. It just means that the language designers considered it way too complicated to implement for the benefit of a few cases where it makes sense.

For instance, my current use case with strong need for multiple inheritance support is the UML specification. UML makes heavy use of multiple inheritance and if you study the UML model you will find that the abstractions found in there make very well sense because they eliminate redundancy and the need to explain what’s going on. All those abstractions are basically orthogonal classifications which can be combined in a subclass to express things very precise and in a type-safe manner.

So, if you are forced to deal with multiple inheritance in your program you can do so with option #2 in Objective-C. However, in my opinion, this has limitations. I will give you an example: Imagine a model like this:

Let’s say we map this to the following physical implementation in Objective-C. Here, the greenish elements are Objective-C @protocol and the yellowish elements are Objective-C @class:

It follows the often heard recommendation to map inheritance using composition. Here, the Class part of an AssociationClass is mapped to a delegate called “theClassImpl”, whereas the Association base class is mapped to plain Objective-C inheritance.

Suppose now we want to map this structure to CoreData. We need to model NSManagedEntity with NSManagedProperty. CoreData does not work on top of @protocols but on actual @classes. Therefore, we have one physical implementation of the association between Class and Property (owningClass-properties).

But here comes the big BUT: This can only work if we have full control over the OR mapping! CoreData on the other hand, does not rely on interfaces but on the actual implementations. That means, we must publish the otherwise internal composition mapping of theClassImpl to CoreData. If we then have a client of Class (for instance: Property::owningClass) then it will not be possible to downcast such a Class obtained from the persistence layer into an AssociationClass. Instead, it would be necessary to navigate backwards from the Class to the actual AssociationClass. But this kind of “alternative” cast can not be implemented transparently using Objective-C language constructs. An [aClass isKindOfClass:[AssociationClassImpl class]] would yield a technical “NO” and it’s not possible to extend the language to make it yield “YES”.

Such an MI -> SI mapping scenario can only work if every consumer solely relies on the Interfaces only and makes no assumption about internal structures. This would imply that the ORM uses factories instead of instantiating from its meta information. In CoreData, this is not the case.

This is why you can pretty much ignore the advice how to map multiple inheritance at the language level if you don’t also consider the APIs you’re dealing with because those APIs will render the easy sounding solution in the context of reality useless pretty quickly. In my case, I was forced to implement the model part of the system in C++ because C++ has awesomely good support for multiple inheritance. All problems related to the mapping of multiple inheritance to the microprocessor architecture had been solved by Bjarne Stroustrup in C++ since day 1. Read here why and how: http://drdobbs.com/184402074

Here’s how the dreaded diamond from the example above would be implemented in C++:

class Element {
};

class Association: public virtual Element {
};

class Class: public virtual Element {
public:
    std::vector<class Property*> properties;
};

class Property: public Element {
public:
    Class *owningClass;
};

class AssociationClass: public Association, public Class {
};

A straight 1:1 mapping of the concept to the language. Here, class “AssociationClass” would fully inherit the behavior of Class::properties without the need to implement something special. It just works. But, in comparison to Objective-C the C++ implementation lacks support for Core Data. But: so does multiple inheritance in Objective-C with CoreData! So no real difference here.

Conclusion

Multiple inheritance with CoreData is close to impossible except for very simple cases. With C++, besides all its ugliness and controversy, at least you get multiple inheritance in the language and usually in an implementation quality without the need to waste your time thinking around the whole concept.