JPA Transitive Persistence for the Play Framework

In play framework 1.x the JPA persistency model has been changed in a dramatic way. JPA as a standard is all about transitive persistence. That means, within a JPA transaction, any change to any object that is managed by the EntityManager will result in propagation of these changes into the database, that means the transitive hull of the loaded reachable object graph is traversed (considering Cascade annotation), then compared to the loaded state and then the full delta is committed to the DB. It is all happening implicitly. The easiness of this automatism is an achievement which took the Java Enterprise world quite some time.

Play now, on the other hand, tries to resemble the logic found in web frameworks like Ruby on Rails, where the save operation is an explicit operation. You will need to call save on any object whose changes you want to propagate to the database. Play has implemented a cascade when the corresponding JPA annotation is present so you don’t need to call save() on all of them but even this solution is not complete, which I wrote about in Beware of this: play framework’s cascaded save() only works on loaded collections.

To be honest, I don’t like play’s (or RoR’s) approach. The logic behind it is, that you can make changes to as many objects you want and if an error occurs somewhere down the logic, you will not need to undo the changes that you made before the error occurred. The same can be accomplished by just not committing the running JPA transaction (i.e., rolling it back). In a J2EE environment, you just don’t want to control the transactions yourself. The JPA hack in play makes it necessary to do exactly that. If you are in a non-trivial environment, where you need to manage more complex object models you don’t want to mess up with the transaction which demarcates the unit of work! Some logic might apply changes which result in changes in the reachable object graph. The programmer of that logic doesn’t necessarily need to know that perhaps the initial change will ripple through to other objects. We call that encapsulation, and that is broken by play’s approach.

Long story short, in order to circumvent that, I created a module that brings the original JPA behavior of transitive persistence back to play. The API is exactly the same as in original play with the exception that it is necessary to inherit the model classes from a different base class. Instead of calling save on every changed object it is only necessay to call persist() on newly created objects. Every other change is handled by the transitive persistence logic of JPA just as it was originally intended.

The module is not publicly available, but I’ve been using it successfully in all my play projects. I just wanted to let you know about it and of course will I make it available (in github) if people ask for it. So, if you want the real JPA back in play, drop me a note and I will publish it.

Advertisements

Java Like Programming in C++

C++ is a nice and very flexible language, but this comes at the cost that it forces you to think about many programming details before you can even think about solving your actual problem. Examples would be:

  • object oriented or generic programming
  • when to use references, value types and pointers
  • memory management rules
  • casting rules, const correctness, virtual methods
  • STL? boost?

This list can become quite endless. Being in the same boat as every other C++ programmer but having grasped some of the look&feel of other languages like Java and Objective-C (in its Cocoa incarnation) or even Qt (as a good C++ OO style example), I am continuously and unconsciously thinking about the pros and cons of each of them and recently I thought: wouldn’t it be nice to be able to program in C++ but make it look like Java?

Sure it would! Even it it were only just for fun…

So, I came up with a working code sample which looks quite like Java but is actually C++. I show you the code sample first before I elaborate on any details. First come some classes which you can consider the “framework”:

#include <iostream>
#include <sstream>
#include <assert.h>
#include <vector>


size_t globalRefCount = 0;

class Object;
class String;
std::ostream &operator<<(std::ostream&, Object&);
std::ostream &operator<<(std::ostream &, const String&);



class Object {    
protected:
    
    struct Impl {
        size_t _refCount;
        Impl() : _refCount(0) {}
        virtual ~Impl() {}
    } *data;
    
    inline void retain() {
        if(data) {
            data->_refCount++;
            globalRefCount++;
        }
    }
    inline void release() {
        if(data) {
            data->_refCount--;
            globalRefCount--;
            if(data->_refCount == 0) {
                delete data;
                data = 0;
            }
        }
    }
    
    Object(): data(new Impl) {
        retain();
    }
    Object(Impl *imp) : data(imp) {
        retain();
    }
    Object(const Object& other): data(0) {
        operator=(other);
    }
    
public:
    virtual ~Object() {
        release();
    }
    virtual const char *type() const { return "Object"; }
    
    void operator=(const Object& other) {
        if(data!=other.data) {
            release();
            data = other.data;
            retain();
        }
    }
    
    // make a heap clone of this object for usage in containers
    virtual Object *clone() const {
        return new Object(*this);
    }
    
    virtual String toString() const;
        
};




class String : public Object {
protected:
    struct Impl: public Object::Impl {
        std::string str;
    };
public:
    String(): Object(new Impl) {}
    String(const char *s): Object(new Impl) {
        static_cast<Impl*>(data)->str = s;
    }
    String(const String &other): Object(other) {}
    String operator+(const String& s) const {
        String result;
        static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
        static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.data)->str;
        return result;
    }
    String operator+(const Object& s) const {
        String result;
        static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
        static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.toString().data)->str;
        return result;
    }
    String operator+(long l) const {
        std::ostringstream oss;
        oss << static_cast<Impl*>(data)->str << l;
        String result;
        static_cast<Impl*>(result.data)->str = oss.str();
        return result;
    }
    const char *c_str() const {
        return static_cast<Impl*>(data)->str.c_str();
    }
    bool operator==(const String &other) const {
        return static_cast<Impl*>(data)->str == static_cast<Impl*>(other.data)->str;
    }
    
    // must have's
    const char *type() const { return "String"; }
    Object *clone() const { return new String(*this); }
    
    String toString() const {
        return *this;
    }
};

std::ostream &operator<<(std::ostream &os, const String& s) {
    os << s.c_str();
    return os;
}

String Object::toString() const {
    std::ostringstream os;
    os << this->type() << "@" << (void *)this << "[" << (data ? data->_refCount : 0) << "]";
    return String(os.str().c_str());
}

std::ostream &operator<<(std::ostream &os, Object& o) {
    os << o.toString().c_str();
    return os;
}




class ClassCastException: public Object {
    struct Impl: public Object::Impl {
        String message;
    };
public:
    ClassCastException() : Object(new Impl) {}
    ClassCastException(const String& msg) : Object(new Impl) {
        static_cast<Impl*>(data)->message = msg;
    }
    const char *type() const { return "ClassCastException"; }
    Object *clone() const { return new ClassCastException(*this); }
    String message() const {
        return static_cast<Impl*>(data)->message;
    }
    String toString() const {
        return message();
    }

};



class ArrayList: public Object {
    struct Impl: public Object::Impl {
        std::vector<Object*> _data;
    };
    
public:
    ArrayList(): Object(new Impl) {}
    ~ArrayList() {
        Impl *self = static_cast<Impl*>(data);
        for (std::vector<Object*>::iterator it = self->_data.begin(); it!=self->_data.end(); it++) {
            delete *it;
        }
    }
    void add(const Object& element) {
        static_cast<Impl*>(data)->_data.push_back(element.clone());
    }
    size_t size() const {
        return static_cast<Impl*>(data)->_data.size();
    }
    
    Object &at(size_t index) const {
        return *static_cast<Impl*>(data)->_data.at(index);
    }
    template<class T> const T &at(size_t index) const {
        Object *o = static_cast<Impl*>(data)->_data.at(index);
        T *t = dynamic_cast<T*>(o);
        if(t) return *t;
        throw ClassCastException(o->type());
    }
    
    const char *type() const { return "ArrayList"; }
    Object *clone() const { return new ArrayList(*this); }

    String toString() const {
        std::ostringstream oss;
        oss << Object::toString() << "(";
        for (size_t i = 0; i<size(); i++) {
            oss << at(i).toString();
            if(i+1<size()) {
                oss << ",";
            }
        }
        oss << ")";
        return oss.str().c_str();
    }

};


class OutputStream: public Object {
    struct Impl: public Object::Impl {
        std::ostream &stream;
        Impl(std::ostream &os): stream(os) {}
    };
    OutputStream() {}
public:
    
    OutputStream(std::ostream& os): Object(new Impl(os)) {}
    OutputStream(const OutputStream &other): Object(other) {}

    void println(const Object &object) {
        static_cast<Impl*>(data)->stream << object.toString() << std::endl;
    }
    
    const char *type() const { return "OutputStream"; }
    Object *clone() const { return new OutputStream(*this); }

};

Now let’s see how to use it in client code. I supply only a main() here but I have some anonymous blocks to show the effects of scoping:


struct system {
    OutputStream out;
    system(): out(std::cout) {}
};

struct system System;


int main (int argc, const char * argv[]) {
    
    assert(globalRefCount==1); // 1 is for System.out
    
    {
        String s;
        assert(globalRefCount==2);
    }
    assert(globalRefCount==1);
    
    
    {
        String s = "Connecting...";
        System.out.println(String("s = ") + s);
        
        String dots = "the dots";
        String t = s + " " + dots + ".";
        System.out.println(String("t = ") + t);
        
        ArrayList l;
        l.add(s);
        assert(l.size() == 1);
        assert(globalRefCount==6);
        
        l.add(t);
        l.add(ArrayList());
        System.out.println(String("l = ") + l);
        assert(globalRefCount==8);
        
        try {
            System.out.println(String("l[0] as String = ") + l.at<String>(0)); // ok
            System.out.println(String("l[1] as String = ") + l.at<String>(1)); // ok
            System.out.println(String("l[2] as String = ") + l.at<String>(2)); // this throws!
            assert(false);
        } catch(ClassCastException e) {
            System.out.println(String("ClassCastException: ") + e);
            l.add(e);
        }
        
        
        // adding to the list in other scope will keep the object valid
        {
            String other = "Created in other scope";
            l.add(other);
        }
        System.out.println(String("l[3] as String = ") + l.at(3));
        assert(l.size()==5);
        assert(l.at<String>(4) == "Created in other scope");
        System.out.println(String("l = ") + l);
    }
    assert(globalRefCount == 1);
    
    return 0;
}

Pretty much like Java, isn’t it?

I have written this just as a proof of concept. As such, it is fully working and I like it so far. It could serve as a good starting point for a complete implementation. Here’s the output when executed:

s = Connecting...
t = Connecting... the dots.
l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1]())
l[0] as String = Connecting...
l[1] as String = Connecting... the dots.
l[2] as String = ClassCastException: ClassCastException@0x100100cb8[1]
l[3] as String = ClassCastException@0x100100b20[1]
l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1](),ClassCastException@0x100100b20[1],Created in other scope)
Program ended with exit code: 0

Fundamental Design

Java is (with the exception of primitive types like int, double etc. and their array forms) an object-oriented language. Everything in Java or Objective-C derives from a common and well known base class. So there is a base class Object in this example as well and there is an equivalent of the platform type String as well as a sample collection called ArrayList which is holding just Objects.

One thing is very important: There is no need for pointers as memory management is part of the solution! For instance, the ClassCastException that is thrown can be added to the ArrayList without having to worry about leaking memory afterwards. It’s not complete yet though as the notion of “weak” pointers is still missing (think: ARC), but the strong part it is fully working here.

The whole idea of the implementation is based on the well-known PIMPL idiom but it also throws the idea of the smart_ptr into the mix but it even goes further. Firstly, all behavior and data are strictly separated, not just the private members. Each conceptual class like String, ArrayList or ClassCastException has a functional class that implements behavior only and no data at all, it acts like a fully functional proxy to the data. This makes it possible to clone (copy assign) these proxy objects very cheaply, because they consist only of 2 pointers (data and _vtable). The actual data is implemented in the nested class “Impl”. There is one specialized Impl class for each conceptual class (1v1 mapping). Both the conceptual classes and the Impl classes span two parallel type hierarchies. As one picture says more like 1000 words, here it is:

In the base Impl (Object::Impl) a smart_ptr like reference counting is implemented (_refCount). I have explicitly added two methods Object::retain() and Object::release() in the code to express the similarity to Objective-C’s NSObject, but this is all handled internally during copy construction or assignment.

Conclusion

I have still to decide wether an approach like this is generally feasible. What I like is to be able to clone good concepts and class library designs from other languages like Java or Objective-C into C++ and continue coding without having to worry about the aforementioned detailed C++ design decisions that trouble me every day.

Of course, I would have to implement ARC style memory management completely before it can be used, otherwise cyclical references would leak. Also, I’d like to mention that I’m fully aware that a coding style like this leads to immediate code bloat. But so does the PIMPL idiom. In order to mitigate that, I have a flexible code generator on my side which let’s me do most part of the actual coding in UML as opposed to hand-crafting it where I would definitely think twice or even more before traveling down this road…

View the full source: https://gist.github.com/2279561

Play module “associations” 1.0 released

Today I have released version 1.0 of the play [associations] module. I would like to elaborate a bit more on the rationale for this module.

Imagine the following simple model:

public class Forum {
    @OneToMany(cascade=CascadeType.ALL, mappedBy="forum")
    public List<Post> posts;
} 
public class Post {
    @ManyToOne
    public Forum forum;
}

Before the Module

In model management there are 3 phases, creation, manipulation and deletion. Under normal conditions (read play without associations module) these would be implemented like this:

1. Object creation and association creation

Forum forum = new Forum();
Post post = new Post();
forum.posts.add(post);
post.forum = forum; // don't forget this
forum.save(); // using CascadeType.ALL etc. this will cascade

2. Manipulation of existing objects

Post post = ...
post.forum.posts.remove(post); // don't forget this
Forum forum2 = new Forum();
forum2.posts.add(post);
post.forum = forum2; // don't forget this
forum2.posts.add(post);
forum.save(); // cascades
forum2.save(); // cascades

3. Deletion of objects

// this can be accomplished using
public class Forum {
  @OneToMany(cascade = CascadeType.ALL, orphaneRemoval = true)
  public List&lt;Post&gt; posts;
}
Post post = ...
post.forum.posts.remove(post);
post.forum = null; // don't forget this
forum.save();

With the Module

Now fast forward to an implementation *with* the associations module, this all gets much easier and intuitive to write:

1. Object creation and association creation

Forum forum = new Forum();
Post post = new Post();
forum.posts.add(post);
forum.save(); // using CascadeType.ALL etc. this will cascade

2. Manipulation of existing objects

Post post = ...
Forum forum2 = new Forum();
forum2.posts.add(post);
forum.save(); // cascades
forum2.save(); // cascades

3. Deletion of objects

Post post = ...
post.forum = null;
forum.save();

IMO this is much more intuitive. If you add a post to a forum, you’d expect that the previous forum does not reference it any more, wouldn’t you? Also, if you make a change on one side, you’d expect that the corresponding change on the other side is also made, wouldn’t you?

And this is exactly what this module provides, complete management of all JPA two-sided associations.

Transparent Bi-directional Associations in Play

Do you find writing code like this cumbersome and error prone?

forum1.posts.remove(post);
post.forum = forum2;
forum2.posts.add(post);

Wouldn’t it be much easier if you could just write

post.forum = forum2;

and the system handles all the wiring and rewiring for you? With the benefit of eliminating errors like “detached entity passed to persist” and such?

I have created a play module that does just that. Whenever you invoke one of the operations that changes a part of the association from whichever side, then the module completes this operation on all other affected parts. This includes not only the new target and its opposite reference, but it also manages to unlink a target object from its current associated object. All hassle free and safe. It works on @OneToOne, @OneToMany and @ManyToMany associations.

There are no dependencies introduced in your code. The module enhancer works on all properties of your @Entity classes having a “mappedBy” attribut on the @OneToOne, @OneToMany or @ManyToMany annotation. You do not need to declare anything else. The presence of the module is sufficient.

You can check it out at https://github.com/pareis/associations and as soon as available in the play modules repository.

Beware of this: play framework’s cascaded save() only works on loaded collections

I am currently developing a web application using the really excellent play framework. If you are a web developer with some Java background I strongly encourage you to give it a try.

However, play has taken a little different or let’s better call it an additional approach to control object synchronizations with the underlying database. Whereas JPA manages a loaded object graph and automatically persists any changes upon transaction completion, play has changed this behavior (“Explicit save”) to give the beginner more control over what is going on with the objects. This extension turns the implicit save mechanism into an explicit one where the developer is in control to call save() on the changed objects. This save() will in turn cascade (if for instance the CascadeType.ALL annotation option is present) to the reachable objects.

This might in the first place seem clever because you get control back but I have mixed feelings about it because there are cases where it does just not work as easily as expected.

As an example, imagine a 3-level parent-child object model.

@Entity
public class Item extends Model {

 @OneToMany(cascade = CascadeType.ALL, orphanRemoval=true, mappedBy="item")
       @OrderBy("createdAt")
 public List<Position> positions = new ArrayList<Position>();
}

@Entity
public class Position extends Model {

 public Date createdAt = new Date();
 public int quantity;

 @ManyToOne public Item item;

 @OneToMany(cascade = CascadeType.ALL, orphanRemoval=true, mappedBy="position")
 @OrderBy("createdAt")
 public List<Order> orders = new ArrayList<Order>();
}

@Entity @Table(name="Ordr")
public class Order extends Model { 
 public Date createdAt = new Date();
 public int quantity;

 @ManyToOne public Position position;
}

So far nothing complicated. But now imagine you are loading (for performance reasons) objects from the lowest level (Order) and manipulate the parent object (i.e., the Position).

public void dostuff(Item item) {

 // load orders according to some important criteria
 List<Order> orders = Order.find("position.item = ? order by createdAt", item).fetch();

 // manipulate the parent of the order (the Position)
 Order order = extractImportantOne(orders);
 order.position.quantity += 200;

 item.save();
}

Intuitively I would assume that the manipulated Position object is updated in the database through the Item.positions cascade settings. But this is not the case here! The reason is that the collection Item.positions has not been populated from the database. Thus, play sees an unintialized PersistentCollection and will simply ignore to cascade.

One solution to the problem would be to touch the cascading collection:

 item.positions.size();
 item.save();

This will make sure the ‘positions’ collection will be considered during the following call to save(). With pure JPA on the other hand, such a call to do the trick would never be necessary and also the call to save() could be avoided, because JPA knows all loaded and changed objects and automatically updates the database.

This is why I have very mixed feelings about play’s extended persistence API. It works well in simple cases but for more complex models it might be better to switch back to plain JPA mode.