Monday, January 15, 2007
Hibernate - Difference between session's get() and load()
Being an avid
hibernate fan, I have always defended it in my organization when people throw undue criticism at it in order to protect themselves. In one such debate, a colleague pointed out a pattern in our code-base that introduced needless performance degradation, and condemned hibernate for it. I was glad he brought that up - for 2 reasons. First because, it sure was a problem and called for immediate attention. Second because, once again the problem was not with hibernate, but us.
If you look closely at domain driven applications, you will notice that a few core objects are directly referenced by most other objects. Let me clarify what i mean with an example. In an
auction application, for example, an Auction is held for an Item, a Bid is placed for an Item, a Buyer buys an Item. The common object referenced here is, well, an Item. This implies that whenever you create a new Auction or Bid you are constrained to supply a reference to Item. The most obvious way to achieve this is by getting a persistent instance of Item from the database using the
session.get() method. This works, but it has its limitations.
Session session = << Get session from SessionFactory >>
Long itemId = << Get the item id from request >>
Item item = (Item) session.get(Item.class, itemId);
if(item != null) {
Bid bid = new Bid();
bid.setItem(item);
session.saveOrUpdate(bid);
} else {
log.error("Bid placed for an unavailable item");
// Handle the error condition appropriately
}
Think about it... How many times will a bid be placed for an Item? Many... Every time a Bid is placed, is it wise to hit the database and retrieve the corresponding Item just to supply it as a reference? I guess not. That is where
session.load() comes in. All the above scenarios remaining the same, if you just used session.load() instead of get(), hibernate will not hit the database. Instead it will return a proxy, or an actual instance if it was present in the current session, and that can be used to serve as a reference.
What does this buy you? At least 2 advantages. First, you save a trip to the database. Second, the error handling code just got elegant. Take a look at the code snippet below. Here we don't handle erroneous conditions using null checks. Instead we use exceptions, which sounds appropriate in this scenario. Don't they?
Session session = << Get session from SessionFactory >>
Long itemId = << Get the item id from request >>
try{
Item item = session.load(Item.class, itemId);
Bid bid = new Bid();
bid.setItem(item);
session.saveOrUpdate(bid);
} catch(ObjectNotFoundException e) {
log.error("Bid placed for an unavailable item");
// Handle the error condition appropriately
}
From the above piece of code, it is obvious that, an ObjectNotFoundException may be thrown if the actual Item representing the given item id cannot be found. What i am not clear about - and neither is hibernate documentation - is which method is more likely to cause this exception and why? session.load() seems to have a possibility to throw this exception, and so does saveOrUpdate() for the same fact that item for the given id/proxy is not available.
I would love to hear from people who have traveled this path and have an answer.
Also, it would be wonderful, if you could point out other differences between session.get() and load() that i may have missed.
Posted by Ganeshji Marwaha at 11:30 PM
33 Comments:
Trackbacks:
<< Home
Right, this is the correct way to use load(). But note that in some cases the exception might not occur until transaction commit time. And in this case it might appear as a constraint violation, not an ONFE.
By
Anonymous, at
2:00 PM
Being a beginner, it was really informative for me to learn the difference.
By
Anonymous, at
8:31 PM
Great post. It's those little things in rich tools like Hibernate that sharing of experiences like this is very helpful. Thanks!
By
marcus, at
11:44 PM
Hibernate 3 API says:
You should not use this method to determine if an instance exists (use get() instead). Use this only to retrieve an instance that you assume exists, where non-existence would be an actual error.
How this effects to your solution and what's the reasoning behind that statement? Any ideas?
By
Anonymous, at
12:54 AM
Nice post - just so much to learn about Hibernate. Couldn't you also use Hibernates 2nd level caching to help the performance and reduce trips to the database?
By
Anonymous, at
7:10 AM
Are there issues we should be worrying about when caching is enabled?
By
Anonymous, at
7:11 AM
As far as I know both calls will NOT cause a database hit if the object does exist in the session. And I think load will throw an ObjectNotFoundException if the object does not exist in the database. Thus it will cause a datbase hit if the object does NOT exist within the database session.
By
Anonymous, at
7:53 AM
If you are sure item is persisted in db then use load(), if you will check whether the item is persisted or not then use get(). if you will use load() then check whether the returned object is persisted then you may go wrong way. Because you load returns a proxy always, but get does not.
By
Anonymous, at
8:00 AM
Am I missing something? In the latest API docs for the "get" methods in Hibernate 3.1/3.2 (which you link to), it quite clearly states:
"If the instance, or a proxy for the instance, is already associated with the session, return that instance or proxy"
This means that the only difference between the two methods is that the load method will throw an exception if the persistent entity does not exist in the database. The functionality you describe did exist in Hibernate 2, however.
By
Simon Knott, at
8:01 AM
As far as I know if the object exist in the session both methods will not query the database but use the cached object. And as far as I understand load it has to query the database if the object does not exist within the session as it will throw an ObjectNotFoundException if the object does not exist. The javadoc isn't quite clear there but if I remeber correctly that is the expected behaviour.
By
Anonymous, at
8:02 AM
As far as I know if the object exist in the session both methods will not query the database but use the cached object. And as far as I understand load it has to query the database if the object does not exist within the session as it will throw an ObjectNotFoundException if the object does not exist. The javadoc isn't quite clear there but if I remeber correctly that is the expected behaviour.
By
Anonymous, at
8:02 AM
As far as I know if the object exist in the session both methods will not query the database but use the cached object. And as far as I understand load it has to query the database if the object does not exist within the session as it will throw an ObjectNotFoundException if the object does not exist. The javadoc isn't quite clear there but if I remeber correctly that is the expected behaviour.
By
Anonymous, at
8:04 AM
Am I missing something? In the latest API docs for the "get" methods in Hibernate 3.1/3.2 (which you link to), it quite clearly states:
"If the instance, or a proxy for the instance, is already associated with the session, return that instance or proxy"
This means that the only difference between the two methods is that the load method will throw an exception if the persistent entity does not exist in the database. The functionality you describe did exist in Hibernate 2, however.
By
Simon Knott, at
8:04 AM
I can see that this is your 2nd post but I hope that you'll present "us" with many more as cool as this one :-)
Any RSS feed?
Cheers,
PP
By
Paulo Pires, at
8:41 AM
This is a very good article and it triggered my curiosity. I have faced this issue few times. But not for performance issues. This is how I see it.
The get method returns a persistent entity or null if the object does not exist. If the object is already in the session, return that instance or proxy.
The load method:
- load (Class, Serializable), you should use find to check the existance of the entity. Don't assume that the object exists. This method retrieves that persistent object. If the object is already in the session, return the instance or the proxy.
- load(Object, Seriablizable), gets the persistent state for gievn Id into transient object (Object not associated with session).
As far as the performance, I benchmarked both and I have not seen any difference.
By
Anonymous, at
9:19 AM
Honestly, i didn't anticipate this many responses. In fact, i would be understating, when i say "you made my day". As i have assured in my first post, my intent is to share and learn at the same time. I will cast my best efforts in the upcoming entries to live up to it. Thanks to all my new friends, who took the patience to listen to me.
After studying the adept responses above, i was able to do a more informed research on the topic. Here, assuming we use the load() method, lets pinpoint when a proxy will be returned, and when an ObjectNotFoundException will be thrown.
Scenario #1: Item entity is configured such that hibernate is allowed to proxy it.
In this case, if the Item instance exists in the session cache, it will be returned, otherwise a proxy will be returned. If an actual instance was returned in the first place, we will be fine when we access Item's properties. If a proxy was returned to begin with, then an ObjectNotFoundException will be thrown if we access the properties of the Item proxy, and the actual Item doesn't exist in the database.
Note: You may access the identifier property without raising the exception.
Scenario #2: Item entity is configured such that hibernate is not allowed to proxy it.
In this case, hibernate will try to return the actual instance always. So, if an instance exists in the session cache, it will be returned. If not, since hibernate is not allowed to proxy it, it will make a trip to the database to find the instance. If present, the instance will be added to the session cache and returned. Otherwise, an ObjectNotFoundException will be thrown.
Hope this clarifies.
By
Ganeshji Marwaha, at
12:03 PM
I can see that this is your 2nd post but I hope that you'll present "us" with many more as cool as this one :-)
Any RSS feed?
Cheers,
PP
Here you go. http://gmarwaha.blogspot.com/atom.xml
The right nav-bar in the main page has a link to this feed, but i guess, i didn't add it to the individual post entries when i was updating my blogger template.
By
Ganeshji Marwaha, at
12:07 PM
Nice post. Encourages me to learn more about Hibernate
By
Anonymous, at
2:51 PM
how about the penalty you take for switching from a simple if/else check, to a try/catch which is a more expensive operation
By
Anonymous, at
3:15 PM
I agree that try-catch has a slight penalty compared to a null-check. Still there are 3 other reasons why you will want to use try-catch in this scenario.
1. In most cases, get() will return an object (primary reason for using load() here), which constrains us to hit the database twice - once for get() and again for saveOrUpdate(). This is far more expensive than using load() with try-catch.
2. Shouldn't we prefer an elegant approach compared to a minor increase in performance.
3. The exception is caught and handled early, which more than halves the penalty introduced.
Either ways, i appreciate the different perspective you brought.
By
Ganeshji Marwaha, at
5:28 PM
Have you tried to use load() twice in one session for a given ID? Then you'll see why this is not such a good idea, as you'll get an exception stating that the object with the given identifier cannot be loaded as it is already associated with the current session.
You can either evict it from the session before, or use get (which has my preference). I would also use either lazy loading or caching for this performance optimization, and certainly not load().
By
Anonymous, at
12:40 AM
I don't think using load() is a good idea for this performance optimization. Have you ever tried to use load() twice in the same session for a given ID? Then you'll get an exception stating that the object with the given ID has already been associated with the session and can thus not be loaded anymore.
I would use caching or lazy loading for this performance optimization, and certainly would not use load().
Please note that get() will only hit the DB if:
- the object is not associated with the session
- is not present in the 2nd level cache
By
Anonymous, at
12:53 AM
I am in accord with some of your opinions but beg to differ on the rest.
Second level cache is undeniably an option, but if i were you, i would consider it as the last resort. I completely agree that get() will hit the DB only when the instance is not in session cache. I thought i mentioned it in my post, but i realized that i haven't after i read your comment. Good catch, thanks. But to be honest, this optimization technique is for when it is not.
Calling load() twice within a session (for the same id) does not throw any exceptions. You might want to double check and we can talk about this later if you prefer.
How exactly can we use lazy loading to achieve performance optimization in this scenario? Assuming the instance is not in session, a call to get() will result in a DB trip regardless of whether you have configured it to be lazily loaded or not. So, i wouldn't bet on this. Again, feel free to correct me if i am wrong.
By
Ganeshji Marwaha, at
1:23 AM
It´s very simple:
use session.load when you are sure that the object exists in your application. session.load always returns a proxy
use session.get when the object could not exists. if the object didn´t exists return null
By
Anonymous, at
6:09 AM
On your comment..
"Second because, once again the problem was not with hibernate, but us."
One thing you should realize is that with hibernate the problem will always be with us. That's because it is pretty complicated to use. It may be rich in functionality but it is overly complex. A good tool/library should always have a very fast/easy learning curve. What is the point of good library if only very few people can use it right? Is it the fault of the people using it or the library? Something to ponder over...
cheers
By
Anonymous, at
8:23 AM
Please note the difference between 'void load(..)' and 'Object load(..)'. I checked the Hibernate sources, and for the 'Object load(..)' I found the following (confirming my statement about loading an Object with the same ID twice in a session) in org.hibernate.event.def.DefaultLoadEventListener:
/**
* Perfoms the load of an entity.
*
* @return The loaded entity.
* @throws HibernateException
*/
protected Object load(
final LoadEvent event,
final EntityPersister persister,
final EntityKey keyToLoad,
final LoadEventListener.LoadType options)
throws HibernateException {
if ( event.getInstanceToLoad() != null ) {
if ( event.getSession().getPersistenceContext().getEntry( event.getInstanceToLoad() ) != null ) {
throw new PersistentObjectException(
"attempted to load into an instance that was already associated with the session: " +
MessageHelper.infoString( persister, event.getEntityId(), event.getSession().getFactory() )
);
}
persister.setIdentifier( event.getInstanceToLoad(), event.getEntityId(), event.getSession().getEntityMode() );
}
Sorry for the formatting, but I'm not allowed to use the appropriate HTML-tags...
You're right that 'void load(..)' does not throw such an exception, when the instance is already associated with the session.
By
Anonymous, at
2:15 AM
Real good discussion. Nice stuff Ganesh.
By
Anonymous, at
2:39 AM
The fact that calling Object load() twice in a single session throws an Exception (NonUniqueObjectException) is not a failing of Hibernate at all - in fact, it's unavoidable.
When Hibernate loads an Object with load() (or with get() for that matter), the Object is always persistent - i.e., associated with the session. (Contrast this with merge(), where the object passed in remains transient or detached.) Therefore, the Object you pass in to load needs to return as a persistent Object. Those are the semantics of the method. Hibernate obviously can't have two separate persistent Java objects for one database record - if it did, how would it know which one's state to save?
If there were a get() method to load state into an existing Object, it would do precisely the same thing, but having such a method wouldn't be logical - since get() can't load null state into the existing Object, and returning null is pretty much get()'s only purpose.
The difference with the Object-returning method is that, if there is an existing persistent Object already associated with the session, it just returns that Object - and therefore doesn't have the duplicate reference problem. (Hibernate refers to these internally as reload and load events respectively.)
It was mentioned above that Hibernate is too complicated. Perhaps the real problem is that it looks so much easier than it is. A lot of people get confused about things like the one I just mentioned because they think of Hibernate as a simple wrapper around a database, whereas it's actually a fundamental shift in thinking about persistence. Anyway, good article - if only for the discussion it provoked.
By
James, at
12:38 PM
Nice Read!!! Esp for a Hibernate Beginner like me. I have a doubt and i am aware that this is not in line with the topic that is being discussed. I have a doubt about hibernate session Cache. We have scenario whihc requires us to run insert / update statements manually and those values should get reflected in the application without restarting the App Server.
1) I tried clearing the hibernate session using evict / evictentity but it does not seem to help.
2) I wrote a Snippet whihc prints the session objects first, then clears the caches (Using evict) and then prints all the objects in the session again. I expected the print statement after clearing the cache to print nothing, but it prints alll the objects again... Why ??
I would apprecaite if some of you experts can throw some light and educate me.
Thanks!!!
By
Anonymous, at
8:33 AM
Fantastic! Great discussion. It's difficult to find a so many rich opinions in a unique place. Congratz!
By
Morlocks, at
8:18 PM
Nice article, but first the times on each comment are stupid (there is no date only the time, thanks to blogger.com!)
Old discussion, but I hope someone see this...
I can't catch the ObjectNotFoundException on Hibernate 3.2.2 (it's considered unrecoverable) which imho it's not a smart idea. I'm the programmer, *I* decide which is unrecoverable! Tools and library like hibernate should just give certain freedoms to their users.
I don't want to stop using lazy initialization by switching to get() instead of load().
Maybe (probably actually) it's an architecture problem of my application, but I get many ObjectNotFoundException flooding my log with useless information because hibernate internally catch that exception and print the call stack.
SVR
By
SVR, at
12:58 PM
Have a look at : http://bean2coffee.blogspot.com/2009/05/retrieving-persistent-entity-using.html
It shows a nice summary table of the differences between get vs load with quick examples
By
Razib Shahriar, at
3:48 PM
Thanks a lot! You made my day!
By
Manuel Darveau, at
2:05 PM