Meetu Maltiar's Blog

Meetu's thoughts on technology and software development

Improve Performance By Using Keys Only Query on Google App Engine

leave a comment »


It has been around two months since we started to port an existing application on Google App engine. It used Hibernate for persistence, Wicket framework for web layer and Spring as an Ioc container. Looking at the will it play on appengine we had to change Hibernate to use either JPA or JDO. We decided on JPA purely on the basis of experience we had on JPA compared to JDO. Wicket is semi compatible and we made it work by the three standard workarounds described here.

Changing the persistence layer from Hibernate to JPA was the most challenging task. Not only we had to break the relationships between entities because of the Datastore’s notion of entity groups but also due to several performance optimizations we had to do later. We will have a look at how can we use keys-only query in JPA to increase performance.

We had trouble mapping the associations in JPA for Google App engine for which we blogged about the case for unowned keys and managing multiple parent problem. Understanding of Entity Groups and transactions are extremely important for implementing persistence on Google App Engine. This becomes especially important for porting an application to Google App engine.

In our application there were certain parts we could find were expensive either in terms of CPU usage or in Datastore calls. We were able to optimize application performance using keys-only queries.

If you are a Datastore low level api user, you are already familiar with com.google.appengine.api.datastore.Query class which has a setKeysOnly() method. If we call this method before we query the Datastore. The Datastore then will only return the com.google.appengine.api.datastore.Entity instances such that they have their keys filled in but none of their properties. This keys-only query are fast because the Datastore skipped the extra scan that turns keys into full-fledged entities.

JPA does not have a key only method for datastore but it does allow us to query an entire object or some subset of it . If we construct this subset to contain only primary key of our object, then app engine implementation of JPA issues a keys-only query.

Suppose we have a TimesheetEntry entity in our application.

@Entity
public class TimesheetEntry {

	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	private Key id;

	private Float hours;

	private Date entryDate;

        // getters and setters
}

Now let’s implement a method in DAO to return the keys of TimesheetEntry objects saved by an user between a start date and an end date. We have a JpaTimesheetEntryDAO which implements a method getTimesheetEntriesInRange(). Let’s look at the code listing.

public class JpaTimesheetDao implements TimesheetDAO {

  public List<Key> getTimesheetEntriesInRange(Date startDate, Date endDate) {
  List<Key> entries = new ArrayList<Key>();
  entries = getEntityManager().createQuery("SELECT id FROM TimesheetEntry entry WHERE entry.entryDate >=:startDate AND entry.entryDate <=:endDate")
				.setParameter("startDate", startDate).setParameter("endDate", endDate).getResultList();
  return entries;
  }
}

If you look closely at the Query you will notice that we have only used “id” field from TimesheetEntry entity. If we had used any other field then all optimization will be lost as we will end up fetching entire entities. We used this simple performance optimization often in our application.

Written by Meetu Maltiar

September 11, 2010 at 13:25

Posted in Cloud, Java

Tagged with , , , ,

Leave a comment