Published

Thu 26 February 2015

←Home

Apache Chemistry and CMIS

In this post, I will explain how you can use apache chemistry API’s to query the enterprise content management systems. Apache Chemistry project provides client libraries for you to easily implement integration with any of the content management products that support or implement CMIS standard. As you might be aware that there are multiple standards such as JCR and CMIS  for interacting with the content repositories. Although with JCR 2 support for SQL-92 is now available, I really prefer the conciseness and wider adoption of the  CMIS standard, and the fact that Apache chemistry API’s really make it easy to interact with the content repository.

I am sharing an example that you can readily test without any special environment setup.  Alfresco, is one of the vendors that provides a ECM product, which supports the CMIS standard. Alfresco, offers a public repository for you to play with.  I will cover this example on a piecemeal basis.

Connecting to the repository:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
SessionFactory sessionFactory = SessionFactoryImpl.newInstance();
        Map<String,String> parameter = new HashMap<String,String>();
        parameter.put(SessionParameter.USER, "admin");
        //the binding type to use
        parameter.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value());
        parameter.put(SessionParameter.PASSWORD, "admin");
        //the endpoint
        parameter.put(SessionParameter.ATOMPUB_URL, "http://cmis.alfresco.com/s/cmis");
        parameter.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value());
        //fetch the list of repositories
        List<Repository> repositories = sessionFactory.getRepositories(parameter);
        //establish a session with the first ?
        Session session = repositories.get(0).createSession();

To connect to the repository, you require a few basic parameters such as the username, password, endpoint url (In this case the REST AtomPub service) and a binding type to specify which type of endpoint is it (WEBSERVICES, ATOMPUB, BROWSER, LOCAL, CUSTOM ) are the valid binding types. After we have this information , we still need a repository id to connect to. In this case, I am using the first repository from a list of repositories to establish the session. Now, let’s create a query statement to search for the documents.

Querying the repository:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
//query builder for convenience
QueryStatement qs=session.createQueryStatement("SELECT D.*, O.* FROM cmis:document AS D JOIN cm:ownable AS O ON D.cmis:objectId = O.cmis:objectId " +
        " where " +
        " D.cmis:name in (?)" +
        " and " +
        " D.cmis:creationDate > TIMESTAMP ? " +
        " order by cmis:creationDate desc");
//array for the in argument
String documentNames[]= new String[]{"Project Objectives.ppt","Project Overview.ppt"};
qs.setString(1, documentNames);
Calendar now = Calendar.getInstance();
//subtract 5 year for viewing documents for last 5 year
now.add(Calendar.YEAR, -5);
qs.setDateTime(2, now);
//get the first 50 records only.
ItemIterable<QueryResult> results = session.query(qs.toQueryString(), false).getPage(50);

Here I have used createQueryStatement  method to build a query just for convenience, you could also directly specify a query string(not recommended). The query is essentially a join between objects. This sample code shows, how to specify the date (Line 14) and an array (Line 10) for the in clause as parameters.  Line 16 assigns the searched values to an Iterable interface, where each QueryResult is a record containing the selected columns.

Iterating the results:

1
2
3
4
5
6
7
8
for(QueryResult record: results) {
    Object documentName=record.getPropertyByQueryName("D.cmis:name").getFirstValue();
    logger.info("D.cmis:name " + ": " + documentName);

    Object documentReference=record.getPropertyByQueryName("D.cmis:objectId").getFirstValue();
    logger.info("--------------------------------------");
    logger.info("Content URL: http://cmis.alfresco.com/service/cmis/content?conn=default&id="+documentReference);
}

As explained above, we get a Iterable result-set to iterate over the individual records. To fetch the first value from the record (as there might be multiple valued attributes), I am using the getFirstValue method of the PropertyData interface.  Note Line 7 as it contains the actual URL of the resource, which is just a base URL to which the object id of the matched document is appended.

Closing the connection ? As per the chemistry javadoc, there is no need to close a session, as it is purely a client side concept, which makes sense as we are not holding a connection here.

Viewing the results: To view the actual documents just use the URL’s generated by the log statement in the browser.

Building the code: Add the following dependency to maven for building the sample.

1
2
3
4
5
<dependency>
     <groupId>org.apache.chemistry.opencmis</groupId>
     <artifactId>chemistry-opencmis-client-impl</artifactId>
     <version>0.12.0</version>
 </dependency>

Wrapping up: I have just covered one example of the CMIS Query API and Apache chemistry to query for the documents. Kindly refer to the documentation links provided in reference section for other usages. Below, is the gist that contains the entire sample code.


References:

Go Top
comments powered by Disqus