|
Comments
|
Today's Top SOA Links
Enterprise Enterprise Content Mangement on the Java Platform
A peek into Java standard APIs for accessing a content repository
Apr. 7, 2005 12:00 AM
Java Web applications have needed a standards-based API for Enterprise Content Management (ECM) for a long time. ECM is an essential requirement for Web applications on the Internet, intranets, and extranets. ECM vendors have proprietary APIs in various languages and this fact has inhibited ECM architectures from being interoperable. JSR-170 for ECM defines a new set of APIs to standardize the interface with ECM products. It aims to the make the ECM product pluggable, much like the JDBC the API enables application code to be independent of databases products. JSR-170 has been actively supported by several ECM vendors and approved for public review. Its adoption is predicated on enterprises demanding it from ECM vendors, and it remains to be seen if these vendors will forego their unfair advantage. In this article, we explain the lifecycle management services associated with "Content" to a Java developer building enterprise applications focusing on the new and emerging JSR-170. Enterprise Content Management (ECM) is about managing the lifecycle of "Content" in an enterprise. The lifecycle of managing such content requires a robust architecture. The lifecycle of "content," as shown in Figure 1, begins with its getting authored with some metadata. It's formally represented in digital form and uploaded to some server using Web protocols. It's then processed, which typically consists of sorting, classifying, and storing in a form that's subsequently easy to query and search. Content gets served to an authenticated and authorized user either in isolation or merged and aggregated with other content. Not all users are interested in the same kind of content so content has to be customized to suit individual user preferences, display device characteristics, and local and internationalization requirements. In Figure 1, users from various domains want to access content through various devices. The same content has to be rendered on various client devices. Compounding that, with the innumerous types of content and associated standards, a single general-purpose Content Management System (CMS) is seldom sufficient for an enterprise. Enterprise architectures deploy more than one ECM product, each having its own APIs for lifecycle management services, which increases developer complexity. A unified and standardized API such as JSR-170 can simplify the task of managing content across various vendor products and frameworks. Enterprise Content Management Once submitted content often has to be translated for an international audience. It may also have to be transformed based on visual formatting requirements specified in templates and other style sheets. Once the content is transformed, it has to be assigned to the appropriate placeholders for dynamic rendering. The scope of ECM can also extend to content delivery. Delivery involves the assembly of dynamic content. It also requires the construction of an index and the ability to search the site for all of its content. There may be personalization requirements and consumers may have a preference about how they want the content to be structured. Consumers may have various authorization privileges based on their roles. Many of these content delivery requirements are also applicable to portal architectures. Portals use ECM solutions as a back-end service. The scope of this article is restricted to the Java interfaces dealing with content repositories. Java Content Repository Model The repository as exposed through level 1 of the JCR is a tree structure very much like the Unix file system. It comprises nodes that can have zero or many child nodes. It should also support CRUD (Create/Read/Update/Delete) operations on the nodes and provide for assigning node types and the means to search the repository. Nodes can have zero or more child properties. It should be possible to do retrieval and traversal of nodes and properties. A path-syntax has been defined to navigate the tree. The repository has three layers of isolation. javax.jcr.repository is an interface. An object implementing this interface represents a persistent data-store. javax.jxr.workspace is an interface; objects implementing this interface serve as a private view whose activities are only visible to users in this workspace. Changes made to this view have to be committed with an explicit checkin operation. A third type of isolation is between the workspace and the nodes (objects) in memory. A repository is similar to the well-known concurrent versioning system but there are some subtle differences. JCR doesn't distinguish, and rightfully so, between content and its metadata. It's up to the application to define its preferred conventions. JCR can be implemented on top of a file system, WebDAV, RDBMS. etc. Figure 2 shows a high-level JCR architecture. An ECM application that's protected through JAAS retrieves a handle to a JCR Repository object using Java's Naming and Directory Interface. It populates a Credential object by pulling attributes from JAAS and invokes the Repository object's login method. So it retrieves a ticket that's like a session. Using the ticket, it retrieves one or more workspaces. The workspace provides for APIs to navigate the node tree and modify the nodes and their properties. JCR provides APIs to copy and move nodes around. It also lets APIs import and export nodes to external systems. A node can be serialized in an XML document. Likewise, an XML document compliant to some schema can be imported and attached to an existing tree. In a nutshell, JCR is similar to a Java DOM (Document Object Model) API with an ECM-friendly syntax. As we said before, the motivation for having two levels of API for this JSR is so this complex set of APIs can be adopted by the industry in a phased way. A JCR repository is viewed as a collection of workspaces, each of which organizes the information in it in a graph (or tree) structure shown in Figure 3. Level 1 of the API defines a standard way to acquire a handle to a workspace in a repository, to authenticate to the workspace, and to access or manipulate data in a workspace at the content-element level. Reader Feedback: Page 1 of 1
Your Feedback
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
||||||||||||||||||||||||||||||