|
Comments
|
Today's Top SOA Links
Enterprise Best Practices for JDBC Programming
Improving maintainability and code quality
By: Derek Ashmore
Oct. 6, 2004 12:00 AM
While many new database persistence methods for Java programmers have been developed in recent years (e.g., entity beans, JDO, Hibernate, and many others), most database access code is still native JDBC. This statement doesn't express a preference, just an observation. Reasons for JDBC's popularity include: (1) it was first, (2) it works, and (3) most developers already know it. I first documented "best practices" for using the JDBC libraries for JDJ in April 2000 (Vol. 5, issue 4). For the purposes of this article, the "best practices" goals for JDBC programming are maintainability, portability, and performance. Maintainability refers to the ease with which developers can understand, debug, and modify JDBC code that they didn't write. Portability refers to the ease with which JDBC code can be used with alternate databases. It turns out that JDBC does not make database programming as platform independent as I would like. In addition, I consider portability a noble goal even if you have no current plans to switch database vendors. Who knows how long your code will be around and what kinds of changes will have to be made to it? Performance refers to optimizing the time and/or memory needed to run JDBC code. Best Practices for JDBC Programming Utilize Connection Pooling Techniques Listing 1 assumes the presence of a J2EE implementation. Applications without access to J2EE constructs can implement connection pooling features using one of many open source products. I recommend Commons-DBCP from Apache-Jakarta (http://jakarta.apache.org/commons/dbcp/). There is no need for developers to write their own connection pooling packages these days. Connection pooling enhances performance by reducing the number of physical database connects and disconnects. Furthermore, it's common for connection pooling to have testing features that asynchronously test connections before your application needs them. In this way, pools provide applications with resilience to database outages. It's very important that all connections created are closed. Note that with connection pooling, issuing a close() merely returns the connection to the pool. Usually, it doesn't result in a database disconnect. Be Diligent About Closing All JDBC Objects Resource leaks caused by not closing JDBC objects are particularly aggravating because they may not surface until the code is run under load. In development, you may not generate enough resource leakage to cause a problem. Furthermore, some JDBC vendors override finalize() to release database resources after objects are garbage collected. If leaked objects are properly closed via finalize(), it's much harder to see the leak in development as the garbage collector corrects the leak. Under high load, JDBC objects may not be garbage collected soon enough to avoid exceeding database resources. I guard against JDBC resource leaks by creating and closing JDBC objects within the same method. For example, the method that creates a connection will also be the method to close it. Furthermore, I close these objects in a finally block to ensure that they get closed in error conditions as well. The reason many developers don't close JDBC objects in a finally block is that it's programmatically inconvenient. The fact that JDBC objects throw a checked exception on close() will cause you to encapsulate the closes in try/catch logic nested within the finally block. An example of how to effectively close JDBC objects is presented in Listing 2. To make closing a JDBC object more palatable, I usually create generic "close" utilities, like those I've included in the open source project CementJ (http://sourceforge.net/projects/cementj/), which make this close logic a one liner. For example, using a generic close utility, the close of the PreparedStatement and ResultSet objects in Listing 1 are one line: DatabaseUtility.close(results, pStmt); CementJ will check for nulls before attempting to close. It will also log exceptions received on the close, but not throw an exception (as there is typically nothing to be done anyway). Using a generic close utility, Listing 2 can be rewritten so that it's considerably shorter and less complex; Listing 2a illustrates this. For those who wish to apply this concept but don't want to add an additional product dependency, Listing 3 illustrates an implementation for a generic close utility. Check Stored Procedure Use There are really two things to consider with stored procedures: portability and performance. From a portability standpoint, stored procedures written in a proprietary language (such as Oracle's PL/SQL or DB2's SQL Procedure Language) make it much more difficult to migrate the application to another database should that become a business priority. These proprietary languages have unique features that might be difficult to replicate in other proprietary languages. If you do write stored procedures, put them in a nonproprietary language. Some database vendors are supporting stored procedures written in Java and other third-generation languages. This makes them easier to move. As to performance ramifications, the specifics can differ between databases. Stored procedures don't always perform better than embedded SQL. The soundest advice on this point is to comparatively measure stored procedure performance and embedded SQL for your database. As a general rule, CPU-intensive operations are bad candidates for stored procedures. For most types of databases, most of the performance gain comes from reducing the number of network transmissions, not from being parsed and physically stored in the database. When considering whether or not to make something a stored procedure, I ask the question: How many network transmissions will be saved by making this process a stored procedure? If the answer is zero, performance will most likely not be improved. Utilize Generated Key Value Retention Feature The JDBC 3.0 specifies that the JDBC driver should return generated values with the insert operation. This means that you don't have to code and execute an additional select statement to retrieve the key value. Use of this feature can streamline code, but may or may not reduce the overhead of issuing an additional select depending upon the specific JDBC driver implementation. A word of caution, some database vendors do not yet support this feature of the JDBC specification. To utilize the generated key feature using JDBC, execute a SQL statement that performs an insert. After execution, issue a getGeneratedKeys() from the Statement or PreparedStatement. The return will be a ResultSet with all generated keys (see Listing 4). Separate JDBC Code from Business Logic For convenience, I separate DAO objects in the package hierarchy (e.g., com.acme.appname.data or com.acme.appname.dao). Some developers also add a DAO suffix to data access object names; for example, a customer DAO might be named CustomerDAO. The primary reasons to separate data access from the rest of the application is that it's easier to switch data sources and it's easier to share DAOs between functions or applications. Medium- to large-sized businesses in particular are likely to have multiple applications using the same data access logic. For example, it's common for an application to need customer information in several different functions. The act of logging into a Web site is different than e-mailing a purchase receipt. However, both these processes need customer information. Separating the access makes the access easier to reuse. Since JDBC code is usually in separate packages, it's easier to locate should you wish to change database vendors or support multiple database vendors. As data access code is easier to find, the separation allows you to more easily determine the impact of database structure changes as well. Consider Query Fetch Size for Large Result Sets To set the query fetch size, set the fetchSize field on the Statement (or PreparedStatement or CallableStatement) before execution. Listing 5 provides an example of setting the query fetch size. (Listings 5-6 can be downloaded from www.sys-con.com/java/sourcec.cfm.) As a general rule, setting the query fetch size is only effective for large result sets. If you set the fetch size much larger than the number of rows retrieved, it's possible that you'll get a performance decrease, not increase. Furthermore, the benefit you get from increasing the fetch size diminishes the higher it's set. I typically set this value to 100 for large result sets. The performance improvement gained from setting the query fetch size varies widely depending upon the database platform and JDBC driver being used. I've seen performance improvements as large as 50%. Performance increases vary depending upon the speed of the network. Generally, the slower the network, the more performance increases can be gained by manipulating the fetch size. Consider Update Batching For example, consider a Web site for online sales. When customers create orders, they often order multiple items. When the order is recorded, usually the items on the order are recorded at the same time. Update batching allows the multiple inserts for the order to be transmitted to the database at once. Update batching is supported for SQL issued via PreparedStatement, CallableStatement, and Statement objects. An example of update batching is presented in Listing 6. As with manipulating the query fetch size, the amount of performance improvement with batching statements varies between database vendors. I've seen performance improvements as large as 92% from batching statements. Also, the slower the network, the greater the opportunity for performance improvement. Changes on the Horizon RowSets can be connected or disconnected. Connected RowSets maintain an underlying connection to the database. Disconnected RowSets allow query results to be serialized and transmitted between JVMs, possibly on different servers. An interesting feature of disconnected RowSets is that you can update the RowSet and post those changes to the originating database at a later time. Another interesting feature is that RowSets can be serialized as XML documents. As this is a new feature, providing "best practices" regarding the use of RowSets is premature. As with support for the generated keys feature, I would expect support for the RowSet feature to differ among the database vendors. Summary References SIDBBAR Core Best Practices Reader Feedback: Page 1 of 1
Your Feedback
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
|||||||||||||||||||||||||||||||||