|
Comments
|
Today's Top SOA Links
Features How to Communicate with a DBMS
Yes, size is important
By: Chris Pollach
Aug. 10, 2009 02:45 PM
Yes size really does matter when you are talking about a client/server application, web server, or application server communicating with a database management system. Size is often overlooked when designing very large application systems that can potentially generate an excessive amount of concurrent users, heavy database transaction load, complex queries, large data stream content or some type of mixture of the above. Recently, I had the opportunity to put this aspect under the magnifying glass when tuning a client's ASE (Application Server Enterprise) database that was operating at over 1 million input / output requests per hour. The ASE server was receiving requests from PowerBuilder native windows applications, JSP and ASP.NET web pages, and EAServer (Sybase's application server) clusters. The transaction mixture was diverse, from small single row queries to complex medium result set queries and large Binary (BLOB) requests. These query transactions often progressed into an impressive amount of small to medium updates as well back to the DBMS. Add to this web activity of over 10,000 users or 500,000 sessions per hour, EAServer logging messages into its Message Queue table in ASE, and PDF files being saved and retrieved into Blob columns and you can imagine the workload on the DBMS. In this article I will outline the approach I took, the test findings and the final results of the modifications to the various application designs and the The Approach In the generic n-tier system diagram shown in Figure 1, the "destination" machine would represent the DBMS, Host3 could represent an Application or Web Server, Host 1 and 2 could be a firewall (black box and white box), and the Source machine could be an Internet- or intranet-based application. In all of the cases, whether the application is talking directly to the DBMS server or through a Web service or application business component - typically a DBMS transaction is spawned from one of these environments and the network layer is traversed. The network activity is in the form of an information package or packet that holds either an inbound request or an outbound answer. Packets are governed by the Network Operating System (NOS) and must follow the configuration restrictions imposed on them as they travel through the various topographies. Like a chain, the packets response time is only as good as its weakest link in the network path to and from the server. So before you get too focused on a possible specific packet performance bottle neck I would suggest that you converse with your Network Administrator. Ask them about the lay-of-the-land so to speak on the general topography before you get too submersed in a specific area - you might find that an old router or slow bridge could be the root of your problems. Thus, a simple network hardware change could alleviate your performance bottleneck without any application or server changes. If on the other hand, the network topography is sound, then it may be plausible that the network packet interactions may be a factor in your performance woes. To aid in tuning this area, it is important to remember that the NOS will send another packet for every request or result packet issued. It does this like an email request where the sender can request a receipt confirmation by the receiver. However, the NOS does not make this confirmation an optional item; it makes it mandatory. In Ethernet topologies, it is also possible that the packet does not even make it to the receiver as a "collision" is encountered. The packet may then have to be retransmitted because of the collision or time-out situation as the original packet may never have arrived. Based on the above scenarios, the frequency of packet time-outs and the response time for the acknowledgment packets is something that you and your network administrator need to analyze by looking at the big picture of what might be happening to your DBMS response times. Time-out situations should be addressed first as to why the NOS is spending al lot of time in this area instead of delivering the original packet effectively in the first transmission go-round. Packet Size Logistics In the packet diagram in Figure 2, we can see that in a modern packet structure there is an imposed overhead of a key routing information known as the Packet Header, the data being transmitted, and wasted space that is padded out to properly form the overall packet size. What we as a developer, DBA, or networking person can do is to try and balance the data usage so that we minimize wasted space and excessive acknowledgment packets. The above packet scenario also points out that shipping varying amounts of data may invoke steady two-packet transmissions at a 512 byte size but average out to one packet using a 1,204 byte format. Increasing the packet to 2K or higher can lead to an extra packet of basically wasted space being transmitted. Thus the ideal situation is to know your average data lengths and then set the packet size to optimize this average - especially for 80% of the common SQL transactions that your applications utilize in their daily tasks. For the best performance, choose a server packet size that works efficiently with the underlying packet size on your network. The goals are:
For example, if your network packet size carries 1500 bytes of data, setting Adaptive Server's packet size to 1024 (512*2) will probably achieve better performance than setting it to 1536 (512*3). Sybase Engineering "I started using Sybase SQL Server (back when it was actually called that) a long time ago. I'm a big fan of adjusting the packet size. The packet size you end up with should be balanced by taking into account the way the application uses the database. Applications that deal with the database by sending one or just a few rows at a time tend to do better with smaller packet sizes. PB applications often pull back many rows for each query so a larger packet size will often result in better performance. There is a point where going to a larger packet size results in decreased performance. Keep in mind that that actual packet size on the wire cannot exceed a certain amount. I don't remember the exact amount but it is less than 2K. So using a 2K ASE packet size may result in the creation of two packets on the network, not just one. Using a larger packet size within ASE will reduce the CPU that ASE needs simply because it is accumulating data into a larger buffer and flushing out fewer of those to the network. (That is a bit of a simplification.) As I recall the packet size setting must be adjusted on both the server and the client. A client can have a larger packet size setting than the client but not the other way around. Additional memory is necessary on the server when using larger packet sizes. For busy networks, a larger packet size often works better. The more packets you have to send the more chance you have of collisions on the network." The 50,000 Foot View Spotlight is basically a dashboard application that connects directly to your DBMS, operating system, etc., using the appropriate connectivity. In the case of ASE, this would be the MDA tables where the performance information is captured. This then allows Spotlight to introspect the target environment and report its findings using a "dashboard" interface. The dashboard graphically maps o t the architecture and then animates and colorizes these areas and interactions between them for you to see visually. In the Spotlight interface in Figure 3 I have outlined in red squares the two key pieces of information that I am interested in for possible packet interaction problems. These are the Problem Users (connections) and the amount of packet activity that the server is having. The problem programs are already at a "concern" level - so Spotlight has colorized the entry in yellow. When the packets received or transmitted section is having problems, the animated pipes will turn to yellow, then orange and then into red as problems develop in this area. Once you notice packetization is an escalating problem area, click the mouse on the packet pipeline and Spotlight will display a pop-up information dialogue and then allow you to "drill down" into this area for specific details (see Figure 4). To do this, just click on the "show me" drilldown and Spotlight will then allow you to look at the packet area being monitored in much greater detail. Once into the specific suspect area, you can select the problem user / connection, then highlight an application, select the Network tab page, and request the monitored information on packets. This then brings up a detail dialog outlining the current application's condition and below this the packet activity by sent and received granularity. This high-level drilldown feature can easily allow the organization to discern what applications might benefit from a detailed investigation and tuning exercise on packet usage. This drilldown feature now gets you a view from the 5,000-foot level for each application that creates a lot of packet load on the DBMS server and thus through the network (see Figure 5). Of course, this quick 50,000-foot to 5,000-foot view for problem areas can also be extended to Input/Output, caching, data buffers, etc., parts of the DBMS operations that allow the operations people to quickly pinpoint an area to monitor further or may require a detailed performance review. Spotlight will even make suggestions as to what to change in the NOS or DBMS as far as configuration parameters are concerned to help expedite a fix for the observed problem area. That is a really nice feature that I have used to tune an ASE configuration many times. For more information on the "Spotlight" product from Quest Software, please visit their website. Ground Level - The Application For PowerBuilder-based application systems, the ability to drill down to see exactly what is happening from the Packet to data buffer transfers is relatively easy to do using three built-in features: The SQL Trace feature, the Profiling Tool, and the DataWindow's SQL related events. These features enable the PowerBuilder developer to easily get "under the covers" to access more detailed information on what is happening between the application, its hand shake, and the actual DB driver (client) interface. Each of these investigation features will give you some more information on what might be happening, such as where excessive IO or CPU is being consumed that might be packet related when performing intense DML requests. As a former DBA, I also found that the SQL "projection" construction technique used could either hamper or aide in the total amount of CPU consumed in the packaging and un-packaging of the data to and from the network packet layer. For example, let's look at a fictitious table called TABLE-A that has 10 columns - I will call them COL-1 through COL-10. They were declared by the DBA in the DDL in that specific order. Most DBMS systems will then physically store the data in the same order as the column definitions in the schema. Now at execution time a DML projection is encountered that states: Select COL-10, COL-1, COL-9, COL-5 .... From TABLE-A. In this case the data is stored in COL-1 through COL-10 physical sequence, so the data has to be moved from the DBMS work buffer to the Packet buffer as one move statement for each column. This overhead may also be incurred at the client side as the data is extracted from the network Packet to the local client's data buffer. In the above SQL example, let's see what would happen if the application developer or DBA create an SQL projection statement that follows the order of the column names as declared in the Schema. So the projection statement would then become: Select COL-1, COL-2, COL-3, COL-4 .... From TABLE-A. This format then allows the DBMS server to move all the data from the work buffer to the packet buffer using one move statement. Actually in the generated pseudo code, this becomes an MVC (WorkArea, BufferArea, TotalLength) command - or basically, from this address to that address move all the data in one consecutive block. This significantly reduces the CPU time to package and un-package the data. This benefit is also witnessed on the client side as the local "C" code in the client can also optimize data movement in this way as well. SQL Trace To activate the SQL Trace, all the application developer needs to do is assign the keyword "TRACE " to the current value in the SQLCA.DBMS property of the Transaction Object within your application (see Figure 6). For example, if is currently assigned the value SQLCA.DBMS = "ASE" - then changing this to SQLCA.DBMS = "TRACE " + SQLCA.DBMS will set the property to "TRACE ASE". When the DB hand-shake driver in PB encounters this setting in the CONNECT SQL statement, it will start the SQL Trace file. You may also start an SQL Trace by adding a parameter to the DBPARM field of the SQLCA structure. The format of the command is "PBTRACE=1" and will have the same effect as using the TRACE keyword. (You have to use the TRACE keyword even when you use the DBPARM value. The DBPARM value just allows you to control when tracing is done.) Either technique will activate the internal trace mechanism invoking the PBTRAnnn.DLL to assist you in this endeavor. Note that in production, you need to deploy this extra tracing DLL for this feature to work. Once the application has executed and closed, the SQL Trace file will have a detailed account of the DML statements that were used and more importantly data lengths and timings that handled each row as the DB drivers packaged or unpackaged information to and from the network packets plus the overhead of round-trip requests to the DBMS for data services (see Figure 7). The important information from my perspective is to get the average data length and pinpoint excessive time against problematic DML statements. These measurements can indicate excessive data movement or long request times for specific result sets. The TRACE.log file is a small ASCII file - so it is also easy to write a PowerBuilder application that can parse the trace information and summarize the transactional information for you. Application Profiler In the case of profiling from the PB development environment, once you have set the Profiling Options in the Systems Options dialogue - just run the application as you would normally to gather the performance intelligence information for the various SQL activities. Once you close the application and return to the PowerBuilder development environment, you can launch the Profile Analysis tool that comes with PowerBuilder. This can be done by using the "File=>New" menu and selecting the "Tool" tab page in the resulting dialog. In that tab page should be options for launching three profile analysis tools called "Profiling Class View", "Profiling Routine View" and "Profiling Trace View". These utilities will also allow you to graph the results of the Profile Trace - in this article we are focused on the SQL activity of a given application. Running this profile analysis for the default packet size connection will give you a baseline measurement. Then, rerun the same test changing the packet size on the next and subsequent tests using the identical functionality. The comparison of these profiles should show you the IO and CPU performance improvement or degradation for each packet size compared to the base line. In Figure 9 and Figure 10, we can see the graphed difference of a 1,024 byte packet size (Figure 9) versus a 2,048 byte packet (Figure 10). The dark blue line indicates the overhead of the DB driver, which is apparent in the 1,204 byte test but drops to zero (negligible) in the 2,048 byte test. The green line indicates the application's overhead to process the data and climbs in the larger packet size test but the involuntary wait time (depicted in red) is reduced. Interestingly enough, the CPU was higher on the client side to handle the larger packet size overhead but the overall SQL performance throughput dropped from an average of 979ms to 911ms - a gain of 7% better performance over all. Examples of profile graphs of 1,024 vs 2,048 byte packet size tests DataWindow Events Once we have the problem application set up to analyze a set of DML statements that represent the various normal SQL processings, we can run this controlled test multiple times. For each test iteration, we can alter the packet size and capture the performance of each of the DataWindow's events for each packet size. Comparing this against the base line we can then compute the various gains or losses. To save yourself some time in enabling this performance gathering feature you can just utilize the (Software Tool & Die) STD Foundation Classes for PocketBuilder, InfoMaker, EAServer or PowerBuilder as the DataWindow event performance capture facility has already been implemented within the framework. You can download the appropriate product version located on the Sybase website's CodeXchange area or visit the new Foundation Classes home. For the PowerBuilder 11.5 version of the FCs, the new Transaction Object events have also been implemented to capture SQL and timing information. Changing the Packet Size Application / Web Server Again, even though the PowerBuilder components are housed within a server, the SQL trace, Profiler commands and DataWindow Events pertaining to SQL activity are still relevant here to help you capture performance information. The key is to keep the testing model static and controlled while varying only the packet size to ensure a proper comparison as you progress through the regression test suite. Test Results The processor time I found was directly related to the packaging and un-packaging of the data into and out of the network packets. By varying the packet size, you can directly influence the overhead on the workstation and the server surrounding this activity. Setting a reasonable packet size can drop the processor overhead from a modest 20% to savings topping over 40%. This drop in overhead was especially noticed on the DBMS server side as the concurrent session load increased. Freeing processing cycles here benefited all applications by facilitating better concurrent activity and faster overall response time at higher load levels. From the network packet transport layer, I was able to observe modest to significant improvements in performance as the packet sizes changed due to transmission efficiency, data packaging effectiveness, and acknowledgement packet overhead (see Figure 14). These factors can certainly influence the overall throughput in communicating to your DBMS server and back again to the application. In my specific application and DBMS profiling, the overall optimum size of the network packets was best optimized at the 7168 byte level. Performance gains at this size were very noticeable in the various client / server, service object and web applications compared to the default 512 byte level that these systems were originally using. The next best gains occurred at the 7,168, 3,564 and 2,048 byte levels. However, it is important to note that my database transaction mix was a combination of many small resultset queries combined with sporadic large queries or Blob (Binary Large Object) requests. Recommendations
In the end, I ended up changing the default Transaction Object (SQLCA) to run with a 2,048 byte packet size for all the small to medium queries - then creating a second Transaction Object set to the 7,168 byte packet size for Blob-related activity (see Figure 15). These changes improved my EAServer environment performance by reducing the CPU load and response times by 21%. I was able to change the client/server based applications as well to standardize on a 4,096 packet size that improved the average transaction response time with the DBMS by 11%. Conclusion I hope that this article has inspired you to look into this area further. I also wanted to help remove some of the mystery about network operations that we often take for granted. Now that you know how you can effect some more control over the network realm, all you need to do is to educate your management as to the possible benefits of this type of endeavor. I am also glad to report that no packets or animals were harmed in the making of this article. Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
|||||||||||||||||||||||||||