Friday, January 20, 2012

ADO.NET Connection Pooling

Connection pooling is a technique for sharing a set of predefined connection objects among clients. Each object in the pool is instantiated upon startup and remains idle waiting for a request to serve. Just like firing up the pizza oven, the creation of the connection object is the most expensive part; pooling a set of similar objects reduces the total cost and guarantees better performance to the application.

ADO.NET and .NET data providers make connection pooling mostly transparent to developers. However, the benefits of connection pooling can be dramatically cut down, or even cancelled out, by poorly chosen connection string parameters or poor Data Access Layer (DAL) design. 

In this article, I'll review the characteristics of the main connection classes in ADO.NET and examine some DAL programming patterns to ensure a good balance of pooling and security. 

Connection Model 

The connection represents an open and unique link to a data source. In a distributed system, this often involves a network connection. Depending on the underlying data source, the programming interface of the various connection objects 

maydiffer quite a bit. A connection object is specific to a particular type of data source, such as SQL Server and Oracle. Connection objects can't be used interchangeably across different data sources, but all share a common set of methods andproperties grouped in the IDbConnection interface. 

In ADO.NET, connection objects are implemented within data providers as sealed classes (that is, they are not further inheritable). This means that the behavior of a connection class can never be modified or overridden, just configured through properties and attributes. 

In ADO.NET, all connection classes support connection pooling, although each class may implement it differently. Connection pooling is implicit, meaning that you don't need to enable it because the provider manages this automatically. 

The connection object is much simpler in ADO.NET than it was in earlier versions like ADO 2.x. It drops some of the functions like the capability of executing commands (method Execute) and reading schema information (method OpenSchema). 

ADO.NET Connection Classes 

ADO.NET comes with four data providers: SQL Server 7.0 and beyond, Oracle, ODBC data sources, and OLE DB data sources. Each of these has its own connection class with a 
common layer of functions plus specific features. The SqlConnection class represents a connection to a SQL Server database, version 7.0 or higher. In ADO.NET 2.0, this class also represents a connection to the newest version-SQL Server 2005. 

The OracleConnection class connects to an Oracle database, while OdbcConnection represents an open connection to a data source set through an ODBC driver. Finally, OledbConnection establishes a link with a data source via an OLE DB provider. 

All connection classes use the ConnectionString property to specify the data source of choice and configure the runtime environment. The connection string can be set either through the aforementioned string property or passed on to the class 

constructor. Each connection class supplies a bunch of read-only properties to extract details out of the connection string. The Database property gets the name of the database to be used after the connection is opened. The DataSource property 

indicates the database server through any of the following: file name, server name, or network path. It is important to notice that these detail properties are read-only; their value can only be set through the connection string. 

The property ConnectionTimeout gets the timeout value for the physical connection to take place. The default timeout is 15 seconds. A value of 0 means that the application never times out and will indefinitely wait for the connection to open; 

while acceptable, this setting should be avoided in practice to prevent the application from hanging. The connection timeout can be set through the ConnectionString property. The Oracle connection class doesn't support any timeout and always 

returns 0-meaning that no maximum wait time is set on an attempt of connection before an error is generated. 

Likewise, the Oracle connection class doesn't support the ChangeDatabase method, which other classes use to dynamically change the database for open connections. 

In addition to the properties and methods defined by the IDbConnection interface-all classes must implement it-each connection class can provide custom members exposing functionalities specific to the underlying DBMS. The SQL Server provider has three properties: Packet Size, Workstation Id, and ServerVersion. 

ServerVersion gets the major.minor.build version number of the current instance of SQL Server. WorkstationId contains the network name of the client accessing SQL Server. PacketSize gets the size in bytes of the network packets used to 

communicate with an instance of SQL Server. The default size is 8192 bytes. A different value can be set through the connection string using the Packet Size attribute. Feasible values range from 512 to 32767 bytes. 

The SQL Server .NET data provider manages pooling using internal classes, as does the Oracle data provider. As you'd expect, OLE DB and ODBC data providers implement connection pooling using the ODBC and OLE DB infrastructure. 

Managing a Connection 

Each connection pool is associated with one distinct connection string; the connection string is parsed using an exact-match algorithm. This means that an additional white space, or even a different order of the same attributes, leads to distinct pools. The following code creates two distinct pools. 

Upon creation, the pool is filled until the maximum size is reached. Connections are created and opened sequentially. To open a connection, you use the Open method. It takes no arguments and draws a connection object from the pool that matches the connection string, if any. (If not, a new pool is created.) The Open method looks for a usable connection. A connection is reckoned usable if it is currently unused, has a matching transaction context or is not associated with any transaction context, and has a valid link to the server. 

If no usable connection is found, the request is queued and served as soon as a valid connection is returned to the pool. Connections return to the pool only if explicitly closed. To close a connection, you can either use the Close or Dispose 

method. (Dispose ends up calling Close to cancel pending commands and roll transactions back.) The Close method releases the connection object to the pool. If connection pooling is disabled, any call to Close terminates the connection instead. 

Connections are removed from the pool in case of a timeout or if a severe error occurs. You should never rely on the garbage collector to return a connection to the pool. If the connection object goes out of scope, the underlying connection to the server is not automatically closed, meaning that any connection must always be explicitly closed Returning a connection to the pool makes the object reusable for another call; the underlying connection to the database is not severed, though. As mentioned, this happens only if a severe error occurs. With SQL Server, if an exception is thrown while a command is being executed, the connection remains open only if the severity level is low (less than 20). Beyond level 20, the connection gets automatically closed. Next, the user can reopen the connection and continue with the application. 

As mentioned, connections are assigned also based on the transaction context. The context of the requesting thread and the assigned connection must match. Each pool is subdivided 
nto connections with no associated transaction context, and into various groups each filled with connections with a particular transaction context. When a connection is closed, it returns into the appropriate group based on its transaction 

context. This mechanism allows you to close the connection even though a distributed transaction is still pending. Later, you retrieve the connection with the proper transaction context and can commit or rollback the distributed transaction. 

Golden rules for effectively using connections and pooling can be summarized as follows: 

•   Enable connection pooling 
•   Store connection strings securely 
•   Open connections late and close them early 
•   Design your data access code thinking in terms of user roles, not user accounts 


Thanks,
Anil

No comments: