Efficiently Using ADO.NET - Part II
Maximizing the Power of Expressions
In Part I of this article, I showed you a few ways that you can maximize the power of ADO.NET objects like the DataView to simplify your code, negate the need to constantly query your database and provide functionality that may be otherwise difficult to achieve. In this article, I want to build upon that base, and further explore some advanced uses of Expressions.
The first example I showed you of using expressions entailed creating a separate DataColumn and setting its expression to concatenate a few existing fields. While the expression column is well suited to this task, it only scratches the surface of its power.
Expressions can be applied in multiple scenarios and can be applied to a DataView's rowfilter for example. You may recall from SQL that the value Null doesn't equal anything, not even itself. As a matter of fact, many professors will not allow you to define Null as equal to nothing because Null is nothing and Null doesn't = Null. Anyway, in SQL if you used the following statement in a table where every records "Last_Name" column was Null, you'd return no records:
SELECT * FROM myTable
WHERE Last_Name = NULL |
If you wanted to get this to work, you'd need to rewrite it using this syntax:
SELECT * FROM myTable
WHERE Last_Name IS NULL |
This is the ANSI SQL standard way of checking for null, although lame 'database' systems like Access (Although I would take issue with including Access in a list of real databases) freestyle with the syntax and use ISNULL. Anyway, IS NULL isn't supported in the Expression Syntax but the ISNULL function (just like T-SQL) is supported. So, if you want to check for null, the best way to do it is to use ISNULL to assign Null values a specific constant, and then set your .RowFilter either equal to that constant or not equal to that constant. Here's an implementation I used to filter by employee type. Since this particular filter only included employees that are still employed (which we infer by a Null Termination_Date), here's how I did it:
Private Sub LoadList()
Try
lstEmployees.BeginUpdate()
If optActive.Checked Then
dv.RowFilter = "IsNull(Termination_Date, '01/01/1900') = '01/01/1900'"
ElseIf optFormer.Checked Then
dv.RowFilter = "IsNull(Termination_Date, '01/01/1900') <> '01/01/1900'"
ElseIf optTemp.Checked Then
dv.RowFilter = "Emp_Type = 'Temporary' AND IsNull(Termination_Date, '01/01/1900') = '01/01/1900'"
ElseIf optContractor.Checked Then
dv.RowFilter = "Emp_Type = 'Contractor' AND IsNull(Termination_Date, '01/01/1900') = '01/01/1900'"
End If
lstEmployees.SelectedIndex = -1
Catch aex As ArgumentException
Debug.Assert(False, aex.ToString)
Catch nex As NullReferenceException
Debug.Assert(False, nex.ToString)
Finally
lstEmployees.EndUpdate()
End Try
End Sub |
I have four option buttons and corresponding to Acitve employees, Former employees (specified by the Not Equal), Temporary and Contractor. If I want to show all current employees, I will use IsNull to change all null Termination_Date 's to '01/01/1900' (Since we don't currently employ anyone older than 100 years old) and set the .Rowfilter of the dataview to match it if I want current employees, or to not match it if I want former employees. As you can see, you can link conditions with an "AND" modifier to include multiple conditions similar to the way SQL works. You may notice the .BeginUpdate and .EndUpdate calls on lstEmployees (which is a listbox control). This is simply done to improve the refresh speed of the combobox and to minimize flicker. Note that the ListBox is bound to the dataview (dv) who's filter we are manipulating. As such, you don't need to call .Clear or .Add on the Listbox's Item(s) collection to modify what you see in it.
Aggregates:
As you can see, using Expressions as described above give you one more mechanism to avoid making unnecessary trips to the database or pulling over fields from the DB which depend on functions and which can complicate your Update/Insert/Delete logic. Another area where you can accomplish this is with aggregates. Let's say that we have an Employees table with EmployeeID (an Integer) as its Primary Key. Then you have a Leave Table which records days used by employees and has a caculated field which returns an integer value representing the number of days an employee was out. Lets call this column DaysUsed and assume that the table relates back to Employees on EmployeeID.
So a typical situation might ential having the Employee's name, Last_Name and First_Name (which I combined in the last article with an expression column, setting the combobox's ValueMember to EmployeeID). Hence, when you click on an Employee in the Combobox, the SelectedItem value will be the three fields concatenated, but the SelectedValue will be the EmployeeID. Lets say that you had a TextBox control with a Label above it indicating "Total Leave Used". When the user changes employees in the listbox, one way to get the total leave might be to use an Aggregate Query with EmployeeID as a Parameter and fire an ExecuteScalar method of the command object each time. Your SQL Statement might be something like:
CREATE PROCEDURE usp_GetTotalLeave
@EmployeeID AS INTEGER
AS
SELECT SUM(DaysUsed)
FROM LeaveTable
Where EmployeeID = @EmployeeID |
Then use something like:
Public Function TotalLeaveUsed() As Integer
Dim cn As New SqlConnection("ConnectionString")
Dim cmd As New SqlCommand("usp_GetTotalLeave", cn)
cmd.Parameters.Clear
cmd.Parameters.Add("@EmployeeID", CType(lstEmployees.SelectedValue, Integer))
cmd.CommandType = CommandType.StoredProcedure
If cn.State <> ConnectionState.Open Then cn.Open()
Dim i As Integer = cmd.ExecuteScalar
If cn.State <> ConnectionState.Closed Then cn.Close()
End Function |
(I'm leaving out exception handling for the sake of brevity, but the overall concept is still the same).
Then in the listbox's SelectedIndexChanged event handler, you'd do something like:
tbTotalLeaveUsed.Text = CType(TotalLeaveUsed, String)
Once again, every time the user clicked on the Listbox erroneously, we'd waste a trip to the db. Moreover, even if they didn't, we'd create a new Command and Connection object at a minimum and require a trip to the database each time the user clicked around. This is as wasteful as anything I can think of. Also, if we temporarily lost our connection to the database or the network, every time we clicked on the Listbox, we'd generate an exception and the application would effectively be unusable until we got out connection back. As such, operating in a OffLine mode would be impossible. So to use this approach, we'd use more client side resources, more server side resources, more network resources and limit our functionality. We'd also have a LOT more code just to accomplish what could be done with one line (I'm going to use two for the sake of illustration, but you could easily combine this into one:
Dim x As Integer = CType(Ds1.Tables("LeaveTable").Compute("SUM(TotalLeaveUsed)", "EmployeeID = " & lstEmployees.SelectedValue.ToString), Integer)
tbLeaveUsed.Text = x.ToString |
So what we did was call the Compute method of the DataTable (which was LeaveTable in our example). To use it, you just call the aggregate function (SUM, COUNT, AVG etc) and a Filter expression. So essentially the above statement would read "Give me the Sum of all of the TotalLeaveUsed Values in LeaveTable (where LeaveTable is a DataTable in Dataset Ds1) where the EmployeeID Field equals the employee number we got from the Listbox". You could opt not to use DisplayMember and Valuemember and then parse out the number, or use the oh so ugly Val function (I actually did this when I was first learning .NET, but that's because I didn't know any better, do yourself and everyone else a favor and use ValueMember/SelectedValue) but there's no need to. You don't even need to base the Filter on a control although in most practical situations you'd want to. Also, I'd like to point out that I have a DataRelation object to LeaveTable, a BindingContext that restricts the records that's displayed by it and a grid that only shows the LeaveTable records for the given employee. That grid is bound and the fact that I call Compute on it has no bearing and doesn't interfere with the working or visual display of the grid. As such, this can be used without any risk of interfering with other functionality.
Now other than making a trip back to the db, you could do this another way: You could walk through the datatable row by row and then use a counter variable to add each TotalLeaveUsed field matching your EmployeeID. At a minimum, you'd need a For loop (either For Each of For x Loop) or while loop, a counter variable, and you'd need to evaluate each row in the datatable. On a fast machine with lots of RAM this may not be an issue. However, if you have any experience with Database administration and performance tuning, you ultimately want to get away from Full Table Scans. If you use this approach, essentially you are performing a full table scan on your own. Granted it's still a LOT better than requerying the database, but it's more code, harder to read, will require more safety checks (you'll need to check for Nulls or risk a null reference exception when you reference the field) and essentially just be reinventing the wheel only to make it slower and uglier. The bigger the table, the longer it will take to evaluate and you may run into the situation where code that performs well on development machines is very slow on the customers machine. That's a lot of downside just to keep a bad habit, don't you think?
Select:
If you wanted to find a row or rows that matched a condition, we've already looked at using the DataView's rowfilter to accomplish it. However, there are other methods you can use (like Find, FindRows and Select) which will get you to the same place. Moreover, you may or may not have a DataView in the first place, and there's no need to create one just to filter some rows.
So, let's say we wanted to see all of the records in LeaveTable for a given employee. Currently the BindingContext takes care of filtering this for us, so when we click on a different record, only the related records show up in the Leave Grid. However, this took a DataRelation object and a BindingContext. What if we just needed to see these records programmatically, without the user clicking around.
Using the Select statement (which is basically Compute for non aggregated values) it's quite easy:
Dim LeaveRows() As DataRow
LeaveRows = Ds1.Tables("LeaveTable").Select("EmployeeID = '" & CType(lstEmployees.SelectedValue, String) & "'") |
To walk through the array, you can use a simple For each loop and then just specify the Column Index or ColumnName after referencing the row:
For Each OneRow As DataRow In LeaveRows
'Column 0 is EmployeeID, you can use whatever one(s) you want
Debug.WriteLine(CType(OneRow(0), String))
Next |
The Select method returns an array of DataRows matching your selection statement. It's important to note that you do need to use single quotations to wrap your expression in. Depending on the criteria, there may or may not be any rows returned. However, you can walk through the DataRows array and do whatever you want, bind it to a control, use it to perform calculations or anything else. The ultimate point is that if you wanted to just look at a subset of records, you DON'T Need to requery the database. The reasons you don't want to do this are the same as above, but hopefully I don't need to make that point any further.
One last thing is I'd like to cover is checking the RowState. In the last article, I touched upon this, but you can accomplish the same with a Datatable and the .GetChanges method. If you call GetChanges, you need to put the changes in another Datatable. So what you now have is a Datatable with only the changes. However, only rows with RowState marked as Modified, Deleted or Inserted will appear. If you call Remove on a Row, it's no longer in the Rows collection and will not show here. It's important to remember that "Deleting" a row and "Removing" a Row are not the same thing although that's an easy assumption to make. However, if you have valid Delete logic in your DataAdapter, rows marked Deleted will disappear from your database after calling DataAdapter.Update, rows that are Removed won't. Anyway, there are two overloads to get changes. The first takes no parameters and will return all changed rows. The second takes a RowFilter argument and will return only the rows with the specified rowstate. There are many reasons you'd want this functionality, but mainly it's to let users "See" what they've done before submitting changes permanently.
So, the get the changes of a DataTable you'd do the following:
Dim dtMyChanges As DataTable = Ds1.Tables("LeaveTable").GetChanges()
'If 10 rows were changed in LeaveTable, I'd have 10 rows in dtMyChanges. |
I can pass this dataTable to my dataadapter for an update for instance, or if a user didn't want to make submit some changes, I could pop up a grid with dtMyChanges as the datasource, then Remove any rows they didn't want to submit. Here I'd use Remove b/c I don't want the changes submitted.
Similarly, I could use the overloaded constructor to get only rows that have been modified:
| Dim dtModifiedOnly As DataTable = Ds1.Tables("LeaveTable").GetChanges(DataRowState.Modified) |
Then I could pass this table to my update or do whatever else with it. In this case, I'd only submit modified rows, not inserts or deletions. This might be good to use in a scenario where you want your changes reflected immediately, (so you'd use this frequently calling update) but you want to wait until the session end to deal with Deleted or Inserted records. You may never need this functionality, but it's really handy if you need it. Moreover, just about everywhere I've worked has required some special treatment for deleted rows, either for audit trails or so mistakes can be undone. Many even required deletions be entered into a separate table, and by using DataRowState.Deleted , you could pass this to a different routine that uses it to insert records into a "DeletedRecords" table.
Hopefully by now you are beginning to see how powerful ADO.NET really is and how you can increase performance and save scarce resources by programming it the way it's intended to be used. It's a big deviation from former data access models most of us are used to and bad habits are hard to break. I'd guess that there are more people out there trying to shove old ADO methods into ADO.NET than using it correctly. I know that I used just about every one of the 'wrong' ways of data access when I first started. And they are writing a lot more code than they need to, debugging a lot more code than they need to, and writing apps that either eat more memory than they need to or run a lot slower than they need to (and in most cases, probably both), writing less secure apps than and in general causing a lot more unnecessary work than they should.
In the next piece, I'm going to walk you through creating an application that will function in an Offline mode (in case your database goes down for a while or you are on a mobile device that doesn't always have connectivity). We've already laid much of the groundwork for this, but I'd like to put a few finishing touches on it. From there, I'm going to conclude the series with some tips and tricks to fill in missing functionality in ADO.NET.