LINQ (Language INtegrated Query) is a SQL-like query feature built into Visual Basic (C# has its own version that's similar) that lets you perform queries on data contained in the program.
You can categorize LINQ features into three main categories:
- LINQ to Objects - Provides features for selecting data out of objects such as arrays and lists.
- LINQ to XML - Provides features for reading and writing XML data.
- LINQ to SQL - Provides features for querying SQL Server databases by using Visual Basic query syntax rather than database objects.
LINQ to Objects is easiest to understand because it doesn't require any knowledge of XML or relational databases so I'll focus on describing its features. The others are similar, ... sort of...
LINQ to Objects gives you the ability to query data stored in program objects, in particular objects that implement IEnumerable such as arrays and lists. The result of the query is a new IEnumerable that you can iterate through to process the results.
For example, suppose you have a List(Of Customer) named all_customers. Then the following Visual Basic code finds Customers with AccountBalance less than zero and returns a new IEnumerable that holds objects containing the customers' first and last names. The code then loops through the result displaying the customers' names in the Immediate window.
Dim query = From cust In all_customers _
Where cust.AccountBalance < 0 _
Select cust.FirstName, cust.LastName
For Each c In query
Debug.WriteLine(c.FirstName & " " & c.LastName)
Next c
You can find lots of information about LINQ on the Web (see the Links section at the end of this article) so I won't go into the syntax here.
When Visual Basic encounters this type of query, it doesn't actually execute it. There is no query engine in the same sense that relational databases have engines to execute SQL. Instead Visual Basic converts the query into a series of functions calls to do all of the work. Almost all of the other features added to Visual Basic 2008 were added to support this conversion.
To write the functional equivalent of this query, Visual Basic invokes the new Where method provided by the all_customers list. Here's where it gets strange because all_customers is a List(Of Customer) and the List class doesn't have a Where method. To make this call possible, Visual Basic must add a Where method to the List class.
Rather than simply surgically implanting a new Where method in List, Visual Basic 2008 supports something called extension methods.
An extension method is a method added to a class from outside of the class. Instead of surgically implanting the new method inside List, the LINQ library grafts it on the outside.
You can also write extension methods. To do so, you decorate the method with the Extension attribute. The method's first parameter gives the class that the method extends. For example, the following code adds a MatchesRegexp method to the String class.
Module StringExtensions
_
Public Function MatchesRegexp(ByVal the_string As String, _
ByVal regular_expression As String) As Boolean
Dim reg_exp As New Regex(regular_expression)
Return reg_exp.IsMatch(the_string)
End Function
End Module
This is a fairly cool and powerful new tool because it lets you add new features to existing classes such as String without opening up those classes.
Unfortunately it also has the potential to make the code much harder to understand. If you add a whole bunch of new features to the String class, other developers must remember that you wrote them and that they did not come in the String class itself. If there is a problem, they need to remember where to look for help. For example, if String.Contains is acting up, you're in trouble because that's part of the String class. If String.MatchesRegexp is broken, you need to recall that someone on the project wrote this method and that you might be able to fix it.
Note that there's also already a way to add new features to a class: subclassing. You could make a RegexpString class that inherits from String and add whatever new features you need. The fact that your code uses a RegexpString object is a reminder that you added code to the class.
Subclassing probably would not have worked well for LINQ, however, because Microsoft wants you to be able to use LINQ with arrays, lists, and other IEnumerable objects. Forcing you to use new LinqArrays, LinqLists, and so forth whenever you want to use LINQ would be a major hassle. So we get extension methods.
I urge caution, however. Don't go overboard using extension methods for every little thing. They can be very powerful but they do add a new source for confusion.
Recall again the previous LINQ query:
Dim query = From cust In all_customers _
Where cust.AccountBalance < 0 _
Select cust.FirstName, cust.LastName
For Each c In query
Debug.WriteLine(c.FirstName & " " & c.LastName)
Next c
What data type does the query return? It returns some sort of IEnumerable thing but what's inside it? In this example, the Select clause picks out the customers' FirstName and LastName values. So what are the objects stored inside the IEnumerable?
Similarly what data type is the looping variable c? It is the same as the type of object contained in the IEnumerable but what is that? (I'll get to why its data type is not declared shortly.)
Microsoft could have made these things generic Objects. Or they could have made them some sort of collection like a DataRow that contains generic Objects. Unfortunately generic Objects are less efficient than strongly typed objects.
To try to improve performance, Visual Basic 2008 creates a new data type for these objects. Exactly what the type looks like is strange and mysterious, and you really don't want to have to decipher it in your code.
To work around this puzzle, Visual Basic 2008 allow anonymous types. An anonymous type is one that is created by Visual Basic for this very purpose. You never need to know its name so it's called anonymous.
To allow the program to use these kinds of anonymous types efficiently, Visual Basic 2008 can infer a variable's data type by looking at the value to which it is assigned. This new feature is called inferred data types and is only possible if you turn on Option Infer (it is on by default).
For example, the following statement defines variable txt. Because it is assigned to a String variable, Visual Basic infers that it must be a String.
Dim txt = "Inferred String"
In the previous LINQ query, the variable named query and the looping variable c both have inferred data types.
Inferred data types are handy for LINQ but they also provide a huge opportunity for creating sloppy, confusing, and inefficient code. Explicitly declaring variable data types makes the code easier to read and understand, and it prevents nasty surprises. Look at the following code and try to guess what data types each of the variables have.
Dim a = "ABC" ' Easy: String
Dim b = 12
Dim c = 1.2
Dim d = b
Dim e = b + c
Dim f = a + b
Dim g = b ^ b
Give up? Here are the answers:
- a is a String. That's easy.
- b is as Integer. Less easy but makes sense.
- c is a Double. Why isn't it a Single? Or Decimal? Just because. You're going to have to remember this rule.
- d is easy if you remember what b is: Integer.
- e is a Double. You can figure this out if you know the promotion rules and you know that c is a Double.
- f causes an error because "ABC" cannot be converted into a Double. Why Double? Who said anything about a Double?
- g is a Double. You can figure this out if you know that b is an Integer and that ^ always returns a Double.
Sorry but not all of this is obvious.
I don't like Option Infer because it makes the code less transparent. To get the most out of LINQ, you also need to turn Option Strict off. If Option Strict is on (which is a good practice), then the following code fails and says "Option Strict On requires all variable declarations to have an 'As' clause."
Dim query = From ...
My advice is to keep Option Infer Off and Option Strict On as much as possible. Change these settings only when you are working with LINQ and keep LINQ modules as small as you can to give these options as little chance to hurt you as possible.
In order to make building and initializing objects of anonymous types easier, Visual Basic now provides a With keyword that lets you initialize a new object's public variables. For example, the following code creates a new Customer object and sets its FirstName and LastName properties.
Dim cust As New Customer _
With {.FirstName = "Rod", .LastName = "Stephens"}
I often write constructors to make building and initializing new objects easier but the new With keyword makes a lot of this unnecessary (although it's somewhat less compact than using a constructor because a constructor doesn't require you to give the properties' names).
Recall again the previous LINQ query:
Dim query = From cust In all_customers _
Where cust.AccountBalance < 0 _
Select cust.FirstName, cust.LastName
For Each c In query
Debug.WriteLine(c.FirstName & " " & c.LastName)
Next c
Visual Basic translates this code into a call to the all_customers list's Where method. The Where method takes as a parameter a function that it can call to determine whether a particular Customer object in the list should be included in the result. Visual Basic could build this function and pass a delegate to it into the Where routine. To make things easier on the LINQ developers, Visual Basic now provides lambda functions or inline functions.
An incline function is one that is specified directly in the code that uses it. It is not given a name or a separate existence of its own. It is used and then forgotten so other parts of the program cannot use it.
For example, the following code calls the all_customers list's Where method (yes, you can call these functions directly rather than letting Visual Basic build them from a LINQ query) passing it an inline function.
Dim query = all_customers.Where(Function(c As Customer) c.AccountBalance < 0)
The incline function takes a Customer as a parameter and returns True if the Customer's balance is less than zero. The query returns an IEnumerable containing the Customer objects that have balances less than zero.
You can also assign an inline function to a variable and then pass the variable into the Where method as in this example:
Dim owes_money = Function(c As Customer) c.AccountBalance < 0
Dim query = all_customers.Where(owes_money)
Inline functions are occasionally helpful, particularly for very simple functions. It is easier to pass an inline function into another routine as in these examples rather than writing a separate function, declaring a delegate variable, setting the variable equal to the function's address, and then passing the variable in to the routine.
But exercise some caution when using inline functions. If the function is complicated or confusing, it may be better to make it a separate function. That will also make setting breakpoints in the function and debugging it a lot easier.
I'm not sure exactly how partial methods fit in but I think they were added to make LINQ easier. A partial method is basically an empty method declaration. The method must be a subroutine, must be Private, must include the Partial keyword, and must have an empty method body. For example:
Partial Private Sub LogMessage(ByVal msg As String)
End Sub
Later the code can define the module's body. It must match the previously defined signature but cannot include the method body.
The only thing this really does is define the method's signature. If the program never defines the method's body, then code calling the method does nothing.
This feature is similar to defining a method in a class that doesn't do anything but then allows you to override it in a subclass. (Originally you were also supposed to be able to provide a default implementation if no other method body was defined but the Microsoft hasn't had time to do that so it's been dropped from Visual Basic 2008.)
Partial methods are really intended to support automatically generated code so you're not really supposed to use them. I don't really see a great benefit anyway.
To support LINQ to XML, the Linq namespace defines new XML classes. These are separate classes from those in the System.Xml namespace but they serve very similar purposes. The following code defines old- and new-style XML element variables.
Dim old_element As System.Xml.XmlElement
Dim new_element As System.Xml.Linq.XElement
The new classes have several useful features. One of the most obvious is that Visual Basic 2008 allows you to initialize XML objects by including actual XML code within the Visual Basic code. The following code initializes an XElement variable.
Dim x_all As XElement = _
<AllCustomers>
<PositiveBalances>
<Customer FirstName="Dan" LastName="Dump">117.95</Customer>
<Customer FirstName="Ann" LastName="Archer">100.00</Customer>
<Customer FirstName="Carly" LastName="Cant">62.40</Customer>
</PositiveBalances>
<NegativeBalances>
<Customer FirstName="Ben" LastName="Best">-24.54</Customer>
<Customer FirstName="Frank" LastName="Fix">-150.90</Customer>
<Customer FirstName="Edna" LastName="Ever">-192.75</Customer>
</NegativeBalances>
</AllCustomers>
These objects also allow LINQ to search an XML hierarchy easily. The following code searches the x_all XElement for nodes named Customer and selects those with value less than -50. (The ellipsis after x_all is part of the XML query and means "search this node's descendants.")
Dim neg_desc2 = From cust In x_all... _
Where CDec(cust.Value) < -50
See the links at the end of this article for more information about LINQ to XML.
LINQ is the biggest new feature in Visual Basic 2008 and it drives many of the other new features (extension methods, anonymous types, lambda or inline functions, partial methods), there are a few other new goodies.
I suspect this was another feature added to make building LINQ easier. Relaxed delegates allow the program to automatically convert parameters to delegates such as event handlers from one data type to another is possible. For example, a Button's Click event handler has a parameter named sender that is declared as a generic Object, but you know that it is a Button. Relaxed delegates let you change the event handler's declaration to make this parameter a Button.
Public Sub Button1_Click( _
ByVal sender As Button, _
ByVal e As System.EventArgs) _
Handles Button1.Click
...
End Sub
This is handy if you want to hook up the event handler to more than one button. You can use the sender parameter to work with the button that was clicked without needing to convert it from a generic object. This code crashes, however, if you hook the event handler up to something other than a button (an unusual occurrence).
You can also omit the routine's parameters entirely if you don't need them. Often event handlers handle only a single event so you don't need the parameters. The following code is also valid.
Public Sub Button1_Click() _
Handles Button1.Click
...
End Sub
This is much less cluttered if you don't need the parameters.
I will probably remove parameters where I can (and I remember). I will probably also change parameter data types where it makes sense.
The IIf function evaluates both of its return values even if it doesn't need to. For example, in the following code IIf calls both FunctionA and FunctionB even though the first parameter is False so IIf will return the value of FunctionB.
Dim value = IIf(False, FunctionA(), FunctionB())
The new If function does the same thing as IIf except it performs short-circuit evaluation. It only evaluates the parameter that it will actually return. In the following code, If only evaluates FunctionB.
Dim value = If(False, FunctionA(), FunctionB())
Many of us have been asking Microsoft to change the behavior of IIf to this for years. Microsoft has hesitated because if FunctionA and FunctionB have side effects, you might need to call them both. Functions with side effects are bad programming style, however, so you shouldn't be doing this anyway.
The If function is a good tool and you should use it whenever possible. If you have existing code where the two functions have side effects, rewrite them so they don't and then use If. (I just wish they had picked a different name because If already means something. Perhaps IIfElse? Personally I would just switch to short-circuit evaluation or make it a project-level option and "encourage" people to rewrite their code without side effects.)
You can now declare any variable as nullable, meaning it can take the value Nothing. (The value "null" is C#'s version of Nothing. I guess it would have been awkward to call these "Nothingable types" in Visual Basic. ;-)
The three declarations all make nullable Integer variables.
Dim a As Integer?
Dim b? As Integer
Dim c As Nullable(Of Integer)
You can set a nullable variable equal to Nothing and can use Is to see if it is Nothing.
Visual Basic also usees "null propagation" so calculations that include a null variable give a null result. For example, if x is Nothing then the expression x + 12 is also Nothing.
Nullable types are useful when you need them so use them if appropriate. There's no reason to make most variables nullable, however.
Each version of Visual Basic comes with improved IntelliSense. Unfortunately this time there's an "improvement" that I dislike. In Visual Basic 2005, as you type more characters in (for example) a method name, IntelliSense scrolls to the matching entry but displays all available method names. In the new version, it does not show any entries that do not match what you have typed so far. This works well if you are typing the correct thing but it's a pain when you want to browse around.
It's also a pain if you type what you think is enough and press Tab, and then discover that you didn't type enough characters. You now need to backspace all the way back to the point where the choice you made and the one you wanted diverged. In Visual Basic 2005, you could just use the arrow keys to move up or down a couple spots to find the method you really wanted.
I complained about this and they said they were going to change it. Possibly when you press backspace or Ctrl-Space or something, it will reopen the whole list. Possibly they will also "forget" and not change anything.
This was definitely a case of "if it ain't broke, don't fix it" but they "fixed" it.
In another violation of "if it ain't broke, don't fix it," Visual Basic now tries to align multiple lines so parameters line up with each other across multiple lines. For example, the following code shows how a call to DrawRectangle gets formatted if you put each parameter on a separate line.
g.DrawRectangle( _
Pens.Red, _
10, _
100, _
100, _
10)
In this example it's not too bad but in a longer series of calls, pushing all of the parameters far to the right makes the code very hard to read. Here is the indentation I prefer.
g.DrawRectangle( _
Pens.Red, _
10, _
100, _
100, _
10)
This format not only fits better in articles, books, and magazine articles, but it also fit better on the screen, even if you have a pretty big monitor.
Several of those testing the Beta have complained and there's a small chance that Microsoft will change this. But as one person said:
Of course, they have already rented the hotel rooms and bought the food for the launch events so I don't (realistically) expect them to fix anything we find in beta2.
Unfortunately I haven't had a chance to take a good look at the latest version of the WPF designer and that's going to be very important. The last version was pretty bad and lots of things were easier to build by writing XAML code rather than using the designer. If Microsoft wants developers to use WPF, they need this to be a very powerful tool. I'll update this section when I've looked over the new designer.
I also haven't had time to look at the latest implementations of such tools as WF and WCF. It's not clear I will have time to look at these in much depth for a while. If you have information about their effectiveness, please