Understanding the .NET Language Integrated Query (LINQ)

Introduction

The Language Integrated Query (LINQ), which is pronounced as “link”, was introduced in the .NET Framework 3.5 to provide query capabilities by defining standardized query syntax in the .NET programming languages (such as C# and VB.NET). LINQ is provided via the System.Linq namespace.

A query is an expression to retrieve data from a data source. Usually, queries are expressed as simple strings (e.g., SQL for relational databases) without type checking at compile time or IntelliSense support. Traditionally, developers had to learn a new query language for each data source type (e.g., SQL, XML, ADO.NET Datasets, etc.).

LINQ provides unified query syntax to query different data sources by working with objects. For example, we could retrieve and save data in different databases (MS SQL, My SQL, Oracle, etc.) with the same code. Using the same basic coding patterns, we can query and transform data in any source where a LINQ provider is available. In addition, we can perform many operations, such as filtering, ordering, and grouping.

In this article, we will learn about the LINQ architecture and technologies, query syntaxes, execution types, and query operations. In addition, we will see some code examples to be familiarized with LINQ concepts.

LINQ and Generic Types (C#)

We can design classes and methods that can provide functionalities for a general type (T) by using Generics. The generic type parameter will be defined when the class or method is declared and instantiated. In this way, we can use the generic class or method for different types without the cost of boxing operations and the risk of runtime casts.

A generic type is declared by specifying a type parameter in angle brackets after the class or method name, e.g. MyClassName<T>, where T is a type parameter. The MyClassName class will provide generalized solutions for any T. The most common use of generics is to create collection classes.

LINQ queries are based on generic types. So, when creating an instance of a generic collection class, such as List<T>, Dictionary<TKey, TValue>, etc., we should replace the T parameter with the type of our objects. For example, we could keep a list of string values (List<string>), a list of custom User objects (List< User>), a dictionary of integer keys with string values (Dictionary<int, string>), etc.

If you have already used LINQ, you probably have seen the IEnumerable<T> interface. The IEnumerable<T> interface enables the generic collection classes to be enumerated using the foreach statement. A generic collection is a collection with a general type (T). The non-generic collection classes such as ArrayList support the IEnumerable interface to be enumerated.

LINQ Architecture and Technologies

As we have already seen, we can write LINQ queries in any source in which a LINQ provider is available. These sources implement the IEnumerable interface, such as in-memory data structures, XML documents, SQL databases, and DataSet objects. In this way, we always view the data as an IEnumerable collection, either when we query, update, etc.

In the following figure, we can see the LINQ architecture and the available LINQ technologies. As we can see, the LINQ technologies are the following:

LINQ to Objects: Using LINQ queries with any IEnumerable or IEnumerable<T> collection directly, without using an intermediate LINQ provider or API such as LINQ to SQL, LINQ to XML, etc. Practically, we query any enumerable collections such as List<T>, Array, or Dictionary<TKey, TValue>.

LINQ to XML: LINQ to XML provides an in-memory XML programming interface that leverages the LINQ Framework to perform queries easier, similarly to SQL.

ADO.NET LINQ Technologies: ADO.NET provides consistent access to data sources (such as SQL Server, data sources exposed through OLE DB and ODBC, etc.) to separate the data access from data manipulation.

LINQ to DataSet: To perform queries over data cached in a DataSet object. In this scenario, the retrieved data are stored in a DataSet object.

LINQ to SQL: Use the LINQ programming model directly over the existing database schema and auto-generate the .NET model classes representing data. LINQ to SQL is used when we do not require mapping to conceptual models (i.e., when one-to-one mapping of the data to model classes is accepted).

LINQ to Entities: We can use the LINQ to Entities to support conceptual models (i.e., models that are not the same as the logical models of the database). The conceptual data models (mapped database models) are used to model the data and interact as objects. In this way, we can formulate queries in the database in the same programming language we are building the business logic.

Figure 1. – The LINQ architecture and the available LINQ technologies (Source).

LINQ Syntax

LINQ provides two ways to write queries, the Query Syntax and the Method Syntax. In the following sections, we will see the syntax of both ways.

Query Syntax

The LINQ Query Syntax has some similarities with the SQL query syntax, as we see in the following syntax statement. The result of a query expression is a query object (not the actual results), which is usually a collection of type IEnumerable<T>.

// LINQ Query Syntax

from <range variable> in <sourcecollection>
<Query Operator> conditional expression
<select or groupBy operator> <result formation>

In Figure 2, we can see a simple LINQ query syntax example. The from clause specifies the data source (numbers) and the num range variable (i.e., the value in each iteration). The where clause applies the filter (e.g., when the num is an even number), and the select clause specifies the type of the returned elements (e.g. all even numbers).

Figure 2. – LINQ query syntax example.

In general, the query specifies what information to retrieve from the data source or sources. Optionally, a query also determines how that information should be sorted, grouped, and shaped before it is returned.

Note: The Query syntax does not support all LINQ query operators compared to the Method syntax.

Method Syntax

Query syntax and Method syntax are semantically identical. However, many people find query syntax simpler and easier to read since it doesn’t use lambda expressions. In Figure 3, we can see the semantically equivalent LINQ Query syntax example written in Method syntax.

The query syntax is translated into method calls (method syntax) for the .NET common language runtime (CLR) in compile-time. Thus, in terms of runtime performance, both LINQ syntaxes are the same.

Figure 3. – LINQ Method syntax example.

Note: In terms of runtime performance, both LINQ syntaxes are the same.

Query Execution

In the previous sections, we saw how to use Query and Method syntax to create our query object. It is essential to notice that the query object doesn’t contain the results (i.e., the query result data). Instead, it includes the information required to produce the results when the query is executed. As we can understand, we can execute the query multiple times.

There are two ways to execute a LINQ query object, the deferred execution and the forced execution:

Deferred Execution is performed when we use the query object in a foreach statement, executing it and iterating the results.

Forced execution is performed when we execute the query to retrieve its results in a single collection object using the ToList() or ToArray() methods. Another way to force the query execution is when we perform functions that need to iterate the results, such as Count(), Max(), Average(), etc.

Let’s assume we have the Customer[] customers array from a related service. We have created the following query object to retrieve the customers who live in Athens.

// Data source

Customer[] customers = CustomerService.GetAllCustomers();

// Create the Query object (via Query Syntax)

IEnumerable<Customer> customerQuery =
from customer in customers
where customer.City == “Athens”
select customer;

In the following example, we can see how to execute the query object using the two execution methods (Deferred and Forced).

//Deferred: Query execution using the foreach stamenent

foreach (Customer customer in customerQuery)
{
Console.WriteLine($”{customer.Lastname}, {customer.Firstname});
}

// Forced: Query execution using the ToList method

List<Customer> customerResults = customerQuery.ToList();
foreach (Customer customer in customerResults)
{
Console.WriteLine($”{customer.Lastname}, {customer.Firstname});
}

Basic LINQ Query Operations

In the following table, we can see the majority of the LINQ Query Operations grouped in categories. For information regarding each query operator’s result type and execution type (Deferred or Forced), click here.

LINQ Operator Category
LINQ Query Operators

Filtering Data
Where, OfType

Sorting Data
OrderBy, OrderByDescending, ThenBy, ThenByDescending, Reverse

Projection Operations
Select, SelectMany

Quantifier Operations
All, Any, Contains

Element Operations
ElementAt, ElementAtOrDefault, First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault

Partitioning Data
Skip, SkipWhile, Take, TakeWhile

Join Operations
Join, GroupJoin

Grouping Data
GroupBy, ToLookup

Aggregation Operations
Aggregate, Average, Count, LongCount, Max or MaxBy, Min or MinBy, Sum

Generation Operations
DefaultIfEmpty, Empty, Range, Repeat

Summary

The Language Integrated Query (LINQ) provides unified query syntax to query different data sources (e.g., SQL, XML, ADO.NET Datasets, Objects, etc.). In addition, it supports various query operations, such as filtering, ordering, grouping, etc.

LINQ queries are based on generic types, so in generic collections such as List<T>, we should replace the T parameter with our type object. The LINQ sources implement the IEnumerable interface to be enumerated. The available LINQ technologies include LINQ to Objects, XML, DataSet, SQL, and Entities.

Advantages

Provide unified query syntax of queries for different data sources.
Type checking at compile-time and IntelliSense support.
We can reuse the queries quickly.
Easier debugging through the .NET debugger.
Supports various query operations, such as filtering, ordering, grouping, etc.

Disadvantages

The project should be recompiled and redeployed for every change in the queries.
For complex SQL queries, LINQ is not very good.
We cannot take advantage of the execution caching provided in SQL store procedures.

LINQ provides powerful query capabilities that any .NET developer should know.

Leave a Reply

Your email address will not be published.