On my current project, we’re using Linq to SQL as our data layer. Without going into unnecessary detail about the whys and hows of our architecture, I’d like to describe a couple problems we encountered, and how we solved them.
Problem #1: Mocking a DataContext for testing purposes:
This one’s been making the rounds lately, with comments from Stuart Clark (via Andrew Tokeley), and a team member of mine, Kris Scott. Their blog entries will tell you far more about the problem and solution than I really need to go into here, but you should read them, as this is pretty exciting stuff.
Problem #2: Performing a SQL “Like” query using Linq syntax:
Linq has the amazing ability to turn regular .Net code into SQL queries for the most common cases. There are a few, however, that got missed. One of these edge cases is the standard “Like” query. Sure, you can use the BeginsWith, EndsWith, or Contains keywords, but these only handle a few specific usages of “Like”. What about when you’ve given the user the ability to specify wildcards as part of a query? You don’t know whether the wildcard will be at the beginning or end. In fact, you don’t even know whether there will be multiple wildcards included. Perhaps I’m performing a customer search, and I remember only part of the customer’s name. I know it ended in “ald”, but I’m not sure if it was Greenwald or Grunvald. In SQL I would just search for “G%ald”, and while I’d pick up a few “Gerald”s along the way, I should find what I’m looking for. In this particular case I could say that it BeginsWith “G” and EndsWith “ald”, but that’s just this case. The point is, that I don’t know what the user is going to search for, and unless I feel like re-implementing a SQL expression parser, that means that I’m simply not going to be able to provide wildcard abilities to the users. Yeah, they’ll totally accept that.
Solution #1: Use the SqlMethods class’ Like method.
The team that implemented Linq must have faced this same problem because they created the SqlMethods class to take care of this for them. To take advantage of this class, you just use it in a lambda expression like this.
list = list.Where(o => SqlMethods.Like(o.LastName, pattern))
Simple, right? Well, technically, the SqlMethods class wasn’t really meant for public consumption. The documentation even says “This API supports the .NET Framework infrastructure and is not intended to be used directly from your code.” But it is public and documented, so Microsoft will probably leave it alone. Seems safe enough to me, right?
Problem #3: Testing it:
So, we’ve got our Like search working, life is good, and no-one goes hungry in developer-land. Until you try to test the thing. The problem is that the SqlMethods can only be used by Linq to Sql. I suppose the “Sql” in the name of the class should have been a dead giveaway, right? If you try to run a query which uses SqlMethods.Like against an IList, for example, you’ll get an exception and crash. Does this mean we’re doomed to integration testing every “Like” query against an actual database? Let’s hope not. On my current project, the tiny minority of “integration” tests take longer to execute than the remaining unit tests combined. On top of this, database integration tests are prone to timeout and deadlocking problems when you have 12 developers writing and testing the same application at the same time. Also, it’s hard to maintain a test-driven, agile environment when running the test suite can take nearly ten minutes.
We could write our tests so that we only performed limited searches with no wildcards, but this isn’t really testing the real-world usage of our code. We could also write our searches so that by throwing a switch behind the scenes, it would replace the calls to SqlMethods with something that works on in-memory lists, such as a RegEx match. This is fragile, however, because you have to keep two code branches in sync, one of which is only used for unit testing anyway. What we need is a way to do this automatically, to take a query and change it depending on the environment it’s running in. Then, once we’re satisfied that this translation works (by unit testing it, naturally), we can take it for granted, plug it in everywhere, and never be bothered by this problem again.
Since the problem with mocking DataContexts has now been solved, I started re-examining this issue. In concept, it’s very simple. Walk the expression tree used by a query, and swap out the calls to SqlMethod.Like with something that works on in-memory lists. Simple, right? Well, it would be if Microsoft hadn’t gone to great lengths to keep you from doing exactly what we’re about to do. First of all, Linq expression trees are immutable, so you can’t change anything about them. Secondly, even if you managed to change the expression tree, the “expression” property of IQueryable is a get-only property. This second problem pales in comparison to the first since we can use the magic of reflection to set even the most unsettable of fields.
The answer to the first, harder question came from an MSDN post entitled “How to: Modify Expression Trees” (http://msdn.microsoft.com/en-us/library/bb546136.aspx). The technique is simple: You use an expression tree “Visitor” (see http://msdn.microsoft.com/en-us/library/bb882521.aspx) to make a copy of the original expression tree, changing certain pieces as you go. The example in the article changed all the Ands (&&) to Ors(||) in a simple query, so why not change all the calls from SqlMethods.Like to something else?
I inherited a class from ExpressionVisitor and called it SqlMethodExpressionFixer (What, you got a better name?), and overrode the VisitMethodCall method. This will allow me to intercept all the MethodCall expressions in the tree, examine them to see if they’re calls to SqlMethods.Like, and substitute calls to my own Like method instead which just so happens to take the exact same parameters as SqlMethods.Like (Clever, that bit, eh?). Here is the SqlMethodExpressionFixer class:
public class SqlMethodExpressionFixer : ExpressionVisitor { public Expression Modify(Expression expression) { return Visit(expression); } protected override Expression VisitMethodCall(MethodCallExpression expression) { Expression result; MethodInfo mi = expression.Method; if ((mi.DeclaringType == typeof(System.Data.Linq.SqlClient.SqlMethods)) && (mi.Name == "Like")) { MethodInfo likeMethodInfo = GetType().GetMethod("Like"); IEnumerable<Expression> args = VisitExpressionList(expression.Arguments); result = Expression.Call(likeMethodInfo, args.ToArray()); } else { result = base.VisitMethodCall(expression); } return result; } public static bool Like(string matchExpression, string pattern) { pattern = Regex.Escape(pattern); pattern = pattern.Replace("%", ".*?").Replace("_", "."); pattern = pattern.Replace(@"\[", "[").Replace(@"\]", "]").Replace(@"\^", "^"); return Regex.IsMatch(matchExpression, pattern); } }
Now that we have a modified copy of the original expression, all that’s left is to substitute it for the original expression whenever we want to run a query with a “Like” in it against an in-memory list. Fortunately, that’s pretty easy to do as well. Given an IQueryable instance, we can figure out whether it’s a database or in-memory source simply by examining its base type. If it’s EnumerableQuery, we’re dealing with an in-memory data source such as a List, and we’ll substitute the modified expression via reflection. If it’s anything else, such as a System.Data.Linq.DataQuery, we’ll leave it alone. Add the following method to the ExpressionFixer to do that for you:
public void FixQuery<T>(IQueryable<T> input) { Type inputType = input.GetType(); if (inputType.BaseType.FullName == "System.Linq.EnumerableQuery") { SqlMethodExpressionFixer fixer = new SqlMethodExpressionFixer(); System.Linq.Expressions.Expression expression = fixer.Modify(input.Expression); FieldInfo fi = inputType.GetField("expression", BindingFlags.NonPublic | BindingFlags.Instance); fi.SetValue(input, expression); } }
Just call the FixQuery method, passing in the query you want to run in-memory, and it will be modified in-place. I don’t know about you, but this is pretty freakin’ awesome if you ask me. We can now fully unit-test our Linq-based data layer without having to hit the database at all.
Update: Here are a couple of unit tests to illustrate the usage. These are written using Eric Hexter’s excellent Should assertion library, which you should check out if you’re not already using it. Replace the reference to Core.Linq.Expressions with a reference to wherever you put SqlMethodExpressionFixer. There are two tests here. One illustrates that trying to enumerate _filtered normally throws an exception. The second uses ExpressionFixer to fix up the query to make it work against the in-memory _list, and then enumerates it, proving that the fixer works.
using System; using System.Collections.Generic; using System.Data.Linq.SqlClient; using System.Linq; using Core.Linq.Expressions; using NUnit.Framework; using Should; namespace Core.Tests.Linq.Expressions { [TestFixture] public class ExpressionFixerTests { private IQueryable<string> _list; private IQueryable<string> _filtered; [SetUp] public void SetUp() { _list = new List<string> { "abcd", "bcde", "cdef" }.AsQueryable(); _filtered = _list.Where(x => SqlMethods.Like(x, "%bc%")); } [Test] public void Queries_with_SqlMethods_calls_fail_against_in_memory_lists() { Action action = () => _filtered.ToList(); action.ShouldThrow<NotSupportedException>(); } [Test] public void ExpressionFixer_replaces_SqlMethods_Like_with_RegEx_Like() { var fixer = new SqlMethodExpressionFixer(); fixer.FixQuery(_filtered); var results = _filtered.ToList(); results.Count.ShouldEqual(2); } } }
I’m having trouble putting this all together in my code. Could you provide an example, please?
I updated the article to include my unit tests. Hopefully that illustrates the usage. Let me know if you have any questions.