I found an interesting bug this morning, and figured I’d better write about it so that maybe I’ll remember to avoid it in the future. We have a Linq query that involves a couple Joins. The main table we’re interested in is already provided for us by a "FilteredNotes" property like this:
IQueryable<Note> notes = DataContext.Notes; if (!canViewAllNoteTypes) { if (canViewLegalNoteTypes) return DataContext.Notes.Where(n => legalCategoryIds.Contains(n.CategoryId)); else return new List<Note>().AsQueryable(); } ...
The results of this pre-filtered set of data are then joined to two other tables to provide some additional filtering like so:
IQueryable<NoteSummary> results = from d in this.FilteredNotes join c in this.DataContext.Claims on d.ParentId equals c.ClaimId join e in this.DataContext.Enrollees on c.EnrolleeId equals e.EnrolleeId where d.ResponsiblePersonId == userId && d.CompletedDate == null && d.ScheduledDate >= startDate && d.ScheduledDate <= endDate orderby d.ScheduledDate ascending select new NoteSummary { ...
}; return results.ToList();
This code has been working great, until I started testing a new security role type. When I am under the guise of this new type, the query would run for a long time, and then explode with an OutOfMemoryException. The problem is actually really simple once you see it. In that first section of code, the author wanted to return an empty list to join against, since the user wasn’t authorized to view any of the categories of notes that we’re interested in here. The trouble is the way the empty list was created. Creating a new list with nothing in it, and then calling AsQueryable() generated an empty IQueryable, but it did it in memory. Since this is the main list to which we’re joining the others, Linq wants to pull them ALL into memory and do the join there. Normally you might not notice, but since this database has millions of rows in each of the tables we’re joining to, it tends to make things bog down a bit.
The solution is to create our empty list, but to do it in a different way. We want an IQueryable of Notes that will become part of a single Linq query sent to the database. We can solve this by changing the FilteredNotes property to retrieve the IQueryable<Note> in a way that defers its execution, but still guarantees it will be blank:
IQueryable<Note> notes = DataContext.Notes; if (!canViewAllNoteTypes) { if (canViewLegalNoteTypes) return DataContext.Notes.Where(n => legalCategoryIds.Contains(n.CategoryId)); else return DataContext.Notes.Where(n => false); } ...
There, by changing the condition to be simple "false", the retrieval of the Notes can now be part of one big query, but we know that it’ll be empty when that happens. Problem solved.