Repositories are dead. Long live repositories.

Note: Here’s one from the backlog that I hope to get back to one day. I’m currently working on a GraphQL-based project, which makes this all superfluous, and I’ve had to shelve my work on this, but I want to put it out there anyway before I forget. Next time I’m working on a more traditional REST API, I hope to dust this off and really live in it.

Personally, I’ve been a holdout on “Team Repository” for perhaps a little too long now. I like the way repositories let me define a query once, in one place, and then share it with the rest of the codebase. I like that I can right-click on a query method and say “Find Usages”. I’m not so much a fan of them being the ultimate cutoff where the query gets materialized because that leads down a path of a hundred queries that only vary slightly from each other. I feel the time has come to design something better, and it’s absolutely dead simple.

Why Repositories?

Like everything else we do, we’re constantly trying to DRY out our code and increase reuse. As developers, we also like to create the “one true answer”, to invent the new shiny thing, to go down in history as the inventor of the last, best solution to all of our problems. Or maybe we just want to make our day to day development easier by solving a problem once and for all. This is the promise of repositories. I define my GetUserByEmailAddress query once, in one place, and then the rest of my code can simply reuse that golden implementation and never think about it again. We want to kill the problem and walk away.

Repositories let us do this. They let us define what operations the rest of the code is allowed to do, exactly which parameters are required to do that work, and they prevent other code from going rogue and inventing their own ad-hoc queries. They are supposed to keep us safe. But they grow out of control, and eventually we end up with gigantic repositories with hundreds of methods, most of which have only one caller, or we end up with a single method on that repository that hands back an IQueryable, and everything else in the application ends up calling that. Neither are ideal outcomes.

Why NOT repositories?

There has been a movement afoot for several years now to do away with repositories entirely, and while part of me screamed “Nooo! Not my repos! I love my repos! I just got them how I like them!”, I could see where they we’re coming from. Entity Framework’s DbSet<T> really IS a repository. It can load, save, and update items. If you define a repository class to sit on top of it, you’re really just wrapping one repo inside another.

In a typical layered application, there is usually some kind of service or “logic” layer that serves as the brains of the outfit. Sometimes this is implemented in the objects themselves in an old-school “object-oriented” way. That way lies the “Active Record” lands, and I personally don’t go over there if I can help it. I’m in the smart services camp until someone talks me out of it. As such, I like my service layer to be the one making the decisions, calling the shots, and generally being trusted to know what it’s doing.

A traditional repository takes some of that responsibility away from the service, and normally we’d say “Separation of concerns! Single responsibility principle!” and shout down the anarchist rebels who want to blend our carefully separated layers together… the heathens. I’d ask you to consider for a moment though that it’s the service’s job to ask for what it wants. It’s merely the repository’s job to go get it, while hiding the fact that there’s a database there behind the scenes. We’ve given the repos the power to tell the rest of the application what is possible.

So what’s this big idea then?

Prepare to have your mind blown, folks because this is going to shake the very foundations of your world. Are you ready for it? Here is comes… DbSet Extension Methods.

I know, right? Boom. Drop the mic and walk off the stage.
What? You don’t see it?
Okay, fine.

If a DbSet<User> is already a kind of repository, then all it’s missing is the “GetById” or “GetByEmailAddress” methods.

public static class UserDbSetExtensions
{
    public static IQueryable<User> GetByEmailAddress(this DbSet<User> dbSet, string emailAddress)
    {
        return dbSet.Where(x => x.EmailAddress == emailAddress);
    }
}

These are the reusable bits of code that we wanted all along, right? So why not just tack them onto the existing DbSet<T> implementations and call it a day? We get the best of both worlds. I have a convenient place to define frequently-used queries or operations, and my services still have direct access to the DbSet if needed.

I considered whether maybe inheriting from DbSet<T> and making a concrete UserDbSet would be the way to go, but there are casting problems within Entity Framework itself, or at least with EF Core the last time I checked. There will be code out there in libraries you want to use that might assume something will be an actual DbSet<User> and won’t know what a UserDbSet is. As long as the inheritance is correct, it should still work, but I found this issue on EF Core’s GitHub page, and that killed the idea for me immediately. Maybe it will be possible someday, but that was just too much uncertainty for me at this point.

In the meantime, adding well-known operations or queries as extension methods will have to do. Generally speaking, extension methods are used for extending things you don’t own, but in this case we’re prevented from writing a proper descendent class, so extension methods will have to do for the moment.

You could build these extension methods off of IQueryable<T> instead of DbSet<T>, which would allow you to stack them up to build queries out of components like Legos, but I don’t think that would be the best or most efficient way to do this. This method was meant to be a more direct replacement for repository methods, and they don’t work that way. You can’t chain traditional repo methods, so doing that here wouldn’t really fit that pattern anyway. What we’re after is an authoritative menu of hand-written, optimized, well-known queries. If you have a need for the order list, filtered by client Id, and a query already exists that does just that, then you can use that query and be on your way. If not, then it’s business as usual.