Dynamic fuzzy search with LINQ and entity framework and jQuery autocomplete part 4

By eidias on (tags: fuzzy search, categories: code)

Search has always been my most desired feature in applications and if I were to guess, I’m not the only one with this craving. The comfort and flexibility it gives you is enormous under one condition – it’s smart enough.

Time for part four

In part three we were left with two problems to solve in the method that fetches data for the search index.

  1. The code would select all columns from the db instead of just the searchable ones
  2. The data pulling operation is time consuming

Let’s take a look at the simpler problem first – the second one.

Constructing the search index every time that someone does a search is just a waste of resources, so it would be nice to be able to construct it once and keep it in cache. Of course the problem with cache is that it gets stale. Using time as expiration is not optimal. Refreshing the cache on entity change is. Fortunately, I have the means to do that, thanks to the project architecture. So each time an indexed item changes, I refresh the cache. Nice and smooth. I did throw in a 1h sliding expiration in there, cause I didn’t want to hog the resources and if someone is not using the app for an hour, chances are, they won’t be using it at all until the next ‘session’ – it’s just that kind of an app.

Now for the second problem – querying just the things I need. Now that was a hassle.

I got a bit stuck on the mentioned issue with creating dynamically anonymous objects – that’s simply not possible in C# (at the moment). So with a push from a helpful stack overflow member I went in a slightly different direction. If I can’t create an anonymous type, I’ll need to use a type that’s defined. The .Select() method signature ruled out the usage of dictionaries, hash tables or something of that sort, so I experimented with other options.

First was to use the same type that was queried, but populating only the required fields. So something like that:

   1: AppContext.Set<Foo>.Select(f => new Foo { ID = f.ID });

Unfortunately that’s not possible in in linq to entities. You can do a .Select(f => f) but you can’t do the former.

Second idea, was to create a type that inherits from the queried – the same result – no go.
Third was to create a proxy that aggregated the queried type (cause as we found out, inheritance was not an option). That looked like a clean solution, but turned out to be problematic later down the road.
The final idea I settled with was dynamically creating a type that inherits from Entity and has all the properties that should be queried. I know it sounds bad, and I gave it much thought, cause dynamically creating types is a slippery slope, but entity framework does it, nhibernate does it so I guess I can try it as well.

Turns out that it’s not as difficult as I initially thought. One method was enough:

   1: public static Type CreateTypeSearchProxyType(Type typeFrom, IEnumerable<PropertyInfo> properties)
   2: {
   3:     var assemblyBuilder = Thread.GetDomain().DefineDynamicAssembly(assemblyName, AssemblyBuilderAccess.Run);
   4:     var module = assemblyBuilder.DefineDynamicModule(moduleName);
   5:  
   6:     var typeBuilder = module.DefineType(typeFrom.Name + suffix, TypeAttributes.Public | TypeAttributes.Class, typeof(Entity));
   7:     var entityMembers = typeof(Entity).GetMembers().Select(m => m.Name);
   8:     foreach (var property in properties.Where(p => !entityMembers.Contains(p.Name)))
   9:     {
  10:         typeBuilder.DefineField(property.Name, typeof(string), FieldAttributes.Public);
  11:     }
  12:  
  13:     return typeBuilder.CreateType();
  14: }

Of course before using this I make sure that I haven’t already defined the type (cause who knows what will happen if I try to do it twice) and the whole thing is called only once – on the first search. With that I can create the last piece of the puzzle:

   1: public static IQueryable<TTo> PartialMap<TFrom, TTo>(this IQueryable<TFrom> source, params string[] members)
   2: {
   3:     var p = Expression.Parameter(typeof(TFrom));
   4:     var body = Expression.MemberInit(Expression.New(typeof(TTo)),
   5:                                      members.Select(m => (MemberBinding)Expression.Bind(typeof(TTo).GetMember(m).Single(), Expression.PropertyOrField(p, m))));
   6:  
   7:     return source.Select(Expression.Lambda<Func<TFrom, TTo>>(body, p));
   8: }

This extension method will help with mapping type Foo to a dynamically created type that contains all the searchable properties of type Foo. So now instead of

   1: var set = AppContext.Set<T>().ToList();

I can write

   1: var set = AppContext.Set<T>().PartialMap<T, TTo>(propertiesToQuery).ToList();
and that does a “SELECT SearchableProperty1, SearchableProperty2… FROM …” – job’s done.

Final words

What started as a simple search requirement turned out to me quite complex technically. On my local machine, the first search takes around 220ms. Once the types are constructed (and cache is empty), it takes around 70ms, with full cache less than 5ms. That’s decent enough for me.

Additionally, to make it more usable, I added a keyboard shortcut that focuses the search input – these are getting more and more popular in web apps and there’s a reason for that.

There are a few things that could be improved here. First of all, moving the type creation to App_Start – this will reduce the ‘first  ever’ search time. Then there’s the matter of having the Entity class as a dependency in the whole process. I did that so that generating a url for the search item target would be easier. And it works, but sooner or later, I’m going to run into a situation where I’ll want to search through something that doesn’t inherit from Entity. I already have an idea for that, but I’ll address that when the necessity arises.

All that hassle to get this (now you know what that Type property was in the SearchIndexItem – no magic, just additional data):

search

Was it worth it? I think so, but time will tell.

That’s all for the series.

Cheers