Monday, January 23, 2012

EF EntityObject Value Hashing

Another "problem" I've run into recently. How do you efficiently compare entity objects to determine if changes have been made when the objects are based off of views constructed from multiple external tables without hashing the entire object and it's values? This one took a bit of digging and testing. Hashing each row was my solution. I started off trying to just serialize the EntityObject in XML or Binary. XML was an issue due to sanitization of the data and was memory heavy. Binary didn't have such issues, but when comparing the binary hashes, which could grow quite large efficiency was an issue. Writing the initial hash of 200,000 rows consumed only 15 minutes, but comparing the binary between two sets (identical in fact), the comparison took an hour. I ended up using a custom GetValueHashCode method and associated Hash field in a partial class of the view object. While this takes considerably longer to code and maintain, the payoffs are immense and it's more than feasible. I'll let you know how it pans out in a month or so...

Watch your arithmetic overflow! I replaced 17/23 with smaller primes, but if you're storing them in a database check your datatypes. I'd also suggest using a HashSet when comparing hashes (StackOverflow- MSDN).

public partial class myview
{
  private int _hash;
  
  public int Hash
  {
    get { return this.GetValueHashCode(); }
    set { this._hash = value; }
  }

  public override GetHashCode()
  {
    var hash = 2;

    unchecked // Overflow is find, just wrap
    {
      if (this.field1 != null)
        hash = hash * 3 + this.field1.GetHashCode();
      if (this.field2 != null)
        hash = hash * 3 + this.field2.GetHashCode();
      if (this.field3 != null)
        hash = hash * 3 + this.field3.GetHashCode();
    }

    return hash;
}

Source again, StackOverflow, full of peeps smarter than me :P

Sunday, January 22, 2012

ObjectContext List of Fields/Tables

I've been digging around trying to find a way to dynamically pull a list of tables and their associated fields from an ObjectContext and I think I just found a solution, courtesy again of StackOverflow. Here's what I came up with to solve my particular problem, not my best work, but it'll do. I've set it up to be added to a partial class of an existing ObjectContext, so it can be called via "context.GetTableList()" or "context.GetFieldList("TableName")".
public List<string> GetTableList()
{
    var r = new List<string>();
    var query = from meta in this.MetadataWorkspace.GetItems(DataSpace.CSpace)
                .Where(m => m.BuiltInTypeKind == BuiltInTypeKind.EntityType)
                let properties = meta is EntityType ? (meta as EntityType).Properties : null
                select new { TableName = (meta as EntityType).Name };
    query.ToList().ForEach(c => r.Add(c.TableName));
    return r;
}

public List<string> GetFieldList(string table)
{
    var r = new List<string>();
    var query = from meta in this.MetadataWorkspace.GetItems(DataSpace.CSpace)
                .Where(m => m.BuiltInTypeKind == BuiltInTypeKind.EntityType)
                let properties = meta is EntityType ? (meta as EntityType).Properties : null
                where (meta as EntityType).Name == table
                from p in properties
                select new
                {
                    FieldName = p.Name,
                    DbType = p.TypeUsage.EdmType.Name
                };
    query.ToList().ForEach(c => r.Add(c.FieldName));
    return r;
}

Source