Wednesday, June 18, 2008

Object Deep Cloning using IL in C# - version 1.0

Howdy,

If you've read the rest of my blog, then you'll have noticed that I have another post that addresses cloning of objects using IL (Intermediate Language)
If you haven't, then you can find the post 'Object cloning using IL in C#'.

The reason I published this post is because the code in my previous post (mentioned above) makes a shallow copy of an object. It copies only references of classes, value-types are off course copied by value. So if you have a class with only value types or strings (immutable), then the shallow copy will act as a deep copy (sort of).

Now, imagine that you have (like in most projects) objects that contain other objects, like a person object that has one legal address and different optional addresses. Then you'll probably create a Person class and an Address class,
in the person class you'll make a property which has the type 'Address' and a property that's a list of Addresses.

If you would take a shallow clone, and you change an address in the clone, then it will also be changed in the original person.address object.

So, i've created a solution that clones the 'Person' object in this case, with the addresses, but it makes new instances of each address.

The only constraints however are that the 'to-clone' objects MUST have a default constructor for instantiating the class, if the system can make an instance, then the private fields will automatically be copied.

Like I mentioned in my previous post, you'll only loose performance the first time, because of all the lookups and the generating of the IL-code.
After that, the compiled code will be executed.
I tried to optimize the IL code as much as possible; maybe it can be optimized even more.

For example:

If you would clone the address of a person in normal C# code then you can do:

public Person Clone(Person p) 

    Person clone 
= new Person()
    
clone.Address = new Address()
    
// normally you should check on null, 
    // but let's assume that it's never null. 
    
clone.Address.ID p.Address.ID;
    return 
clone;
}

or you can do:

public Person Clone(Person p) 

    Person clone 
= new Person();
    
Address a = new Address();
    
clone.Address a;
    
// normally you should check on null, 
    // but let's assume that it's never null. 
    
a.ID p.Address.ID;
    return 
clone;
}

The performance of the second option is 'higher'.

You could ask, why?
Well, if you would disassemble these two code samples, then you'll notice that the second code needs fewer instructions.

First sample decompiled:

.locals init (
     [0] class Cloning.Person clone)
L_0000: nop
L_0001: newobj instance void Cloning.Person::.ctor()
L_0006: stloc.0
L_0007: ldloc.0
L_0008: newobj instance void Cloning.Address::.ctor()
L_000d: callvirt instance void Cloning.Person::set_Address(class Cloning.Address)
L_0012: nop
L_0013: ldloc.0
L_0014: callvirt instance class Cloning.Address Cloning.Person::get_Address()
L_0019: ldarg.1
L_001a: callvirt instance class Cloning.Address Cloning.Person::get_Address()
L_001f: callvirt instance int32 Cloning.Address::get_AddressID()
L_0024: callvirt instance void Cloning.Address::set_AddressID(int32)

L_0029: nop
L_002a: ldloc.0
L_002b: ret


Second sample decompiled:

.locals init (
     [0] class Cloning.Person clone,
     [1] class Cloning.Address a)
L_0000: nop
L_0001: newobj instance void Cloning.Person::.ctor()
L_0006: stloc.0
L_0007: newobj instance void Cloning.Address::.ctor()
L_000c: stloc.1
L_000d: ldloc.0
L_000e: ldloc.1
L_000f: callvirt instance void Cloning.Person::set_Address(class Cloning.Address)
L_0014: nop
L_0015: ldloc.1
L_0016: ldarg.1
L_0017: callvirt instance class Cloning.Address Cloning.Person::get_Address()
L_001c: callvirt instance int32 Cloning.Address::get_AddressID()
L_0021: callvirt instance void Cloning.Address::set_AddressID(int32)
L_0026: nop
L_0027: ldloc.0
L_0028: ret


As you see in the code of the first sample, you need 5 callvirt's, and it's more IL code,
in the second sample you have 4 callvirt's and less code, because we have a kind of shortcut to the address object
from the local store [1].

When you create an object, you push it onto the stack and then you store it in a local store (local variable).
If you use a.ID then you need to lookup the ID from the local store reference,
else if you use clone.Address.ID then you need to lookup the Address reference from the person object (also in a store), and then lookup the ID reference from the retrieved Address reference, and then you can do an action with it.
(lookat line L_0013 to L_0024 of first sample and line L_0015 to L_0021 in second sample)

In that manner I tried to optimize all the calls in the IL code, so that I can do the required job in as less instructions as possible.

Here is the result:

Now let's take a look at the code:

Our revisited Person class with an Address class:


using System;
using 
System.Collections.Generic;
using 
System.Text;

namespace 
Cloning
{
    
public class Person
    {
        
private int _id;
        private string 
_name;
        private string 
_firstName;
        private string 
_field1, _field2, _field3;

        public 
Person()
        {
            
this.Addresses = new List<Address>();
        
}

        
public int ID
        {
            
get return _id}
            
set { _id = value; }
        }

        
public string Name
        {
            
get return _name}
            
set { _name = value; }
        }

        
public string FirstName
        {
            
get return _firstName}
            
set { _firstName = value; }
        }

        
private Address _address;

        public 
Address Address
        {
            
get return _address}
            
set { _address = value; }
        }

        
private List<Address> _addresses;

        public 
List<Address> Addresses
        {
            
get return _addresses}
            
set { _addresses = value; }
        }

    }

    
public class Address
    {
        
public Address()
        {
            
this.AddressID -1;
        
}

        
public Address(int aid)
        {
            
this.AddressID aid;
        
}

        
private int _addressID;

        public int 
AddressID
        {
            
get return _addressID}
            
set { _addressID = value; }
        }

        
private string _street;

        public string 
Street
        {
            
get return _street}
            
set { _street = value; }
        }

        
private string _city;

        public string 
City
        {
            
get return _city}
            
set { _city = value; }
        }

    }
}


The Cloning class used for testing:

using System;
using 
System.Collections.Generic;
using 
System.Text;
using 
System.Reflection;
using 
System.Reflection.Emit;
using 
System.Threading;
using 
System.Collections;

namespace 
Cloning
{
    
/// <summary>    
    /// Delegate handler that's used to compile the IL to.    
    /// (This delegate is standard in .net 3.5)    
    /// </summary>    
    /// <typeparam name="T1">Parameter Type</typeparam>    
    /// <typeparam name="TResult">Return Type</typeparam>    
    /// <param name="arg1">Argument</param>    
    /// <returns>Result</returns>    
    
public delegate TResult Func<T1, TResult>(T1 arg1);

    public class 
Cloning
    {
        
/// <summary>    
        /// This dictionary caches the delegates for each 'to-clone' type.    
        /// </summary>    
        
private static Dictionary<Type, Delegate> _cachedIL = new Dictionary<Type, Delegate>();
        private static 
Dictionary<Type, Delegate> _cachedILDeep = new Dictionary<Type, Delegate>();
        private 
LocalBuilder _lbfTemp;

        
/// <summary>    
        /// Clone one person object with reflection    
        /// </summary>    
        /// <param name="p">Person to clone</param>    
        /// <returns>Cloned person</returns>    
        
public static Person CloneObjectWithReflection(Person p)
        {
            FieldInfo[] fis 
p.GetType().GetFields(System.Reflection.BindingFlags.Instance |
                System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic)
;
            
Person newPerson = new Person();
            foreach 
(FieldInfo fi in fis)
            {
                fi.SetValue(newPerson, fi.GetValue(p))
;
            
}
            
return newPerson;
        
}

        
/// <summary>    
        /// Clone a person object by manually typing the copy statements.    
        /// </summary>    
        /// <param name="p">Object to clone</param>    
        /// <returns>Cloned object</returns>    
        
public static Person CloneNormal(Person p)
        {
            Person newPerson 
= new Person();
            
newPerson.ID p.ID;
            
newPerson.Name p.Name;
            
newPerson.FirstName p.FirstName;
            
newPerson.Address = new Address();
            
newPerson.Address.AddressID p.Address.AddressID;
            
newPerson.Address.City p.Address.City;
            
newPerson.Address.Street p.Address.Street;
            if
(newPerson.Addresses!=null)
            {
                newPerson.Addresses 
= new List<Address>();
                foreach 
(Address a in newPerson.Addresses)
                {
                    newPerson.Addresses.Add(a)
;
                
}
            }
            
return newPerson;
        
}

        
/// <summary>    
        /// Generic cloning method that clones an object using IL.    
        /// Only the first call of a certain type will hold back performance.    
        /// After the first call, the compiled IL is executed.    
        /// </summary>    
        /// <typeparam name="T">Type of object to clone</typeparam>    
        /// <param name="myObject">Object to clone</param>    
        /// <returns>Cloned object</returns>    
        
public static T CloneObjectWithILShallow<T>(T myObject)
        {
            Delegate myExec 
= null;
            if 
(!_cachedIL.TryGetValue(typeof(T), out myExec))
            {
                
// Create ILGenerator (both DM declarations work)
                // DynamicMethod dymMethod = new DynamicMethod("DoClone", typeof(T), 
                //      new Type[] { typeof(T) }, true);
                
DynamicMethod dymMethod = new DynamicMethod("DoClone"typeof(T), 
                    
new Type[] { typeof(T) }, Assembly.GetExecutingAssembly().ManifestModule, true);
                
ConstructorInfo cInfo myObject.GetType().GetConstructor(new Type[] { });
                
ILGenerator generator dymMethod.GetILGenerator();
                
LocalBuilder lbf generator.DeclareLocal(typeof(T));
                
generator.Emit(OpCodes.Newobj, cInfo);
                
generator.Emit(OpCodes.Stloc_0);
                foreach 
(FieldInfo field in myObject.GetType().GetFields(
                        System.Reflection.BindingFlags.Instance 
                        | System.Reflection.BindingFlags.NonPublic 
                        | System.Reflection.BindingFlags.Public))
                {
                    generator.Emit(OpCodes.Ldloc_0)
;
                    
generator.Emit(OpCodes.Ldarg_0);
                    
generator.Emit(OpCodes.Ldfld, field);
                    
generator.Emit(OpCodes.Stfld, field);
                
}
                generator.Emit(OpCodes.Ldloc_0)
;
                
generator.Emit(OpCodes.Ret);
                
myExec dymMethod.CreateDelegate(typeof(Func<T, T>));
                
_cachedIL.Add(typeof(T), myExec);
            
}
            
return ((Func<T, T>)myExec)(myObject);
        
}

        
public T CloneObjectWithILDeep<T>(T myObject)
        {
            Delegate myExec 
= null;
            if 
(!_cachedILDeep.TryGetValue(typeof(T), out myExec))
            {
                
// Create ILGenerator (both DM declarations work)
                // DynamicMethod dymMethod = new DynamicMethod("DoClone", typeof(T), 
                //      new Type[] { typeof(T) }, true);
                
DynamicMethod dymMethod = new DynamicMethod("DoClone"typeof(T), 
                    
new Type[] { typeof(T) }, Assembly.GetExecutingAssembly().ManifestModule, true);
                
ConstructorInfo cInfo myObject.GetType().GetConstructor(new Type[] { });
                
ILGenerator generator dymMethod.GetILGenerator();
                
LocalBuilder lbf generator.DeclareLocal(typeof(T));
                
generator.Emit(OpCodes.Newobj, cInfo);
                
generator.Emit(OpCodes.Stloc_0);

                foreach 
(FieldInfo field in typeof(T).GetFields(System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Public))
                {
                    
if (field.FieldType.IsValueType || field.FieldType == typeof(string))
                        CopyValueType(generator, field)
;
                    else if 
(field.FieldType.IsClass)
                        CopyReferenceType(generator, field)
;
                
}
                generator.Emit(OpCodes.Ldloc_0)
;
                
generator.Emit(OpCodes.Ret);
                
myExec dymMethod.CreateDelegate(typeof(Func<T, T>));
                
_cachedILDeep.Add(typeof(T), myExec);
            
}
            
return ((Func<T, T>)myExec)(myObject);
        
}

        
private void CreateNewTempObject(ILGenerator generator, Type type)
        {
            ConstructorInfo cInfo 
type.GetConstructor(new Type[] { });
            
generator.Emit(OpCodes.Newobj, cInfo);
            
generator.Emit(OpCodes.Stloc, _lbfTemp);
        
}

        
private void CopyValueType(ILGenerator generator, FieldInfo field)
        {
            generator.Emit(OpCodes.Ldloc_0)
;
            
generator.Emit(OpCodes.Ldarg_0);
            
generator.Emit(OpCodes.Ldfld, field);
            
generator.Emit(OpCodes.Stfld, field);
        
}

        
private void CopyValueTypeTemp(ILGenerator generator, FieldInfo fieldParent, FieldInfo fieldDetail)
        {
            generator.Emit(OpCodes.Ldloc_1)
;
            
generator.Emit(OpCodes.Ldarg_0);
            
generator.Emit(OpCodes.Ldfld, fieldParent);
            
generator.Emit(OpCodes.Ldfld, fieldDetail);
            
generator.Emit(OpCodes.Stfld, fieldDetail);
        
}

        
private void PlaceNewTempObjInClone(ILGenerator generator, FieldInfo field)
        {
            
// Get object from custom location and store it in right field of location 0
            
generator.Emit(OpCodes.Ldloc_0);
            
generator.Emit(OpCodes.Ldloc, _lbfTemp);
            
generator.Emit(OpCodes.Stfld, field);
        
}

        
private void CopyReferenceType(ILGenerator generator, FieldInfo field)
        {
            
// We have a reference type.
            
_lbfTemp generator.DeclareLocal(field.FieldType);
            if 
(field.FieldType.GetInterface("IEnumerable") != null)
            {
                
// We have a list type (generic).
                
if (field.FieldType.IsGenericType)
                {
                    
// Get argument of list type
                    
Type argType field.FieldType.GetGenericArguments()[0];
                    
// Check that it has a constructor that accepts another IEnumerable.
                    
Type genericType Type.GetType("System.Collections.Generic.IEnumerable`1[" 
                            
+ argType.FullName + "]");
                    
                    
ConstructorInfo ci field.FieldType.GetConstructor(new Type[] { genericType });
                    if 
(ci != null)
                    {
                        
// It has! (Like the List<> class)
                        
generator.Emit(OpCodes.Ldarg_0);
                        
generator.Emit(OpCodes.Ldfld, field);
                        
generator.Emit(OpCodes.Newobj, ci);
                        
generator.Emit(OpCodes.Stloc, _lbfTemp);
                        
PlaceNewTempObjInClone(generator, field);
                    
}
                }
            }
            
else
            
{
                CreateNewTempObject(generator, field.FieldType)
;
                
PlaceNewTempObjInClone(generator, field);
                foreach 
(FieldInfo fi in field.FieldType.GetFields(System.Reflection.BindingFlags.Instance
                    | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Public))
                {
                    
if (fi.FieldType.IsValueType || fi.FieldType == typeof(string))
                        CopyValueTypeTemp(generator, field, fi)
;
                    else if 
(fi.FieldType.IsClass)
                        CopyReferenceType(generator, fi)
;
                
}
            }
        }

    }
}


The Program class with the test code in it.

namespace Cloning
{
    
class Program
    {
        
static void Main(string[] args)
        {
            
// Do some cloning tests...
            
Cloning.TestCloning tc = new Cloning.TestCloning();
            
tc.DoTest();
        
}
    }
}


Explanation:

As you see in the code, the model is a little bit expanded.
But if you shuffle a bit with the code, then you can put it in 1 method again,
I only did this to improve readability, so that you can understand what's happening, and how the code is generated.

I also left out the comments from the IL parts, if you want to know what those statements mean, then look at my other post regarding IL cloning (Object cloning using IL in C#), it's explained there.

Method list of cloning class with a short description:


  • Person CloneObjectWithReflection(Person p)
    This method clones an object using reflection (very slow).

  • Person CloneNormal(Person p)
    This method clones the person class manually.
    I think a lot of people will do/user this method :)


  • T CloneObjectWithILShallow(T myObject)
    Make a shallow copy of an object using IL in a generic method.

  • T CloneObjectWithILDeep(T myObject)
    Make a deep copy of an object using IL in a generic method.

  • void CreateNewTempObject(ILGenerator generator, Type type)
    Generate IL-code to create a new object of a certain type and store it in a local store. (local variable)

  • void CopyValueType(ILGenerator generator, FieldInfo field)
    Generate IL-code to copy the values of a value type to the clone.

  • void CopyValueTypeTemp(ILGenerator generator, FieldInfo fieldParent, FieldInfo fieldDetail)
    Generate IL-code to copy the values of a value type from a store location to the destination address in the clone.

  • void PlaceNewTempObjInClone(ILGenerator generator, FieldInfo field)
    Generate IL-code to reference an object from a store to an address in the clone.

  • void CopyReferenceType(ILGenerator generator, FieldInfo field)
    Generate IL-code to copy the values of a reference type (class) to the clone. Instantiate new objects if needed.


I didn't test LINQ support, i guess that an extra 'exception' has to be added for the IQueryable or something like that, and also Arrays should be added, but that's for in version 2.0 or something like that ;), you can see how the system works and expand it to your needs...

I hope that this post and piece of code is useful for some people, if so, please let me know through a comment, thanks ;)

Regards,
F.

14 comments:

Sam said...

Cool, thanks (even though a local _lbfTemp is a bit ugly).

It is a pity it does not work for LINQ objects containing EntitySets.
It fails in CopyReferenceType, the call Type.GetType("System.Collections.Generic.IEnumerable`1["+field.FieldType.FullName+"]"); returns null, crashing the next statement :(

I'll try to make this work, but I'm not sure if I understand it enough to make it work...

thanks for the leg-up!

Whizzo said...

Hi Sam,

I just saw a little flaw in my code, so i've corrected it, it was in CopyReferenceType, the
Type.GetType("System.Collections.Generic.IEnumerable`1["+field.FieldType.FullName+"]"); doesn't return null anymore... (normally)

This code is a draft version, i have to clean it, so yeah die class var is maybe a little dirty, but it's effective at the moment :)
It tracks my local store declarations...

The 'list<>' objects become new list objects but the containing objects are shallow, meaning that you can change the population of the lists in the cloned objects, but not the objects itself, without changing the original values....

I'll take a look at the LINQ stuff, but my environment doesn't let me at this moment.

Regards
F.

Sam said...

For my example, it still returns null.

As you said, enumerations do not yet work, and this:
field.FieldType = {Name = "EntitySet`1" FullName = "System.Data.Linq.EntitySet`1[[Bsoft.Libs.WcfInterface.LibsZugGruppenBenutzer, Bsoft.Libs.WcfInterface, Version=1.4.12.1856, Culture=neutral, PublicKeyToken=fef7aea2559add41]]"}
is an enumeration.

I really appreciate your work, and have full understanding of your limited time, so I'm glad for your fast and helpful responses!

If it would help you, I could post a LINQ entity class source that is throwing this error, so you don't have to whip up one yourself - just say, and I'll build a really small example for you.

Whizzo said...

Hi Sam,

It would be really helpful if you could send an example that has your problem... please make it work standalone...

You can mail it to whizzo#at# telenet#dot#be

Cheers!

Whizzo said...

EntitySet<TEntity> is indeed an enumeration but is doesn't accept a constructor with parameter IEnumerable<T> so, that why the constructorinfo is null.

There is another approach needed for entityset, if it's clonable, but i will look in to it if i get your code...

tip: maybe try to add an extention method to entityset that accepts a IEnumerable<T> as parameter, you can't inherit because it's sealed.

Regards
F.

Sam said...

I've sent you an example project at your other address - holler if you need more!

Alex said...

Hy,

I have just found your blog post, searching for a way to clone LINQ to SQL Entity Classes.
The problem is ... my EntitySet<> are copied and I don't want this behavior :). I have tried both CloneObjectWithILShallow and CloneObjectWithIL from the original post, they both copy the EntitySet for the classes. The workaround I found is to add something like this

if (!field.FieldType.FullName.Contains("EntitySet"))


inside the foreach, but I am not sure this would be correct, even if it seems to work.

So, does it now work and the post is old:) or have I made some sort of mistake?

Thanks.

Whizzo said...

Hi Alex,

This is indeed an 'older' post, I have one that's more up to date (version 1.1 on this blog)

Your solution will skip the cloning of the field that contains the 'EntitySet' name, in you solution it will work, but be sure this is what you want.

I have to take a look to a more 'clean' solution for ignoring stuff, like an array or dictionary used for ignoring fields from a 'sealed' type that's whitin the .NET library.

Anonymous said...

Very nice articles with a good explanation of the code. Keep up the good work!

Greetings from Germany,
Daniel

Frode Nesbakken said...

Cool article, but I think your code has several problems. What if I have a linked list of objects, such as:

class MyClass
{
private MyClass NextInstance;
....
}

The code will fail because it recursively calls CopyReferenceType. I managed to do about 4000 linked objects before it crashed...

Now, it shouldn't be too difficult to detect and handle these situations, but I think the code has a long way to go before it is anywhere near "general".

Another problem is with object trees, where several objects might be referenced more than once. In these cases, the MSIL code will create a new object where it should only reference an already existing object.

Another problem is that it specifically REQUIRES all objects in the object graph to have a default constructor with no arguments.

But all in all, its a good start!

cialis said...

Interesting article, added his blog to Favorites

Host MAX said...

Great article :)
However it won't work for classes which use interfaces/abstract classes as fields.

And you can probably replace
Type.GetType("System.Collections.Generic.IEnumerable`1["+field.FieldType.FullName+"]")
with
Type genericType = typeof(IEnumerable<>).MakeGenericType(argType);

China Echofool said...

Cool,thanks for sharing.
I have a question to ask.
in this case:
public class MyClass:List
{

}
var myStrings=new MyClass();
myStrings.Add("a");
myStrings.Add("b");
myStrings.Add("c");
var newStrings= CloneHelper.Clone(myStrings);
//newStrings.Count==0.

i found that
typeof(T).GetFields(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public))
return an empty array.
how to fix this problem?
i'm confused with IL code.
email:1127597642@qq.com.
from:china

said...

Hello,

Thank you for this nice code. When I try to use it, I am facing a problem with this line:
generator.Emit(OpCodes.Newobj, cInfo);

It does not accept a null value for cInfo which is unfortunately my case. I don't have access to the definition of the "Original" class as I am developing an addin for a commercial software.

I wanted to have a local copy of the instance because they don't keep it between the different calls to the class that I write.

Thank you for any help,
C├ędric