Unsuccessful article about speeding up reflection

Let me explain the title of the article. It was originally planned to give good, reliable advice on speeding up the use of reflection on a simple but realistic example, however, during the benchmarking, it turned out that reflection is not as slow as I thought, LINQ is slower than I dreamed in nightmares. But in the end it turned out that I also made a mistake in the measurements ... The details of this life story are under the cut and in the comments. Since the example is quite everyday and implemented in principle as it is usually done in an enterprise, it turned out to be quite interesting, as it seems to me, a demonstration of life: the impact on the speed of the main subject of the article was not noticeable due to external logic: Moq, Autofac, EF Core and other "bindings".

I started working on the impression of this article: Why is Reflection slow

As you can see, the author suggests using compiled delegates instead of directly calling methods of reflection types as a great way to greatly speed up the application. Of course, there is also IL emission, but I would like to avoid it, since this is the most labor-intensive way to complete the task, which is fraught with errors.

Considering that I always adhered to a similar opinion about the speed of reflection, I did not intend to particularly question the author's conclusions.

It's not uncommon for me to see naive use of reflection in the enterprise. The type is taken. Get information about the property. The SetValue method is called and everyone is happy. The value flew into the target field, everyone is happy. Very intelligent people - seniors and team leads - write their extensions on object, basing "universal" mappers of one type into another on such a naive implementation. The essence is as follows: we take all the fields, we take all the properties, we iterate over them: if the names of the members of the type match, we execute SetValue. From time to time we catch exceptions on misses where we didn’t find some property in one of the types, but even here there is a way out that achieves performance. try/catch.

I've seen people reinvent parsers and mappers without being fully armed with information about how the bikes invented before them work. I've seen people hide their naive implementations behind strategies, behind interfaces, behind injections, as if that would excuse the bacchanalia that followed. From such implementations, I turned up my nose. In fact, I did not measure the real performance leak, and, if possible, simply changed the implementation to a more “optimal” one, if my hands reached it. Therefore, the first measurements, which are discussed below, seriously confused me.

I think many of you, reading Richter or other ideologues, have come across a completely fair statement that reflection in code is a phenomenon that has an extremely negative effect on the performance of an application.

Calling reflection forces the CLR to traverse the assemblies looking for the right one, pull up their metadata, parse them, and so on. In addition, reflection during sequence traversal results in a large amount of memory allocation. We spend memory, CLR uncovers GC and friezes rushed. It should be noticeably slow, trust me. Huge amounts of memory of modern production servers or cloud machines do not save you from high processing delays. In fact, the more memory, the more likely you are to NOTICE how the HZ works. Reflection is, in theory, an extra red rag for him.

Nevertheless, we all use both IoC containers and data mappers, the principle of which is also based on reflection, but there are usually no questions about their performance. No, not because dependency injection and abstraction from external bounded context models are so necessary things that we have to sacrifice performance in any case. Everything is simpler - it really does not greatly affect performance.

The fact is that the most common frameworks that are based on reflection technology use all sorts of tricks to work more optimally with it. Usually it's cache. Usually these are Expressions and delegates compiled from the expression tree. The same automapper keeps a competitive dictionary under it, matching types with functions that can convert one into another without calling reflection.

How is this achieved? In fact, this is no different from the logic that the platform itself uses to generate JIT code. At the first method call, it is compiled (and, yes, this process is not fast), on subsequent calls, control is transferred to an already compiled method, and there will be no particular performance drops.

In our case, you can also use JIT compilation and then use the compiled behavior with the same performance as its AOT counterparts. In this case, expressions will come to our aid.

Briefly, the principle in question can be summarized as follows:
You should cache the final result of the reflection work in the form of a delegate containing the compiled function. It also makes sense to cache all the necessary objects with type information in the fields of your worker type stored outside the objects.

There is logic in this. Common sense tells us that if something can be compiled and cached, then it should be done.

Looking ahead, it should be said that the cache in working with reflection has its advantages, even if you do not use the proposed method of compiling expressions. Actually, here I simply repeat the theses of the author of the article, which I refer to above.

Now about the code. Let's look at an example that is based on a recent pain I had to deal with in a serious production for a serious lending institution. All entities are fictitious, so that no one would guess.

There is some entity. Let there be Contact. There are letters with a standardized body, from which the parser and hydrator create these same contacts. A letter arrived, we read it, parsed it into key-value pairs, created a contact, saved it in the database.

It's elementary. Let's say a contact has the First Name, Age, and Contact Phone properties. This data is transmitted in the letter. The business also wants support to be able to quickly add new keys for mapping entity properties to pairs in the body of the letter. In case someone made a mistake in the template or if before the release it will be necessary to urgently start mapping from a new partner, adjusting to the new format. Then we can add a new mapping correlation as a cheap datafix. That is, a life example.

We implement, create tests. Works.

I will not provide the code: there are a lot of source codes, and they are available on GitHub at the link at the end of the article. You can download them, torture them beyond recognition and measure them, as it would affect in your case. I will give only the code of two template methods that distinguish the hydrator, which was supposed to be fast, from the hydrator, which was supposed to be slow.

The logic is as follows: the template method receives the pairs generated by the basic logic of the parser. The LINQ layer is the parser and the underlying logic of the hydrator making a query to the db context and matching keys with parser pairs (there is non-LINQ code for these functions for comparison). Next, the pairs are passed to the main hydration method and the values ​​of the pairs are set to the corresponding properties of the entity.

"Fast" (Fast prefix in benchmarks):

 protected override Contact GetContact(PropertyToValueCorrelation[] correlations)
        {
            var contact = new Contact();
            foreach (var setterMapItem in _proprtySettersMap)
            {
                var correlation = correlations.FirstOrDefault(x => x.PropertyName == setterMapItem.Key);
                setterMapItem.Value(contact, correlation?.Value);
            }
            return contact;
        }

As we can see, a static collection is used with property setters - compiled lambdas that call the entity setter. They are created by the following code:

        static FastContactHydrator()
        {
            var type = typeof(Contact);
            foreach (var property in type.GetProperties())
            {
                _proprtySettersMap[property.Name] = GetSetterAction(property);
            }
        }

        private static Action<Contact, string> GetSetterAction(PropertyInfo property)
        {
            var setterInfo = property.GetSetMethod();
            var paramValueOriginal = Expression.Parameter(property.PropertyType, "value");
            var paramEntity = Expression.Parameter(typeof(Contact), "entity");
            var setterExp = Expression.Call(paramEntity, setterInfo, paramValueOriginal).Reduce();
            
            var lambda = (Expression<Action<Contact, string>>)Expression.Lambda(setterExp, paramEntity, paramValueOriginal);

            return lambda.Compile();
        }

In general, it is understandable. We go around the properties, create delegates for them that call setters, and save. Then we call when needed.

"Slow" (Prefix Slow in benchmarks):

        protected override Contact GetContact(PropertyToValueCorrelation[] correlations)
        {
            var contact = new Contact();
            foreach (var property in _properties)
            {
                var correlation = correlations.FirstOrDefault(x => x.PropertyName == property.Name);
                if (correlation?.Value == null)
                    continue;

                property.SetValue(contact, correlation.Value);
            }
            return contact;
        }

Here we immediately bypass the properties and call SetValue directly.

For clarity and as a reference, I implemented a naive method that writes the values ​​of their correlation pairs directly to the fields of the entity. Prefix - Manual.

Now we take BenchmarkDotNet and examine the performance. And suddenly ... (spoiler - this is not the correct result, details - below)

Unsuccessful article about speeding up reflection

What do we see here? Methods that triumphantly carry the Fast prefix are slower in almost all passes than methods with the Slow prefix. This is true for both allocation and speed. On the other hand, a beautiful and elegant implementation of mapping using wherever possible the LINQ methods intended for this, on the contrary, greatly eats up performance. order difference. The trend does not change with different number of passes. The difference is only in scale. With LINQ 4 - 200 times slower, there is more garbage on about the same scale.

UPDATED

I did not believe my eyes, but more importantly, our colleague did not believe my eyes or my code - Dmitry Tikhonov 0x1000000. Having rechecked my solution, he brilliantly found and pointed out an error that I, due to a number of changes in the implementation, he missed from the beginning to the end. After fixing the found bug in the Moq setup, all the results fell into place. According to the results of the retest, the main trend does not change - LINQ still affects performance more than reflection. However, it's nice that the work with the compilation of Expressions is not in vain, and the result is visible both in terms of allocation and execution time. The first run, when static fields are initialized, is naturally slower for the "fast" method, but then the situation changes.

Here is the retest result:

Unsuccessful article about speeding up reflection

Conclusion: when using reflection in an enterprise, it is not particularly necessary to resort to tricks - LINQ will gobble up performance more strongly. However, in highly loaded methods that require optimization, you can save reflection in the form of initializers and delegate compilers, which will then provide “fast” logic. This way you can maintain both the flexibility of reflection and the speed of the application.

The benchmark code is available here. Anyone who wants to can double-check my words:
HabraReflectionTests

PS: the code in the tests uses IoC, and in the benchmarks it uses an explicit construction. The fact is that in the final implementation, I cut off all the factors that could affect performance and make the result noisy.

PPS: Thanks to the user Dmitry Tikhonov @0x1000000 for discovering my error in the Moq setup, which affected the first measurements. If one of the readers has enough karma, please like it. The man stopped, the man read, the man double-checked and pointed out the mistake. I think it deserves respect and sympathy.

PPPS: thanks to that meticulous reader who got to the bottom of the style and design. I am for uniformity and convenience. The diplomacy of the presentation leaves much to be desired, but I took into account the criticism. I ask for a projectile.

Source: habr.com

Add a comment