Arthur Khachuyan: artificial intelligence in marketing

Arthur Khachuyan is a well-known Russian specialist in big data processing, founder of the Social Data Hub company (now Tazeros Global). Partner of the National Research University Higher School of Economics. Prepared and presented, together with the National Research University Higher School of Economics, a bill on Big Data in the Federation Council. He spoke at the Curie Institute in Paris, St. Petersburg State University, Federal University under the Government of the Russian Federation, at Red Apple, International OpenDataDay, RIW 2016, AlfaFuturePeople.

The lecture was recorded at the open-air festival “Geek Picnic” in Moscow in 2019.

Arthur Khachuyan: artificial intelligence in marketing

Artur Khachuyan (hereinafter - AH): – If from a huge number of industries - from medicine, from construction, from something, something, to choose the one where the technology of big data, machine learning, deep learning is most often used, then this is probably marketing. Because for the last three or so years, everything that surrounds us in some kind of advertising communications is now tied precisely to data analysis and precisely to what can be called artificial intelligence. Therefore, today I will tell you about this from such a very distant history...

If you imagine artificial intelligence and what it looks like, it’s probably something like that. The strange picture is one of the neural networks that I wrote a year ago to find the dependence of what my dog ​​does - how many times does she need to go big, small, and how does it generally depend on how much she eats or not? . This is a joke about how artificial intelligence could be imagined.

Arthur Khachuyan: artificial intelligence in marketing

But still, let's think about how it all works in advertising communications. There are three ways in which modern algorithms in advertising and marketing can interact with us. It is clear that the first story is aimed at obtaining and extracting additional knowledge about you and me, and then using it for some good and not so good purposes; personalize the approach to each specific person; Naturally, after this, create a certain demand in order to perform the main target action and conduct a certain sale.

Using technology, they are trying to solve the problem of effective communication

If I tell you to think about what Pornhub and M. Video”, what are you thinking?

Comments from the audience (hereinafter referred to as C): - TV, audience.

OH: – My concept is that these are two places where people come for a certain type of service, or let’s call it a certain type of goods. And this audience is different in that it does not want to tell the seller anything. She wants to come in and get what interests her in some explicit or implicit form. Naturally, no one coming to M. Video” does not want to communicate with any sellers, does not want to understand, does not want to answer any of their questions.

Therefore, the first story follows from all this.

When technologies for obtaining additional knowledge appeared in order to somehow avoid communicating with a person. We all love it when we call the bank and the bank tells us: “Hello. Alexey, you are our VIP client. Now some super manager will talk to you.” You come to this bank, and there really is a unique manager who can talk to you. Unfortunately or fortunately, not a single company has yet figured out how to hire a thousand personal managers for a thousand clients; and since most of these people are now online, the task is to understand what kind of person this is and how to communicate with him correctly before he comes to some advertising resource. And therefore, in fact, technologies have appeared that are trying to solve this problem.

Data extraction is the new oil

Let's imagine that you are the owner of a flower stall. Three people come to see you. The first one stands for a very long time, hesitates, tries to talk to you, takes some kind of bouquet - you go to wrap it, go out to do something there; he runs away from the stall with this bouquet - you have lost your three thousand rubles. Why did it happen? You don’t know anything about this person: you don’t know his history of arrests in the Ministry of Internal Affairs, you don’t know that he is a kleptomaniac and is registered in a psychiatric dispensary. Why? Because you saw it for the first time, and you are not a behavioral analyst.

Someone else comes... Vitaly. Vitaly also takes a very long time to figure it out, he says, “Well, I need this and that.” And you tell him, “Flowers for mom, right?” And you sell him a bouquet.

The concept here is to find out enough data to understand what the person actually needs. Everyone immediately thought about some kind of advertising networks and so on...

Everyone has probably heard the stupid phrase that “data is the new oil” more than once? Surely everyone has heard. In fact, people learned to collect data quite a long time ago, but extracting data from this data is the task that artificial intelligence in marketing, or some kind of statistical algorithms, is now trying to solve. Why? Because if you talk to a person, he can give you a right, wrong, or somehow colored answer. The joke I tell my students is how surveys differ from statistics. I’ll tell you this as an anecdote:

This means that in two villages they decided to conduct a study on the average length of manhood. This means that in the first village, Villaribo, the average length is 15 centimeters, in the village of Villabaggio - 25. Do you know why? Because measurements were carried out in the first village, and a survey was carried out in the second.

The porn industry is the flagship of recommendation systems

This is why the modern approach is to analyze all people without exception, even if they are a little less than 100%, but these are the people you don’t need to ask, you don’t need to look at them. It is enough to analyze what is now called a digital footprint to understand what this person needs, how to speak to him correctly, how to correctly create demand around him. On the one hand, this is a mindless machine (but you and I know this very well); we don't want to communicate with people from M. Video,” and even more so, when we go to resources like Pornhub, we want to get exactly what we need.

Why do I always talk about Pornhub? Because the adult industry is the first to come to the analysis of such technologies, to the implementation of such technologies, to data analysis. If you take the three most popular libraries in this area (for example, TensorFlow or Pandas for Python, for processing CSV files, and so on), if you open it on Github, with a short Google of all these names you will find a couple of people who either worked or currently work at the Pornhub company, and were the first to implement recommendation systems there. In general, this story is very advanced, and shows how much this audience, how much this company has moved forward.

Arthur Khachuyan: artificial intelligence in marketing

Three levels of identification

There is a huge set of data around a person that can be identified. I usually formally divide this into three levels, going deeper and deeper. Naturally, the company has its own data.

If, say, we are talking about building a recommendation system, then the first level is the data that is located at the store itself (purchase history, all kinds of transactions, how a person interacted with the interface).

Next there is a level (relatively the largest) - this is what is called open sources. Don’t think that I encourage you to scrape social networks, but in fact, what is available in open sources opens up a huge set of data that you can, say, learn about a person.

And the third major part is the environment of this person himself. Yes, there is an opinion that if a person is not on social networks, there is no data about him there (you probably already know that this is not true), but the most important thing is that the data that is on a person’s profile (or in some application ) is only 40% of the knowledge that can be obtained about it. The rest of the information is obtained from his environment. The phrase “tell me who your friend is and I will tell you who you are” takes on a new meaning in the XNUMXst century because a huge amount of data can be obtained around that person.

If we talk closer to advertising communications, then receiving advertising communications not from advertising, but from some friend, acquaintance or somehow verified person is a very cool feature that a lot of marketers use. When some application suddenly gives you a free promo code, you make a post about it and thereby attract a new audience. In fact, this promo code for the conditional “Yandex.Taxi” was not chosen at random, but for this, a huge amount of data was analyzed about your potential to attract a new audience and somehow interact with them.

Arthur Khachuyan: artificial intelligence in marketing

They even analyze the behavior of TV series characters

I will show you three pictures, and you tell me what the difference is between them.

This one:

Arthur Khachuyan: artificial intelligence in marketing

This:

Arthur Khachuyan: artificial intelligence in marketing

And this one:

Arthur Khachuyan: artificial intelligence in marketing

What's the difference between them? Everything is simple here. As in quantum mechanics, in this case this creativity was formed by the observer. That is, the difference in the same advertising campaign, carried out by the same brand at the same time, is only in who watched this creative. Personally, when I go to Amediateka, they still show Khal Drogo. I don’t know what Amediateka thinks about my preferences, but for some reason this happens.

What is now called personalized communications is the most popular story of attracting an audience and properly interacting with it. If at the first stage we identified people using our own brand data, open source data and, for example, data from this person’s environment, we, after analyzing him, can understand who he is, how to talk to him correctly and, most importantly, what language he speaks talk to him.

Here technology has gone so far that the characters in TV series that people watch are now being analyzed. That is, you like TV series - they [likes] are watched, they look at who you interacted with there, in order to understand what kind of person would be suitable for you to interact with. It sounds like complete nonsense, but just for fun, try it on one of the resources - different people see different creatives (in order to interact with it correctly).

Not a single modern media or any video resource just shows you some news. Go to the media - a huge number of algorithms are loaded that identify you, understand all your previous activity, make an appeal to the mathematical model and then show you something. In this case, there is such a strange story.

How are needs determined? Psychometry. Physiognomy

There are many (real) approaches to determining a person’s actual needs and how to communicate with them correctly. There are many approaches, everything is solved differently, it is impossible to say which is good and which is bad. The main ones seem to know everything.

Arthur Khachuyan: artificial intelligence in marketing

Psychometry. After the story with Cambridge Analytics, it took some kind of shocking, in my opinion, some kind of turn, because every second political company now comes and says: “Oh, can you make me like Trump? I also want to win, and so on.” In fact, this, of course, is nonsense for our realities, for example, political elections. But to determine psychotypes, three models are used:

  • the first is based on the content you consume - the words you write, some information you like, videos, etc.;
  • the second is tied to how you interact with the web interface, how you type, which buttons you press - indeed, there are entire companies that, based on their keyboard handwriting, can quite reliably determine what is now called psychotypes.
  • I’m not much of a psychologist, I don’t really understand how it works, but from the point of view of advertising communications, audiences divided into these segments work very well, because someone needs to be shown a red screen with a blue woman, someone needs to be shown a dark screen -blue background with some kind of abstraction, and it works very cool. At some low levels - so much so that a person doesn’t even think about it. What is the main problem in the advertising market now? Everyone is an intelligence agent, everyone is hiding, everyone has a million thousand browser permissions installed, in order not to be identified in any way - you probably have “Adblocks”, “Gostrey” and all sorts of applications that block tracking. Because of this, it is very difficult to understand anything about a person. And technology has moved on - you need to not only know that this person has returned to your site for the 125th time, but that he is also such and such a strange person.

Physiognomy is a very controversial science. It's not even considered science. This is a group of people who used to program lie detectors for some Ministry of Internal Affairs, and now are engaged in what is called the personification of creativity. The approach here is very simple: several of your public photographs are taken from some social networks, and three-dimensional geometry is built from them. And if you are a lawyer, you will now say that this is a person and personal data; but I’ll tell you that these are 300 thousand points located in space, and this is not a person, and is not personal data. This is what everyone usually says when Roskomnadzor comes to them.

But seriously, your face separately, if your first and last name is not signed there, is not your personal data. The point is that the guys mark out various facial features that influence how a person makes decisions and how to interact with him correctly. In some areas this works poorly, in some advertising segments; in which segments does it work very well. In the end, it turns out that when you go to some resource, you see not just one banner that is shown to everyone, but, for example... now it’s normal to make 16 or 20 options for different audiences - and it works very cool. Yes, this is even sadder from the consumer’s point of view, because people are beginning to be manipulated more and more. But nevertheless, from a business point of view it works very well.

The black box of machine learning

This gives rise to the following problem with such technologies: after all, for most developers now what is called deep learning is a “black box”. If you’ve ever been immersed in this story and talked to the developers, they always say: “Oh, listen, well, we’ve coded something so incomprehensible there, and we don’t know how it works.” Perhaps someone has had this happen.

This is actually far from true. What is now called machine learning is far from a “black box”. There are a huge number of approaches to describe the input and output data, and in the end the company can thoroughly understand on the basis of what signs the machine decided to show you this pornographic video or another. The question is that none of the companies ever disclose this, because: firstly, it is a trade secret; secondly, there will be a huge amount of data that you didn’t even know about.

For example, before this, in a discussion on ethics, we discussed how social networks analyze personal messages in order to tag people in some kind of advertising stories. If you write something to someone, based on this you receive a specific tag for, in fact, some kind of advertising communications. And you will never prove it, and there is probably no point in proving it. However, if similar patterns were revealed, they would exist. It turns out that the market for building such recommender systems pretends not to know why this happened.

People don't want to know what people know about them

And the second story is that the client never wants to know why he received this particular ad, this particular product. I'll tell you this story. My first experience in the commercial implementation of recommendation systems based on similar algorithms precisely for the sake of research was in 2015 in a very large network of sex shops (yes, also not a particularly unpleasant story).

Arthur Khachuyan: artificial intelligence in marketing

Customers were offered the following: they come in, log in with their social network, and after about 5 seconds they receive a completely personalized store for them, that is, all the products have changed - they fall into a certain category, and so on. Do you know how much the conversion rate of this store has increased? Not by any means! People came in and immediately ran away from it. They came in and realized that they were offered exactly what they were thinking about...

The problem with this test was that under each product it was written why you were offered that particular one (“because you are a member of the hidden group “Powerful woman is looking for a man who is a doormat”). Therefore, modern recommendation systems never show the data on the basis of which the “prediction” was made.

A very popular story is the media because they all use similar recommender systems. Previously, the algorithms were very simple: look at the “Politics” category - and they show you news from the “Politics” category. Now everything is so complicated that they analyze the places where you stopped the mouse, what words you concentrated on, what you copied, how you generally interacted with this page. Then he analyzes the vocabulary of the messages themselves: yeah, you’re not just reading news about Putin, but in a certain way, with a certain emotional coloring. And when a person receives some news, he doesn’t even think about how he came here. Nevertheless, he then interacts with this content.

All this, naturally, is aimed at keeping the poor, unfortunate little man who is already going crazy from the huge amount of information that is around him. Here it must be said that it would be nice to use such systems to personalize the creative around you and collect some information, but, unfortunately, there are no such services yet.

Artificial intelligence catches the client in the air and creates demand

And here one very interesting philosophical question arises, moving from creating a recommendation system to creating demand. Rarely does anyone think about it, but when you try to ask the so-called Instagram, “Why are you collecting data? Why not show me absolutely random advertising?” - Instagram will tell you: “Friend, this is all done to show you exactly what is interesting to you.” Like, we want to know you so precisely that we can show you exactly what you are looking for.

Arthur Khachuyan: artificial intelligence in marketing

But technology has long since crossed this terrible threshold, and similar technologies no longer predict what you need. They (attention!) create demand. This is probably the scariest thing that revolves around artificial intelligence in such communications. The scary thing is that it has been used almost everywhere for the last 3-5 years - from Google search results to Yandex search results, to some systems... Okay, I won’t say anything bad about Yandex; and good.

What's the point? It’s been a long time since such advertising communications have moved away from the strategy where you write “I want to buy a child seat” and see a hundred thousand million publications. They moved on to the following: as soon as the woman posted a photo with a barely visible belly, her husband would immediately begin to be followed by messages: “Man, the birth is coming soon. Buy a child seat."

Here, you might reasonably ask, why, with such gigantic advances in technology, do we still see such shitty advertising on social networks? The problem is that in this market everything is still decided by money, so one fine moment some advertiser like Coca-Cola may come and say: “Here’s 20 million for you - show my shitty banners to the entire Internet.” And they will really do it.

But if you make some kind of clean account and test how accurately such algorithms guess you: they first try to guess you, and then they start doing something to you in advance. And the human brain works in such a way that, when receiving information that is reliable for it, it does not even process the moment why it received this information. The first rule to determine that you are in a dream is to understand how you came here. A person never remembers the moment he ended up in a certain room. It's the same here.

Google May Begin to Shape Your Worldview

Such studies were carried out by several foreign companies that engage in i-tracking. They installed devices on special computers that record where the test subject's eyes are looking. I took from five to seven thousand volunteers who simply scrolled the feed, interacted with social networks, with advertising, and they recorded information on which parts of the banners and creatives these people stopped their eyes on.

And it turns out that when people receive such hyper-personalized creative, they don’t even think about it - they immediately move on, start interacting with it. From a business point of view, this is good, but from the point of view of us, as users, this is not very cool, because - what are they afraid of? – That at one fine moment the conditional “Google” may begin (or, of course, it may not begin) to form its own worldview. Tomorrow, for example, he can start showing people news that the earth is flat.

Just kidding, but they have been caught so many times that during elections they start giving certain information to certain people. We are all used to the fact that the search engine gets everything honestly. But, as I always say, if you really want to know how the world works, write your own search engine, without filters, without paying attention to copyright, without ranking some of your friends in search results. The display of real data on the Internet is generally different from what is shown by Google, Yandex, Bing, and so on. Some materials are hidden because friends, colleagues, enemies or someone else (or a former lover with whom you slept) - it doesn’t matter.

How Trump won

When there was the last election in the United States, a very simple study was conducted. They took the same requests in different places, from different IP addresses, from different cities, different people Googled the same thing. Conventionally, the request was in the style of: who will win the elections? And amazingly, the results were constructed in such a way that in those states where the largest number of people tried to vote for the wrong candidate, they received some good news about the candidate that Google promoted. Which one? Well, it’s clear which one – the one who became president. This is an absolutely unprovable story, and all these studies are a finger in the water. Google can say: “Guys, all this is done so that we show the most relevant content for you.”

From now on, you should know that what is called maximally relevant is absolutely not the case. The company calls relevant something that needs to be sold to you for some good or bad reason.

Those who do not have money now are already being prepared for future purchases

There is another interesting point here that I will tell you about. A huge number of active audiences now on social networks and in apps are young people. Let's call it this - insolvent youth: children 8-9 years old who play moronic games, these are 12-13-14 who are just registering on social networks. Why would huge companies spend huge budgets and resources to create applications for a non-paying audience that is never monetized? At the moment when this audience becomes solvent, there will be a sufficient amount of data about it to predict its behavior very well.

Arthur Khachuyan: artificial intelligence in marketing

Now ask any targetologist, what is the most difficult audience? They will say: highly profitable. Because selling, for example, an apartment worth 150 million rubles through social networks is almost impossible. There are isolated cases when you do some kind of advertising for 10 thousand people, one buys this apartment - the client is a success... But one in ten thousand, from a statistical point of view, is complete crap. So, why is it difficult to identify a high-income audience? Because the people who are now members of a highly profitable audience were born when the Internet was still very small, when no one knew Artemy Lebedev yet, and there is no information about them. It is impossible to predict their behavior pattern, it is impossible to understand who their opinion leaders are, and from what sources of content they receive.

So when you all become billionaires in 25 years, and the companies that are going to sell you something will have a huge amount of data. That's why we now have a wonderful GDPR in Europe that prevents the collection of data from minors.

Naturally, this doesn’t work at all in practice, since all the children still play on their mother’s and father’s accounts - this is how information is collected. Next time you give your child a tablet, think about this.

Absolutely not some scary, dystopian future, when everyone will die in a war with machines - an absolutely real story now. There are a huge number of companies that are creating algorithms for psycho-profiling people based on how they play games. A very interesting industry. Based on all this, people are then segmented in order to somehow communicate with them.

Arthur Khachuyan: artificial intelligence in marketing

Prediction of the behavior of these people will be available in 10-15 years - precisely at the moment when they become a solvent audience. What’s most important is that these people have already given permission in advance to process their personal data, transfer it to third parties, and all this is happiness, and so on.

Who will lose their job?

And my last story is that everyone always asks what will happen in 50 years: we will all die, there will be unemployment for marketers... There are marketers here who are worried about unemployment, right? In general, there is no need to worry, because any highly qualified person will not lose his job.

Arthur Khachuyan: artificial intelligence in marketing

No matter what algorithms are created, no matter how closely the machine gets close to what we have here (points to his head), if it develops quickly enough, such people will never be left idle, because someone will have to create these creative things do. Yes, there are all kinds of “gans” who draw pictures that look like people and create music, but it’s still unlikely that people in this area will ever lose their jobs.

Arthur Khachuyan: artificial intelligence in marketing

I have everything with the story, so you can ask questions if you have more. Thank you.

Arthur Khachuyan: artificial intelligence in marketing

Leader: – Friends, we are now moving on to the “Question and Answer” block. You raise your hand - I come up to you.

Arthur Khachuyan: artificial intelligence in marketing

Question from the audience (XNUMX): – Question about the “black box”. They said that it was possible to specifically understand why such and such a result was obtained for such and such a user. Are these some kind of algorithms, or does it need to be analyzed every time for each model ad hoc (author’s note: “especially for this” - a Latin phraseological unit)? Or are there ready-made ones for some kind of neural network that, roughly speaking, can make business sense?

OH: – Here you need to understand the following: there are a huge number of tasks in machine learning. For example, there is a task - regression. For regression, no neural networks are needed at all. Everything is simple: you have several indicators, you need to calculate the following. There are tasks where it is necessary to resort to such a thing as deep learning. Indeed, in deep learning it is difficult to reliably understand what weights were assigned to which neurons, but legally all you need is to understand what data was at the input and how it played out at the output. This is legally enough to patent such a decision and it is enough to understand on what basis the story was made.

It’s not like you went to the site and were shown some kind of banner because you took a photo with red hair on Instagram two months ago. If the developer does not include the collection of this data and the marking of hair color in this model, then it will not come out of nowhere.

How to sell the results of machine learning systems?

W: – It’s just a question of what: exactly how to explain, how to sell to someone who doesn’t understand machine learning. I want to say: my model clearly leads from hair color to... well, hair color changes... Is this possible or not?

Arthur Khachuyan: artificial intelligence in marketing

OH: - Maybe yes. But from a sales point of view, the only scheme will work: you have an advertising campaign, we replace the audience with the one generated by the machine - and you just see the result. This, unfortunately, is the only way to reliably convince the customer that such a story works, because there are a lot of solutions on the market that were once implemented and did not work.

About creating a virtual personality

W: - Hello. Thanks for the lecture. The question is: what chance does a person have, who for some reason does not want to follow the lead of machine learning, to create for himself a virtual personality that is radically different from his own personality, through interaction with the interface or for some other reason?

Arthur Khachuyan: artificial intelligence in marketing

OH: – There are a bunch of different plugins that deal specifically with randomizing behavior. There is a cool thing - Ghostery, which, in my opinion, almost completely hides you from a bunch of different trackers that cannot then record this information. But in fact, now all you need is a closed profile on social networks so that no one, no evil scrapers, can collect anything there. It's probably better to install some kind of extension or write something yourself.

You see, the concept here is that legally, for example, personal data refers to data by which you can be identified, and the law gives as an example your address of residence, age, and so on. Nowadays there is a countless amount of data by which you can be identified: the same keyboard handwriting, the same press, the digital signature of the browser... Sooner or later, a person makes a mistake. He can be somewhere in a “cafe” using “Thor”, but in the end, at one fine moment, either the VPN will forget to turn on, or something else, and at that moment he can be identified. So the easiest way is to make a private account and install some extension.

The market is moving towards the point where you only need to press one button to get results.

W: - Thanks for the story. As always, always very interesting (I'm following you). The question is: what progress is there in terms of creating systems that are positive for users, recommendation systems? You said that at one time you were working on a recommendation system for finding a sexual partner, a friend in life (or music that a person could potentially like)... How promising is all this, and how do you see its development from the point of view of creating systems that people need?

OH: – In general, the market is moving to the point where people need to press one button and immediately get what they need. As for my experience in creating dating applications (by the way, we will relaunch it at the end of the year), in addition to the fact that 65% were married men, the most difficult recommendation problem was that a person was offered several models at the start of the application - “ Friendship", "Sex", "Sex Friendship" and "Business". People didn't choose what they needed. Men came and chose “Love,” but in reality they threw nudity at everyone, and so on.

The problem was to identify a person who does not fit one of these models, and somehow smoothly take him and move him in the other direction. Due to the small amount of data, it is very difficult to determine whether this is an error in the forecasting algorithm, or whether a person is not in his category. It’s the same with music: there are now very few really worthy algorithms that can “facast” music well. Maybe “Yandex.Music”. Some people think the Yandex.Music algorithm is bad. For example, I like her. I personally, for example, don’t like the YouTube music algorithm and so on.

There are, of course, some subtleties - everything is tied to licenses... But in reality, the demand for such systems is quite high. At one time, the Retail Rocket company was known, which was involved in the implementation of recommendation systems, but now it somehow does not fare very well - apparently because they did not develop their algorithms for a long time. Everything goes towards this - to the point that we go in and, without pressing anything, get what we need (and become completely stupid, because our ability to choose has completely disappeared).

Influence marketing

W: - Hello. My name is Konstantin. I would like to raise a question about influence marketing. Do you know any systems that allow a business to select a suitable blogger for the business based on some statistical data and so on? And on what grounds is this done?

Arthur Khachuyan: artificial intelligence in marketing

OH: – Yes, I’ll start from afar and immediately say that the problem with all these technologies is that all this artificial intelligence in marketing is now like a tightrope walker: on the left there are large companies that have a lot of money, and in any case everything will be effective for them work because their advertising campaigns are aimed simply at views; on the other hand, there are a lot of small businesses for which this will not work, because they have a lot of data. So far, the applicability of these stories is somewhere in the middle.

When there are already good budgets, and the task is to process these budgets correctly (and, in principle, there is already quite a lot of data)… I know a couple of services, something like Getblogger, which seem to have algorithms. To be honest, I have not studied these algorithms. I can tell you what approach we use to find opinion leaders when we need to give a gift to some mothers.

We use a metric called Content Distribution Time. It works like this: you take a person whose audience you are analyzing, and you need to systematically (for example, once every 5 minutes) collect information on each post, who liked it, commented on it, and so on. This way, you can understand at what point in time each person in your audience interacted with your content. Repeat this operation for each representative of his audience, and thus, using the metric of the average time of content dissemination, it can, for example, be colored in a large network graph of these people and use this metric to build clusters.

This works quite well if we want, for example, to find 15 mothers who maintain their public opinion on some woman.ru. But this is a rather complex technical implementation (although purely theoretically it can be done in Python). The bottom line is that the problem with influence marketing in large advertising agencies is that they need big, cool, expensive bloggers who don’t work for shit. Now, a car brand wants to sell some product through some opinion leader - they need to use a car blogger as a last resort, because their audience has either already bought a car, or knows exactly what kind of car they want, just sits and looks at cool cars. Here it is important not to miss the analysis of the audience of the person himself.

Marketing bots

W: – Tell me, how much do bots on social networks affect the collection of information and its quality?

Arthur Khachuyan: artificial intelligence in marketing

OH: – It’s such an interesting thing with bots. Cheap bots are quite easy to identify - they either have the same content, or they are friends with each other, or they are in the same network. There are also approaches to dealing with complex bots. Or are you asking the problem how to connect a person to his fake?

W: – How high-quality information will be the output with all this garbage?

OH: – Here it works this way: due to the fact that there is a huge amount of data (for example, for some kind of marketing research), all this riffraff can simply be thrown out. That is, it is better to throw out a little more real people than to capture bots, because it is useless for them to show any advertising. But if you collect metrics, for example, interactions with banners or recommendation systems, such accounts can be thrown out.

Now on social networks, there are about six percent of virtual characters or simply abandoned pages or introverts, whom algorithms “match” as bots. As for linking a person to his fake, here, too, everything is tied to the fact that the person will sooner or later make a mistake, and the thing is that the behavior model is the same - both his real account and his fake. Sooner or later they will watch the same content or something else.

Here it all comes down not to the percentage of error, but to the amount of time needed to reliably identify a person. For someone who lives with their Instagram, this time for reliable identification comes down to five minutes. For some – by six to eight months.

To whom and how to sell data?

W: - Hello. I'm interested to know how data is sold between companies? For example, I have an application in which you can find out (to the developer) where a person goes, what stores he goes to, and how much money he spends there. And I’m interested to know how, let’s say, I can sell data about my audience to these stores or put my data into one huge database and get paid for it?

Arthur Khachuyan: artificial intelligence in marketing

OH: – As for selling data directly to someone, you and everyone else were ahead of OFD – fiscal data operators, who cunningly built themselves between the transfer of checks and the Tax Service and are now trying to sell data to everyone. Indeed, they actually crashed the entire mobile analytics market. In fact, you can embed your application, for example, the Facebook pixel, its DMP system; then use this audience to sell. For example, the “May Target” pixel. I just don’t know what kind of audience you have, you need to understand. But in any case, you can integrate either into Yandex or My Target, which are the largest DMP systems.

This is quite an interesting story. The only problem is that you will give them all the traffic, and they, as exchanges, will take upon themselves the monetization of this traffic. They may or may not tell you that 10 people have used your audience. Therefore, either you build your own advertising network, or you surrender to large DMPs.

Who will win - the artist or the techie?

W: – A question a little distant from the technical part. It was said about the fears of marketers about the coming mass unemployment. Is there some kind of competitive struggle between creative marketing (these guys who came up with chicken advertising, Volkswagen advertising, it seems) and those involved in Big Data (who say: now we’ll just collect all the data and deliver targeted advertising to everyone )? As a person who is directly involved, what is your opinion about who will win - an artist, a technician, or will there be some kind of synergistic effect?

Arthur Khachuyan: artificial intelligence in marketing

OH: – Listen, well, they work together. Engineers don't come up with creativity. Those who are creative do not invent an audience. There is some kind of multidisciplinary story here. The real problems now are for those who sit and press buttons, for those who do the “monkey job”, pressing the same thing every day - these are the people who will disappear.

But those who analyze the data will naturally remain, but someone must process this data. Someone will have to come up with these pictures, draw them. A machine can’t come up with such creativity! This is complete madness! Or like, for example, the viral advertising of Carprice, which, by the way, worked very well. Remember, there was this one on YouTube: “Sell it at Carprice,” absolutely crazy. Of course, no neural network will generate such a story.
In general, I am a supporter of the fact that it is not people who will lose their jobs, but they will have a little more free time, and they will be able to spend this free time on self-education.

Primitive advertising will die out

W: - By and large, the advertising that is shown, the banners - by and large, even selling texts are not written there: “You need windows - take it!”, “You need something else - take it!”, that is, there is no creativity there at all.

OH: – Such advertising will die out, of course, sooner or later. It will die out not so much because of the development of technology, but because of the development of you and me.

It is better to mix the relevant with the irrelevant

W: - I'm here! I have a question about the experiment that you said didn't work out for you (with the recommender system). In your opinion, is the problem what was signed there, why is it recommended, or is it that everything that the user saw seemed relevant to him? Because I read an experiment for mothers, and there wasn’t that much data yet, and there wasn’t that much data from the Internet, there was just data from a grocery retailer that predicted pregnancy (that they would be mothers). And when they showed a selection of products for expectant mothers, mothers were horrified that they found out about them before any official things. And it didn't work. And in order to solve this problem, they deliberately mixed relevant products with something completely irrelevant.

Arthur Khachuyan: artificial intelligence in marketing

OH: “We specifically showed people the basis on which the recommendations were made in order to understand their feedback. Actually, this is where the concept was born that people don’t need to be told that these are some super-relevant products for him.

Yes, by the way, there is an approach to mixing them with irrelevant ones. But there is the opposite thing: sometimes people come in and interact with this irrelevant product - random outliers occur, models break and things get even more complicated. But this actually exists. Moreover, many companies deliberately, if they know that someone is processing their data (someone could steal such output from them), they sometimes mix it up so that they can later prove that you did not take the data from its recommendation system, but from the so-called Yandex.Market.

Ad blockers and browser security

W: - Hello. You mentioned Ghostery and Adblock. Can you tell us how effective such trackers are in general (perhaps based on statistics)? And did you have any orders from companies: they say, make sure that our advertising cannot be closed by Adblock.

OH: – We do not directly contact advertising platforms – precisely so that they do not ask to make their advertising visible to everyone. I personally use Ghostery – I think it’s a very cool extension. Now all browsers are fighting for privacy: Mozilla has released a bunch of all kinds of updates, Google Chrome is now super-secure. They all block everything they can. “Safari” has even turned off “Gyroscope” by default.
And this trend, of course, is good (not for those who collect data, although they also got out of it), because people first blocked cookies. Everyone who owned advertising networks remembered such a wonderful technology as browser fingerprints - these are algorithms that receive 60 different parameters (screen resolution, version, installed fonts) and based on them they calculate a unique “ID”. Let's move on to this. And browsers began to struggle with this. In general, this will be an endless battle of the titans.

The latest developer Mozilla is quite secure. It saves virtually no cookies and sets a short lifetime. Especially if you turn on “Incognito”, no one will find you at all. The question is that it will be inconvenient to enter passwords in all services.

Where does psychotyping and physiognomy work and not work?

W: – Arthur, thank you very much for the lecture. I also enjoy following your lectures on YouTube. You mentioned that marketers are increasingly resorting to using psychotyping and physiognomy. My question is: what brand categories does this work in? My belief is that this is only suitable for FMCG. For example, choosing a car is...

OH: – I can download where it exactly works. This works in all sorts of stories like “Amediateka”, TV series, films and so on. This works well in banks and banking products, if it is not the premium segment, but all sorts of student cards, installment plans - those kinds of things. This really works very well in FMCG and all sorts of iPhones, chargers, all this crap. This works well in “mom and pop” products. Although I know that in fishing (there is such a topic)... There have been cases with fishermen several times - they can never be reliably segmented. I do not know why. Some kind of statistical error.

This doesn’t work well with motorists, with jewelry, or with some household items. In fact, it doesn't work well with things that people would never write about on social media - you can check it this way. Conventionally, with the purchase of a washing machine: here’s how to understand who has a washing machine and who doesn’t? It seems like everyone has it. You can use OFD data - see who bought what using receipts, and match these people using receipts. But in fact, there are things that you would never talk about, for example, on Instagram - it’s difficult to work with such things.

Machines recognize tricks as statistical stuffing.

W: – I have a question about targeting. Is it possible (or do they suddenly exist) of a conditional random character who contradicts himself in everything: first he Googles “the best gyms”, and then he Googles “10 ways to do nothing”? And so it is in everything. Can targeting keep track of something that contradicts itself?

OH: – The only question here is this: if you have been using Google for 2 years, told it everything you can about yourself, and now install a plugin for yourself that will write similar random queries, then, of course, from the statistics you will be able to understand – what you are doing now is a statistical outlier, and this is all a matter of sifting out. If you want, register a new account, but the volume of advertising will not change. She'll just get weird. Although she is still strange.

Some ads 🙂

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Add a comment