Поиск по блогу

пятница, 12 июня 2015 г.

Смотрим видео "Повышение эффективности ранжирования в Веб-поиске"

Это вторая лекция курса Лекция и тесты в НОУ "ИНТУИТ Веб-поиск с использованием информации о поведении пользователя: выявление намерений пользователя, использование данных о кликах и переходах в схеме ранжирования. Звучит красиво, но здесь автор рассмотривает модель формирования векторного пространства для отбора страниц "похожих" на образцовые... Математики больше, чем эвристики (лекцию не досмотрел). Здесь видео и подстрочный тарабарский перевод с YouTube.

Lecture plan Explicit Feedback in IR -Query expansion -User control

From Clicks to Relevance

  1. Rich Behavior Models -+Browsing -+Sessions/Context Information -+Eya tracking, mouse movement
In [ ]:
###Information seeking process (Marchionini 1989)
In [ ]:
-Query formulation
-Action (query)
-Reviw results
-Refine query

Здесь стоит сформулироватьмотивы такого поведения: мы часто не знаем толком, что ищем, а если знаем, то запрос формулировать сразу трудно, потому, что надо использвоать правильный набор слов (словарь), только после этого можно уточнять и тиражировать запросы

Assumtions

Rather than reweighting the query in a vector space, if a user has told us some relevant and nonrelevant documents, then we can proceed to build a . One way of doing this is with a Naive Bayes probabilistic model

In [ ]:
#a1: User has sufficient knowledge fo a reasonable initial query
#a2: Relevance prototypes are "well-behaved"

Приложение (подстрочник на тарабарском языке)

0:15
welcome to second
0:17
lecture or enter by the user behaviorand interaction a
0:25
okay so today we're going to talk about them
0:28
36 one explicit feedback in air
0:32
%uh that includes clear expansion and i wanna talk a little bit about
0:36
UserControl a that I'm going to talk about clicks
0:40
its fellow and 3 I'm gonna talk about
0:43
richer behavior models that include browsing session information
0:47
I trek to los Pollos possibly most tracking K
0:52
so to start so
0:55
to remind you a we were close to him this information seeking
0:59
arm framework where we have past information need
1:03
thats formulated as a career and that in turn
1:07
is I'll submitted to search engine to user interacts
1:11
what the results performance the query and hopes for this process continues
1:15
and specifically to go to focus on this part refusals fire crews
1:19
and this is a relevance feedback the the pic
1:23
or lost it difficult to this lecture so why do you care
1:28
well for one thing a member previous text her sometimes you don't know how to
1:34
freeze the query you can recognize it but they don't offer is
1:38
a in fact corporation can be extremely difficult
1:41
and UK again as they discussed the start with a simple career
1:45
and then start I'll to you to rate arm
1:51
through the city's right so guests over
1:54
first is that this is in some sense recap us lectured at correlation is
1:58
difficult and people like this this star the civil parish in the interest
2:02
um and of course it helps to discover whatever it is you might not know about
2:06
particularly as a premise is that this to boost Rico
2:10
such a thing of i mean more documents like this said it's a you know who
2:13
featured
2:14
so there's three that's a good question but there's explicit look
2:18
a red users mark relevant documents this is a system that them stand for user
2:23
a contention based on he there and then there's biofeedback rate
2:27
so and they're not going to talk about the last
2:30
kind it today so focused on this too so here's an example that hosted a
2:35
I'm given that they have some site throughout the world
2:39
as fears work to you know what's going on the swine flu in
2:43
in Russia and some a
2:46
site that is great result that a drink with you doing spoil
2:50
blue Russian answer at all Russian food also
2:54
Midwest a drink whiskey puppet so I'll
2:57
this well service a curious and there's you know I want to know there's any
3:01
other documents about this I clicked
3:02
on the civil right that's a former relevance feedback and the
3:06
right near the story about the the shift and i actually turns out that the
3:11
documents have nothing to do with
3:13
both swine flu or so I mean they have to do is life in that's about it
3:18
so so here's an example residents gather and weaken
3:22
so think work goes on underneath and why this results have
3:25
almost nothing to do with my initial court where um
3:29
and know something to think about try to guess what Scott up
3:32
alright so why did it so helping to use
3:36
um so here's a general sort of big assumption
3:39
that there is some optimal query or
3:42
said about the lockers and the goal is to try to move my query the user query
3:47
closer to this article great so how does it work
3:51
we're going to use that information to the quarry and use query to retrieve
3:54
Minnesota
3:55
doctors from of course this is a somewhat there is some overlap
3:59
if you lecture from yesterday arm from database search I think it
4:03
it's a good thing um and the you know the guy few
4:06
probably get a slightly different perspective today so
4:09
um so what exactly do with it back I'll be going to
4:12
we can for example with terms are relevant documents a good add additional
4:16
terms or
4:17
for example you could remove terms for my career you know this is somewhat
4:20
hidden from user
4:21
this by default way secure is a a
4:24
relevance feedback and pictures that have the classic formulation of this
4:28
we have initial Corey writers like triangle for initial core
4:32
where on the axes ardor not relevant documents
4:36
those other relevant documents and my goal is to move
4:39
my I'll curry closer to
4:42
you know the group opposed the the relevant documents and away from
4:46
you know bosnia and away from the non relentless K
4:50
simple thanks I start your I want to somehow get the right
4:53
because this chorus just disclosed the relevant you know to
4:56
um you know three non-relevant the relevant and then there's a bunch of
4:59
sort of farm number given documents that are outside of the initial
5:03
circle think of you know that the space motto cassettes murdered
5:06
Rep I so that's the you know big big idea
5:10
so classic 00 vehicle called rock you and I think it's been mentioned
5:14
yes that as well I'll is a stated as follows: to start that the initial
5:18
career and modify it no in the following week I'll
5:23
and by the way let me just the I'm suggests that you don't
5:27
look at the handouts right up because but probably counterproductive
5:31
sites don't match exactly when you discover what so
5:35
I suggest the handouts are best used after the fact to review and stuff right
5:38
instead of
5:39
during the lecture you can make no thought but it's better to
5:42
so this this are the real thing handouts for later
5:45
so um again so this oh so this is the initial career
5:50
other going to tried to at terms up from relevant documents for it so
5:55
think ok call you know this is a vector space or representation there's might
5:58
corrector um I'm going to at terms that are
6:02
in the documents that are marked read it right specifically in the centroid rates
6:06
of the big
6:07
all the vectors arm in the relevant documents
6:10
and find the Centro great divide by the the size of the
6:14
relevant documents and subtract the way to slow the first terms
6:18
0 farm other non relevant documents against big government documents compute
6:22
the Centro
6:23
arm and so with weights to offer for the original beta for the
6:28
relevant cover for the non-relevant right so that the weights that are
6:32
usually defense shows or 730
6:35
so they deserve course that the new curry sqm
6:38
modified will be hopefully close I hope it will be closer to the central
6:42
the of the relevant documents and will be further from the
6:46
center of the non-relevant and bill also
6:49
of course be some distance and I don't jump too far from the Regional Court
6:53
okay up so arm a in in here's an example
6:58
well here's an original character I'll missus
7:01
oh and we're going to thats for example the Sun right
7:04
we're going to arm and it's a offers one
7:07
other going to has some relevant central to the relevant documents has this
7:11
victory this
7:12
with the to its core would be to 0.5 for going at
7:16
so the central Tibetan on the negative feedback saying I'm West Point 25
7:20
going to subtract a this sector 5.25 read the terms are also of course the
7:25
weather before
7:26
and so here's the new career eight you see that some terms now have never to
7:29
wait
7:31
and us some terms have farm positive care within the reach
7:35
alright so if you have the vessel is to have any questions about work it's a
7:38
very simple other thing but it's under
7:40
so the basis of a lot of ideas that that people use the relevance for
7:46
ok with group alright so
7:49
just to see how it works were in a real task
7:53
we have for example we want to find out arm
7:56
a from you know Corey that retrieves new new
8:00
us peso DOP occasions I want to find all the documents about this
8:04
so the initial so the result is this so the green ones are the relevant 13
8:08
I'm so this one is relevant russia has a strap spectrometers and outside of the
8:12
cation
8:14
um night nasa but another head-scratcher so do it environment here
8:18
and also this on telecommunications has to do with that I said if you click
8:21
through to the document services
8:23
and those are on our so they're going to
8:26
I then try to the actually expand the query
8:30
this the terms red so right the new
8:33
with the new terms that have been in the no
8:36
with high weight from the relevant documents likes I do it not surprisingly
8:39
its original on so
8:41
space launch for example instrument which was not in the region Okorie
8:45
arm and others rare and then over going to
8:49
a submit this new curry insurers our
8:52
arm a hours at are original documents but they also found some new documents
8:57
that are also
8:58
arm relevant right so here tend to go to launch a sacred site I
9:02
presumably arm that's not the case in a satellite technology
9:06
um I'm not sure what this is but I suspect it so
9:09
you know has something to do with testing superconductors
9:12
basic respect took to the point is you've got some new documents
9:16
the originally rank high you know that parents are still
9:20
to rate very high %uh the one that the market for 465
9:23
as well because small to try to move towards that
9:26
and the man relevance are marked for one thing
9:30
to to think about this what's more available positive feedback or negative
9:34
feedback
9:35
went so for example a you know if you say that
9:38
positive feedback is lamentable so dominated in beer
9:41
up for example getting on with 2.25 and eat it too
9:45
75 for um any idea why
9:49
way why do you think it's been its book was defeated through that is more
9:53
is more useful what someone is going to be awake
9:56
yes
10:00
a
10:04
so there's this dude two things in your own surrenders
10:06
more non-relevant documents and there's more terms
10:09
number and a cancer so that's desert two different things I think so
10:14
I think um a well thing about this is that suppose that
10:18
some a so in a bit there saying in the top 10 results there will be
10:22
them you know more non-relevant and okay okay
10:25
on up
10:29
them
10:33
all correct
10:38
that that that that's right that's a that's a moron or along those lines were
10:42
so exactly so the space of nine relevant documents as much
10:46
sort of bigger great I'm and the so it's my it's much harder to serve eliminate
10:51
negative
10:51
I'm documents there is another a issue
10:54
and that is a served under way that there's something called the cluster
10:58
hypothesis introduced by raspber
11:00
who some sort of a and it sort of seems to be
11:04
true that where that all sing or adopted all the relevant documents tend to be
11:09
somewhat close together and space right on the cluster various non-relevant
11:12
documents are
11:13
kinda anywhere I can go over the place so so in some sense it's more
11:17
its easier to move towards the relevant cluster because there is this cluster
11:20
above the surface trying to
11:22
so devoid non-relevant documents it could be and K
11:27
and because of this a ain't just becomes much more simple as on many systems and
11:31
didn't do only
11:32
lol only positive feedback so
11:36
there's a couple assumptions and relevance to back and there's a very
11:40
serious
11:41
so one is that the user has sufficient knowledge for
11:44
reasonable crew I'll it's not always the case
11:47
and also that there is some a
11:50
your reasonable means that there is not relevant documents and initial setup
11:54
that you can perform this feedback
11:55
so this is again not always the case so the examples of course misspellings
12:01
I'll cross-language retrieval right when I searching say collection and Russian
12:05
I don't know which forced to use um and many so upset about what a good
12:10
or recovering mismatch read something you know cost of nurses Austra
12:13
right um so the second so that's the first assumption rather than usual
12:18
Korea's reasonable and in this sense
12:20
K second assumption the relevance prototypes are well behaved
12:24
um it means that again this clustering assumption
12:28
I thought this is that relevant documents are clustered together not
12:31
necessarily the case
12:32
because from you know that could be that they're different clusters of course the
12:35
relevant documents but they should have at least some cavalry over up
12:38
otherwise right you're not going to be able to to move in the right direction
12:42
so there is some
12:44
serve you violation so e2 for example here is a
12:47
um you know they could be very diverse relevant examples trading very different
12:52
clusters annual
12:53
so here for example are know if I want to find all the
12:56
a pop star slept at one point worked at McDonald's
12:59
a month so it's yours you know I find if you first examples
13:02
flat plains and that no sharon stone if you remember this this picture and I
13:07
forgot
13:07
with this case is a of it um and
13:10
from so so if I use or abuse this guys is relevant examples I might actually
13:15
quickly
13:16
I'm ever try to drift over its you know all kinds of things that have nothing to
13:20
do with my dogs because they're so different
13:22
right okay so there's one other problem with the back that's worth thinking
13:26
about
13:27
that lockers are inefficient for typical a regional I'm so why is that the case
13:31
anybody
13:32
I'll remember the basic
13:35
structure researching where longer is not efficient
13:49
let's say that you actually have clinic terms
13:51
they're not efficient in a cell may be efficient as a
13:55
voter lists the status time-consuming competition expensive
13:58
right okay so we're going to have to scan for each to only have to look up
14:04
a a posting least an inverted lists very so
14:08
you know the right time is at least in your interview in in length of firm
14:12
of currently stands at a standard implementation but typically though
14:14
kinds of other issues is also
14:16
so the the longer the list of course the list of terms the longer the query
14:20
the more posting least you have to touch class right to merge right
14:24
so it becomes so ok but that's a very expensive
14:27
so for this the solution is to for example cut off
14:31
to only the top 20 query terms um
14:34
and you know or for example keep only the most probably wants a system already
14:38
but the other thing of course if users are you know who don't wanna provide
14:43
explicit feedback
14:44
and and other thing is you have this the remember I was wondering why was
14:48
one particular document the truth and other because you just don't know
14:52
rest you know there's magic happening under under the
14:55
under the covers so so arm
14:58
so it's there is a problem so this is definitely a problem users from
15:01
predictable behavior from assertion
15:02
so the course this was the very simple on their assertions them of course much
15:07
more common complicated models and starting from a simple
15:10
my based version a you could try to estimate the retina
15:14
the probability that term means that the relevant document writer for a bill to
15:18
terms radovan also for example just the frequency
15:22
how many times how many um relevant documents appear sources
15:25
all the local noon so um and the
15:28
sort of computer if so maybe usually base buspar something more interesting
15:33
the rewrite the results um you you can also use language modeling techniques
15:37
again from
15:38
EDS lecture yesterday very so from I'm not gonna go into too much detail about
15:42
how
15:42
I'm not gonna go to any more detail about sophomore offensive
15:46
relevance feedback methods because again it's been covered yesterday
15:49
I just want to get the basics and make sure that they understand exactly the
15:52
ideas that underneath all the things I
15:54
because they have released you know the ideas are the same and so so so if you
15:57
want to move towards to
15:58
relevant document cluster this is really important
16:02
it's really its you have to be extremely careful while dating relevance feedback
16:07
because you can't cut with precision recall all documents anymore
16:11
why is that prayer because the user has already mark some documents rather
16:16
and some other relevant so he just people the document service plus
16:19
in there in the connection you'll be cheating right so
16:23
so what you have to do is sit by the phone additional documents that are not
16:26
seen by user with school residual connection
16:29
if they removed a mark documents and the sad thing is that often the final
16:33
performance thanks for going to lower than the regional Corey
16:35
right in any way
16:39
cool
16:44
exactly yep so they removed the
16:48
not only the most relevant also be just once you know for system to treat rare
16:52
so where so see have to be in so what you have to do is
16:56
up there well so there's a kind of the issues but
16:59
but that's something to be aware of right um
17:03
so we'll see I'll give you one example relevance feedback about haitian
17:07
that sort of the classic one that's going to be sort of coming up in a
17:10
couple this
17:11
so again to the south and another sort of issue that
17:14
reports getting known for many many many for collaborations
17:18
is that resident status usually various stuff tomorrow
17:22
up two rounds are usually not very useful are the biggest
17:25
officer Ando a for search engines do over soon or patient just one round of
17:31
you that
17:31
for example the example showed you with the Google similar documents
17:35
so now knowing what we'd know about what
17:39
before happens on me what do you think if so this bridge just pretend that they
17:43
used a simple rocky others in which the dog freddy is something different
17:47
but but that if they did what would be the alpha beta gamma from other
17:51
ascending for the Google you know when you click on the
17:54
simone Peach what do you think is I'll
17:57
yes a
18:02
yes exit so it's all be alright so they just look at
18:06
though so that %uh the documents relevant to that
18:09
core rate so it's not relevant to the initial sir to that document not the
18:13
initial career
18:14
and there is no negative feedback of course okay everybody so
18:18
what's up okay so before I get into their rotation example let me just very
18:25
quickly just run through the metrics just in case you forgot
18:28
right precision it K not surprisingly so green is relevant thread is not relevant
18:32
so
18:33
opposition at the would-be to third position for would-be
18:37
2-for-4 to your habits are
18:40
so easy correct a mean average precision again is
18:44
because the computer precision its K um so here's an example we're going to be
18:48
again just like the example we saw yesterday
18:50
is going to be one third time's 1 you know what other one is relevant first to
18:55
third
18:55
a was that so too afraid so too out of
18:59
3 because that's when the next docking relevant documents right
19:02
a three-fifths rose 0.06 okay and endeavors this across multiple cores
19:07
finally a CG so again so it's a
19:10
I'll see if the metric to trust the we have any documents at the top
19:14
at the top of the list and we're going to just the at
19:17
relevant documents fish waste that the case exponentially
19:21
I'm with each level right so here's a is is a
19:24
and kisses addition GM sorry for this for this example results of prey to
19:28
normalize it
19:29
by dividing by the highest possible best possible writing those documents for two
19:34
would be green well green greening you know
19:37
bright green and red about there okay so that's an issue
19:41
um again the if this was just sort of to just in case you forgot
19:45
from yesterday sewn up another we talk about metrics and talk to you about the
19:50
classical study
19:51
on that I think I miss very nice example
19:54
of both relevance feedback and also how to do it with a valuation
19:57
and IR so arm so here with the research question s
20:02
services over fundamental on those relevance feedback to improve results
20:05
where and but the first did st. another question as well
20:09
how much UserControl there should be how much does it help
20:14
so they tried three different settings opaque
20:17
right that's the magic setting where the user doesn't get to see the feedback
20:20
process
20:21
and they just Oakley can get something that hopefully
20:25
transparent where the user shows the relevant to both terms
20:28
but can't modify the query impenetrable where the user shows
20:32
his has is the terms a connection to modify differs law
20:36
so the question of the asters basically one this doesn't help into
20:40
what's the best level control um so here's how they did it
20:45
um they of be used the number of track
20:48
topics Ultrabook topics %uh it ok if writer's
20:51
this is the correct about the company advertising for the other %uh this is
20:55
that what the query supposed to retrieve rates have noticed how the
20:57
core is a very you know both described right document
21:01
what what makes a doctor relevant so that all the 18 or the subjects so
21:05
exactly what documents should be to develop
21:08
and here's something more details for what the relevant document must the
21:12
must do rare so very very well
21:15
Best Buy information you because you love it though the user's what they
21:18
should be looking for
21:20
um then they did the pretest so they you there is no issue
21:24
users not understanding how to use the system again very important because of
21:27
course at this time for users to learn the system so that it's a 15 minutes or
21:30
so
21:31
and then the additional experiment for it each user were shown
21:34
um 1mon RightNow no relevance to the
21:37
opaque lets the a magic rituals parents impenetrable
21:41
and a used position at thirty issah metric for the study
21:45
but isn't all that much abuse but thats or I'll start with
21:49
so here's an example here with the interface that of course the reports
21:53
that people know
21:53
what's going on so few this is the vice versa
21:57
the snow a person where the user issues a query
22:00
summer here way so that's the current career
22:04
and then based on the retrieved the documents
22:07
right he said documenting this is a the title and see the document previously
22:11
you know relevant project Rep and then the
22:15
systems that have tries to add additional terms for the core
22:18
in this case that the queries about other manufactures threats
22:21
um here at the same thing with the
22:25
transparent for OpenTable version where the user were
22:28
not only day you can see what the terms are up you can delete
22:32
or so at terms right here so you can actually modify the query to submit
22:36
by so that's the difference but otherwise everything is the same
22:40
a what I wanted to show was first this part which is
22:43
comparison of precision right at
22:46
at thirty so this is no relevance feedback its I'm
22:50
after arm I'll the
22:53
the user sort of the the time was up I think they had up to 30 minutes or
22:56
when they felt they were done um day and
23:00
you know where the position was about .4 and you see the future
23:03
various prayer um and the
23:07
because there's a guy in multiple users okay so then
23:10
when there is a.m. i sorry there are no surprises this is the tutorial right
23:14
over and the you it's a good rest because some users didn't know what they
23:17
were doing right
23:18
after tutorial their precision for over a non relevance feedback when up
23:22
2.45 or so point five or so and then
23:25
so just take a look now at the University but part
23:32
so ability coercion does improve
23:35
both for mean and median precision right up to
23:39
I'll approximately you know 55
23:42
or so um transparent
23:45
a relevance it but doesn't seem to help at all except the various gets a little
23:50
small so you thinking
23:51
tempt them to convert a little faster sorry the file career
23:55
and when the users can actually modified it corrects on top of the seventeen
23:58
percent the
24:00
why I'm sorry black box relevance to but gives you also get
24:04
additional 15 percent on top of that right so it's really helpful to give
24:08
users the control
24:10
to modify delete terms at terms you know the see what's going on
24:14
as well as actually able to change this um
24:17
here's another interesting thing a.m. the this was the precision
24:21
but also the measure a the how many iterations how much time did the system
24:26
the mean
24:27
before you convergence of the good results right and this is
24:30
you know you get dramatically much faster convergence again when the user
24:34
is able to
24:35
or modify the query but you get less of numbers of people sent it just
24:39
the you know are puzzled why those terms are included
24:43
and it you know the convergence actually goes up if you allow them to see
24:47
so the things underneath but not allow them to change it bro
24:50
so the proper spend more time trying to somehow at Magic epitomize the crew
24:54
a instead of notice posting about so so the point is if you show people
24:58
something actually have to
25:00
let the modifier so right so when you that the modified so that the
25:05
that users final results much best so the summers
25:08
benefits to improve results about 66 percent of the time
25:12
it's a summary of course in a sense that this is summer like 10 years a research
25:15
analyst
25:16
but this is a study from spank it all based on excited career up
25:19
I'll on every typically you want to see at least five just documents that I
25:24
wasted by going to get stable results
25:26
and you need again sort of queries for which there is enough relevant documents
25:32
another interesting thing day tho and maybe this is changing our
25:37
know if the new search engines but I'll at that time
25:40
only four percent of the Corries arm four percent of the square special use
25:44
this relevance feedback
25:45
each feature little more like this are seeing or talking
25:49
arm but of course me know that menus are stop after just looking at it for 60
25:54
page so
25:55
says I'm so so those users who are not lazy read that actually do something
25:58
else
25:59
I both one a huge this so feature I'm
26:03
and some from the study of just described users
26:06
I'm are you know can be much more effective using relevance feedback when
26:10
they can modify create
26:12
so of course that you know implies we need to do good graces just so
26:16
and perhaps even 45 operations this chin no
26:19
that's something that's not done without by surgeons
26:22
K
26:26
so so over the summer I already talked about the explicit feedback
26:30
%uh specifically the kind where expand the career the
26:34
talked about user control and no let's sort of trying to move from explicit
26:39
deducting
26:39
visited specifically up two clicks
26:46
so as I said the users are reluctant to provide residents is there
26:50
no in for TN of turning the searchers maybe precision oriented but I don't see
26:54
more documents like this I just want to get to ask
26:57
right um and they can be sort of annoyed when you I
27:01
you know start asking questions you know what this document helpful
27:04
I'm so we would like to really got a relevant information without forcing
27:10
user
27:10
to do stuff prayer so that's the goal here
27:14
going to estimate felons from here so
27:17
just to put clicks in perspective to hear the different kinds of
27:21
observable behavior that he could try to go other or explode
27:25
for example you know somebody use a documentary leeson's
27:29
they might select an object or this is Kylie prayer
27:32
whereas know they clicked on on a document
27:35
up the white though all kinds of other useful things they might subscribe to
27:38
the channel or two above
27:40
they might term to my bookmark something or save it for purchase
27:44
or needed from the pickup base things on
27:47
you can imagine so there's actually quite a few things users can do with it
27:50
up
27:50
3 again severe just your focusing on the clicks because they're so common
27:55
but there's many other things a good sport so
27:58
limitations as you know clicks are difficult to interpret because
28:02
from the previous text your users who click on something always or almost
28:05
always
28:06
of them um there is a person's position on bus issue which will talk about
28:11
and the things like notebook I was also misleading because people like to
28:15
you know go get a cup of coffee you know click on a result the get a cup of
28:19
coffee come back
28:20
five minutes later and it to you do to us it seems like maybe they spend five
28:24
minutes reading the document
28:25
we don't know they might do things like opening multiple tabs team i'd the
28:29
multitasking those kinds of interesting things to go up
28:31
on that may click interpretation quite difficult
28:34
and the despite these limitations the we have lots of clicks
28:38
arm if you don't have to ask users to do anything
28:41
a special for us they just do their normal
28:45
whatever they're doing right so in some sense there is know the tradeoff for it
28:49
um getting lots of data about you don't know
28:52
easily how to interpret so here so we can start observing it
28:56
so this is a very nice study by joking though from those
28:59
fight they looked at the
29:02
percent of the time that people look that up for us
29:06
wrestled just sort of used I travel to try to examine how people to the result
29:12
so securities a arm the gray is the number
29:16
uppers percent but they say shows the percent of time people hope that the
29:19
abstract
29:19
and the blood is the percent of clicks and you can imagine that by
29:24
correlated and most of them again or near the top and a soda or
29:28
on Friday um and the
29:32
us and actually this is so they did something interesting so this is just
29:36
the
29:37
um passively use the results they tried to see what happens if it slapped
29:41
the ordinary right so what this tells me is that you know there is a trust
29:45
account for the issue
29:47
caption by a certain pressure some relevance buyers just to see if people
29:50
do things differently and
29:52
and there is some slight difference you know when you slow start slopping
29:54
results may be relevant
29:56
get a little or some but Dover also shape for the picture
29:59
behavior say so that makes it again a very nice example the control
30:03
good start okay so based on this
30:07
what they saw examined a digital qualitative analysis and
30:11
or want it was all but the tried to identify strategies that people
30:15
I might use a to with will do
30:18
what they mean by clicking and within the state's a secure serve them on the
30:22
most though
30:23
I'll no longer just keep up meaning that if they click on result though
30:28
um it's a free that means they've looked at results 1 until
30:32
and they've just come to being on relevant based on the summer
30:35
sources keep about and be it's it's been
30:39
its it's true that indeed it's not in use as non-relevant about the eighty
30:43
percent of the time
30:45
um there is the one even though the better even
30:49
causes the Lost Creek rate if the user clicks on the result
30:52
when that's the last click day again about eighty-three percent to
30:56
eighty-one percent of the time that that means a pose a relevant document
31:00
um but other ones that other hypothesis that right or not this useful for
31:04
example
31:05
maybe a click on 3 after clicking on one means the three is more relevant and
31:10
more
31:11
that's not necessarily the case um
31:14
a there is you know I
31:17
actually not that much difference between a
31:20
no meaning saying that the clicked result
31:23
preview no but scott knows better than to as opposed to just three is better
31:27
than one and two
31:28
and then I'm the other interesting someone and that was
31:32
in the normal setting rates without so
31:35
inflicting the result I'm was the
31:38
often a sort of love click is
31:41
and me that is all they see is a for that users don't just care
31:45
up to three and that the company they actually go down one more
31:49
and then look at their that the following results or come back and click
31:52
on free trade
31:53
%uh which means that you could say that for in this case for example might not
31:56
be relevant
31:57
and it's true you know it was true in the regional setting and what's not true
32:02
after use a sub stopping
32:04
things but so those are so volatile strategist
32:07
against give above and a steep next
32:10
with what it's called so
32:14
so what do we do that straight so now that you know that people do
32:17
or with you know you know that people do look you know from top to bottom and
32:20
then they
32:21
so far bias to click on the top result like this all before %uh
32:24
so this is the simple model the I'll venture into this you spoke about the
32:29
on Friday for example they could try to just DiBiase by
32:33
trying to you um you know account for a preference for the
32:37
for the top document so it's actually much worse and the web saying we have
32:40
real users
32:42
um so here's an example where there is a higher click-through
32:46
at to I'll at the document at the top
32:49
I'll ranked document even though the first relevant
32:52
document is actually in position three right there's more police on the first
32:55
one
32:56
the third you know the first though the third one is more relevant for
32:59
so this is done by I manually annotating a number for
33:03
many queries and knowing where the first relevant documents were shown to the
33:06
user
33:07
where so get a very simple read this very simple model I think this as mobile
33:11
two or something we show yesterday
33:13
about this delicious you know a mixture model or additive model over there
33:17
reseda
33:18
a I'll relevance is a around the room for the variable that's that's
33:24
are generated by a mixture of two components to distributions
33:28
a rather specifically um up through
33:32
I'm the position position a preference and the retina and Brenda and and and
33:37
relevant something to click so generally this so you could try to recover
33:39
relevant
33:40
by subtracting the expected position bus
33:44
K and nothing difficult year arm
33:47
so secures you know it is it suggests AP
33:50
more precise strategy that proposed by you're looking for though
33:53
a solo number this keep above inskeep next a strategist
33:57
so suppose they were clicks on the results on a particular property
34:01
cookware
34:02
up you know there is that they for users and or something or to users in the
34:06
clicked on
34:07
the results 2&4 rare
34:10
and they're going to say that the result to the clickers likely by chance
34:13
and the um because right to expect have the click still and only about half the
34:18
land a position to anyway
34:20
um and by so when we subtract the expected version right but
34:24
but the CLI composition for is a is is actually important right
34:27
so therefore will say that for is more relevant than one two and three eighths
34:31
of the Lord is click because they think it's by chance
34:34
%uh but not to say that to is more relevant tomorrow or three
34:38
okay that's a simple extension of the skipper buds where
34:42
so it's it's surprising were sixty surprisingly well
34:46
um I'm so if this is this is this
34:49
you above bus next strategy implemented directly right so what we're trying to
34:53
predict
34:54
is %uh relevance I'll all relative
34:57
relevance of up there I saw your preferences sir
35:01
fairways a relevant a I'll for
35:04
arm for documents on the x-axis is recalling the
35:08
y axis is precision on the pair level and
35:11
the red line is the be signed original search engine ranking
35:15
so the skip above us next strategy
35:19
does up slightly better than the original search engine ranking but once
35:22
you start to
35:23
discarding or sort of not trusting clicks you can actually get
35:25
significantly higher precision
35:27
on predicting relevance um by Justin is very very simple model
35:34
gonna move on to more difficult and more complicated 10 minutes but the
35:37
any questions on this stuff
35:42
not bright side is the concert with more clicks that I like you by chance given
35:45
all those sort of things we discussed
35:49
alright so then the I'll skip so other ones but
35:52
and get straight to the escape model which so was I think with the most
35:55
promising ones from
35:56
from Friday so how do you sleep over and over or so extension of Cascade
36:01
up to interpret clicks know to get us out of place
36:05
so this is a trying to predict were so this is the motto that stress is a
36:08
generative model tractor trying to generate clicks
36:11
right so this is the clip is generated from three components
36:15
the one is that of course the user must have examined this document
36:19
right to click on it um and there's two other ones
36:22
that user must be attracted by this summary right to talked about
36:26
capsule being attractive to users um and
36:30
furious the you know once they've clicked right there is a
36:33
whether they're satisfied or not if to page
36:36
%uh also again did you examine the Euro dusty
36:40
Rep was user satisfied by the page and was user attracted by the right to have
36:45
two or three
36:46
stage three components reinventing at the go into the
36:49
plea a generation or pick some interpretation
36:53
um in particular for a they're going to
36:57
so we can you can look at this at the paper for details later
37:01
on that this is a set up a patient that completely describes this
37:05
business at um and I think
37:08
if you start thinking about this is it's rougher preview for example
37:12
um little I'm sorry little a a some you that's the
37:15
actual I attractive you know how um
37:19
whether the cool
37:23
over the euro was attractive and right but it actually was ready to use our
37:27
phone is relevant 0
37:28
what thing with think it was relevant and as subdued means that
37:32
the the the document actually does satisfied information
37:35
rate so thats so those are the components so that something here is
37:40
that
37:41
document is attractive than the user will find the snippet a tractor-trailer
37:44
so there's some kinda simplifying assumptions here rensselaer did to make
37:48
it easier
37:48
um and things like this right at them
37:52
up this is another one where basically saying that if the user click
37:56
went on a page and click on a page and I never taken anything getting a read
38:01
that probably their site is fight right so there's some strong assumption sir
38:05
so there there's a strong assumption song though but I think it's it's it's
38:08
it's reasonable motto the problem of what are the most reasonable was there
38:11
is right now
38:13
okay so um and once you sort of have this
38:18
months to train this base hit them you know we'll talk a little bit about
38:21
Rainier
38:22
on-the-go list of course use the trying to figure out
38:26
with what the relevance was a you write somebody clicks on the on the result you
38:29
is that
38:30
how do you figure out the thriller um well it's relevant
38:34
if the user to have given that the user haven't had that they will be satisfied
38:38
right so they're trying to estimate this probability given to this half rate than
38:41
we can try to use the chain rule to sort of figure this out
38:45
that the probability that I'm document is very well
38:48
is against South its um which ain't it time for a bit user was satisfied with
38:54
the document
38:55
given that they've clicked on it and arm
38:58
that they've examined a threat a and in the end it comes out to be simply
39:02
this this to serve variables associated with the document name that
39:06
attractiveness
39:07
and against them the satisfaction state and the
39:11
because we do have a lot to placate the train at by using a mother is from rate
39:15
so using Federal
39:16
forward backward on equivalent rates of which if you know thank you mark a
39:21
milestone this is
39:22
sorry I was sick you know if your back was the kind where mothers
39:25
red so that he is that the vatican train it on a dataset d'amato Prometheus
39:29
effectively waiting still climbing or
39:31
britain said virtually all I think about and Bennett said the
39:35
the arm some this parameter
39:39
right I'm the worth share manager right that's the one which
39:43
which I was about something about four whether the user test two
39:47
keep clicking a result after the clicked on the first or up
39:51
okay so how does it work well it turns out that predicts a
39:54
relevance about eighty percent eighty percent
39:58
agreement to the human relevance judgments which is pretty good
40:01
it is actually very very good because server human
40:05
judges agree at least I think about 85 percent of the time understandably
40:09
I'll but you know whatever you think you know the butter
40:12
you know your particular that about 80 percent is really good for this
40:16
for real suburbia and than they've shown that for example
40:19
you can actually use those clicks as features
40:23
and that some giving you preview up tomorrow Victor
40:26
how you could use the judges for example not only does correlates with
40:30
human growth as an inspection is useful to improve ranking
40:33
so where is this so the to
40:37
the story that they can actually do better and I'm
40:42
yet but they can do better than the that they can do almost as well as the
40:46
explicit relevance judgments and better than sort of models like to adjust to
40:50
congressional cascade Cascade the so this is an extension of the skin
40:53
moderate
40:53
but so they can do that occurs K in this case
40:57
okay um so
41:01
let me just summarize and then we'll talk with I'm sure you have questions
41:04
about us
41:05
so I first saw you there very simple model that is the position by S
41:09
and the sort of did talk about extension on the disc in my lower because
41:14
I haven't seen a paper that shows hope this image released from this game
41:17
already
41:18
and anyway so the one we show what I just introduced this is better
41:22
sometimes is more realistic and it does it has been shown to work both in
41:26
practice
41:26
and a 2008 says he has some limitations I pointed out some other already in the
41:31
can discuss more
41:32
and thing now is a good time to have questions
41:36
a before I move on to the next project so no
41:39
into a couple minutes for questions and we'll take a break
41:46
okay I went back a couple sites and %uh something here is that again that the
41:50
user
41:50
um it's a in some sense a single click model
41:53
in that even though there is a private service controls how well the usual but
41:57
users going to clicking
41:58
um in the end this reversing the user will be satisfied
42:02
if they stopped leaking right that they're satisfied this is not always the
42:06
case if these are just give up
42:08
um with other indications are there again so they treat this service
42:12
private room an early a would be nice if they could learn that the medicare
42:16
and they also the big big assumption in addition to mention is that the
42:20
user can estimate how good is a is a is a document
42:23
um you notice how attractive it is based on the you know
42:27
based on the capsules that made it perfectly without you but that action is
42:30
not well done
42:32
the summer's level down so they might not a pic when it never made it to this
42:35
point right to actually look at the page so
42:39
the problem this the so young one limitation all
42:42
this model as we just discussed the skate or the and I think fishnet which
42:46
is like all other similar to the skate idea
42:49
um is that your mother think click Corey and a result
42:53
ugly in isolation for all the rest of the third session
42:57
officer sessions take multiple chorus
43:00
and meeting place so
43:03
I every IRS paper came out recently that doesn't know
43:07
other tries to cut through clicking context me
43:11
the process being the a set of core
43:15
straight that he should borrow the same said and what they're trying to do
43:19
or sister try to arm the following you want to figure out where there is a
43:23
career change
43:24
and I'll give you an example what that is and then they're going to model the
43:29
chain not just
43:30
you know click on a single patient isolation but
43:33
in the context of this chain and then of course you know they did some very nice
43:37
so
43:38
arm learning you know using standard for both the trees issues
43:41
yesterday great day so
43:45
and what they're trying to do is again present predicts relevance not based on
43:49
the
43:49
simply so the sport but also for the whole chain so this is the
43:53
they directly that I hope probabilistic come net right to a net
43:57
or hierarchical bit had actually that's going to estimate relevant
44:01
overtly based on ARM probability that your particular state in the chain
44:06
that generates a particular search
44:09
that in turn mate generated pageview or
44:13
you know particular fish from the search and a birthrate to click
44:16
and then from in this context we gonna try to estimate from you would feel
44:20
relevant
44:21
given all this information
44:27
so a basically know they're they're saying
44:31
not only to be just right to may have a generative model of a
44:35
I love sort of clicks given relevance and then try to recover elements they're
44:39
going to
44:40
ability is there a more accurate for discriminative model
44:43
by representing this chain from
44:47
component this features where so and then um
44:51
good you know that they can train it by just three months unlikelihood that
44:54
estimate register
44:55
go through and and count effectively um and and then the USOC 8
45:00
those features now time positions of chambers's
45:03
features this relevance judgment from editor it's a person likes this %uh this
45:07
doctrine development or not
45:09
and use it for learning where so specifically they
45:14
just like everybody else is a Slovenian with the trees
45:17
the use to read up for are they actually used 849
45:21
a B&B smile a system there was some of tweaking
45:24
know to make sure these are the go good result um
45:27
in its okay and then home baby of course dude is joint training and test sets
45:32
to try to first well trained separately the
45:36
basis net features to our articles for a pic features as well as
45:39
the the tree on top of those features and added cross-validation
45:44
sorry so there isn't a prototype also the game the key
45:47
contribution here is that they have this sort of clever features derived from the
45:51
heart of the Berkeley
45:53
and the use numbers features another a generative model that repeated to a
45:58
a no any from was but reserve the right to try to actually
46:02
learned what so week from the features are you know correlate with
46:06
befriend us um I'm going to
46:10
so that the point it is a birthright different configurations this is the
46:13
baseline they're so different configurations of treason
46:16
in forests from you know the on details of course on the paper but they just
46:21
want to point out that
46:22
yet that does much better than they have reasonable based right I believe is
46:25
based on these because kate mara also
46:27
so the door because Kate equivalent read what I just described before
46:30
so you once you serve your present klicken this context have changed and
46:34
train
46:35
K a learner on top of the features I did this come into a corner
46:39
um then you get much better precision recall right so much better accuracy
46:44
or performance
46:47
okay so before I'm also so again the
46:50
is the key and I i know i skipped that out some details
46:53
for example how this is strange but but is the key easter is the main idea
46:57
clearly have any questions about what they were doing
47:01
up okay so let's move on to Richard evils
47:04
so so I guess I talked about place and if you remember the site
47:08
ugly cuz just one sort of behavior you could look at it's very important for
47:12
but it's just one but there are others like browsing scrolling ball time
47:16
so how do we use those additional behavioral features estimators
47:19
so others that are three main approaches two main approaches those who pursue it
47:24
sticks and that's been so the case
47:26
I'm backing I wasn't too early to the thousands
47:30
and the there's been earning based methods a
47:33
I think they're very nice to a general model it all started
47:36
is the cure is broza model and to do published in 2005
47:40
and then you can extend at the Korean browsing model
47:44
so here's the cure is rosa DeLauro you start from
47:47
the user the personas Chris something where
47:50
they get back this series of the click on the result they come back to the page
47:54
and the again this was not a Microsoft so little
47:57
box pops up and says did you find the results did you will others already
48:02
did you like it or not Rep I'm so
48:05
and or in in sometimes it might actually pulsing
48:09
well you decided not to click on this result white-out well
48:13
I'll so it's curious reading this that um
48:17
so of course this was down on Microsoft employees and you know
48:21
for us or bye bye consent and at the Rec Center but the point is you can collect
48:26
all kinds of useful data
48:27
um and you can even you know once you finally clicked on something come back
48:31
you can try to muddy the full searches
48:33
36-hole you know is satisfied enough to do so
48:36
so that he is a first-come brother to cook or away behavior on metrics like
48:39
browsing
48:41
clicking coming back to the page miss relevance
48:45
so they used a I'm Beijing
48:48
a bees nest to try to correlate arm
48:51
arm the arm defeat so
48:54
features like you know how long for example user clicked on a result what
48:58
the position of the
48:59
of the result was so what are the images
49:03
when I can look and you can look at the site later the paper to get the details
49:06
but
49:06
the plan is to try to represent all kinds of useful features I'll
49:10
the search page the the and the result itself right and sort of the
49:14
did that going to the resulting back
49:18
so the point is that both and click-through make that that suggest
49:21
strong stronger predictors of actual um
49:24
not surprisingly printing adding a result the favors also predict
49:28
satisfaction with but it's very rare
49:30
and months just are combining those measures together using a learner
49:34
of course you'll get a better result than than just click for informational
49:38
I so here you know for example if you just only use click-through
49:42
and the US but with some very simple model of plea from not so thank you must
49:45
that we discussed it
49:46
but you know you still sort of get some boo you know the the end result still
49:50
cold front that want to start heading
49:53
combined measures to get so much higher accuracy higher
49:56
accurate prediction of satisfaction collect points 7 months ago from 14.7
50:01
arm and I'm and the
50:05
so you can even a and often you can even once you start up cutting off my
50:09
confidence so if you wanna only really really good data
50:12
among the first to get less a doesn't but just to get it by factors
50:17
right so you can go up from 17 to wipe on something so
50:20
right that was the plan K so so into you
50:24
do what they've introduced again is a is a general model
50:27
farm are predicting satisfaction based on all counts if you saw
50:32
features more reached unjustly so then
50:35
so the but there are many other things that you can do for example
50:38
what they didn't look at whether the user was so biased by
50:43
you know that the presentation of the result or what they did after you know
50:46
so they did talk about this tomorrow up balls back but perhaps these are my goal
50:50
for exploring
50:50
arm and see what a model that as well I'll
50:54
so they be is going to present this s features for dinner
50:58
presenting be surprised to feature the presentation features the click-through
51:02
features
51:02
and the for browsing behavior features a
51:06
and and you can course trainer um so
51:09
one useful way to train to do is not just the British predict satisfaction
51:12
but
51:13
I soon learned yesterday it's useful to to
51:16
provide pairwise preferences for the train director
51:20
so in this so paper the home
51:24
the the point was to predict that there was preferences not just the
51:28
absolute satisfaction um sewing the the point is you have a bunch of features a
51:33
contrarian
51:33
very much any bus fire almost any cost per view is that your own that
51:37
implementation called rank at
51:39
a and that can assign relevance score
51:42
a trip based on the provided up there with professionals
51:46
on this kind of features on sort of a few thing back to the very simple model
51:50
is shown before
51:51
by adding this so for browsing behavior features
51:55
like browsing scrolling
51:58
service center um bishop in addition you get both
52:02
higher precision right of predicting preferences going up to
52:05
help with 7.6 up 176 178
52:09
as both a much higher recall right so in this region what you're getting is
52:13
information from the browsing and that helps um
52:16
you know who was the first book precision and recall okay so
52:19
more not surprisingly want to use the region motherly can do much better
52:23
simple click for miles just again put this back in context that talked about
52:27
clicks
52:28
from talked about some simple things a
52:31
like the something for ways to examining the
52:35
results and no what does the users to remember
52:39
eye-tracking studies well from so
52:42
both skipped the sort of the motivation for attracting because they already
52:45
talked about it on Friday
52:47
but I do want to talk in some more detail about how they could use eye
52:50
tracking for
52:51
prejudices and more each model here that
52:54
in particular it could be potentially very useful so here's
52:58
again months it's a mess but know that's a myth that they can affect some
53:02
information from
53:03
this are the fixation points you know the the data points for the usual looked
53:07
on the page and the ideas to protect against the st. Catharines were
53:12
where the user fixated to read the page and from there to try to
53:16
um extracts things like whether the user with scheming
53:19
the text compared to where they were reading it carefully
53:23
and I think you've seen may have seen some other yesterday so
53:26
I if that means I can safely say to go fast but
53:29
I'm did what they tried to compare in the study is
53:32
I'm can you do better by care for her mother named the
53:36
the said things like reading speed and a no
53:40
other through pictures of of the gay stuff to do better relevance feedback
53:44
rates instead of trying to the feedback
53:47
over the whole document saying look you know everything a document is relevant
53:50
now they're only looking at the particular words that
53:54
people perhaps focused on by reading slowly versus came in
53:57
okay and from and it turns out that
54:00
I yes it helps spread the baseline is physically takes
54:04
the some coolant or extension of the rocky others emerges as the whole
54:07
document
54:08
up pretty much any kind of gay stuff helps
54:12
um aim for and the CG in particular it helps the
54:15
to look at how long the person looked at I'm
54:19
at the a passage and
54:23
um you know and and motto the readings you reading speed maybe is not as useful
54:27
other features three
54:28
so the point is you can do much better I'm in the city or other position on
54:32
precision metrics
54:33
by Ian by than a standard pronounce it better if you know which parts of the
54:38
page the user looked at and of course I'm
54:41
you know the if they don't have wifi for every machine but even this all those
54:45
days he said
54:46
perhaps moscow has a lot of things and on I guess
54:50
probably should leave it for another time so
54:53
I want to leave you this something that I think so you if you're looking for a
54:57
fun project to do
54:58
I'll you can try to turns out there was a fun competition in 2005 a
55:05
who and where the goal was 10 for relevance for my movement
55:08
a its it's still posted here for now
55:12
a.m. eso yesterday and the
55:15
the goal is to predict relevance given I'm first so they provide
55:19
data with eye-tracking where that says how people look to the titles as well as
55:25
the actual
55:26
on you know relevance to answer so then you consider trainer try to play around
55:30
with your own models and try to see if you can beat
55:33
you know the current best which is based in H&M extension for
55:37
so so yes it's a very in there are some nice papers there as well
55:40
so the point is there is data so if you think I try to solve this
55:44
I'm very abstract thing it's not you can actually try or so
55:48
so in summary the look that explicit feedback
55:52
we look to it talked about clicks for quite a bit I have talked about which
55:55
behavior models not as much time as I wanted to but
55:58
you're pressed for time so um in particular the club browsing session
56:02
context information and a little bit about
56:04
eye-tracking and they didn't talk about Muslim it's unfortunately
56:07
and you can read more about this a in this
56:11
very nice the first from arm and again the key ones are your games
56:16
this is it more a survey and this is the eye-tracking paper and that's the
56:20
um the their division network one that's for the best pic mono know about
56:24
so thanks for waking up on someone


Посты чуть ниже также могут вас заинтересовать

Комментариев нет:

Отправить комментарий