Everyones an influencer. quantifying influence on twitter (Bakshy, Hofman, Mason, Watts, 2011)
- Стас Фомин
- Дополнительный нижний колонтитул
- Стас Фомин, 03:25, 31 августа 2011
- 1 Resume ⌘⌘
- 2 statistics ⌘⌘
- 3 Странное ⌘⌘
- 4 Модель не ограничена! ⌘⌘
- 5 Цитаты
- 1.6M Twitter users
- 74 million events
- two month in 2009
- the subgraph 56M users and 1.7B edges (направленное замыкание!).
- maximum of followers 4M,
- maximum of friends was 760K
- Не использовали понятие ретвита — «reposting more inclusive than “retweeting”»
- Анализ распространения исключительно по времени и фолловингу.
- «however, we ﬁnd that predictions of which particular user or URL will generate large cascades are relatively unreliable»
- «diffusion can only be harnessed reliably by targeting large numbers of potential inﬂuencers, thereby capturing average effects»
- «marketing strategies, deﬁned by the relative cost of identifying versus compensating potential «inﬂuencers»
Модель не ограничена! ⌘⌘
Не каскад и не нейронная сеть с трешхолдом:
These three assignments effectively make different assumptions about the inﬂuence process:
- “ﬁrst inﬂuence” rewards primacy, assuming that individuals are inﬂuenced when they
ﬁrst see a new piece of information, even if they fail to immediately act on it, during which time they may see it again;
- “last inﬂuence” assumes the opposite, instead attributing inﬂuence to the most recent exposure; and “split inﬂuence”
assumes either that the likelihood of noticing a new piece of information,
- or equivalently the inclination to act on it,
accumulates steadily as the information is posted by more
inﬂuencing another individual to pass along a piece of information does not necessarily imply any other kind of inﬂuence, such as inﬂu- encing their purchasing behavior, or political opinion. Our use of the term “inﬂuencer” should therefore be interpreted as applying only very narrowly to the ability to consistently seed cascades that spread further than others.
the distribution of cascade sizes is approximately power-law, implying that the vast majority of posted URLs do not spread at all (the average cascade size is 1 . 14 and the median is 1)
depth of the cascade (Figure 4b) is also right skewed, but more closely resembles an exponential distribution, where the deepest cascades can propagate as far as nine generations from their origin; but again the vast majority of URLs are not reposted at all, corresponding to cascades of size 1 and depth 0 in which the seed is the only node in the tree.
Cascades Sizes and Depths ⌘⌘
To identify consistently inﬂuential individuals, we aggre- gated all URL posts by user and computed individual-level inﬂuence as the logarithm of the average size of all cascades for which that user was a seed. We then ﬁt a regression tree model , in which a greedy optimization process recur- sively partitions the feature space, resulting in a piecewise- constant function where the value in each partition is ﬁt to the mean of the corresponding training data.
Regression Tree ⌘⌘
Influence depens PLI and followers ⌘⌘
Cascade Size for type of URLs ⌘⌘
Cascade Size for content category ⌘⌘
Although content was not found to improve predictive per- formance, it remains the case that individual-level attributes— in particular past local inﬂuence and number of followers— can be used to predict average future inﬂuence. Given this observation, a natural next question is how a hypothetical marketer might exploit available information to optimize the di ﬀ usion of information by systematically targeting certain classes of individuals.
To illustrate this point we now evaluate the cost-e ﬀ ectiveness of a hypothetical targeting strategy based on a simple but plausible family of cost functions c i = c a +f i c f , where c a rep- resents a ﬁxed “acquisition cost” c a per individual i, and c f represents a “cost per follower” that each individual charges the marketer for each “sponsored” tweet. Without loss of generality we have assumed a value of c f = $0.01, where the choice of units is based on recent news reports of paid tweets (http://nyti.ms/atfmzx). For convenience we express the acquisition cost as multiplier α of the per-follower cost;
http://dealbook.nytimes.com/2009/11/23/a-friends-tweet-could-be-an-ad/ hence c a = α c f .