Everyones an influencer. quantifying influence on twitter (Bakshy, Hofman, Mason, Watts, 2011)
- Автор
- Стас Фомин
- Дополнительный нижний колонтитул
- Стас Фомин, 00:25, 31 августа 2011
Resume ⌘⌘
- http://research.yahoo.com/files/wsdm333w-bakshy.pdf
- 1.6M Twitter users
- 74 million events
- two month in 2009
- the subgraph 56M users and 1.7B edges (направленное замыкание!).
- maximum of followers 4M,
- maximum of friends was 760K
statistics ⌘⌘
Странное ⌘⌘
- Не использовали понятие ретвита — «reposting more inclusive than “retweeting”»
- Анализ распространения исключительно по времени и фолловингу.
Вывод ⌘⌘
- «however, we find that predictions of which particular user or URL will generate large cascades are relatively unreliable»
- «diffusion can only be harnessed reliably by targeting large numbers of potential influencers, thereby capturing average effects»
- «marketing strategies, defined by the relative cost of identifying versus compensating potential «influencers»
Модель не ограничена! ⌘⌘
Не каскад и не нейронная сеть с трешхолдом:
These three assignments effectively make different assumptions about the influence process:
- “first influence” rewards primacy, assuming that individuals are influenced when they
first see a new piece of information, even if they fail to immediately act on it, during which time they may see it again;
- “last influence” assumes the opposite, instead attributing influence to the most recent exposure; and “split influence”
assumes either that the likelihood of noticing a new piece of information,
- or equivalently the inclination to act on it,
accumulates steadily as the information is posted by more
ways-of-influences ⌘⌘
cascades ⌘⌘
Цитаты
influencing another individual to pass along a piece of information does not necessarily imply any other kind of influence, such as influ- encing their purchasing behavior, or political opinion. Our use of the term “influencer” should therefore be interpreted as applying only very narrowly to the ability to consistently seed cascades that spread further than others.
the distribution of cascade sizes is approximately power-law, implying that the vast majority of posted URLs do not spread at all (the average cascade size is 1 . 14 and the median is 1)
depth of the cascade (Figure 4b) is also right skewed, but more closely resembles an exponential distribution, where the deepest cascades can propagate as far as nine generations from their origin; but again the vast majority of URLs are not reposted at all, corresponding to cascades of size 1 and depth 0 in which the seed is the only node in the tree.
Cascades Sizes and Depths ⌘⌘
To identify consistently influential individuals, we aggre- gated all URL posts by user and computed individual-level influence as the logarithm of the average size of all cascades for which that user was a seed. We then fit a regression tree model [6], in which a greedy optimization process recur- sively partitions the feature space, resulting in a piecewise- constant function where the value in each partition is fit to the mean of the corresponding training data.
Regression Tree ⌘⌘
Influence depens PLI and followers ⌘⌘
Cascade Size for type of URLs ⌘⌘
Cascade Size for content category ⌘⌘
Although content was not found to improve predictive per- formance, it remains the case that individual-level attributes— in particular past local influence and number of followers— can be used to predict average future influence. Given this observation, a natural next question is how a hypothetical marketer might exploit available information to optimize the di ff usion of information by systematically targeting certain classes of individuals.
To illustrate this point we now evaluate the cost-e ff ectiveness of a hypothetical targeting strategy based on a simple but plausible family of cost functions c i = c a +f i c f , where c a rep- resents a fixed “acquisition cost” c a per individual i, and c f represents a “cost per follower” that each individual charges the marketer for each “sponsored” tweet. Without loss of generality we have assumed a value of c f = $0.01, where the choice of units is based on recent news reports of paid tweets (http://nyti.ms/atfmzx). For convenience we express the acquisition cost as multiplier α of the per-follower cost;
http://dealbook.nytimes.com/2009/11/23/a-friends-tweet-could-be-an-ad/ hence c a = α c f .
[ Хронологический вид ]Комментарии
Войдите, чтобы комментировать.