Winning the Hardware Software Game Winning the Hardware-Software Game - 2nd Edition

Using Game Theory to Optimize the Pace of New Technology Adoption
  • How do you encourage speedier adoption of your product or service?
  • How do you increase the value your product or service creates for your customers?
  • How do you extract more of the value created by your product or service for yourself?

Read more...

Latest Comments

  • Anonymous said More
    Well written. Well constructed. Tuesday, 13 August 2019
  • Ron Giuntini said More
    As always a good read.
    I have always... Thursday, 25 January 2018

Suppose a friend told you that he was planning on doing a TED Talk, and he asked your advice on how to make his talk one of the most popular TED Talks out there. What would you tell him?

This is exactly the type of question Data Scientists seek to answer. The way Data Scientists approach such a problem is to gather information on past TED Talks and analyze that information to see which factors describe only the most popular TED Talks, and not also the less popular Talks. For our purposes, we’ll define “popular” TED Talks as Talks that generate a lot of views.

So then following the Data Scientists’ route, we obtain a database that contains all TED Talks posted on the TED website from its inception in June 2006 through September 2017. There are 2,550 talks. The distribution of views per talk across all the different talks is presented in Figure 1.

Figure 1

1 ted talks by views

What the distribution of TED Talks by views shows is that (i) A small number of talks has received over 5 million views each, (ii) a large number of talks has received several million views each, and (iii) and a large number of talks has received less than a million views. We can present the same information contained in Figure 1 a little differently, as shown in Figure 2.

Figure 2

2 ted distrn talks views2

Figure 2 shows that the 4% of talks that each had more than 5 millions views collectively accounted for 25% of the total number of views for all TED Talks from June 2006 through September 2017. That is, a small number of talks generated a large portion of the total views for all TED Talks. What we want to know is: what characteristics do those top 4% of talks share that the other talks don’t?

The database of information on TED Talks contains information on characteristics of speakers and their talks, including, for example: date posted; duration of talk; title and description of talk; identity and occupation of speakers; number of languages for which transcripts of the talk were provided; number of comments by other viewers; and tags — or themes — for each talk. There is also a ranking system TED provides for viewers to rate the talks. Viewers are given a set of 14 descriptors from which they can choose up to 3 to describe a particular talk: Beautiful, Confusing, Courageous, Fascinating, Funny, Informative, Ingenious, Inspiring, Jaw-dropping, Longwinded, Obnoxious, OK, Persuasive, or Unconvincing.

The tags for each talk appear to be inconsistently defined. For the 2,550 TED Talks, there are 19,098 total tags, of which 417 are unique. The 5 most popular tags are: Technology, Science, Global Issues, Culture, TEDx, and Design, which collectively account for 16% of the total tags assigned. The tags assigned per talk vary from 1 to 32 in a seemingly unsystematic manner (see Figure 3). This inconsistency in assignment of tags suggests that the tags variable would not be a good predictor of TED Talk popularity. The analyses were run both with and without information on tags, and, as suspected, they didn’t provide any additional information.

Figure 3

3 ted distrn tags

Jumping Right In!

So, now, if we jump right to the analysis, what does it tell us? If we regress the number of views a TED Talk receives on the various data elements in the dataset, across all 2,550 TED Talks, we get the results presented in Figure 4. To be conservative, I’m labeling as “statistically significant” those results that are significant at the 1% level (i.e., p-value ≤ 0.01). Variables with statistically significant coefficients have been highlighted in yellow.

Figure 4

4 ted reg1 graph

The first observation from the analysis is that the adjusted R2 is 0.82. There’s a good amount of variation in the views a Talk generates – 18% – that isn’t captured in the variables that have been included in the regression.

The second observation is that the impact of languages is large and positive. So, talks posted in more languages generate more views. Or talks that generate more views are posted in more languages. This is correlation, not necessarily causation.

The third observation is that the year the talk was presented has by far the largest impact on the number of views a talk has received, where talks given in later years are more popular. We’ll explore this more in a minute, but first, let’s go through the other variables in the regression.

The fourth observation is that talks with more ratings generate fewer views. Before we interpret this unintuitive result, let’s consider the impacts of the individual ratings descriptors. It turns out that the ratings descriptors that generate the most views are Confusing and OK, not particularly favorable descriptors. The way I interpret the information on ratings is that it’s the less popular talks that viewers give ratings to, and those ratings are not favorable. So ratings reflect people voicing dissatisfaction with the talk, and people who enjoy talks simply don’t provide ratings.

So now let’s return to the strong relationship between Year and the number of views a TED Talk receives. Consider the pattern in Views per Talk over time, presented in Figure 5.

Figure 5

5 ted talks yr2

Talks during the first year received a lot of Views, but there were relatively few talks that year, so those large Views per Talk get less weight in the analysis. Views per Talk peaked in 2013, but they were also relatively high for 2014 and 2015. Also, there were enough talks presented during those years to give the large Views per Talks large weight in the analysis. So it looks like the large positive impact of Year on Views reflects the fact that talks in 2013 through 2015 — later years in the analysis — generated more views. Again this is correlation not causation.

Let’s take one more deeper dive and compare the distributions over time of Talks with less than 5 million views and Talks with more than 5 million views. That is, we’re splitting the blue line in Figure 5 into two sub-components. The distribution of Talks over time for Talks with more than 5 million views and Talks with less than 5 million views is presented in Figure 6.

Figure 6

6 ted talks yr4

It turns out that of the 99 talks in the dataset with more than 5 million views, 22 of them were presented in 2013. So what the large positive coefficient in the regression on Year is saying is that talks that were presented in later years, particularly 2013, generated more views. Again, this is correlation, not causation. It doesn’t say if you want to generate more views, then present your talk in 2013. Rather, it says that talks that generated more views took place in 2013. Correlation, not causation.

So now recall the distribution of Views per Talk in Figure 1. The distribution is nonlinear for talks with more than 5 million views. So then what happens if we look at the analysis of talk characteristics that affect Views separately for the two subgroups? That is, what happens if we subdivide the talks into those with less than 5 million views and those with more than 5 million views, and then we run the analysis separately for each subgroup? Are there differences in the patterns of characteristics that predict numbers of views for the two different groups of talks?

Talks with Less Than 5 Million Views

Let’s first take a look at what the analysis says for talks with fewer than 5 million views, which is presented in Figure 7

Figure 7

7 ted reg2 graph

The results of the analysis for talks with fewer than 5 million views shows the identical pattern as that for all talks combined. This suggests that that weird pattern we saw for the Ratings variables in the analysis of all talks — where people tended to submit more ratings for talks they don’t like — pertained to the less popular talks, that is, talks that had less than 5 million views.

Talks with More Than 5 Million Views

So what do the results have to say about the talks with more than 5 million views?

Figure 8

8 ted reg 3

As Figure 8 shows, for the most popular talks, none of the characteristics of the talks are significant predictors of views. In other words, if you ask, “what are the characteristics of the most popular TED talks?” The answer is, “there is no predictor.”

So What’s Going On?

Here’s my hypothesis.

TED Talks can be viewed on TED’s website, but they can also be viewed on YouTube and other social media sites, such as Facebook, iTunes, and Hulu. Which talks are people most inclined to view? Do they go to the TED website and start with the most recently presented TED Talks? I don’t think so. I posit that most TED Talks are viewed through either (i) a link sent to people by friends, (ii) a link others posted on social media, or (iii) talks posted under a label of “Top 10 TED Talks,” “Most watched TED Talks,” or some other such label.

In other words, I posit that the most popular TED Talks are the ones that have been caught up in a success-breeds-success loop, which has been facilitated or fostered by choice architecture, so as to propel those Talks into the group of most popular.

Success-breeds-success phenomena occur when things that are popular become even more popular, because they are given more chances to succeed. For example, once a piece of content has garnered enough clicks, other people will click on it simply because many others have also done so.

Wikipedia defines choice architecture as the design of different ways choices can be presented to consumers, and the Impact of that presentation on consumer decision-making. In other words, choice architecture recognizes that the way you present choices to people can affect which of the options they choose.

Choice architecture feeds success-breeds-success phenomena by labeling certain content as “Top 10,” “Most Viewed,” “Now Trending,” etc. People will tend to skip individual pieces of content posted on the site in favor of what’s most popular. Either they view what’s popular as a proxy for high quality content, or they fear missing out (FOMO) on what so many others have experienced.

So, I posit that the most popular TED Talks are not viewed through a visit to TED’s website. Rather, I propose that the most popular TED Talks are more likely to be viewed because they either serendipitously end up in the path of viewers or they appear under a label of “Most Popular.” A TED Talk becomes among the most popular when it starts to gain momentum in views, gets passed around more on social media, makes it into a Top 20 list and continues to become ever more popular because it’s popular.

The other contributing factor that might make the most popular TED Talks so popular is that they exhibit some intangible quality about the speaker or the talk that appeals to viewers, but that hasn’t been captured in the database of TED Talks data. The 14 ratings descriptors capture some elements of this, such as Ingenious or Inspiring or Funny. But they don’t capture information, for example, about speakers who are dynamic, or wry, or captivating. It’s also possible that intangible characteristics lead the most popular TED Talks to gain the initial momentum they need to get caught up in a success-breeds-success loop, which then propels them into the top Talks.

So What Does This Mean?

The first implication is that the key information we need to answer the questions we seek to answer is often not captured in the data we have. Sure, we might be able to get some small scraps of understanding from the information we have. But relative to the primary understanding we actually seek, the scraps are often irrelevant. However, we won’t understand what we’re missing, unless we have some understanding of the dynamics that drive the situation. In other words, if we jump right into the TED Talk data without first thinking about what might drive popularity of TED Talks, then we’re very likely to completely miss the big picture. We won’t know what we’re missing, unless we take time before jumping into the data to try to understand what really drives the situation.

The second implication is this. In a world flooded with information, in which everyone is vying for our attention, success-breeds-success phenomena and choice architecture are increasingly determining which content ends up becoming popular or successful. That is, a product’s success is increasingly determined as much by factors that don’t have anything to do with the nature or quality of the product itself, but rather, by how well the product is propelled into success through extrinsic factors. Merit won’t necessarily win the day. Is that what we want?

More Blogs

Being Healthy Shouldn’t Be This Hard

31-08-2019 - Hits:180 - Ruth Fisher - avatar Ruth Fisher

In 2018, Americans spent $3.67 Trillion on healthcare, amounting to 19.5% of GDP, up from 5.2% of GDP in 1960. It might not be such a bitter pill if Americans were becoming correspondingly healthier over time. But we’re not. Everyone knows that despite our hefty increases in spending over the...

Read more

Intangibles and Context Will Increasingly Differentiate Winners from Losers

30-08-2019 - Hits:68 - Ruth Fisher - avatar Ruth Fisher

  You’re hungry and ready to eat. What’s for lunch? What you choose to eat depends on your environment, that is, the context in which your desire to eat occurs: Where you are, what’s available nearby, what you like to eat, how much money you have, how hungry you are, and so...

Read more

Theory vs. Reality

15-08-2019 - Hits:152 - Ruth Fisher - avatar Ruth Fisher

What’s the difference between theory and reality? In theory, nothing… Unfortunately, in the real world, what start out as great ideas in theory often end up being implemented in ways that lead to not-so-great outcomes in practice. I remember sitting in a grad school class, The Economics of Regulation, when I was introduced...

Read more

Best Practices in Medicine Should Rely on Providers' Knowledge, Skills and Exper…

13-08-2019 - Hits:159 - Ruth Fisher - avatar Ruth Fisher

In 2005, a physician-scientist research pioneer, John Ioannides, published what has come to be a widely circulated paper, “Why Most Published Research Findings Are False.” The replication crisis we’re having in science embodies the concern voiced by Mr. Ioannides. Yet, despite much evidence that so many studies are not valid...

Read more

Actions without Consequences Are Causing Our Healthcare Crisis

02-07-2019 - Hits:394 - Ruth Fisher - avatar Ruth Fisher

Actions Have Consequences I often turned to my father for advice. I would ask him, “What’s the right decision to make?” And he would invariably reply, “There’s no right or wrong. There are only consequences.” By taking the morality out of the equation, my father forced me to focus not on the...

Read more

Cash-Based Models for Healthcare

16-06-2019 - Hits:520 - Ruth Fisher - avatar Ruth Fisher

The configuration of our current healthcare system is a product of its history: It has evolved into its current form as a consequence of two primary sets of factors. First, the healthcare system has evolved into its current form due to historical laws and regulations that have generally catered to...

Read more

The US Healthcare System Is Massively Complex and Massively Interconnected 

31-10-2018 - Hits:2538 - Ruth Fisher - avatar Ruth Fisher

Consider how the different groups of players in the healthcare system are connected to one another: Healthcare Industry Trends Trends in society and in the healthcare industry over time have led to  Increases in medical information  Increases in numbers and specialties of service providers Increases in numbers of available medical devices and pharmaceuticals Increases in malpractice...

Read more

Playing the Marijuana Market Transition Game

26-09-2018 - Hits:2611 - Ruth Fisher - avatar Ruth Fisher

Download PDF Timeline of US Marijuana Laws California Is Different from Other Legalized States Description and Implications of CA Legislation Marijuana Supply Chain Regulations and Realities Players of the CA Market Transition Game CA Market Evolution to Date Future Market Evolution   California is currently transitioning from illegal and semi-legal markets for marijuana to legal markets. The black and...

Read more