In cricket there is a superstition including many others, that at the score of 111 or its multipels there is an increased chance of fall of wicket. The number 111 is called the 'Nelson Figure' in cricket. The legend says that in the World War II there was a army officer Lord Nelson who lost one arm, one leg and one eye. I am not sure whether this office played cricket or was related to cricket in some way but his situation became an expression in cricket to show score of 111. The famous umpire David Shepherd made this number even more special by his one leg jump when the score reached 111.

In the third test match between India and England on the second day, when Cook and Strauss were trouncing the hapless Indian bowling attack there was a discussion on cricinfo about probability of fall of wicket on 111. So I looked in to the partnership data over last 1950 test matches.

The score at the fall of first wicket seems to follow an exponential distribution. The figure below shows that probability to score a certain score before the loss of first wicket. The different lines show this distribution for different innings (top plot). This figure shows that there is a about 0.6% chance that the wicket will fall at the score of 50.

Because of the nature of data it makes sense to plot these graphs in a log-linear plot (lower-plot). In such plots a straight line indicates an exponential distribution. The pale blue line corresponds to an exponential function. Largely the nature of exponential does not change with inning and the behavior remains similar.

Next, we can look for runs scored for each wicket in different innings in the same way. In the figure below I chose the log-linear representation because that more revealing. Different colors belong to different wickets with first wicket being blue and 10th wicket being the red line. Different subplots are for different innings.

Naturally the lines have a higher slope for lower order wickets but largely the distribution is exponential. That is that fall of wicket, when looked over all the matches, all the conditions, all bowler and batsmen, turns out to be a random process. Exponential distribution is a simplification and more sophisticated distributions can be used to better describe this phenomenon. Maybe I will do it some other time, when I am sitting in a boring lecture, like now...

see also

http://en.wikipedia.org/wiki/Nelson_%28cricket%29

In the third test match between India and England on the second day, when Cook and Strauss were trouncing the hapless Indian bowling attack there was a discussion on cricinfo about probability of fall of wicket on 111. So I looked in to the partnership data over last 1950 test matches.

The score at the fall of first wicket seems to follow an exponential distribution. The figure below shows that probability to score a certain score before the loss of first wicket. The different lines show this distribution for different innings (top plot). This figure shows that there is a about 0.6% chance that the wicket will fall at the score of 50.

Fraction to score a certain number of runs before the fall of first wicket. |

Because of the nature of data it makes sense to plot these graphs in a log-linear plot (lower-plot). In such plots a straight line indicates an exponential distribution. The pale blue line corresponds to an exponential function. Largely the nature of exponential does not change with inning and the behavior remains similar.

**Therefore, there is no data that suggests that there an increased chance for the fall of the first wicket at 111.**Next, we can look for runs scored for each wicket in different innings in the same way. In the figure below I chose the log-linear representation because that more revealing. Different colors belong to different wickets with first wicket being blue and 10th wicket being the red line. Different subplots are for different innings.

Inning by inning fraction of fall of wickets as a function of runs scored. |

see also

http://en.wikipedia.org/wiki/Nelson_%28cricket%29

good jobs....how come the probability >1 ?

ReplyDeleteThanks SD, indeed the labels on the y-axis should have been percentage and not Prob.

ReplyDelete