Tuesday, June 9, 2009

Criterion to classify a player as a bowler, batsman or allrounder in Test Cricket



I once again took data from Anantha's blog where he is analyzing the runs and wickets from the Test players to identify player with best allrounder abilities (in terms of making runs and taking wickets).
He admits that there is some kind of arbitrariness in dividing the data in separate groups. There is more or less a continuum in the batting and bowling record of various test cricket players, as can be seen in the scatter plot shown below. In the figure below I plotted the runs scored by a player versus wickets take by that player. The color of the dots is the allrounder index defined by Anantha. Dark blue colors show low index (more bowler) and red-brown colors show larger index (more batsmen).
Here I propose an additional criterion to cluster the player in bowler or batsman group or in between (all rounder).
I hypothesize that bowlers are the ones who consistently increase their total wickets as a function of matches played as a greater pace than non-bowlers. The speed by which a player increases his total wickets can be captured by the slope of matches vs wickets curve. The slope will be highest for pure-bowlers group and smallest for pure-batsmen group. Similarly, the slope of matches versus total runs slope can be estimated to define a criterion to qualify a player as a batsman. The logic is same that pure batsmen tend to increase their aggregate runs as a faster rate than lesser batsmen. An all rounder candidate should obviously increase both runs and wickets with matches.

In panel A of the figure below, I plotted the number of wickets taken by various players as a function of number of matches they played. The color of the circles is defined by the player index (defined by Annanth), blue shades indicate low index and red shades mean a high index. Further I identified top 20% % bowlers (i.e. the players coded in blue shades) as defined by Ananth's criterion and fitted a line (blue) to those. The line has a slope of 4.73. Similarly, I identified top 20% batsmen (i.e. those with high player index -- red shaped) and fitted a line (orange) to estimate the slope of wickets as a function of matches. The 20% batsmen have a
slope of 0.3.
In the same vein, in panel B I plotted total runs scored with respect to matches played. Color-code for the circles is same as in panel A. Again I fitted straight lines through top batsmen and top bowlers. The batsmen tend to increase their aggregate runs with a slope of 72.67, while the top bowler increase their aggregate with a slope of 9.8. These two number could match with the average of average runs scored by batsmen and bowlers respectively.

These four slopes 4.73 and 0.3 (bowler criterion) and 72.67 and 9.8 (batsmen criterion) set the boundries to classify a player as bowler or batman or both.
We can study the developement of number of wickets and aggregate runs of a player as a function of matches played by him and the two slope (runs vs matches and wickets vs matches) will render him on some location in the two panels and would provide a more objective criterion to decide whether a player is more a bowler or a batsman or a candidate to be both.

In coming few I will analyze the pregression of Gary Sobers, Jacque Kallis, Imran Khan, Sid Barnes, Kapil Dev and Ian Botham to see where they fall in the above two plots.

right arm over
Arvind


No comments:

Post a Comment