Friday, March 9, 2012

My tribute to Rahul Dravid: What if he debuted with Sachin

Rahul Dravid decided to call it a day and thus comes the end of an illustrious career. A lot has been written and more will be written in days to come, but those who have witnessed Rahul's batting know that no amount of runs can do justice to the man and his career. I often take the refuge of 'numbers' when it comes to define a career because memory gets clouded and lot of prejudices enter in our evaluation. 

This is specially true in the case of Rahul who all through his career remained in the shadow of Sachin Tendulkar. They say that he did not mind that, but what choice did he have when the whole country decided to remain mesmerized with Sachin (for few reasons of course).

Dravid's first test match somehow defined his career. A glorious knock of 95 at the Lords on his debut, but Ganguly another debutant scored a century and took all the limelight. This has been the fate of Rahul all through career. Judgement of Indian cricket masses is strongly clouded by their emotions and not one really pays attention to numbers. For some reason we treat Zaheer Khan a great bowler (in Indian context of course) but we forget to compare his record with Javagal Srinath, who is not considered a great bowler.  

So I tried to look through the number again compared Rahul Dravid with our benchmark of greatness Sachin Tendulkar when they played together. I hate to bring this comparison by just numbers so I made some graphics. I asked a simple question: when Sachin and Rahul batted together in an innings in a test match who scored more runs. I do not want to calculate averages, that tells nothing, I will only give you the raw data in a pictorial manner so you can judge with less emotions and more objectivity. Here is how to read the figure
Sachin Tendulkar vs Rahul Dravid.
Rahul is marginally better if at all when they play together.


Panel A: Innings by innings runs scored by Sachin Tendulkar. The dark line is smoothed version of the same data. 
Panel B: Innings by innings runs scored by Rahul Dravid. The dark line is smoothed version of the same data. 
Panel C: Probability distribution of runs scored by Sachin (brown line) and Rahul (blue line). This graph shows the probability of scoring some number  of runs in an innings. It turns out that there is no difference between the two when they batted in the same innings. Moreover, it is all noise, as the distribution is an exponential (thick gray line), characteristic of a class of stochastic processes (Poisson Process).  
Panel D: Scatter plot of runs scored by Sachin (Y-axis) and Rahul (X-axis) when they played together. If the point is above the gray line then Sachin scored more runs, if it is below the gray line Rahul scored more runs. It turns out that Rahul outscored Sachin 130 times while Sachin outscored Rahul 118 times. But there is not much difference in the runs they scored:  Rahul (12981) scored 395 runs more than Sachin  (12586). Ideally we would want that both score together but that was so rare.
Panel E: The scatter data in panel D is presented as probability in pseudocolors. 
Panel F: Innings by inning difference in the score of Rahul and Sachin. Positive number means Rahul scored more and negative number means Sachin scored more. 
Panel G: The distribution of run difference shown in panel E is a gaussian noise. 
Panel H: Autocorrelation of run scoring by Sachin (brown line) and Rahul (blue line). Once again, as revealed by panel C, the run scoring by the two players is a stochastic process. Although for Rahul there is some increased chance of repeating his performance (failure or success) every 5 matches (I am not sure if it is significant). 

What does this analysis tell us? Because I considered the matches when they played together, I automatically corrected by all the conditions, bowlers (they bat at no. 3 and 4) and other effects. SO when we look this analysis I do not see any difference between Sachin and Rahul. If I have to chose perhaps I will go for Rahul given that he scored 395 more runs. But I may not play them together because chances that they both will score good amount of runs is rather rare.

This great similarity between the two giants of Indian cricket and or World Cricket in 2000s makes me ask why Rahul remained in Sachin's shadow? What if Rahuls had debuted slightly before Sachin Tendulkar, would be worshipping Rahul and not Sachin? 

Perhaps Sachin is so popular for his early start and his explosive batting in limited over cricket, something Rahul was never allowed play like Sachin. I will do that analysis some other day. But here is a humble request, please, respect Rahul for what he has achieved, his numbers tell a very good story and they very conclusively say if at all Rahul Dravid is better than Sachin Tendulkar. Rahul is a humble man when he says that the next generation is well equipped, we all know the vacuum he is leaving at the number 3.

right arm over
Arvind

Wednesday, February 8, 2012

The myth of bowling in the right areas

Whenever a bowler is hit for a boundary modern cricket commentators shower all kinds of cliches to describe the shot and when they seem to return to their senses, they make a sorry remark on the bowler that he should be bowling in the right areas. With batting dominating, 'bowling in the right areas' is becoming the new cliche contemporary cricket commentary. It certainly gives the impression that the commentator knows what these so called right areas are. But unfortunately, they never make any sensible suggestion, so I assume that like the poor bowler and his captain, they also do not have any idea of where the 'right area' might be on the pitch. 
Ranji invented leg-glance to deal with the leg-side bowling
Lets try to narrow down on these 'right areas' by first isolating the bad areas and start with the 'bad line to bowl'. Down the leg side is never considered good. This is simple because a slight error and the delivery will be a wide. Traditionally, umpires do not give LBW to balls pitching outside or even in the line of the leg stump. Finally, there may still be some stigma of the bodyline series. Then too much outside of the off stump is also not good, because it gives enough room to the batmen to play the shot. Moreover, the batsmen can decide to leave the ball. So unless the bowler can move the ball in, there is not much point bowling outside the off stump.

Next, the length of the ball. Bowling too full or too short is bad. Usually they recommend 'good length' which is something like two third of the length of the pitch. Short balls without much pace only invite well executed pull shots. Too full is rather easy to play if it is not combined with swing.

So, is that it then, that bowl around the off-stump at about two-third of the length, the so called 'corridor around the off and the middle stump'. But I am sure all bowlers know this. This is what they practice. This is what the coaches train for.

However, it is not true that a ball bowled in the 'right areas' is going to give you a wicket or at least will trouble the batsman every time you bowl it. Evidently, balls pitched well within the 'right areas' are smacked for easy boundaries and at times even balls pitched in the so called 'bad areas'  get you a prized wicket. In fact, I did a little survey myself on the hawk-eye data that is available for some matches on cricinfo website to confirm this.

So if you have played cricket at any serious level you know that in reality there are no right areas.  Once a batsman knows what the bowler is going to do, he can execute any shot on any delivery. So the biggest enemy of the bowler is to become predictable.

Along the same lines, there is no perfect ball which can give you a wicket every time you bowl it. You may get some success on few occasions but soon batsmen will have a strategy to play that ball for maximum score. The innovation of leg-glance by Ranjit Sinhji was first such instance and development of a number of new shots in recent times, including the 'switch-hit', gives clear indications that batsmen can come up with an antidote to any delivery given some time. 
Glenn surprised everyone. 
He bowled not just in the corridor 
but in the corridor of uncertainly


So in my experience the 'right bowling areas' is a myth created by our modern cricket commentators, who want to sound educated in cricket. The biggest enemy of a bowler is the monotony of his bowling, no matter how elegant it appears from the commentary area. The biggest ally a bowler can have in the middle of the bowling spell is the variability and a sense of surprise in his bowling. 


Unfortunately, this part of bowling never makes it to the statistics and we think that Glenn McGrath was a great bowler because he consistently bowled in the corridor of the uncertainty, but I think you should watch him again. And not just Glenn, pick videos of any successful bowler and you will find 'unpredictability' was the main weapon in their armor.

right arm over
Arvind

Tuesday, January 3, 2012

Making of a Test bowler: Time it takes to find your feet

Interesting times are back in Cricket. Debutant bowlers since September 2011 have surprised the batmen and as many as six new entrants (four of those are fast bowlers) have started their careers with a five fours. In fact the 12 debutant bowlers of in 2011 have shared as many as 18 five fors. Although data is not sufficient, but this new breed of bowlers seem to bring a hope that finally balance between the ball and the bat will be restored (see Number Game). What is really encouraging is that six of these new bowlers are less than 22 years of age. So I think there are interesting times ahead for Test Cricket where runs will not be easy, or at least that what I would like to happen.


To appreciate the importance these superb performances by the debutant bowlers we (together with Ajit Padmanabhan) looked at the probability to take a certain number of wickets in first four innings. To this end we looked at the bowlers who took at least 200 wickets and bowled at least 71 inning (otherwise the great Clarrie Grimmett would be left out and we dont want that). This way we have 57 bowlers for this analysis.


So we asked how these obviously successful bowlers performed early on in their career. In the figure below we have the probability of taking a certain number of wickets in first four innings of a bowler who finished his career with at least 200 wickets (left panel). The probability to take a certain number of wickets decreases as we increase the wickets count. This decrease is almost linear. But more data may change the picture. The figure also shows that for some of the very successful bowlers the probability to take five wickets or more is only 7% (right panel). This really reveals how amazing  the success of the class of 2011 has been.
Figure 1. Left: Probability of taking a certain number of wickets in the first, second, third and forth career inning of a bowler. The linear decay of the probability traces shows how difficult it is to take more wickets in a test innings.
Right: Probability of taking 'n' or more wickets in the first four career innings. The dark line in both panels is average of the four innings. It turns out that there is only 7% chance to take five of more wickets in your first four outings as a Test bowler. Data taken from cricinfo. 


How much time you want to invest in a new bowler
In cricket early success or failure is no predictors of long term success. So how much time a team should invest in a bowler if he is not responding. We turned to the  numbers of the bowlers who took at least 200 wickets and extended our previous analysis. This time we averaged the innings by innings wickets of these bowler in the 200+ club. It turns out that, when averaged over 57 bowlers, it takes about 10 innings before the bowlers reach their steady state of taking on average 1.5 wickets per inning. 


Figure 2. Evolution of a Test bowler. The probability distribution shown in the figure 1 is color coded and shown for first 71 innings of 57 bowlers who have taken at least 200 wickets. Dark colors mean less probability and other way around for bright colors. The thick blue line is the mean of the probability distributions. The slow rise of the blue line indicates that test bowlers take some time to find their feet at the highest level of Cricket. After about 10 innings the line hovers around 1.5 indicating that after that bowlers stats dont vary much. It also shows that even the very successful bowlers on average take only about 2 wickets per innings. This consistent with the fact that most teams field 4-5 bowler who fight to take 10 wickets.
So we argue that the teams should look into their bowlers at least for 5 test matches before giving up on them. By the same token if a bowlers has not reached his steady state of 1.5 wickets by the fifth match, there is a little chance for him, statistically speaking.


right arm over
Arvind

PS: I was helped by Ajith Padmanabhan in collection and analysis of the data.