Earlier this week Serena Williams made news regarding her take on how often she is being required to go through drug testing. Williams took to Twitter to voice her opinion, and suggested that she is being singled out compared to the number of drug tests her peers are having to go through. I figured this would be a good question to try and answer with our good ol’ data visualization tool, Power BI :-). Read on!
The Data
I compiled data from the International Tennis Federation’s (ITF) yearly summary reports for 2015-2017 and filtered it to US players. The data is not available for individual players for 2018 until the end of the year, so we have to fly in the dark until then for a clear picture of what went on in 2018. I then combined the ITF data with the U.S. Anti-Doping Agency’s numbers for the same time frame. This data is updated weekly, so we can see how often players have been getting tested in the US so far this year. The ITF and USADA are the 2 main bodies that conduct drug tests on tennis athletes, with WADA having the capability of conducting tests as well, but usually abstaining from doing so. By filtering to US players, I aimed to get a good comparison in the total number of tests conducted, given that non-US players do not get tested by the USADA.
Lastly, I included some of the top ranked players outside of the US to provide some comparisons on the number of tests given to top players on the circuit. These players included Rafael Nadal, Roger Federer, and Caroline Wozniacki.
The ITF numbers are given in the form of ranges (1-3, 4-6, 7+). I standardized these to 2, 5, and 7 respectively in order to be able to sum them up. The 7+ category can vary significantly, and there is no way to determine how high it can go, so what’s shown here is a low estimate.
The Report
I wanted to put some of the new features from July’s Power BI Desktop update to use in this report. One of the first things I did was put a wallpaper. I played around with the use of a semi-transparent background so that my visuals didn’t get overwhelmed by the color and noise going on in the image. Once published and viewed through the Power BI portal, the wallpaper really stands out, and gives the page a very modern/slick look to it. The background color ensures that the users can still read the report and the visuals are easy to read.
Since I had two datasets that had Player Name and Year, I tested out the Composite models feature to create a many-to-many relationship between the two tables using the Player Name, and this worked out pretty nicely. It didn’t make it into the final version of the report though, as it’s still in preview and the report can’t be published with that feature enabled yet.
Visual Header – This is one of my new favorite features. Having the ability to remove the header on the visuals gives you a lot more flexibility, and gives your report a more polished look.
Lastly, I used the Histogram Chart custom visual to provide a perspective in the breakdown of the number of tests and highlight any outliers. By looking at the distribution this way, we can easily click on any of the bins and see which players fell within a given number of tests, and what was the range for the majority of the players.
Using the Report
The goal of the report is to help you draw your own conclusion on whether you think Serena Williams is being singled out by these agencies, and also to provide some insight into the data available from these agencies. One thing to note is that Serena was out most of 2017 because of her pregnancy. As a result, I provided a link in the report so you can change the histogram visual to show the data without 2017 numbers.
Serena Williams is great player, but mostly wins from her physics. Definitely she is quite stronger then most of the other opponents and have more stamina. I like her gameplay, but definitely is not the player I would support.