Ravi Manjunatha
Google Cloud - Community
5 min readFeb 10, 2023

--

Video AI to build your personal Tennis Video Library !

AI in sports & fitness is an exciting prospect. Increasingly most professional clubs employ professionals who are adept at harnessing the data be it Match stats, player stats or the performance videos and recommend ways and means to improve one’s game or outwit the opponent.

Recently had an opportunity, to work on a use case where use of Video AI models was explored to build a Video Library containing specific segments of videos. Lets say, you have hours of Tennis videos, you would like to segment the forehands, backhands, drop shots of a particular player into separate videos and then analyze their style of play. The ask was to use a GCP native AI solution to achieve this.

To accomplish , we will use the Auto ML Video Intelligence Models in Vertex AI. Vertex AI is GCP’s unified AI Platform . The solution, for this is as described below,

To demonstrate this solution, I was looking for publicly available labelled Tennis Videos :-) Unfortunately, I couldn't find one. I ended up finding labelled videos of fights and no-fights in this link. It has over 150 video clips of fights and no-fights taken from any hotels, bars and shopping centers. The solution, should work exactly the same way for the Tennis Video Library use case, if we had labelled videos of fore-hand, back-hand or drop shots in Tennis.

We will use these videos for the demonstration. The Auto ML training used in this demo, is charged per hour for training. So, the Total cost for this demo, will be around ~20 USD.

  1. As a first step, we will need to upload all these videos to a GCS bucket. You can keep the fights and no-fights videos in different folders if required.

2. We will use the Auto ML Video Classification, in Vertex AI.

3. Create a dataset in Vertex AI, to upload the video datasets.

4. The Video files to be uploaded, will not be directly listed as is. Rather, the urls of each of the videos need to be placed in .csv file. The csv file should contain the following fields,

url of the file, label of the video in our case (fight, no-fight), time when the event starts (in our case fight, no-fight), time when the event ends.

So, in the import file path, we will select this .csv file. We will leave the split as default.

5. It should take about 5 minutes to import all the videos, the resulting distribution of videos under labelled categories should be as below,

6. Once, the import is complete, we will now proceed to train the Auto ML Model.

7. It took my model almost 5 hours to complete the training .

8. We can see the Model, evaluation parameters. It gives you good precision and recall metrics, for 100+ samples.

9. It is now time for us to test the model, by passible videos for prediction. Auto ML Video models only allow for Batch prediction.

10. We need to pass test videos in the same way as we passed the input files. The file format in this case has to be in the jsonl format.

11. The sample input jsonl format for batch prediction is as below,

{'content': 'gs://sourcebucket/datasets/videos/source_video.mp4', 'mimeType': 'video/mp4', 'timeSegmentStart': '0.0s', 'timeSegmentEnd': '2.366667s'}

12. We will now upload the jsonl file with instances for test videos and give the output path where the predictions will be stored,

13. The batch prediction results, for the test files can be seen in the format as below, it shows the video classified with time stamps in the video section when the event occurs, such as when the fight took place, similarly in the case of Tennis videos, where we will have multiple classes, it will give us the instances when a forehand, backhand or drop shot was played.

14. This can be then used as an input to slice the videos for the time segments, by writing a custom script in python. A sample script can be found here. So, all fight segments can go to one folder and non-fights can be go another folder.

15. We can then combine all the fights video segments into 1 single file by using a python script like the one found here. Similarly we will be able combine all the tennis video segments of forehand, backhand or drop shots into one videos for each.

So, with about half-day’s effort you should now be able to build your tennis video library with all the shots of your favorite player under a single file! Match Point !!

--

--