Gaining Strokes Lost

Losing Strokes Gained covers the idea of a comprehensive, predictive suite of three golf statistics. One of these statistics, Stuff+, aims to be a statistic that can be a core component of college golf scouting because it judges a shot’s physical characteristics, rather than its outcome. This methodology allows a college golfer, or any golfer for that matter, to get on a launch monitor and instantly know how above or below PGA TOUR average their shots would be.

The above paragraph seems fantastic in theory. But, as Paul Graham writes, “Putting ideas into words is certainly no guarantee that they’ll be right. Far from it. But though it’s not a sufficient condition, it is a necessary one.” My theory may seem like a worthwhile idea, but without a concrete implementation it is just that, a theory. This article aims to be further investigation on the features that could be most beneficial to compute Stuff+, or expected strokes gained, for a given shot off-the-tee.

After much back and forth with the PGA TOUR, they finally sent me documentation for their ShotLink API. The TOUR’s API access costs in the five figures, so this article won’t be finished research, but rather a reflection on what actually makes a tee shot valuable. Thoughts expressed aren’t ones backed by objective, data-backed reasoning. They are direct results of thought given to different types of ball flights and play styles that I believe lead to strokes gained. 

The TOUR separates shot tracking into four different categories: RawBallTrajectory, RawNormalizedTrajectory, BallTrajectory, and NormalizedTrajectory. Each of these ball tracking categories share similar statistics (features in a potential model), but the approach to tracking itself is distinctly different between each. For purposes of this project, NormalizedTrajectory would be best suited to provide the statistics needed to adequately model Stuff+.

The TOUR’s documentation defines NormalizedTrajectory as “Trajectory of the ball flight converted to shotlink coordinates without taking weather in account.” Included in NormalizedTrajectory are a host of useful statistics chief among them being SpinAxis, XFit, YFit, ZFit, MaxHeight, and Curve. It’s important to mention that WindVelocity is also a tracked category which should be a helpful way to adjust carry numbers given conditions. Other weather factors are likely important to further adjusting ball flights, but would need to be accessed by a different API given my current understanding of what the TOUR offers.

Outside of trajectory data, the API carries several other statistics of particular interest - apex_range, apex_side, vertical_launch_angle, and horizontal_launch_angle. Paired with the aforementioned trajectory statistics, the TOUR provides a complete ball flight that perfectly aligns with the previously mentioned statistical goal. 

Let’s dive deeper into a few of the aforementioned features and their use case in the format of [statistic] - “TOUR definition” with my subsequent thoughts following. Please note that any italicized text indicates a statistic stored by the PGA TOUR’s ShotLink dataset. Stuff+ would be the result of modeling strokes gained based on the italicized statistics mentioned, rather than by determining strokes gained on proximity/lie.

[spinAxis] - “The degree of rotation relative to the horizon. Negative = a left tilt, Positive = a right tilt.”

Picture yourself perched in the sky on a Boeing 747, looking at the wing outside your nearby window. You’re mid-flight and the wings are running parallel to the ground; the spin axis of the plane at this point is 0. As the wings bank left and right the spin axis then moves to negative and positive values. In golf terms, the spin axis from -2 to 2 would be considered a straight shot with no discernible shot shape. Any ball with a spinAxis less than -2 would be a ball going left and anything greater than 2 would be a ball going right. For a righty, this means that spinAxis -2 or less is a draw or hook and spinAxis 2 or greater is a cut, fade, or slice. 

What interests me most about spin axis data is the ability to fit golfers into categories based on their shot shapes. Most golf fans know Rory McIlroy plays a draw, but new TOUR golfers have very little information available on them. Spin axis data could allow analysts to glean insights into newcomers’ play style as well as evidence changes in shot shape preference over time for tenured TOUR players. Additionally, spin axis data has the ability to weed out truth in course fit narratives. We often hear pre-tournament that a course is a “drawer’s course” or a “fader’s course”, but that’s all unproven conjecture. Spin axis data could show actual trends in success based on shot shape. Research could also include whether or not smaller spin axis numbers lead to higher strokes gained or not. I imagine the answer isn’t so clear, but we often hear that one of Scottie Scheffler’s greatest skills is to hit balls in tight windows with minimal curve. I would guess someone of Scheffler’s shot-making stature varies his spin axis numbers but keeps them in a tight range. 

[apex_range] - “distance of the ball from the tee at the balls highest point in ft”

This statistic’s use case lies within its ability to define a shot type. In essence, apex_range finds the point in the air in which a ball reaches its apex. Think of your once-a-round skyball struck off your driver’s crown - the ball doesn’t travel very far it goes straight up and down. The apex_range would be much smaller than the apex_range of your perfectly struck tee shot where the apex_range falls at a more desirable number down the fairway. Apex_range could be used to identify flush strikes. If two players hit driver at 175 mph ball speed, the one with the higher apex_range likely struck it as intended as it indicates the ball’s propensity to hold velocity. A feature that combines apex_range and apex_height could also identify stingers. Being able to identify stingers, characterized by low apex_range, low apex_height, and large runout, would allow analysts to understand if choosing stingers is a positive expected value choice for golfers. This shot by Luke Clanton would be defined as such:

[apex_side] - “distance of the ball left or right relative to the x axis at its highest point in ft”

More than anything, apex_side serves as a way to figure out where players are missing. Wyndham Clark’s 2025 season would be a worthwhile case study for apex_side. Known for his preference to cut out the left side of the golf course by playing a massive fade, Clark’s form has fallen off since his 2023 U.S. Open victory at Los Angeles Country Club. Clark’s frustration has been highlighted in a bad off-the-tee moment at Quail Hollow this year (below) and his alleged Oakmont locker room mishap

If you were to create a feature where apex_side pairs with curve, you could quantify exactly where a player’s struggle derives. The above video shows Clark’s ball start just left of the target line. I’m guessing Clark intended to start it at the left edge of the fairway with the apex_side well to the left of the broadcast’s target. Instead, the apex_side on this shot falls right off the broadcast’s target, hence Clark’s intense frustration. Any shot with apex_side on the short side of the fairway and a curve heading away from the fairway would see a huge dock because the player hasn’t properly controlled the flight as they wished.

[apex_height] - “height of the ball at the highest point along its trajectory in ft”

Last among the apex statistics, apex_height is what the TOUR releases to the public. Being able to hit a driver high allows players to cut corners in places course architects didn’t believe it possible, a massive advantage from a strokes gained perspective. Check out this Rory McIlroy drive from the 4th round of his 2025 AT&T Pebble Beach Pro-Am victory:

You can see the broadcast graphic show a stunning apex of 166 ft, well above the average of the PGA TOUR’s apex leader Frankie Capan III (135 ft). McIlroy’s sheer ability to elevate his ball over the trees due to the speed he creates allows him to hit one of the best drives of the day on the 14th hole: 

[ball_speed] - “Provides the speed (in mph) at which the ball exited the club face for the given stroke”

Ball_speed needs no introduction. Alongside clubhead speed, ball_speed’s importance is well-documented across golf discourse. Look no further than the X discourse after the PGA TOUR’s ball speed leader, Aldrich Potgieter, bombed and gouged his way around Detroit Golf Club for his maiden PGA TOUR victory. Only two players ranking top 30 in average ball speed off-the-tee net negative true strokes gained off-the-tee according to DataGolf. This trend highlights the fact that 180 ball speed is an inflection point when identifying truly talented off-the-tee players. 

Below is my favorite Potgieter drive of the year at TPC San Antonio’s 17th hole and the drives the rest of the field hit on that hole in round 2.

No drive in round 2 came close to how good Potgieter’s was. It’s in the fairway, just off the green, and leaves Potgieter the entire green to work with. A large majority of the field decided to lay up given the hole location, but when you have the raw ability of Potgieter, there’s no doubt when it comes to strategy. Making this drive even more impressive is the sheer ball speed paired with the trajectory he hit it on. At the ball’s apex, its apex_side already falls right of the bunker. Pair that with minimal curve, the ball falls exactly where Potgieter intended, creating the most impressive drive of the day on the 17th.

[horizontal_launch_angle] - “Provides the horizontal launch angle of the ball for the given stroke”

Simply put, combining horizontal_launch_angle with spin_axis and curve allows us to understand what kind of miss the player is playing and correctly award or dock them for it. Here’s Scottie Scheffler at Quail Hollow’s driveable 8th hole in round 4 of this year’s PGA Championship: 

Scheffler’s coordinated footwork and hand motion reveals his intent to hit a fade towards the green sitting tucked right of the tree placed straight through the fairway. This shot’s intended start line is the tree sitting straight out from the tee box which we can deduce by Scheffler’s feet being directly aligned to it. A closed clubface causes Scheffler to have a horizontal_launch_angle well to the left of the tree despite his body’s best efforts to guide it right. If we were to study every shot on the 8th hole this particular day, I would suspect that none with such a negative horizontal_launch_angle yielded anywhere near a positive strokes gained. If this ball had the adequate curve to place it in a more favorable spot to the tree’s right, the horizontal_launch_angle trend would still be a hard-to-manage shot pattern throughout a round or tournament. Scheffler’s left miss persisted for one more hole until he settled in on Quail Hollow’s back nine, leading to his first PGA Championship victory. Most golfers wouldn’t be likely to adjust as quickly as golf’s #1 and the resulting statistics should reflect that reality. 

[SpinRateEffectiveFit] - “Normalized Spin rate coefficients. Can be used to calculate the spin rate of the ball at a given time”

One of the most nuanced statistics stored by ShotLink, SpinRateEffectiveFit uses polynomial coefficients to store the lift-producing component of spin at any given time during the ball’s flight. Polynomial coefficients are used because it makes RPM retrievable while also being computationally efficient. So, instead of storing every single point in flight, ShotLink finds coefficients that can represent the ball’s flight in equation form. Why does SpinRateEffectiveFit matter? By assessing spin rate over time, shots that lift too early and lose distance, have high lift potential, or create too much rifle spin can all be more easily identified than by simply using spin rate. Rifle spin is the key terminology here. Think of it as revolutions per minute (RPM) being created, but not being used for carry. A quarterback’s pass has the same effect as the ball has 0% active spin, meaning 0% of the ball’s spin is being used to create movement. The pass has a spin axis that points straight ahead. For a golf shot, rifle spin causes decreased carry and increased sensitivity to wind as it doesn’t have any lift or spin to hold its flight. SpinRateEffectiveFit’s most important contribution is finding the golfers who excel at optimizing their spin, indicating their ability to strike the ball as intended.

Butch Harmon, one of golf’s greatest teachers, has always been a proponent of the golf ball being the ultimate teacher. Harmon said:

“I learned to teach at a time when we didn’t have any technology to analyze the swing or ball flight. We had our eyes - and the ball is the ultimate teacher. If you watch what the ball does, it tells you the clubface angle and the swing path at impact, and those are the biggest things. I see teachers today who rely on technology way too much, looking at all the data more than the student or the actual swing. We’re teaching people, not robots. Find me something that works better than my own eyes, and I’ll change my tune.”

Harmon refers to “all the data” here which would be many of the statistics I mentioned above. What if “all the data” could be synthesized into one number, telling coaches and players a shot’s propensity to net strokes gained. That’s the core aim of Stuff+. Harmon says it himself, the clubface angle and swing path at impact are two of the biggest components of a successful swing. If we let the ball flight dictate a shot’s quality, we are better off when analyzing a golfer’s ability. In focusing on those statistics that truly matter, Stuff+ may be the number that works better than Harmon’s own eyes. 

Next
Next

OKC: The Ironic Champion