Why Being a Chief Data Officer is Like Running a Minor League Baseball Team
I recently finished a book that’s been on my list for a while: The Only Rule Is It Has To Work by Ben Lindbergh and Sam Miller. I’m a huge baseball fan (Go Cubs!) and love thinking about how using data can help improve organizations, so this was a perfect book to read. The main premise of the book is that Ben and Sam — two baseball data nerds — are given the opportunity to run a minor league baseball team in Sonoma, California, and they want to use data and sabermetrics to guide all decisions. They plan to change pitchers more frequently, adjust the defense so there are sometimes five infielders instead of four, find prospective players by sorting spreadsheets, and more.
As I read the book, their efforts seemed similar to mine as the City of Syracuse’s Chief Data Officer (also, both authors have good names!). Their initial thoughts of using data to guide their decisions was similar to what I thought I’d be able to do when I first took my job with the City of Syracuse. I had dreams of building predictive models about all of the different types of work that occurred on a daily basis, transforming the organization into an efficient machine.
I quickly found, as they did, that it is often not that easy. And though they were running a minor league baseball team, and I work with public servants, and we’re located on opposite sides of the country, it turns out there are a lot of similarities.
Counting stuff/information on paper
I often say that most of my job as Chief Data Officer is just figuring out how to count stuff. If the Mayor asks how many potholes the City has filled, or how many vacant properties are in the City, the answer is not always as clear as I’d like. Similarly, when Ben and Sam take over the Sonoma Stompers, they find that players who haven’t caught on in a better league and only qualify to play in an independent league — essentially as low as it gets on the professional baseball ladder — generally do not have a long track record of digitized baseball statistics. Finding out batting averages was difficult enough, let alone some of the more complex statistics more commonly understood at the major league level, or within sabermetrics.
Data that only exists on paper is a constant challenge. Some times digitizing the information is too costly and not valuable enough to justify the effort, so the data is never used.
Partnerships
One way to collect or digitize data is to partner with people and organizations that want to help. In our case, that has often been working with students that want real world experience, or have to do a project to satisfy a requirement for academic credit. We’ve worked with students who have helped us document where every sidewalk curb corner is in the city and have looked in 100-year-old water engineering books to find the year a water main was installed. We have also leveraged data from GPS units on our fleet to understand where and when potholes are being filled.
In the baseball context, the Stompers were lucky that Ben and Sam had an established fan-base, and a select few were willing to help do data entry, build databases, and make sense of the information. Additionally, they got help from companies like PitchFX that supplied technologies to better understand and collect data about what was happening during games, like pitch locations.
As with any technology, though, sometimes it doesn’t work. The software the Stompers used could be glitchy or depended on a lot of cameras. Our GPS units sometimes go down and we lose access to some data.
As with any volunteers, you get what you pay for, and sometimes the results are not as perfect as you’d like. But, in the pursuit to collect better data, there will always be obstacles.
Showing information helps
In the book, many of the players initially are not interested data or technology. The players have played baseball their entire lives, have one shot left to make it to any level of pro-baseball, and are not about to start trusting a couple of guys who think data will help make them better ballplayers. Once Ben and Sam show video that helps to give tips on improving swings or use the information at hand to change a strategy, some of the players start to realize the potential and ask for more.
Though showing video and using data is not foolproof and does not suddenly make the players into all-stars, they did make some progress.
In city government, many staff feel similarly when it comes to data. The staff and supervisors have been in their roles sometimes for decades, and are not about to have people who have never done the job, but can crunch numbers on a computer show them how to do the job better. Our approach in incorporating data has been to both think about where their frustrations are and how visualizing data could help, or performing an analysis that essentially justifies a view they have already had.
In the first approach, we knew that during major snow storms the Department of Public Works snow crews need to plow a lot of streets very quickly. Sometimes, for a variety of reasons, a street might get missed. We built a snow plow tracker that shows which streets have been plowed during a storm. Visualizing information in this way helped to ensure all streets were plowed as quickly as possible and no one was missed.
In the second approach, we worked with the Water Department to identify water mains that were at a high risk of breaking. Initially, just showing a heatmap of where the breaks occur most frequently got the staff in the department to engage with us. They recognized blocks in the city where they had fixed breaks over the years, and they told us stories about those experiences, which in turn helped us learn more about important features in the data.
Just like with the Stompers, initially finding information that helps or is easily relatable staff to trust and understand our approach, and also realize that the work we were doing was not meant to hurt or shame them, but rather to help make them more successful in their jobs — providing the tools they need to make decisions.
Culture
No matter what the data say, sometimes people will ignore the recommendation. In the book, there are a number of instances where the manager just tells Ben and Sam that he won’t follow their advice. The recommendation was either too strange or different from common baseball practice, or the downside if it doesn’t work is too negative. In the book, Ben and Sam are left with little they can do in the moment, because ultimately, they are providing information, but not actually doing the work of managing the at bats.
In city government, we experience similar challenges sometimes. The data may point in a pretty clear direction — in one instance it seemed to show that we should drastically change the way we repave our roads because we were doing too little preventative maintenance. Our recommendation was to essentially stop doing major road work on a few roads, and instead focus on minor road work on many roads. The data was clear, but the recommendation was largely not taken. For better or worse, though, the roads will always need maintenance, so we continue to show data that supports our claim.
Gut decisions
Near the end of the book, after most of the season is complete and there have been some successes but also many challenges. Ben and Sam review the work they think they accomplished. There are some questions about what impact they even had. Given that the data were sparse when they started their journey, and the certainly made their share of poor recommendations, they realize that even they, the data-driven, Fangraphs loving, sabermetricians, still relied on their gut when making a call during a game. This is because ultimately, you have to decide which data to consider and which to ignore. You have to understand that you are dealing with people, not robots, and so sometimes you may push harder for your recommendation than other times, even if you know which way would be the most efficient or effective.
In city government, we often face the same challenge. Yes, we want to push for use of data in decision making. But we also know the data often aren’t in the best shape, so there may be a lack of clarity in the recommendation. We decide to analyze certain data instead of others, or work with specific departments even though others need help, too. These gut decisions mean that we are not always as evidence based as we would like, but we also know that in order to improve and embed the practice of data-driven decision making in local government, we have to put ourselves out there and take a chance based on the best information we have.
Success feels great
Ben and Sam have a clear advantage when it comes to experiencing success in baseball over experiencing success in local government. The quick rush of seeing the prospect you found based on a data analysis hit a home run, or watching an out get recorded because of the defensive shift you recommended would be hard to beat. Nonetheless, correctly predicting a water main break, or giving someone an insight into their work that sparks a new idea is pretty cool — the closest version of stealing home that we get as data nerds in public service. Another similarity: in both baseball and local government, the recommendations are tangible and the results can be quick and clear.
While Ben Lindbergh and Sam Miller were trying to fill the lineup card more effectively and I’m trying to help the Department of Public Works fill potholes, dealing with personalities, a shoe string budget, and a lack of clean data are a few clear overlaps between running a baseball team and running the data team for a local government. The book is a fun one to read, and I recommend it. Even if you aren’t a huge baseball fan, the book offers good insights into the challenges of and strategies to build a slightly more data-driven organization.