Who better to forecast the NCAA men’s basketball tournament other than the software gurus at SAS and their partners?

After all, last year a team using SAS tools for its “Dance Card” predicted at 100 percent every team that would make the field. This year, the forecast hit 97 percent.

Their only miss? N.C. State, says Shannon Heath of SAS.

What happens when the games begin, however, is an entirely different matter – as Duke Blue Devil fans well know after that stunning loss to Mercer. (The ”Dance Card” effort is led by Jay Coleman of the University of North Florida, Mike DuMond of Charles River Associates, and Allen Lynch of Mercer University. How ironic, eh? They even have a lengthy YouTube video describing “Method to the Madness.”)

But if the men and women using analytics software and high-performance computing power to predict what shoppers will buy next and what products will be hot at Christmas can’t forecast March Madness, then don’t weep because your bracket is in tatters.

Among those in mourning is Jared Dean, confessed basketball junkie, who heads Research and Development at SAS.

“My bracket was not as good this year as in the past,” he tells WRALTechWire 

“In the first round there were seven upsets (higher seed beating lower seed) but I didn’t get them all.

“Dayton, North Dakota, and Mercer were unexpected and the Duke loss really hurt my chances because I had predicted that Duke would make it to the final four.”

Ouch!

Yes. Blue Devils, he is blue, too.

“I had bought session tickets before the brackets were announced so I sat in stunned silence as the final seconds ticked away on the Mercer win.”

Now if Dean can’t predict Mercer beating Duke, who can? Look at his bio at SAS: “Jared Dean is Director of Research and Development at SAS. He is responsible for development in SAS’ global data mining solutions. This includes customer engagements, new feature development, technical support, sales support and product integration. He also has a new book on data mining and big data in the later stages of publication. Prior to joining SAS, Jared worked as a mathematical statistician for the US Census Bureau. He holds a MS degree in computational statistics from George Mason University.”

He’s got to be the Dick Vitale of data miners.

However, Duke wasn’t the only reason Dean’s own bracket was devastated.

“My bracket has nine of the 16 teams left in the tournament and my final four prediction was Florida, Arizona, Michigan St. and Duke,” he says.

That’s probably better than most.

Words of Warning

As the 2014 tournament opened, Dean blogged about March Madness and Predictive Modeling.

“This is a great time of year for me, because I get to combine several of my passions,” he wrote, citing statistics, analysis and basketball. 

“In the tournament history stretching back more than 75 years, only 14 universities have won more than one championship, and three schools local to SAS world headquarters are on that list (the University of North Carolina, Duke University, and North Carolina State University). That concentration, combined with the fact that this area is a well-known cluster for statistics, means that I am not alone amongst my neighbors in combining my passions.”

He likes blending bracketology and history.

“I’m sure some readers have used these kinds of strategies and lost or maybe even won the ‘kitty’ in these betting pools, but the best results will come using historical information to identify patterns in the data. For example, did you know that since 2008 the 12th seed has won 50% of the time against the 5th seed? Or that the 12th seed has beat the 5th seed more often than the 11th seed has beat the 6th seed?”

But analysis can go much deeper to create what should be reliable predictive models, he added.

“Upon analyzing tournament data, patterns like these emerge about the tournament, specific teams (e.g. NC State University struggles to make free throws in the clutch), or certain conferences. To make the best predictions, use this quantitative information in conjunction with your own domain expertise, in this case about basketball.

“Predictive modeling methodology generally comes from two groups: statisticians and computer scientists (who may take a more machine learning approach). The field of data mining encompasses both groups with the same aim – to make correct predictions of a future event. Common data mining techniques include logistic regression, decision trees, generalized linear models, support vector machines (SVM), neural networks, and many many more (all available in SAS).”

Get all that? Wonder if bookies use neural networks and support vector machines? 

Yet how many people – even software fanatics – predicted Dayton would topple Ohio State on opening night? Dean noted that game along knocked an estimated 80 percent of bracket players hoping for perfection.

Remember what Mark Twain once said about stats: “Lies, damned lies, and statistics.”

In concluding his blog, Dean pointed out that models aren’t perfect.

“Statistician George Box is famous for saying, ‘essentially, all models are wrong but some are useful,'” he wrote in reference to statistical guru Box.

Not Giving Up 

Dean is in Washington, D.C. for the big SAS Global Forum, which drew a record crowd of more than 4,500. They are there to hear the latest about “big data” analytics. But Dean is still following action on the court.

“With the SAS Global Forum conference going on this first part of this week I haven’t had much time to look at revising my predictions,” he says, “but watching the games I still like my original final four picks with the exception of changing Louisville for Duke.”

Dean also plans to blog about lessons he’s learned.

“This week I plan on using the NCAA tournament seedings to illustrate how to assess the quality of a model,” he says.  ”Next week I plan to continue on the NCAA tournament and ask people to send me their brackets so that I could discuss ensemble models which is crowdsourcing for models.”

Ah, there’s a magic word. Crowdsourcing.

It seems crowds are seen as possible solutions for everything these days.

Could crowdsourcing produce the perfect bracket in 2015?

“Would you be willing to send me your bracket and participate,” he asks The Skinny.

Wish I could, but despite being a Hoosier state native and basketball fanatic myself, I gave up on brackets long before Bob Knight and Indiana University parted ways.

Even with Knight, Indiana was always too unpredictable after that magical unbeaten season of 1976. And I always picked Indiana to win.

The worst human factor of all in picking a bracket?

Loyalty.

[SAS ARCHIVE: Check out more than a decade of SAS stories as reported in WRALTechWire.]