Two artificial intelligence (AI) programs have finally proven they “know when to hold ’em, and when to fold ’em,” recently beating human professional card players for the first time at the popular poker game of Texas Hold ’em. And this week the team behind one of those AIs, known as DeepStack, has divulged some of the secrets to its success—a triumph that could one day lead to AIs that perform tasks ranging from from beefing up airline security to simplifying business negotiations.
AIs have long dominated games such as chess, and last year one conquered Go, but they have made relatively lousy poker players. In DeepStack researchers have broken their poker losing streak by combining new algorithms and deep machine learning, a form of computer science that in some ways mimics the human brain, allowing machines to teach themselves.
“It’s a … a scalable approach to dealing with [complex information] that could quickly make a very good decision even better than people,” says Murray Campbell, a senior researcher at IBM in Armonk, New York, and one of the creators of the chess-besting AI, Deep Blue.
Chess and Go have one important thing in common that let AIs beat them first: They’re perfect information games. That means both sides know exactly what the other is working with—a huge assist when designing an AI player. Texas Hold ’em is a different animal. In this version of poker, two or more players are randomly dealt two face-down cards. At the introduction of each new set of public cards, players are asked to bet, hold, or abandon the money at stake on the table. Because of the random nature of the game and two initial private cards, players’ bets are predicated on guessing what their opponent might do. Unlike chess, where a winning strategy can be deduced from the state of the board and all the opponent’s potential moves, Hold ‘em requires what we commonly call intuition.
The aim of traditional game-playing AIs is to calculate the possible results of a game as far as possible and then rank the strategy options using a formula that searches data from other winning games. The downside to this method is that in order to compress the available data, algorithms sometimes group together strategies that don’t actually work, says Michael Bowling, a computer scientist at the University of Alberta in Edmonton, Canada.
His team’s poker AI, DeepStack, avoids abstracting data by only calculating ahead a few steps rather than an entire game. The program continuously recalculates its algorithms as new information is acquired. When the AI needs to act before the opponent makes a bet or holds and does not receive new information, deep learning steps in. Neural networks, the systems that enact the knowledge acquired by deep learning, can help limit the potential situations factored by the algorithms because they have been trained on the behavior in the game. This makes the AI’s reaction both faster and more accurate, Bowling says. In order to train DeepStack’s neural networks, researchers required the program to solve more than 10 million randomly generated poker game situations.
To test DeepStack, the researchers pitted it last year against a pool of 33 professional poker players selected by the International Federation of Poker. Over the course of 4 weeks, the players challenged the program to 44,852 games of heads-up no-limit Texas Hold ‘em, a two-player version of the game in which participants can bet as much money as they have. After using a formula to eliminate instances where luck, not strategy, caused a win, researchers found that DeepStack’s final win rate was 486 milli-big-blinds per game . A milli- big-blind is one-thousandth of the bet required to win a game. That’s nearly 10 times that of what professional poker players consider a sizable margin, the team reports this week in Science.
The team’s findings coincide with the very public success several weeks ago of Libratus, a poker AI designed by researchers at Carnegie Mellon University in Pittsburgh, Pennsylvania. In a 20-day poker competition held in Pittsburgh, Libratus bested four of the top-ranked human Texas Hold ’em players in the world over the course of 120,000 hands. Both teams say their system’s superiority over humans is backed by statistically significant findings. The main difference is that, because of its lack of deep learning, Libratus requires more computing power for its algorithms and initially needs to solve to the end of the every time to create a strategy, Bowling says. DeepStack can run on a laptop.
Though there’s no clear consensus on which AI is the true poker champ—and no match between the two has been arranged so far—both systems have are already being adapted to solve more complex real-world problems in areas like security and negotiations. Bowling’s team has studied how AI could more successfully randomize ticket checks for honor-system public transit.
Researchers are also interested in the business implications of the technology. For example, an AI that can understand imperfect information scenarios could help determine what the final sale price of a house would be for a buyer before knowing the other bids, allowing that buyer to better plan on a mortgage. A system like AlphaGo, the perfect information game–playing AI that defeated a Go world champion last year, couldn’t do this because of the lack of limitations on the possible size and number of other bids.
Still, DeepStack is a few years away from truly being able to mimic complex human decision making, Bowling says. The machine still has to learn how to more accurately handle scenarios where the rules of the game are not known in advance, like versions of Texas Hold ‘em that its neural networks haven’t been trained for, he says.
Campbell agrees. “While poker is a step more complex than perfect information games,” he says, “it’s still a long way to go to get to the messiness of the real world.”