技术共享
TECHNOLOGY SHARING
提供开源算法 高水平基准AI
训练及复盘数据以及AI开发包等
Neural Fictitious Self-Play
CFR against a best responder
Hansen, Steven, et al. "Fast deep reinforcement learning using online adjustments from the past." Advances in Neural Information Processing Systems. 2018.
Trust Region Policy Optimization
反事实后悔最小化算法(Counterfactual Regret Minimization)
External sampling Monte Carlo CFR
Best Response
Outcome sampling MC CFR
Deep CFR
Regret Policy Gradient