如何训练神经网络来玩 2048 游戏？

How can I train neural network to play the 2048 game?

我想训练神经网络来玩 2048 游戏。我知道 NN 不是像 2048 这样的状态游戏的好选择，但我想实现 NN 会像有经验的人一样玩游戏，即只在三个方向移动瓷砖。

但是我不知道如何自训练神经网络，因为我们不知道有效的输出。通常，例如在回归中，您知道正确的输出并且可以计算损失（例如均方误差）并更新权重。但是在 2048 年有效输出基本上是未知的（当然你可以计算你可以移动的每个方向的分数，例如，具有最高差异 score_after_move - previous_score 的方向将是我们的有效输出，但我认为这不是自学神经网络）。那么是否可以为2048游戏定义损失函数呢？最好是可区分的。

下一个问题是何时更新权重：在每次移动之后还是在完成游戏（游戏结束）之后？

如果这很重要：我的 NN 拓扑现在很简单：

2D matrix of gaming board -> 2D matrix of input neurons -> 2D fully-connected hidden layer -> 1D 4-neuron layer

所以每个tile都会被输入到第一层对应的神经元（2D全连接层有什么特殊的称呼吗？）。最后一层的预期输出是一个长度为 4 的向量，例如[1, 0, 0, 0] 将是 "up" 移动方向。

目前我已经为 2048 游戏实现了无头 class（在 Python/NumPy 中），因为使用视觉输入很慢，而且还有更多工作要做。

P.S。也许我对这个游戏（或一般游戏）的 NN 学习有错误的想法。请随时向我展示更好的方法，我将不胜感激。谢谢:)

编辑：强化学习似乎是正确的方式。以下是一些有用的链接：

Demystifying Deep Reinforcement Learning

Action-Value Methods and n-armed bandit problems

Q-learning for Keras

Deep Reinforcement Learning for Keras

所以https://github.com/matthiasplappert/keras-rl seems to be the best way. You must only implement few methods defined by OpenAI Gym environment API. These are step() and reset() methods: https://github.com/matthiasplappert/keras-rl/blob/master/rl/core.py#L330

有关更多信息，请参阅 keras-rl 开发人员的回答：https://github.com/matthiasplappert/keras-rl/issues/38

当我的 2048 游戏 AI 项目完成后，我将 link 发送到此处（如果我不会忘记这样做的话:)）

编辑：这是承诺的 link 来源，完全忘记了：/ https://github.com/gorgitko/MI-MVI_2016

如何训练神经网络来玩 2048 游戏？

How can I train neural network to play the 2048 game?

python

artificial-intelligence

neural-network

keras