15 次代码提交 (4b7e0d4b-519b-4415-a677-9c85114005f5)

作者 SHA1 备注 提交日期
Arthur Juliani 982fab41 Initial commit 7 年前
vincentpierre cde3c8f7 formating and added documentation 7 年前
Arthur Juliani 71591043 PPO additions and warnings 7 年前
GitHub aee5d336 Fix discrete state (#33) 7 年前
Arthur Juliani adac2683 Fix for multi-agent with observations 7 年前
Arthur Juliani 06d9bbec Log lesson in TensorBoard 7 年前
vincentpierre 22db3d64 added the modified files from dev-cooperative-env 7 年前
Arthur Juliani 51f23cd2 0.2 Update 7 年前
Arthur Juliani b56259f6 Fix cumulative reward (Unity) and Nan reward (python) bugs 7 年前
Arthur Juliani 75ea16ff Add comments and alphabetize flags 7 年前
Arthur Juliani adedd491 Initial support for multiple observations (#256) 7 年前
Arthur Juliani 54652c69 dev-logParam (#135) 7 年前
GitHub faa53e35 Fix observations on PPO trainer (#340) 7 年前
Arthur Juliani c21a391d Various bug fixed and changes 7 年前
Arthur Juliani 9d26767d Instantiate training buffer with trainer 7 年前