yanchaosun
|
1b86b3ae
|
initialize
|
4 年前 |
yanchaosun
|
3ef4196e
|
Added the algorithm named ppo_transfer
|
4 年前 |
yanchaosun
|
c2d6f5c0
|
basic implementation
|
4 年前 |
yanchaosun
|
a9c788d7
|
new model
|
4 年前 |
yanchaosun
|
ac4c80c2
|
integrate the implementation and hyperparameters
|
4 年前 |
yanchaosun
|
1e52ad3d
|
ready for cloud training
|
4 年前 |
yanchaosun
|
05a96355
|
remove slim package
|
4 年前 |
yanchaosun
|
ad95032b
|
transfer path
|
4 年前 |
yanchaosun
|
59251abe
|
change yamls
|
4 年前 |
yanchaosun
|
a80915a8
|
yaml update
|
4 年前 |
yanchaosun
|
666c8ba9
|
new cloud training change
|
4 年前 |
yanchaosun
|
59e93b0b
|
transfer config
|
4 年前 |
yanchaosun
|
d7402406
|
multiple sizes configs
|
4 年前 |
yanchaosun
|
5eccb4c9
|
new transfer test for cloud
|
4 年前 |
yanchaosun
|
d1e8d344
|
with swish activation
|
4 年前 |
yanchaosun
|
f74af710
|
fix problem
|
4 年前 |
yanchaosun
|
3f0cc587
|
fix
|
4 年前 |
yanchaosun
|
4ba543e5
|
fix bug
|
4 年前 |
GitHub
|
839eb2cb
|
Develop model transfer test (#4214)
* test env, and code integration
* delete results
|
4 年前 |
yanchaosun
|
7e3216ae
|
simple env test
|
4 年前 |
yanchaosun
|
cdaaa318
|
bisim
|
4 年前 |
yanchaosun
|
3d0d359c
|
bisimulation draft
|
4 年前 |
yanchaosun
|
1fdbfe65
|
no normalization
|
4 年前 |
yanchaosun
|
5a778ca3
|
fix normalization
|
4 年前 |
yanchaosun
|
a212fef9
|
new bisim implementation
|
4 年前 |
yanchaosun
|
aca8cd58
|
update with new alternating
|
4 年前 |
yanchaosun
|
0e2f6e19
|
small fix
|
4 年前 |
yanchaosun
|
ec929746
|
minor update
|
4 年前 |
Andrew Cohen
|
d0133066
|
working
|
4 年前 |
yanchaosun
|
9bc90956
|
fix bug with bisimulation
|
4 年前 |
yanchaosun
|
f8b91faa
|
try to fix the bisim metric
|
4 年前 |
yanchaosun
|
ce36349b
|
some changes
|
4 年前 |
Andrew Cohen
|
1b17ae56
|
add tanh activ
|
4 年前 |
Andrew Cohen
|
5fa28f5f
|
merge YC changes
|
4 年前 |
yanchaosun
|
28355444
|
bisim fix, disable stop gradient
|
4 年前 |
yanchaosun
|
3246570c
|
added action encoder, and flags related with action training/transferring; set model_schedule as a changable hyperparameter
|
4 年前 |
GitHub
|
9f041970
|
Develop bisim action encoder, incorporate related hyperparameter settings (#4253)
|
4 年前 |
yanchaosun
|
b991096b
|
update target encoder soft copy
|
4 年前 |
yanchaosun
|
b74294bf
|
target encoders and new forward loss
|
4 年前 |
yanchaosun
|
0c468084
|
sac transfer implementation; disable action encoder
|
4 年前 |
yanchaosun
|
36f36750
|
target critic for ppo
|
4 年前 |
yanchaosun
|
6df774ed
|
update: separate model train as an option
|
4 年前 |
yanchaosun
|
e8fcc4bb
|
ppo new implementation
|
4 年前 |
Andrew Cohen
|
72bf7b72
|
reward loss separate
|
4 年前 |
yanchaosun
|
b5e02978
|
sac crawler config
|
4 年前 |
yanchaosun
|
2e927257
|
separate policy net
|
4 年前 |