My Awesome Blog

Vanilla policy gradient

The value-function approach on reinforcement learning has worked well in many applications. It obtains the policy by selecting the action in each state with highest estimated value iteratively.

My paper reading list

The fllowing is the papers I am reading and have read.

Awesome site

The fllowing is the papers I am reading and have read.

MathJax is a simple way of including Tex/LaTex/MathML based mathematics in HTML webpages. To get up and running you need to include the MathJax script in the header of your github pages page, and then write some maths. For LaTex, there are two delimiters you need to know about, one for block or displayed mathematics \[ ... \], and the other for inline mathematics \( ... \).

My Awesome Blog

My Git Set Up

My os Set Up

Vim Set Up

Vanilla policy gradient

My paper reading list

Awesome site

MathJax Example