Have a great summer break

It was good having you guys in the seminar. Have a great summer break and keep in touch.

Posted in Announcements | Leave a comment

Lecture 16: The Exp3 algorithm

(Guest post by Bobby Dygert and Dan Padgett)

In this lecture, we consider a simple modification to the PSIM algorithm presented last class, to try and get a lower bound on the algorithm’s normalized regret. Instead of exploring every expert during each phase, we choose a probability p that, at time t, we explore (with expert chosen uniformly) rather than exploit. We call this new algorithm EXP3. We show that, assuming we use Hedge as our Bext Expert algorithm, we can achieve average regret of at most O\left(\sqrt{Tn\log{n}}\right).

(The slides have been posted. –Atri)

Posted in Lecture summary, presentation | Leave a comment

Lunch on Thursday

I’d like to take you guys out for lunch next Thursday,  April 28 at noon 12:30. Please let me know if you can make it. We’ll have lunch at the Indian place in the commons.

Posted in Announcements | 5 Comments

Lecture 15: Multi-armed Bandit problem

(Guest post by Bobby Dygert and Dan Padgett)

We introduced the multi-armed bandit problem, which is similar to the best-expert problem except the algorithm only gets to see the value c_t(x_t) rather than the entire cost function c_t at each step. The trade-off of exploration versus exploitation is central to the multi-armed bandit problem. We discussed a few example scenarios that contain this type of trade-off such as clinical drug trials, server selection, ad placement, selling goods, routing, and go along with how the multi-armed bandit problem can be fitted to these problems. We also discuss one multi-armed bandit algorithm, PSim, which uses phases to simulate the best-expert problem and prove a bound for its regret.

(The slides have been posted. –Atri)

Posted in Lecture summary, presentation | Leave a comment

Lecture 14: Zinkevich’s Algorithm

(Guest post by Caiming Xiong)

We first saw how to apply the Hannan-Kalai-Vempala Algorithm into the best expert problem. Then we discussed  Zinkevich’s algorithm which is a deterministic algorithm for solving the Many Expert Online Learning problem and obtained the bounds for the best expert situation and “concept drift” situation.

Finally we presented  two examples from the Kalai-Vempala paper: online shortest path and update tree problems that are good applications for the Hannan-Kalai-Vempala Algorithm. They demonstrate that online algorithms can perform almost as well as the best single decision when there are exponential many possible decisions.

(The slides have been uploaded. –Atri)

Posted in Lecture summary, presentation | Leave a comment

Lecture 13: Handling exponentially many experts

(Guest post by Swapnoneel Roy and Steve Uurtamo)

Coming Soon…

The slides have been uploaded.

Posted in Lecture summary, presentation | Leave a comment

Lecture 12: Blackwell’s Approachability Theorem

In yesterday’s lecture, we saw that Blackwell’s approachability theorem implies the existence of no external and internal regret learning algorithms. The slides have been uploaded. The material was from Week 4 notes from Bobby’s course.

The presentations start next week.

Posted in Lecture summary | Leave a comment