Hacker News new | ask | show | jobs
by throwawayGT 3446 days ago
I took Isbell's class as well, and perhaps here we can share our respective experiences.

In the year I did it, the class was structured as follows:

At the beginning of the semester, you'd pick two datasets.

Every two weeks, you'd apply two or so algorithms that were being covered at the time (maybe k-means and SVD, or a NN and SVM) to your chosen data sets. There would be a set of variations that you were supposed to apply to each algorithm. Typically you'd normalize or clean the data in some way. Perhaps you'd filter outliers, etc...

The result would be a set of experiments to run (2 datasets) x (2 algorithms) x (2^3 variations per algorithm). You would compile the results into a (10 page max) paper, with analysis about how the dimensions differed.

It was up to the student to figure out how to actually implement this pipeline (I used sqlite + numpy/scipy/scikitlearn, many used Matlab).

On paper, this sounds like a great class - what a wonderful way to learn about how different approaches relate to each other, and how crucial the process of preparing data is to the effectiveness of the algorithm. In practice, however, this did not happen for most students I knew.

These students spent most of their time finding implementations of the algorithms and hacking at them to actually run all the experiments. They then rushed through gluing the results together through some semblance of analysis. Alumni of the class I knew said the same thing about their experience.

This analysis was read by TA's. There were I think 3 of them for about 100 students. We wouldn't get the papers back for weeks (long past we moved on to new material). When we got our papers back there was very little feedback of the content - mostly it was noted that we submitted the work on time, and had successfully performed all the experiments required.

I agree that Isbell is a joy to listen to - he is charismatic, entertaining, and I too enjoyed his anecdotes. However, I felt like you would only get something out of his lectures if you already knew what you were talking about.

When I think about the quality of the class, I think about how responsive the class is to the individual needs and progress of the student.

If you say that it's up to the student what they get out of the class, and your bar for a good class is that the content is arranged in a nice manner, then here you go https://pe.gatech.edu/sites/pe.gatech.edu/files/agendas/CS-4... ... any self-directed student can grab Mitchell, and do the weekly assignments I describe above - all for free and in the comfort of their own home.

4 comments

I agree that latency and detail of feedback is an enormous problem with this sort of partially-guided coursework. However, it's a generalized problem with higher education, not specific to GT, in that when implemented effectively it's one of the most valuable education experiences but difficult to scale, because it demands time-consuming supervision.

This is especially true of term project courses, where the final portion of the project to which you devote the most time and creativity is also the part for which you're likely to receive the least feedback.

>However, I felt like you would only get something out of his lectures if you already knew what you were talking about.

I disagree (having taken the course as an undergraduate and it being my first major exposure to machine learning). Certainly if all you do is attend the lectures, you're going to miss some background knowledge, but that is true of most (if not all) university courses. You're supposed to devote 2-3 hours of outside work for each hour of lecture. Meaning 6-9 hours of studying per week outside of those lectures.

Some of this is doing the projects, although some of it is personal investigation.

There are failings of his course (one of the biggest at this point is that it doesn't do any work with the state of the art now), but I think that the fact that his course caters toward people who are self-driven is not a failing.

The best way to look at what the goal of the course is is by looking at his exams. If they weren't different than you took them, they were intentionally too difficult for the allotted time, leading to low averages and incomplete work by the majority of students.

However, the course allows motivated students to make connections between concepts, with the help of the professor and the coursework. Having someone "leading you" down the right path is very helpful, much moreso than a textbook alone.

I really do think that there is one exam question that sums up Isbell's course perfectly: its the one where you are asked to compare and contrast 4-5 aspects of 4 randomized optimization algorithms (RHC, GA, SA, and MIMIC) and explain situations where you'd use each and why.

The course's goal is to lead to a strong intuition for the algorithms covered (sadly at the partial expense of a theoretical understanding), not everyone puts in the work to develop that understanding, but that's not a failure of the course, necessarily.

I do agree that having materials that provide an approach to a topic is very useful, but as I mention elsewhere such materials are available for free online.

You can find the syllabus for Isbell's class and follow along. You can do the readings and programming investigations. If you like lectures, you can find many full courses on YouTube (I found caltech's lectures https://www.youtube.com/watch?v=eHsErlPJWUU to be the best at presenting SVM's out there, although this was probably my third attempt at understanding them so maybe the other resources rubbed off.. they also skim over the quadratic programming detail but I get that this may be beyond the detail that many people desire in an intro class).

If you have to teach the material to yourself, how is your experience improved by being in the class?

>You can find the syllabus for Isbell's class and follow along

To be fair, most of Isbell's course (lectures) is also available on Udacity.

>If you have to teach the material to yourself, how is your experience improved by being in the class?

There are a couple advantages. One of the most obvious is the lower latency of responses when you have confusion or misunderstanding. In a lecture, you can ask a question and get an answer almost immediately. This is most useful (imo) with algorithms and mathematical concepts, because you can ask, and lecturers are often quick to provide insight, into the interrelationships between algorithms (both in Machine learning and in a more theoretical sense like computability). There are topics that come up a lot, and being able to have instant feedback on those connections allows you to spend less time misunderstanding than not.

That alone is a fairly weak justification, I think the stronger one is feedback in general. Watching lectures only gets you so far. With implementation of algorithms, often your feedback is testable correctness (although my experience in DS&A suggests that most people are capable of constructing incredibly incorrect models for things that perform well on some input, and even on decent autograders), but with things like machine learning algs and intuition about those algorithms, you can't get that. So the feedback that yes, your understanding is correct (even if that feedback is slow) is invaluable. In that regard I think online courses and MOOCs can be good, but MOOCs that don't provide feedback aren't as valuable. I've attended a lot of lectures, and I've ignored a lot of lectures. Listening to someone say something does not mean one has learned it.

I'd also note that, if I recall, the way that Isbell approaches teaching the material, vs. the way the textbook does are very different. Textbooks are (often) references. They provide information on what something is and how it works theoretically, but very often lecturers are able to provide the kinds of things that aren't (and shouldn't?) be in textbooks.

If I'm reading a textbook, its very likely that I want to know how to implement an algorithm, so I care that the algorithm for simulated annealing says that you jump with probability e^(D/T) > Rand[0,1]. Whereas in a lecture, I'm likely much more interested in the idea that simulated annealing is conceptually very similar to throwing a ping-pong ball into a large complex, convex plastic surface and seeing where it lands.

My criticism is precisely that feedback was lacking. The assignments were only graded on submission - there was no feedback there (likely because every student worked with different data so going in-depth would have required the grad student TAs to spend too much time per student digging in).

I don't agree that feedback during lecture is valuable or low-latency as you say - not with 100 students attending. It might work to ask a clarifying question here and there, but again - you're only in a position to take advantage of that if you're already comfortable with the material and are generally keeping up.

Books are different than lectures, sure, but I don't think there's much difference between attending a lecture with 100 students, or watching one online. Indeed many people claim the online way is better, since you can rewind and skip around, pause and lookup references, etc...

When I took it we were encouraged to use Weka for the algorithm implementations themselves. This certainly allowed me (and I'd never so much as touched machine learning prior to taking the class -- I took it on a bit of a lark that wasn't related to my research at all) to focus on understanding the behavior of the algorithms rather than worrying about hacking them together.

I'd agree that Tech has too few TAs for too many students, generally, for its graduate courses, but I don't know that other schools do a better job. A brief survey of the folks around my desk elicited howls of laughter at the notion of useful or accessible TAs in grad school.

> I agree that Isbell is a joy to listen to - he is charismatic, entertaining, and I too enjoyed his anecdotes. However, I felt like you would only get something out of his lectures if you already knew what you were talking about.

I think this assertion is, at best, too strong. A better assertion might be that his lectures depended on coming in with sufficient background.

As I said, I came into the course with no experience with machine learning at all. On the other hand, I did have a fairly strong theoretical computer science, stats, and linear algebra background. I will admit that may have made me blind to things he was simply assuming with respect to educational background that were not actually safe to assume. That said, I still refer back to his primer on information theory (http://www.cc.gatech.edu/~isbell/tutorials/InfoTheory.fm.pdf) when discussing work relying on it, so he certainly made some effort to fill in gaps as he discovered they were common.

> When I think about the quality of the class, I think about how responsive the class is to the individual needs and progress of the student.

For a graduate level course I feel a class clears this bar when it accurately and thoroughly documents the prerequisites. Now, I'm not saying Charles's class necessarily does this. As I said, I came in with a pretty strong background in what turned out to be more than sufficient, but with that background I personally felt his lectures were quite tractable, even assuming complete ignorance of ML itself.

These students spent most of their time finding implementations of the algorithms and hacking at them to actually run all the experiments. They then rushed through gluing the results together through some semblance of analysis. Alumni of the class I knew said the same thing about their experience.

Ironically, this sounds quite a lot like much of industry.

Or unsurprisingly...
yeah, true.