|
|
|
|
|
by tensor
5329 days ago
|
|
Libraries like Weka and Mahout are no more toys than any other library that implements standard and widely applicable algorithms. Yes, you need to do a lot of extra work to properly model your problems, choose features, and combine different algorithms into a final product. But it's not often that you need to tweak the core algorithms that these libraries provide. If you really understand enough to implement new classifiers or other types of learning algorithms, these libraries are still useful to you. For one, they provide a solid framework for allowing your new algorithm to easily interact with other algorithms. Two, it's not unlikely that your new algorithm is a variation on an existing one. Don't re-implement it. These libraries are open, so copy the source and modify it. And three, mahout uses hadoop. Distributed processing systems are another topic altogether. If you are proposing to write your own, I would hope that you have good reasons for spending the time. Hadoop is certainly no toy. In summary, don't waste time reimplementing core algorithms unless you are doing it for a learning exercise. But do still take a good course on machine learning, because using the provided algorithms in these packages and others correctly is highly non-trivial. |
|