Hacker News new | ask | show | jobs
by Ragib_Zaman 2420 days ago
Flip side to this - I remember studying a series of papers in statistics/optimisation/machine learning with highly non-trivial content published between 1998 and 2008 by the Della Pietra brothers. I had assumed they were mathematicians/statisticians at some major research university. At the bottom of one of these papers it had some personal blurbs which stated they had both been at RenTech since 1995 working on "statistical methods to model the stock market", which surprised me greatly given the amount of research they output. I assume the content of those papers were applied to their strategies, in which case there would also be a good amount of people who would be surprised if they knew how advanced their methods are. I say this especially in comparison to the "quant" strategies I know many other trading firms use/have used, which are actually often quite simple. Indeed, at some places they seem to think any systematic/automated strategy is "quant"...
1 comments

Do you remember what the titles of papers? I'd like to read them.
You can find a bunch of papers published by people at RenTec. Search MathSciNet for "Renaissance Technologies" as the corporate affiliation for the author.

Likewise, search Google Scholar for "@rentec.com", or "Renaissance Technologies."

As throwawaymath mentioned, it's not too hard to find papers written by them, but I'll highlight some papers that I found particularly interesting.

Many optimization problems that arise in machine learning can be viewed as minimizing a particular Bregman distance subject to affine constraints (I'll call these "Bregman distance optimization problems").

In [1], the authors develop some quite general and widely applicable convex analysis type results for Bregman distances and use those results to give the technique of "Auxiliary Functions" which can be used to derive and prove the convergence of algorithms for particular Bregman distance optimization problems.

In [2], the authors show that finding the optimal parameters of two quite different approaches of binary classification, AdaBoost and Logistic Regression, can both be simultaneously viewed as the same Bregman distance optimization problem (with slightly different initial parameters). They then present several algorithms for solving this unified problem (and thus, for optimizing both AdaBoost and Logistic Regression), and prove the convergence of their algorithms with the method of Auxiliary functions developed in [1]. This paper was the first general proof of convergence for AdaBoost which was proposed by Yoav and Schapire (one of the authors of [2]) and earned them the Gödel prize in 2003.

If you don't mind me plugging in some expository work of mine, I wrote an essay ([3]) giving an introduction to Bregman distances and is pitched at a lower level than [1] and [2]. The end of the first chapter discusses in particular the general Bregman distance optimization problem, setting it up in the same framework as [1] and [2], and the final chapter presents the algorithm and proof of convergence originally given in [2] but hopefully with a slightly simplified presentation due to focusing only on the Logistic Regression case of the algorithm.

In case you want to play around with it, I have implemented the algorithm from [2], as well as a related algorithm from [4] which incorporates L1 (Ivanov) regularization into that algorithm, both being available at [5]. The middle chapters contain brief discussions on the relation of Bregman distances with Exponential Families, and on Generalized Linear Models, which are relevant to the overall purpose of the essay but not so closely related to the content of [1] and [2] and may safely be skipped.

If you generally enjoy the types of problems discussed in [1] and [2], you might also enjoy some of these papers: [6] [7] [8] [9].

--------------------------------------

[1] - "Duality and Auxiliary Functions for Bregman Distances". Stephen Della Pietra, Vincent Della Pietra, and John Lafferty, 2001.

[2] - "Logistic Regression, AdaBoost and Bregman Distances". Robert Schapire, Michael Collins, Yoram Singer, 2001.

[3] - "Optimisation of Bregman Divergences". Ragib Zaman, 2018. https://github.com/RagibZaman/mathematical-optimisation/blob...

[4] - "Bregman distance to L1 regularized logistic regression". T. Huang and M. Gupta, International Conference on Pattern Recognition, 2008.

[5] - Notebook containing implementations of Bregman divergence based algorithms for Logistic Regression from [2] and [4]. https://github.com/RagibZaman/mathematical-optimisation/tree...

[6] - "Legendre functions and the method of random Bregman projections." H. Bauschke and J. Borwein, Journal of Convex Analysis, 1997.

[7] - "Inducing features of random fields". Stephen Della Pietra, Vincent Della Pietra, and John Lafferty, 1997.

[8] - "Statistical learning algorithms based on Bregman distances". Stephen Della Pietra, Vincent Della Pietra, and John Lafferty, 1997.

[9] - " A maximum entropy approach to natural language processing". Adam L. Berger, Stephen Della Pietra and Vincent Della Pietra. Computational Linguistics, 1996.