|
I used to work at ITA. I did not work on the QPX product (which is the bit you're referring to) so take everything I say with a grain of salt. I can't talk about ITA specific info, but I can talk in general terms about knowledge that is public in the airline industry. Generally speaking, if you wanted to do what ITA does, you'd have to get three types of data: (1) schedules, (2) fares and fare rules, and (3) availability data. Getting (1) and (2) is relatively simple since there are special clearinghouses where airlines publish their schedule and fare data. ATPCO does fares and I've forgotten the name of the organization that handles SSIMs for schedules but in either case, you pay for a subscription and they make data available to you. Simple. Except for the fact that the protocols used to transmit the data are baroque and painful, the data itself has all kinds of data quality issues, and airlines are really really dumb when it comes to thinking about the semantics of what they're trying to convey. Specifically, airline standards groups typically go nuts specifying syntax while completely ignoring semantics. The result is that lots of carriers are "standards compliant" but functionally unable to communicate. There are a lot of N squared implementations in the industry (i.e., every carrier that needs to talk to each other writes custom code to talk to every other carrier). The nice thing about (1) and (2) is that we're talking about relatively static data that is updated infrequently; many systems process fare data on a daily basis for example. You may wonder: how can this be since airline prices vary rapidly even within the same day? The answer boils down to availability data (3), which changes extremely rapidly. A rule of thumb is that you might see one availability change per flight segment per second. Getting access to (3) is a good deal harder than (1) or (2). Typically, it involves talking directly with carriers or with the more sophisticated carrier alliances that are smart enough to pool infrastructure for their members. Alternatively, one can buy access from various global distribution systems (Sabre, Galileo, Amadeus, etc.) but this can be a very very expensive proposition. Plus, if you're buying availability data from a GDS, odds are good that you're competing with them, which doesn't give them much incentive to play nice. There are all sorts of problems you need to solve to do what ITA does, but for a host of reasons I won't go into, availability is the hardest. As for when they were just starting, I wasn't there, but based on later conversations, I can say that Carl and Jeremy (ITA founders) were just incredibly gutsy. And shockingly smart. And really lucky. They threw something together that was complete crap but managed to impress some industry people just enough to offer them a static data file. They had no idea how to read the damn thing so they made some educated guesses and demoed a search engine using it to the industry folks, who compared the demo results to their own internal system. Obviously, they got lots of stuff wrong, but they got enough right to really impress some of the staff who then started giving them data dictionaries, etc. The rest is history. |