Our security team uses this, we’re about 1500 employees. However, I believe they opted to use a fork over this linked version, citing (IIRC) that Facebook’s roadmap for this project was a little bit too unpredictable, and that they weren’t responsive to PRs and community requests. I think we went with https://github.com/osql/osql.
It’s installed on every laptop (chrome books and MacBooks), and I believe on every EC2 instance, and they have nothing but good things to say about it. We did have to come up with an aggregator solution for storing the results of the queries, but I’m under the impression that it wasn’t too big of a lift.
Facebook has since transfered the project to the Linux Foundation, and the group behind osql is largely the same group of maintainers on the current osquery.
We have been a big part of the Osquery community for a while and we think it's an awesome project that can be used to get an insane level of visibility across your fleet. We also think in the context of rolling this out to end-users, companies need to really consider the privacy implications of the data Osquery is capable of collecting.
To that end, we soft-launched a product in May that helps fast growing tech companies use tools like Osquery to implement something called User Focused Security. User Focused Security involves treating employees like adults and understanding the context in which they work before rolling out a security strategy.
We want to be the best choice for organizations that want to get serious about the security of their laptops but don't want to lock-down devices, violate their user's privacy, or hurt their internal culture with opaque surveillance.
The three values that we use to build our software:
1. User Education over Enforcement
2. Trust through Transparency
3. Quality conclusions over Quantity of data
We use Osquery because it helps us fulfill that second value by giving end-users visibility into what is running on their device.
If your team uses Slack and want to see our approach you should check us out at https://kolide.com
BTW, if you are interested in learning more about User Focused Security and how it might scale to really large companies, I definitely recommend reading a recent interview we did with Jesse Kriss at Netflix https://blog.kolide.com/ufs-spotlight-jesse-kriss-of-netflix...
afaik one of the main motivations that led to start osquery project was precisely to have a cross platform tool to allow collecting much needed information from all your hosts in an enterprise setting
I love the idea of providing various functionality under a sql interface. Sure sometimes it doesn't fit, but overall it's one of the better lowest common denominators I've met so far.
I've used osquery a few times on my personal laptop (this post reminds me to try to get the company I work in to adopt it!) and for me it was one of the bigger inspirations for creating OctoSQL[1] as a means for such tools to interoperate.
A filesystem isn't so different from a database in the first place. Not really relational, but still.
The olde PalmOS had databases as primary storage. Though databases seem to have had capacity for blobs, since apps themselves were stored that way, aside from text files, images and whatnot.
In fact, afaik some mainframe OSes were built around databases.
>WinFS includes a relational database for storage of information, and allows any type of information to be stored in it, provided there is a well defined schema for the type. Individual data items could then be related together by relationships, which are either inferred by the system based on certain attributes or explicitly stated by the user. As the data has a well defined schema, any application can reuse the data; and using the relationships, related data can be effectively organized as well as retrieved. Because the system knows the structure and intent of the information, it can be used to make complex queries that enable advanced searching through the data and aggregating various data items by exploiting the relationships between them.
From what I heard, it was slow, and devs just were not that interested in some clean schema based interface because it complicated their ability to ship; interesting that the modern approach seems to also favor schema on write.
I stil remember the initial announcement of this years ago... I wasn't able to use it back then but saved it for later.
I'm currently in a situation in which I'd love to use osquery which is why I tried it out a few month ago.
Sadly, there wasn't any inbuilt multi-node/cluster functionality to speak of.
I gave up on it as it's utility is pretty low if you're constrained to localhost queries... And the third party "cluster" tools looked pretty barebones and seemed a hassle to setup. And not even really useful, as they just enable you too execute queries on several nodes.
I would want to do queries across servers. (I.e. select load,uptime, hostname where servertype "worker" and kernelversion "3.4")
there was very little value for me which I could already get with an adhoc Ansible task on my servers
I had a little bit of the same problem and while I think the third party Software like fleet should work (Never really had enough time to try it out at work) I wrote a small ansible module to integrate osqueryi.
OSQuery is pretty powerful and the SQL-like query makes it easier to correlate various system metrics in 1 step.
However compared to a central metric system that can aggregate metrics across all the hosts, its use quickly becomes less important.
Also there are some CPU considerations, as OSQuery is not as lightweight as other metrics gather tools. Several times I've ran into OSQuery interfering with the actual application, competing for resources. So if you do run it, make to renice it to mitigate this, especially if you're running time-sensitive apps like video/audio.
I'd like something that integrates canned DTrace/eBPF scripts, along with authorization (some canned scripts might leak sensitive data). Really, a bit of an idempotent, extensible, remote OS observability protocol.
The Linux kernel already has a pretty good API available via file nodes. And there are other lightweight tools to gather and parse information. Not sure I understand the benefits of exposing it through SQL, but I know some people are obsessed with SQL.
My understanding is there’s security and standardization benefits as well. If I’m remembering correctly there was a local keychain credential stealing attack around the time I was first looking at it and they had a plug-in available for detection the same day. While it wasn’t something magical you couldn’t write, test and run on your fleet yourself a central place to deduplicate that sort of work/test cycle and collaborate was compelling.
The real power of SQL is joins. But not only might you want to simply query, think about grouping and group functions. For example: say you wanted to know the highest rate of gif file creation per second between two dates, for a certain user. That’s what you could do with a simple sql query.
Personally I'll be glad if this serves me as a tool to get info from `ps`, `netstat` and similar things without looking through man pages every time I'm doing something other than the handful routine invocations.
Thought of writing such an util myself, actually, though not with SQL.
My use case was something like: for all users of the os give me all ssh fingerprints in .ssh/authorized_keys and stuff like that. Mostly for security and compliance.
My other usecase is netstat working different on Linux and macOS, so I alias an osqueryi command on macOS to show me which process opens which port.
You don't like SQL? The language syntax isn't great but the relational model behind it is a thing of beauty once you get it IMHO.
It sounds like a really interesting idea to me.
I was disappointed that Microsoft attempt at a relational database was it Win FS failed. Not that I use Windows, but it also seemed to kill the open source attempts at doing something similar.
It’s installed on every laptop (chrome books and MacBooks), and I believe on every EC2 instance, and they have nothing but good things to say about it. We did have to come up with an aggregator solution for storing the results of the queries, but I’m under the impression that it wasn’t too big of a lift.