| Hey, For those of you who don't know my little tool Musoq, I wanted to introduce it as a small tool that allows you to query with SQL-like syntax without any database. It allows you to query various things from niche ones like CAN DBC files, weird ones like C# code, interesting ones with Git querying to regular stuff like CSV, TSV and various others. I am quite a bit experimenting with various things so I'm hybridizing the engine with LLMs or doing other weird stuff that are more or less practical :-) I wanted also to share some recent developments in this little project as I hope it might be interesting to some of you. New Experimental Plugins:
* Git Plugin (Beta): I've been working on Git repository querying - managed to test it on the EF Core repo (16k commits) and it seems to work okay
* Roslyn Plugin (Beta): Added basic C# code analysis capabilities For the very first time:
I've extended CROSS APPLY to use computed results as arguments! Now the operator can use values from the current row as inputs. Here's an example: SELECT
f.DirectoryName,
f.FileName
FROM #os.directories('/some/path', false) d
CROSS APPLY #os.files(d.FullName, true) f
WHERE d.Name IN ('Folder1', 'Folder2')
After another pack of fixes I'm finally able to query multiple git repositories AT ONCE! with ProjectsToAnalyze as (
select
dir2.FullName as FullName
from #os.directories('D:\repos', false) dir1
cross apply #os.directories(dir1.FullName, false) dir2
where
dir2.Name = '.git'
)
select
c.Message,
c.Author,
c.CommittedWhen
from ProjectsToAnalyze p cross apply #git.repository(p.FullName) r
cross apply r.Commits c
where c.AuthorEmail = 'my-email@email.ok'
order by c.CommittedWhen desc
Under the Hood:
- Added a Buckets feature for memory management (currently just testing it with the Roslyn plugin)- Moved to .NET 8 - Added CROSS/OUTER APPLY operators - Made some improvements to error messages and runtime behavior New piping features:
I've been experimenting with piping capabilities:
* Image Analysis with LLMs: ./Musoq.exe image encode "image.jpg" | ./Musoq.exe run query "select s.Shop, s.ProductName, s.Price from ..."
* Text Data Extraction: Get-Content "ticket.txt" | ./Musoq.exe run query "select t.TicketNumber, t.CustomerName ... from #stdin.text('Ollama', 'llama3.1') t"
* Data Source Combination: { docker image ls; ./Musoq.exe separator; docker container ls } | ./Musoq.exe run query "..."
I'm working on comprehensive documentation:
I encourage you especially to look at section "Practical Examples and Applications" and "Data Sources" where you can look at all the tables the tool currently provides. <https://puchaczov.github.io/Musoq/>Other Changes: - Made some improvements to OS and Archive data sources (OS can now query metadata like EXIF) - Added a few fields to CAN DBC plugin - Command outputs can now be used as inputs for queries I'm hoping to: - Improve stability and add more tests - Flesh out the documentation - Work on package distribution (Scoop, Ubuntu packages) - Share some examples of source code querying with Roslyn Ideas for later: - WHERE robust analysis and optimizations - DISTINCT operator implementation - PROTOBUF schema support - Performance improvements - Query parallelization - Recursive CTEs - Subqueries I'd really appreciate any thoughts or feedback! The documentation section where I write a short analysis of EF Core with git plugin: <https://puchaczov.github.io/Musoq/practical-examples-and-app...> |