Hacker News new | ask | show | jobs
Show HN: VirtualStorageLibrary – .NET Tree solutions for items, dirs, symlinks (shimodateakira.github.io)
63 points by shimodateakira 672 days ago
repository: https://github.com/shimodateakira/VirtualStorageLibrary

Hello HN, I'm excited to share a project I've been working on called VirtualStorageLibrary. It's a .NET library designed to manage in-memory tree structures, including items, directories, and symbolic links.

VirtualStorageLibrary is a .NET library that operates entirely in-memory and provides a tree-structured collection. This library offers a foundation for managing hierarchical data structures, supporting items, directories, and symbolic links that encapsulate user-defined types <T> (generic types defined by the user). This library is not a file system. Instead, it was redesigned from scratch to create a more flexible and user-friendly tree structure. The library aims to make it intuitive to reference, traverse, and manipulate nodes by specifying paths.

The collections provided by .NET are linear, including types like hash sets, arrays, lists, and dictionaries, which inherently have a linear structure (such as arrays and lists). In contrast, common file systems can be viewed as tree-shaped collections, where elements are managed as nodes in a hierarchical structure. While there are existing libraries that support tree-shaped collections, I couldn’t find one that models a file system-like structure. Therefore, I conceptualized a logical interpretation of a file system and asked, "Can we implement a tree collection purely as objects?" The goal was to create a system that can flexibly manage hierarchical and allow intuitive access.

Key Features:

Flexible Tree Structure Hierarchical structure based on parent-child relationships, enabling flexible node management.

Various Nodes Items, directories, and symbolic links, including user-defined types <T>.

Intuitive Node Operations via Paths API for referencing, searching, adding, deleting, renaming, copying, and moving nodes using paths.

Link Management Symbolic links managed with a link dictionary, tracking changes in target paths.

Circular Reference Prevention Detects and prevents circular references in paths involving symbolic links by throwing an exception.

Flexible Node List Retrieval Retrieves node lists within directories, filtered, grouped, and sorted by specified node types and attributes.

Anticipated Use Cases:

Natural Language Processing (NLP) Knowledge Base Systems Game Development Hierarchical Clustering Education and Learning Development Status and Future Plans:

The current version is 0.8.0 (prerelease). As of 2024/08, all essential features for version 1.0.0 have been implemented. However, some bug fixes, around 30 feature improvements, and refactoring tasks remain. With version 0.8.0, we aim to gather user feedback, including bug reports and feature enhancement suggestions. Simultaneously, we plan to work through the remaining tasks for version 0.9.0, targeted for release in October 2024. During this period, class names, method names, property names, and other elements in the library may change, merge, or be deprecated without notice. Detailed information will be provided in the release notes, so please check them for updates.

For more detailed information on the current issues and planned improvements, please visit the following page (content in Japanese):

https://github.com/users/shimodateakira/projects/3/views/3

Thank you for your interest, and I look forward to your feedback.

- Akira Shimodate, AkiraNetwork

8 comments

Version 0.9.1 - Prerelease

This is the prerelease of the project, focusing on bug fixes, enhancements, and new features.

New Features:

Adapter for Items and Symbolic Links in Indexer (#189)

We've added new adapters for items and symbolic links to improve the indexing capabilities of the VirtualStorage class. This allows for more intuitive access and manipulation of items and symbolic links within the storage.

Added Classes:

  VirtualItemAdapter<T>
  VirtualSymbolicLinkAdapter<T>
These adapters provide a streamlined way to interact with items and symbolic links in the storage, enhancing usability and functionality.

Example Usage:

Below is an example demonstrating the new adapters in action:

VirtualStorage<int> vs = new();

vs["/"] += new VirtualDirectory("dir1");

vs["/dir1"] += new VirtualItem<int>("item1", 123);

vs["/"] += new VirtualSymbolicLink("link1", "/dir1/item1");

Console.WriteLine($"item1 ItemData = {vs.Item["/dir1/item1"].ItemData}");

Console.WriteLine($"link1 TargetPath = {vs.Link["/link1"].TargetPath}");

Console.WriteLine($"link1 ItemData = {vs.Item["/link1"].ItemData}");

output:

item1 ItemData = 123

link1 TargetPath = /dir1/item1

link1 ItemData = 123

Hello HN, I'm excited to share the latest update of my project, VirtualStorageLibrary. This .NET library is designed to manage in-memory tree structures, including items, directories, and symbolic links. In this 0.9.0 release, we've focused on important bug fixes and enhancements to improve the library's robustness and usability.

What's Improved in Version 0.9.0:

- #56 : Implemented validity checks for target node names during link creation. - #69 : Renamed the `GetNodesWithPath` method to `GetPaths`. - #86 : Updated `AddLinkToDictionary` to pass absolute paths for target paths. - #144: Fixed an issue in the `RemoveNode` method that allowed deletion of the current path; now throws an exception instead. - #145: Added validation checks to prevent incorrectly specified regular expressions. - #146: Introduced a new exception for when a path is not found. - #147: Added a mechanism to dynamically switch wildcard matchers, improving pattern matching flexibility. - #148: Improved the organization of `DebuggerStepThrough` attributes across the codebase for a better debugging experience. - #158: Fixed a bug during initialization in the `VirtualPath` class. - #184: Resolved an exception that occurred when copying items under a symbolic link in the `CopyNode` method.

These updates aim to provide a more robust and flexible solution for managing hierarchical data structures in .NET environments.

For a detailed list of all changes, please see the release notes: https://shimodateakira.github.io/VirtualStorageLibrary/index...

Thank you for your continued interest and feedback. I'm looking forward to hearing your thoughts on this release!

- Akira Shimodate, AkiraNetwork

Release 0.9.0.3 - Prerelease This is the prerelease of the project, focusing on bug fixes and enhancements.

Issue: #188 When updating symbolic links, the link dictionary was not being updated.

I like this idea.

My first remark is VirtualStorage<T>. As I understand it, T is the type the library user can associate with the items (nodes, files) of the tree. I think this library can be made more useful with VirtualStorage<T, U> where U is a type you can associate with the directories of the tree.

VirtualStorage<T> is then equivalent with VirtualStorage<T, some-dummy-unit-type>. (If I'm not mistaken, F# allows unit where a type parameter is expected, but C# does not allow void.) The use case I have in mind has been mentioned in other comments as well: keep a representation of a file system in memory. For such use cases, you could then use VirtualStorage<System.IO.FileInfo, System.IO.DirectoryInfo>

But also in other use cases, associating information at the directory level might be very useful.

Going even further, maybe a third type might be associated with links as well?

Hello, this is Akira. Thank you for your comment.

The idea of using VirtualStorage<T, U> is very interesting. Currently, the class representing a directory is of the VirtualDirectory type, which is not a generic type. I think the idea of making directories a generic type, VirtualDirectory<U>, to encapsulate a user-defined type U, similar to how VirtualItem<T> represents items, is very useful. Consequently, it would allow us to comprehensively manage everything as VirtualStorage<T, U>.

With your proposed approach, when using VirtualStorageLibrary to represent a file system in memory, it would be possible to create an instance of VirtualStorage<T, U>. This would intuitively allow for the retention of information about physical files and directories.

The key feature of this use case is that it allows for efficient reference and traversal of nodes in a file system maintained with VirtualStorage<T, U> through the VirtualStorageLibrary API. However, implementing node operations (deletion, moving, copying) presents some challenges. This is because, when performing node operations, it is necessary to update the System.IO.DirectoryInfo held within VirtualDirectory<U>. This implies that there would be overlapping node operations between the node manipulations in VirtualStorageLibrary and those in System.IO. Given this, it seems more practical to focus solely on efficient node referencing and traversal in this use case.

The idea of associating a third type with VirtualSymbolicLink in VirtualStorageLibrary might require further consideration, but it certainly seems worth exploring.

Thank you for sharing such meaningful and interesting ideas. If you would like, please consider posting your ideas in the issues or discussions on the VirtualStorageLibrary repository.

VirtualStorageLibrary GitHub: https://github.com/shimodateakira/VirtualStorageLibrary

It looks like you did the sensible thing and used U+002F for path separator. Snover got this right when he did PowerShell and it stayed that way up through the open source releases. More recently someone fucked it up and started prescribing backslash for no good reason when you're using it on Windows, so now it goes so far as aggressively rewriting any paths you've typed in when you use Tab to autocomplete, replacing U+002F with U+005C.

I had a look at the project, and it doesn't look like anything like `if (os.platform.name().includes("Windows")) separator = '\'` is going on here, but I haven't done an exhaustive audit.

How do you the relationship between VirtualStorageLibrary and the relevant/analogous PowerShell subsystem(s)?

Hello, Akira here.

Thank you so much for taking an interest in this project and for your thoughtful comments.

First, regarding the use of \, VirtualStorageLibrary is designed to be platform-agnostic, and there are no plans to tie it to any specific OS in the future. Therefore, you won't find any code like os. in this library. It’s a purely logical hierarchical object model inspired by file systems.

While the default separator is /, it's fully customizable. You can set it to any character you like, such as : or -, and of course, you can also use \.

Regarding the relationship with PowerShell subsystems, as of now, there is no functional integration between VirtualStorageLibrary and PowerShell. That means there’s no capability to import or export data handled by cmdlets—yet. However, there are indeed ideas for future versions that could involve collaboration with PowerShell. Although this is not yet detailed in the roadmap, I believe that being able to manipulate VirtualStorageLibrary via cmdlets would be a highly useful and interesting feature.

When designing VirtualStorageLibrary, I carefully examined the culture surrounding file systems both in Linux and in PowerShell. Although I’ve primarily been a Windows user, I’m also familiar with Linux, and I’ve always found its file system to be simple and elegant. That’s why I chose / as the default separator for this library.

For more information on customizing separators, please refer to the documentation here: https://shimodateakira.github.io/VirtualStorageLibrary/intro...

Please feel free to reach out if you have any more questions or feedback!

I recently needed something like this (and ended up rolling my own) while using the Dropbox API.

There have a fairly conservative rate limit, so you have to think about every API call you make. When traversing, it makes the most sense to do a recursive "ls" of the entire directory tree, keep it in memory, and perform whatever other actions you want (search, list, etc.) on the cached tree.

The implementation ended up being a pretty simple IDictionary<string, string[]>, but it would be interesting to try switching to something purpose-built.

Hello, this is Akira. Thank you for your interest.

I haven't used the Dropbox API myself, but your approach of caching the recursive ls results and then performing actions on the cached tree is very appropriate, especially considering the rate limits. It sounds like a smart way to handle things efficiently.

I’m not sure what types of nodes the Dropbox API provides, but if it’s just files and directories, it could be possible to build a tree in VirtualStorageLibrary based on the cached structure, and then perform reference or search actions. Also, if the Dropbox API has an option to fetch the entire tree structure at once, that would likely be the most efficient approach.

I've prepared an API reference for VirtualStorageLibrary, so feel free to take a look: https://shimodateakira.github.io/VirtualStorageLibrary/api/A...

Please feel free to reach out again if you have any more comments.

Your license badge shows AGPL 1.0, but the license file GPL 3.0.

I would suggest a commercial option in addition to GPL, otherwise it is going to have a very limited reach. Or possibly switch to LGPL (depending on your intentions), but this decision need to be done before you take other any contribution (or add a requirement to sign CLA to PR checklist).

Hello, Akira here. Thank you for your valuable feedback and advice.

I'm glad to see your interest in my project. Your point about the mismatch between the license badge and the file is indeed important. To avoid confusion, I'll make sure to correct that immediately.

I also appreciate your suggestion regarding the addition of a commercial license and switching to LGPL. In fact, I plan to switch from GPL to LGPL to broaden the project's usability. As for the commercial license option, it is still under consideration, but I'm positively thinking about it as a future possibility for the project.

Additionally, I am considering the introduction of a Contributor License Agreement (CLA) to foster deeper collaboration with contributors.

Your continued feedback and suggestions are highly valued. Thank you for your support, and I look forward to hearing more from you in the future.

can u give examples of these anticipated use cases in natural language processing and also in education?
Hello, this is Akira. Thank you for your interest.

First, regarding natural language processing, I’m not an expert in that field, so I’m unable to provide specific examples. However, I included it as a potential use case because I heard from a colleague that this library could be useful in such areas.

As for the field of education, I mentioned it because I believe the source code of this library could be helpful for learning programming. This is one of the advantages of open source. The specific code you might find useful depends on which programming techniques you want to focus on. For example, if you're interested in recursive calling techniques, the WalkPathToTarget method might be a good reference.