?

Log in

No account? Create an account
entries friends calendar profile It's Me Previous Previous Next Next
The Autobiography of Russell
Life from a different perspective
zimzat
zimzat
Tagged file system?
Why not make a file system where each file is given a root tag (e.g. the name of the program or person it belongs to) and then as many optional tags as the user wants (e.g. image, document, etc). The file system could then be displayed using a list of root tags and common tag filters defined by the user or program. You could also make auto-tag features where programs are automatically given a tag of 'executable' or 'program', images are given a tag of 'image' and optionally the type of image, etc.

This would make it very easy for users to select the root of a program and see all files related to it under descriptive tags, or select a tag and see all of the root tags with files that use that tag.

Why not? Sounds like it would make a good alternate file system. (And for those of you who are already crying about losing compatibility with existing systems, just have it default to to /root-tag/optional-tag/optional-tag/file)

I personally like Macintosh's ability to make programs look like one file/folder that gets executed in the system, and if you want to delete the program just delete the one file and you're done. None of this spraying itself all over the system. But I wouldn't trade anything for Macintosh's 'fancy' graphics or overpriced computers. No thanks. I just think we could learn a few things from them.

Current Mood: bored bored
Current Music: None

3 comments or Leave a comment
Comments
raist_ From: raist_ Date: July 31st, 2006 03:42 am (UTC) (Link)

Devil's Advocate

If you have a nigh-unlimited amount of tags, doesn't that make it turn into a O(n) time constraint to get to the actual meat of the file?

Also, without a designation (read: control) of how you know when you actually get to the file, how do you program something to know when it reaches where it needs to go?

I might send you a file with 23493028 tags, but you only are expecting 3 tags and then the meat of the file; that's going to look pretty fucked up.

And what if one of those 1349382904283491 tags gets a corruption in it? The more tags you have, the higher the probability that something's going to get corrupted along the way.

Plus, you'd be looking at (slightly) higher file sizes. Now, space might not be as much of an issue as it used to be, but it does still come into effect. (especially on network environments)

This is just the stuff that popped into my head as I was reading this. I havn't really thought extensively of ideas why not yet.
zimzat From: zimzat Date: July 31st, 2006 05:13 pm (UTC) (Link)

Angel's Advocate? ;-)

If you have a nigh-unlimited amount of tags, doesn't that make it turn into a O(n) time constraint to get to the actual meat of the file?

I really need to look up these "O(n)" things and how that's supposed to work.

There are optimizations that can be done, though. The tag information could be put at the end of the file, if you really want to put that information in the file itself (altough I think it would be better in the filesystem information table).

Tags could also be associated with an id and that could be put in the file [system table] instead of the tag string itself. Using something like that the system could do a sort of "autofill completion" for tags when being entered to help ensure the right tags get reused and not misspelled. It would also make searching for tags a lot easier, so you would only search for one number rather than strlen(tag) characters.

You could also define some tags like "_dir" that store where in a 'normal' file system the file would otherwise be.

I might send you a file with 23493028 tags, but you only are expecting 3 tags and then the meat of the file; that's going to look pretty fucked up.

That sounds like a screw-up in file transfer communication protocol, not in the file system protocol itself.

And what if one of those 1349382904283491 tags gets a corruption in it? The more tags you have, the higher the probability that something's going to get corrupted along the way.

I'm pretty sure this is also a problem with hierarchy file systems, so designing the system to be redundant and/or explicit shouldn't be harder than for any of the current file systems facing this problem.

Plus, you'd be looking at (slightly) higher file sizes. Now, space might not be as much of an issue as it used to be, but it does still come into effect. (especially on network environments)

Yes, that would a concern. If using the above optimizations, though, each tag would only increase the file system size by whatever size the id for each tag was. The transfer protocol could also be designed to default to only sending the above mentioned "_dir" tag unless it was told the receiving client specifically wants those tags. In that case it could send them as a sort of meta-data above and beyond the file itself. The same would apply for when sending compressed files and what-not.
raist_ From: raist_ Date: August 3rd, 2006 10:03 pm (UTC) (Link)

I need to reread this and finish posting, but...

I might send you a file with 23493028 tags, but you only are expecting 3 tags and then the meat of the file; that's going to look pretty fucked up.

That sounds like a screw-up in file transfer communication protocol, not in the file system protocol itself.


I was referencing what sounded like the ability to easily customize how many tags each user is using. For ex: One user decides he wants to use 4 tags with your file system. Another user decides he wants to use 2342 tags with your file system. Having them communicate is going to be a bitch.

-----

"Big O Notation"

It's a way of describing the absolute worst case time measurement. It uses upper limits for theoretical amounts of data. It's mostly used in comparing time in algorithms and data structures; because "time" is so arbitrary for a given set of hardware, OS, etc., you use the theoretical n. For example:

for (i = 1 to n){
// do something
}
runs in n time. It's therefore O(n). (Pronounced "Order n") Most doubly linked list functions run in n time.

The always popular Shitty Sort:
for (i = 1 to m){
for (i= 1 to n) {
if (a[i] < b[j]) {
return
}
if (a[i] > b[j]) {
}
}
}

This runs in O(n^2) time, and is most undesirable. You can do sorts much better then n squared. Most ideal sorts (bubble sort, heap sort, etc.) are designed to run in either n*log(n) time, or log(n)

Make sense? Most data structures and/or algorithms classes/books will teach you this.
3 comments or Leave a comment