Categories instead of directories, or the Semantic File System for Linux

Data classification itself is an interesting topic for research. I love to collect information that seems necessary, and I always tried to make logical directory hierarchies for my files, and one day in a dream I saw a beautiful and convenient program for assigning tags to files, and decided that I couldn’t live like this anymore.

The problem of hierarchical file systems

Users often face the problem of just choosing where to save the next new file and the problem of finding their own files (sometimes file names are not at all intended to be remembered by a person).

The way out of the situation can be semantic file systems, which are usually an add-on to the traditional file system. Directories in them are replaced by semantic attributes, also called tags, categories, metadata. I will use the term "category" more often, as in the context of filesystems, the word "tag" is sometimes odd, especially when "subtags" and "tag aliases" appear.

Assigning categories to files largely eliminates the problems of storing and searching for a file: if you remember (or guess) at least one of the categories assigned to a file, then the file will never disappear from sight.

Earlier on HabrΓ© this topic was raised more than once (time, two, three, four etc.), here I describe my solution.

Path to implementation

Immediately after the mentioned dream, I described in my notebook a command interface that provides the necessary work with categories. Then I decided that in a week or two I could write a prototype using Python or Bash, and then I would have to work on creating a graphical shell in Qt or GTK. The reality, as always, turned out to be much harsher, and the development was delayed.

The original idea was to first of all make a program with a convenient and concise command line interface that will create, delete categories, assign categories to files, and remove categories from files. I called the program Whisper.

First attempt to create Whisper ended in nothing, since a lot of time began to go to work and college. The second attempt was already something: by the time of my master's thesis, I managed to finish the planned project and even make a prototype of the GTK shell. But that version turned out to be so unreliable and inconvenient that a lot of things had to be rethought.

I actually used the third version myself for a very long time, having transferred several thousand of my files into categories. This, among other things, was greatly facilitated by the implemented auto-completion of bash. But some problems, such as the lack of automatic categories and the ability to store files of the same name, still remained, and the program was already bent under its own complexity. So I came to the need to solve the problems of developing complex software: write detailed requirements, develop a functional testing system, study packaging instructions, and much more. I have now come to the point where this humble creation can be presented to the free community. Such specific file management as management through the concept of categories raises unexpected questions and problems, and in solving them Whisper spawned five more projects around him, some of them will be mentioned in the article. Until now Whisper I haven't purchased a graphical shell, but the convenience of using file categories from the command line already overrides for me any advantages of a conventional graphical file manager.

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹ использования

Let's start simple - create a category:

vitis create ΠœΡƒΠ·Ρ‹ΠΊΠ°

Let's add some composition to it for example:

vitis assign ΠœΡƒΠ·Ρ‹ΠΊΠ° -f "The Ink Spots - I Don't Want To Set The World On Fire.mp3"

You can view the contents of the "Music" category with the "show" subcommand:

vitis show ΠœΡƒΠ·Ρ‹ΠΊΠ°

You can play it using the "open" subcommand

vitis open ΠœΡƒΠ·Ρ‹ΠΊΠ°

Because we have only one file in the β€œMusic” category, then only it will start. For the purpose of opening files with their default programs, I made a separate utility vts-fs-open (standard tools like xdg-open or mimeopen did not suit me for a number of reasons; but, if anything, in the settings you can specify another utility for universal file opening). This utility works well on different distributions with different working environments, so I recommend installing it along with vitis.

You can also directly specify a program to open files:

vitis open ΠœΡƒΠ·Ρ‹ΠΊΠ° --app qmmp

Categories instead of directories, or the Semantic File System for Linux

We create more categories and add files using β€œassign”. If files are assigned to categories that do not yet exist, a prompt is issued to create them. An extra request can be avoided by using the -yes flag.

vitis assign ΠŸΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ R -f "Π’Π²Π΅Π΄Π΅Π½ΠΈΠ΅ Π² R.pdf" "БтатистичСский ΠΏΠ°ΠΊΠ΅Ρ‚ R: тСория вСроятностСй ΠΈ матстатистика.pdf" --yes

Now we want to add the category "Mathematics" to the file "Statistical Package R: Probability and Mathematics.pdf". We know that this file is already categorized as "R" and so we can use the categorical path from the Vitis system:

vitis assign ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ° -v "R/БтатистичСский ΠΏΠ°ΠΊΠ΅Ρ‚ R: тСория вСроятностСй ΠΈ матстатистика.pdf"

Luckily, bash completion makes this easy.

Let's see what happened, using the --categories flag to see a list of categories for each file:

vitis show R --categories

Categories instead of directories, or the Semantic File System for Linux

Note that the files have also been automatically categorized by format, type (combines formats), and file extension. These categories are optional. Later I will definitely do the localization of their names.

Let's add something else to "Mathematics" for a change:

vitis assign ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ° -f "ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΡ‡Π΅ΡΠΊΠΈΠΉ Π°Π½Π°Π»ΠΈΠ· - 1984.pdf" ΠŸΠ΅Ρ€Π΅Π»ΡŒΠΌΠ°Π½_Π—Π°Π½ΠΈΠΌΠ°Ρ‚Π΅Π»ΡŒΠ½Π°Ρ_ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°_1927.djvu 

And now the interesting begins. Instead of categories, you can write expressions with union, intersection, and subtraction operations, that is, use operations on sets. For example, the intersection of "Math" with "R" will result in one file.

vitis show R i: ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°

Subtract from the "Mathematics" the mention of the language "R":

vitis show ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°  R  #ΠΈΠ»ΠΈ vitis show ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ° c: R

We can aimlessly combine music and the R language:

vitis show ΠœΡƒΠ·Ρ‹ΠΊΠ° u: R

The -n flag allows you to "pull out" the necessary files from the query result by numbers and / or ranges, for example, -n 3-7, or something more complicated: -n 1,5,8-10,13. Often useful with the open subcommand, which allows you to open the desired files from a list.

Categories instead of directories, or the Semantic File System for Linux

While we're moving away from using the usual directory hierarchy, it's often useful to have nested categories. Let's create a subcategory "Statistics" for the category "Mathematics" and add this category to the appropriate file:

vitis create ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°/Бтатистика

vitis assign ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°/Бтатистика -v "R/Π’Π²Π΅Π΄Π΅Π½ΠΈΠ΅ Π² R.pdf"

vitis show ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ° --categories

Categories instead of directories, or the Semantic File System for Linux

We can see that this file is now categorized as "Math/Statistics" instead of "Math" (extra links are tracked).

Referring to the full path can be inconvenient, let's create a "global" alias:

vitis assign ΠœΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠ°/Бтатистика -a Бтатистика

vitis show Бтатистика

Categories instead of directories, or the Semantic File System for Linux

Not just regular files

Internet links

To unify the storage of any information, it would be useful, at a minimum, to categorize links to Internet resources. And this is possible:

vitis assign Π₯Π°Π±Ρ€ ЦвСтоаномалия -i https://habr.com/ru/company/sfe_ru/blog/437304/ --yes

A file with the title of the HTML page and with the .desktop extension will be created in a special place. This is the traditional GNU/Linux shortcut format. Such shortcuts receive the automatic NetworkBookmarks category.

Naturally, shortcuts are created to be used:

vitis open ЦвСтоаномалия

Executing the command opens the link just saved in the browser. Categorized shortcuts to Internet sources can serve as a substitute for browser bookmarks.

File fragments

It is also useful to have categories for individual file fragments. Nice application, huh? But the current implementation only affects regular text files, audio and video files so far. Let's say you need to mark a certain piece of a concert or a funny moment in a movie, then when using assign you can use the --fragname, --start, --finish flags. Let's save the intro from DuckTales:

vitis assign vitis assign -c Заставки -f Duck_Tales/s01s01.avi --finish 00:00:59 --fragname "Duck Tales intro"

vitis open Заставки

Actually, no files are cut off, instead a file pointer to the fragment is created, which describes the file type, the path to the file, the beginning and end of the fragment. The creation and opening of pointers to fragments is delegated to utilities specially made by me for this purpose - these are mediafragmenter and fragplayer. The first creates, the second opens. In the case of audio and video recordings, starting the media file from a certain to a certain position is done using the VLC player, so it must also be in the system. At first I wanted to do it on the basis of mplayer, but for some reason it was very crooked with positioning at the right moment.
In our example, the file "Duck Tales intro.fragpointer" is created (it is placed in a special place), and then the fragment is played from the beginning of the file (because --start was not specified during creation) to the mark at 59 seconds, after which VLC closes .

Another example - we decided to categorize a single performance at a concert of some famous performer:

vitis assign ЛСпс "БпаситС наши Π΄ΡƒΡˆΠΈ" -f Π“Ρ€ΠΈΠ³ΠΎΡ€ΠΈΠΉ Π›Π΅ΠΏc - ΠšΠΎΠ½Ρ†Π΅Ρ€Ρ‚ ΠŸΠ°Ρ€ΡƒΡ - пСсни Π’Π»Π°Π΄ΠΈΠΌΠΈΡ€Π° Высоцкого.mp4 --fragname "БпаситС наши Π΄ΡƒΡˆΠΈ" --start 00:32:18 --finish 00:36:51

vitis open "БпаситС наши Π΄ΡƒΡˆΠΈ"

When opened, the file will be included in the desired position and will close after four and a half minutes.

How it all works + additional features

Category storage

At the very beginning of thinking through the organization of the semantic file system, three ways came to my mind: through the storage of symbolic links, through the database, through the description in XML. The first method won, because. on the one hand, it is easy to implement, and on the other hand, the user has the opportunity to look at the categories directly from the file system (and this is convenient and important). At the beginning of use Whisper the directory "Vitis" and the configuration file ".config/vitis/vitis.conf" are created in the user's home directory. Directories corresponding to the categories are created in ~/Vitis, and symbolic links to the original files are created in these category directories. Category aliases are also just links to them. Of course, having the "Vitis" directory in your home directory may not suit someone. We can switch to any other place:

vitis service set path /mnt/MyFavoriteDisk/Vitis/

At a certain point, it becomes clear that it makes little sense to categorize files scattered in different places, since their location can change. Therefore, for a start, I created a directory for myself, where I stupidly dumped everything and gave it all categories. Then I decided that it would be nice to arrange this moment at the program level. This is how the concept of "file space" appeared. At the beginning of use Whisper It would not hurt to immediately set up such a place (all the files we need will be stored there) and enable autosave:

vitis service add filespace /mnt/MyFavoriteDisk/Filespace/

vitis service set autosave yes

Without autosave, using the "assign" subcommand will require the --save flag if it is desired to save the appended file to the filespace.

Moreover, you can add several file spaces and change their priorities, this can be useful when there are a lot of files and they are stored on different media. Here I will not consider this possibility, details can be found in the help for the program.

Semantic file system migration

One way or another, the Vitis directory and file spaces can theoretically sometimes move from place to place. To bring it to working capacity, I created a separate utility link-editor, which can bulk edit links, replacing parts of the path with others:

cp -r /mnt/MyFavoriteDisk/Vitis/ ~/Vitis
link-editor -d ~/Vitis/ -f /mnt/MyFavoriteDisk/Vitis/ -r ~/Vitis/ -R
cp -r /mnt/MyFavoriteDisk/Filespace/ ~/MyFiles
link-editor -d ~/Vitis/ -f /mnt/FlashDrive-256/Filespace/ -r ~/MyFiles -R

In the first case, after we moved from /mnt/MyFavoriteDisk/Vitis/ to the home directory, the symbolic links associated with the aliases are edited. In the second case, after changing the location of the file space, all links in Vitis are changed to new ones according to the request to replace part of their path.

Automatic categories

If you run the command vitis service get autocategorization, you can see that automatic categories are assigned by default by format (Format and Type) and file extension (Extension).

This is useful when, for example, you need to find something among PDFs or look at what you have stored from EPUB and FB2, you can simply run a query

vitis show Format/MOBI u: Format/FB2

It just so happened that the standard GNU / Linux tools like file or mimetype did not suit me precisely because they do not always correctly determine the format, I had to make my own implementation based on file signatures and extensions. In general, the topic of determining file formats is an interesting topic for research and deserves a separate article. So far, I can say that, perhaps, I have not provided for true recognition for all formats in the world, but in general, it already works well now. True, EPUB now defines the format as ZIP (generally justified, but in practice this should not be considered normal behavior). Until some time, consider this feature experimental, report bugs. In strange situations, you can always use file extension categories, such as Extension/epub.

If autocategories by format are enabled, autocategories are also enabled that combine some formats by type: "Archives", "Pictures", "Video", "Audio", and "Documents". Localized names will also be made for these subcategories.

What is not said

Whisper turned out to be a very multifaceted tool, and it is difficult to cover everything at once. I will briefly mention what else you can do:

  • categories can be deleted and removed from files;
  • query results by expressions can be copied to the specified directory;
  • files can be run as programs;
  • The show command has many options, such as sorting by name/date modified or accessed/size/extension, showing file properties and paths to originals, enabling display of hidden files, etc.;
  • When you save links to Internet sources, you can also save local copies of HTML pages.

All details can be found in the user help.

Prospects

Skeptics often say that "no one will put these tags on his own." On my own example, I can prove the opposite: I have already categorized more than six thousand files, created more than a thousand categories and aliases, and it was worth it. When one team vitis open План open your to-do list or when one command vitis open LaTeX you open Stolyarov's book about the LaTeX layout system, it is already morally difficult to use the file system "the old fashioned way".

On this basis, a number of ideas arise. For example, you can make an automatic radio that plays themed music according to the current weather, holiday, day of the week, time of day or year. Still close to the topic is a music player that knows about categories and can play music by expression with operations on categories as on sets. It's useful to have a daemon that will watch the Downloads directory and suggest categorizing new files. And, of course, one should make a normal graphical semantic file manager. Once I even made a web service for the enterprise for sharing files, but it was not a priority and became irrelevant, although it reached a high level of operability. (Due to major changes in the Whisper, it is no longer usable.)

here is a little demo

Categories instead of directories, or the Semantic File System for Linux

Conclusion

Vitis is not the first attempt to radically change the style of working with data, but I considered it important to implement my ideas and put the implementation in the public domain under the GNU GPL license. For convenience, a deb package for x86-64 has been made, it should work on all modern Debian distributions. There were minor difficulties on ARM (while all other programs related to Whisper, work fine), but in the future a working package will be built for this platform (armhf). I stopped creating RPM packages for the time being due to problems on Fedora 30 and the problem of spreading across many RPM distributions, but later on, at least a couple of them will still be packaged. Until then, you can use make && make install or checkinstall.

Thank you all for your attention! I hope this article and this project can be useful.

Link to project repository

Source: habr.com

Add a comment