Data classification itself is an interesting topic for research. I love to collect information that seems necessary, and I always tried to make logical directory hierarchies for my files, and one day in a dream I saw a beautiful and convenient program for assigning tags to files, and decided that I couldnβt live like this anymore.
The problem of hierarchical file systems
Users often face the problem of just choosing where to save the next new file and the problem of finding their own files (sometimes file names are not at all intended to be remembered by a person).
The way out of the situation can be semantic file systems, which are usually an add-on to the traditional file system. Directories in them are replaced by semantic attributes, also called tags, categories, metadata. I will use the term "category" more often, as in the context of filesystems, the word "tag" is sometimes odd, especially when "subtags" and "tag aliases" appear.
Assigning categories to files largely eliminates the problems of storing and searching for a file: if you remember (or guess) at least one of the categories assigned to a file, then the file will never disappear from sight.
Earlier on HabrΓ© this topic was raised more than once (
Path to implementation
Immediately after the mentioned dream, I described in my notebook a command interface that provides the necessary work with categories. Then I decided that in a week or two I could write a prototype using Python or Bash, and then I would have to work on creating a graphical shell in Qt or GTK. The reality, as always, turned out to be much harsher, and the development was delayed.
The original idea was to first of all make a program with a convenient and concise command line interface that will create, delete categories, assign categories to files, and remove categories from files. I called the program Whisper.
First attempt to create Whisper ended in nothing, since a lot of time began to go to work and college. The second attempt was already something: by the time of my master's thesis, I managed to finish the planned project and even make a prototype of the GTK shell. But that version turned out to be so unreliable and inconvenient that a lot of things had to be rethought.
I actually used the third version myself for a very long time, having transferred several thousand of my files into categories. This, among other things, was greatly facilitated by the implemented auto-completion of bash. But some problems, such as the lack of automatic categories and the ability to store files of the same name, still remained, and the program was already bent under its own complexity. So I came to the need to solve the problems of developing complex software: write detailed requirements, develop a functional testing system, study packaging instructions, and much more. I have now come to the point where this humble creation can be presented to the free community. Such specific file management as management through the concept of categories raises unexpected questions and problems, and in solving them Whisper spawned five more projects around him, some of them will be mentioned in the article. Until now Whisper I haven't purchased a graphical shell, but the convenience of using file categories from the command line already overrides for me any advantages of a conventional graphical file manager.
ΠΡΠΈΠΌΠ΅ΡΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ
Let's start simple - create a category:
vitis create ΠΡΠ·ΡΠΊΠ°
Let's add some composition to it for example:
vitis assign ΠΡΠ·ΡΠΊΠ° -f "The Ink Spots - I Don't Want To Set The World On Fire.mp3"
You can view the contents of the "Music" category with the "show" subcommand:
vitis show ΠΡΠ·ΡΠΊΠ°
You can play it using the "open" subcommand
vitis open ΠΡΠ·ΡΠΊΠ°
Because we have only one file in the βMusicβ category, then only it will start. For the purpose of opening files with their default programs, I made a separate utility vts-fs-open (standard tools like xdg-open or mimeopen did not suit me for a number of reasons; but, if anything, in the settings you can specify another utility for universal file opening). This utility works well on different distributions with different working environments, so I recommend installing it along with vitis.
You can also directly specify a program to open files:
vitis open ΠΡΠ·ΡΠΊΠ° --app qmmp
We create more categories and add files using βassignβ. If files are assigned to categories that do not yet exist, a prompt is issued to create them. An extra request can be avoided by using the -yes flag.
vitis assign ΠΡΠΎΠ³ΡΠ°ΠΌΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ R -f "ΠΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π² R.pdf" "Π‘ΡΠ°ΡΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΠΉ ΠΏΠ°ΠΊΠ΅Ρ R: ΡΠ΅ΠΎΡΠΈΡ Π²Π΅ΡΠΎΡΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΠΌΠ°ΡΡΡΠ°ΡΠΈΡΡΠΈΠΊΠ°.pdf" --yes
Now we want to add the category "Mathematics" to the file "Statistical Package R: Probability and Mathematics.pdf". We know that this file is already categorized as "R" and so we can use the categorical path from the Vitis system:
vitis assign ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ° -v "R/Π‘ΡΠ°ΡΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΠΉ ΠΏΠ°ΠΊΠ΅Ρ R: ΡΠ΅ΠΎΡΠΈΡ Π²Π΅ΡΠΎΡΡΠ½ΠΎΡΡΠ΅ΠΉ ΠΈ ΠΌΠ°ΡΡΡΠ°ΡΠΈΡΡΠΈΠΊΠ°.pdf"
Luckily, bash completion makes this easy.
Let's see what happened, using the --categories flag to see a list of categories for each file:
vitis show R --categories
Note that the files have also been automatically categorized by format, type (combines formats), and file extension. These categories are optional. Later I will definitely do the localization of their names.
Let's add something else to "Mathematics" for a change:
vitis assign ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ° -f "ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΉ Π°Π½Π°Π»ΠΈΠ· - 1984.pdf" ΠΠ΅ΡΠ΅Π»ΡΠΌΠ°Π½_ΠΠ°Π½ΠΈΠΌΠ°ΡΠ΅Π»ΡΠ½Π°Ρ_ΠΌΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ°_1927.djvu
And now the interesting begins. Instead of categories, you can write expressions with union, intersection, and subtraction operations, that is, use operations on sets. For example, the intersection of "Math" with "R" will result in one file.
vitis show R i: ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ°
Subtract from the "Mathematics" the mention of the language "R":
vitis show ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ° R #ΠΈΠ»ΠΈ vitis show ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ° c: R
We can aimlessly combine music and the R language:
vitis show ΠΡΠ·ΡΠΊΠ° u: R
The -n flag allows you to "pull out" the necessary files from the query result by numbers and / or ranges, for example, -n 3-7
, or something more complicated: -n 1,5,8-10,13
. Often useful with the open subcommand, which allows you to open the desired files from a list.
While we're moving away from using the usual directory hierarchy, it's often useful to have nested categories. Let's create a subcategory "Statistics" for the category "Mathematics" and add this category to the appropriate file:
vitis create ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ°/Π‘ΡΠ°ΡΠΈΡΡΠΈΠΊΠ°
vitis assign ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ°/Π‘ΡΠ°ΡΠΈΡΡΠΈΠΊΠ° -v "R/ΠΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π² R.pdf"
vitis show ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ° --categories
We can see that this file is now categorized as "Math/Statistics" instead of "Math" (extra links are tracked).
Referring to the full path can be inconvenient, let's create a "global" alias:
vitis assign ΠΠ°ΡΠ΅ΠΌΠ°ΡΠΈΠΊΠ°/Π‘ΡΠ°ΡΠΈΡΡΠΈΠΊΠ° -a Π‘ΡΠ°ΡΠΈΡΡΠΈΠΊΠ°
vitis show Π‘ΡΠ°ΡΠΈΡΡΠΈΠΊΠ°
Not just regular files
Internet links
To unify the storage of any information, it would be useful, at a minimum, to categorize links to Internet resources. And this is possible:
vitis assign Π₯Π°Π±Ρ Π¦Π²Π΅ΡΠΎΠ°Π½ΠΎΠΌΠ°Π»ΠΈΡ -i https://habr.com/ru/company/sfe_ru/blog/437304/ --yes
A file with the title of the HTML page and with the .desktop extension will be created in a special place. This is the traditional GNU/Linux shortcut format. Such shortcuts receive the automatic NetworkBookmarks category.
Naturally, shortcuts are created to be used:
vitis open Π¦Π²Π΅ΡΠΎΠ°Π½ΠΎΠΌΠ°Π»ΠΈΡ
Executing the command opens the link just saved in the browser. Categorized shortcuts to Internet sources can serve as a substitute for browser bookmarks.
File fragments
It is also useful to have categories for individual file fragments. Nice application, huh? But the current implementation only affects regular text files, audio and video files so far. Let's say you need to mark a certain piece of a concert or a funny moment in a movie, then when using assign you can use the --fragname, --start, --finish flags. Let's save the intro from DuckTales:
vitis assign vitis assign -c ΠΠ°ΡΡΠ°Π²ΠΊΠΈ -f Duck_Tales/s01s01.avi --finish 00:00:59 --fragname "Duck Tales intro"
vitis open ΠΠ°ΡΡΠ°Π²ΠΊΠΈ
Actually, no files are cut off, instead a file pointer to the fragment is created, which describes the file type, the path to the file, the beginning and end of the fragment. The creation and opening of pointers to fragments is delegated to utilities specially made by me for this purpose - these are mediafragmenter and fragplayer. The first creates, the second opens. In the case of audio and video recordings, starting the media file from a certain to a certain position is done using the VLC player, so it must also be in the system. At first I wanted to do it on the basis of mplayer, but for some reason it was very crooked with positioning at the right moment.
In our example, the file "Duck Tales intro.fragpointer" is created (it is placed in a special place), and then the fragment is played from the beginning of the file (because --start was not specified during creation) to the mark at 59 seconds, after which VLC closes .
Another example - we decided to categorize a single performance at a concert of some famous performer:
vitis assign ΠΠ΅ΠΏΡ "Π‘ΠΏΠ°ΡΠΈΡΠ΅ Π½Π°ΡΠΈ Π΄ΡΡΠΈ" -f ΠΡΠΈΠ³ΠΎΡΠΈΠΉ ΠΠ΅ΠΏc - ΠΠΎΠ½ΡΠ΅ΡΡ ΠΠ°ΡΡΡ - ΠΏΠ΅ΡΠ½ΠΈ ΠΠ»Π°Π΄ΠΈΠΌΠΈΡΠ° ΠΡΡΠΎΡΠΊΠΎΠ³ΠΎ.mp4 --fragname "Π‘ΠΏΠ°ΡΠΈΡΠ΅ Π½Π°ΡΠΈ Π΄ΡΡΠΈ" --start 00:32:18 --finish 00:36:51
vitis open "Π‘ΠΏΠ°ΡΠΈΡΠ΅ Π½Π°ΡΠΈ Π΄ΡΡΠΈ"
When opened, the file will be included in the desired position and will close after four and a half minutes.
How it all works + additional features
Category storage
At the very beginning of thinking through the organization of the semantic file system, three ways came to my mind: through the storage of symbolic links, through the database, through the description in XML. The first method won, because. on the one hand, it is easy to implement, and on the other hand, the user has the opportunity to look at the categories directly from the file system (and this is convenient and important). At the beginning of use Whisper the directory "Vitis" and the configuration file ".config/vitis/vitis.conf" are created in the user's home directory. Directories corresponding to the categories are created in ~/Vitis, and symbolic links to the original files are created in these category directories. Category aliases are also just links to them. Of course, having the "Vitis" directory in your home directory may not suit someone. We can switch to any other place:
vitis service set path /mnt/MyFavoriteDisk/Vitis/
At a certain point, it becomes clear that it makes little sense to categorize files scattered in different places, since their location can change. Therefore, for a start, I created a directory for myself, where I stupidly dumped everything and gave it all categories. Then I decided that it would be nice to arrange this moment at the program level. This is how the concept of "file space" appeared. At the beginning of use Whisper It would not hurt to immediately set up such a place (all the files we need will be stored there) and enable autosave:
vitis service add filespace /mnt/MyFavoriteDisk/Filespace/
vitis service set autosave yes
Without autosave, using the "assign" subcommand will require the --save flag if it is desired to save the appended file to the filespace.
Moreover, you can add several file spaces and change their priorities, this can be useful when there are a lot of files and they are stored on different media. Here I will not consider this possibility, details can be found in the help for the program.
Semantic file system migration
One way or another, the Vitis directory and file spaces can theoretically sometimes move from place to place. To bring it to working capacity, I created a separate utility link-editor, which can bulk edit links, replacing parts of the path with others:
cp -r /mnt/MyFavoriteDisk/Vitis/ ~/Vitis
link-editor -d ~/Vitis/ -f /mnt/MyFavoriteDisk/Vitis/ -r ~/Vitis/ -R
cp -r /mnt/MyFavoriteDisk/Filespace/ ~/MyFiles
link-editor -d ~/Vitis/ -f /mnt/FlashDrive-256/Filespace/ -r ~/MyFiles -R
In the first case, after we moved from /mnt/MyFavoriteDisk/Vitis/ to the home directory, the symbolic links associated with the aliases are edited. In the second case, after changing the location of the file space, all links in Vitis are changed to new ones according to the request to replace part of their path.
Automatic categories
If you run the command vitis service get autocategorization
, you can see that automatic categories are assigned by default by format (Format and Type) and file extension (Extension).
This is useful when, for example, you need to find something among PDFs or look at what you have stored from EPUB and FB2, you can simply run a query
vitis show Format/MOBI u: Format/FB2
It just so happened that the standard GNU / Linux tools like file or mimetype did not suit me precisely because they do not always correctly determine the format, I had to make my own implementation based on file signatures and extensions. In general, the topic of determining file formats is an interesting topic for research and deserves a separate article. So far, I can say that, perhaps, I have not provided for true recognition for all formats in the world, but in general, it already works well now. True, EPUB now defines the format as ZIP (generally justified, but in practice this should not be considered normal behavior). Until some time, consider this feature experimental, report bugs. In strange situations, you can always use file extension categories, such as Extension/epub.
If autocategories by format are enabled, autocategories are also enabled that combine some formats by type: "Archives", "Pictures", "Video", "Audio", and "Documents". Localized names will also be made for these subcategories.
What is not said
Whisper turned out to be a very multifaceted tool, and it is difficult to cover everything at once. I will briefly mention what else you can do:
- categories can be deleted and removed from files;
- query results by expressions can be copied to the specified directory;
- files can be run as programs;
- The show command has many options, such as sorting by name/date modified or accessed/size/extension, showing file properties and paths to originals, enabling display of hidden files, etc.;
- When you save links to Internet sources, you can also save local copies of HTML pages.
All details can be found in the user help.
Prospects
Skeptics often say that "no one will put these tags on his own." On my own example, I can prove the opposite: I have already categorized more than six thousand files, created more than a thousand categories and aliases, and it was worth it. When one team vitis open ΠΠ»Π°Π½
open your to-do list or when one command vitis open LaTeX
you open Stolyarov's book about the LaTeX layout system, it is already morally difficult to use the file system "the old fashioned way".
On this basis, a number of ideas arise. For example, you can make an automatic radio that plays themed music according to the current weather, holiday, day of the week, time of day or year. Still close to the topic is a music player that knows about categories and can play music by expression with operations on categories as on sets. It's useful to have a daemon that will watch the Downloads directory and suggest categorizing new files. And, of course, one should make a normal graphical semantic file manager. Once I even made a web service for the enterprise for sharing files, but it was not a priority and became irrelevant, although it reached a high level of operability. (Due to major changes in the Whisper, it is no longer usable.)
here is a little demo
Conclusion
Vitis is not the first attempt to radically change the style of working with data, but I considered it important to implement my ideas and put the implementation in the public domain under the GNU GPL license. For convenience, a deb package for x86-64 has been made, it should work on all modern Debian distributions. There were minor difficulties on ARM (while all other programs related to Whisper, work fine), but in the future a working package will be built for this platform (armhf). I stopped creating RPM packages for the time being due to problems on Fedora 30 and the problem of spreading across many RPM distributions, but later on, at least a couple of them will still be packaged. Until then, you can use make && make install
or checkinstall
.
Thank you all for your attention! I hope this article and this project can be useful.
Source: habr.com