Sharing files from Google Drive using nginx

prehistory

It just so happened that I needed to store more than 1.5tb of data somewhere, and even provide the ability for ordinary users to download them via a direct link. Since traditionally such amounts of memory go to VDS, the cost of renting which is not too invested in the project budget from the “nothing to do” category, and from the initial data I had a VPS 400GB SSD, where, with all my desire, 1.5 TB of images without lossless compression cannot be placed succeed.

And then I remembered that if I remove rubbish from Google Drive, like programs that will only run on Windows XP, and other things that have been wandering from media to media since the Internet was not so fast at all not unlimited (for example, those 10-20 versions of the virtual box were unlikely to have any value other than nostalgic), then everything should fit very well. No sooner said than done. And so, breaking through the limit on the number of requests to api (by the way, technical support increased the quota of requests per user in 100 seconds to 10 without any problems), the data quickly flowed to the place of its further deployment.

Everything seems to be good, but now it needs to be conveyed to the end user. Yes, and without any redirects to other resources there, but for a person to simply press the “Download” button and become the happy owner of the treasured file.

Here, by God, I went into all serious trouble. At first it was an AmPHP script, but I was not satisfied with the load it created (a sharp jump at the start up to 100% of the core consumption). Then the curl wrapper for ReactPHP went into action, which quite fit into my wishes for the number of CPU cycles consumed, but gave the speed not at all what I wanted (it turned out that you can simply reduce the curl_multi_select call interval, but then we have a gluttony similar to the first option ). I even tried to write a small service in Rust, and it worked quite quickly (it’s even surprising that it worked, with my knowledge), but I wanted more, and it was somehow not easy to customize it. In addition, all these solutions somehow buffered the response in a strange way, and I wanted to track the moment when the file download ended with the greatest accuracy.

In general, for some time it was awry, but it worked. Until one day I came up with a wonderful idea in its crazyness: nginx, in theory, can do what I want, it works briskly, and even allows all sorts of perversions with configuration. You have to try - will it work? And after half a day of persistent searching, a solution that had been working stably for several months was born, which met all my requirements.

Setting up NGINX

# Первым делом создадим в конфигах нашего сайта отдельную локацию.
location ~* ^/google_drive/(.+)$ {

    # И закроем её от посторонних глаз (рук, ног и прочих частей тела).
    internal;

    # Ограничим пользователям скорость до разумных пределов (я за равноправие).
    limit_rate 1m;

    # А чтоб nginx мог найти сервера google drive укажем ему адрес резолвера.
    resolver 8.8.8.8;

    # Cоберем путь к нашему файлу (мы потом передадим его заголовками).
    set $download_url https://www.googleapis.com/drive/v3/files/$upstream_http_file_id?alt=media;

    # А так же Content-Disposition заголовок, имя файла мы передадим опять же в заголовках.
    set $content_disposition 'attachment; filename="$upstream_http_filename"';

    # Запретим буфферизировать ответ на диск.
    proxy_max_temp_file_size 0;

    # И, что немаловажно, передадим заголовок с токеном (не знаю почему, но в заголовках из $http_upstream токен передать не получилось. Вернее передать получилось, но скорей всего его где-то нужно экранировать, потому что гугл отдает ошибку авторизации).
    proxy_set_header Authorization 'Bearer $1';

    # И все, осталось отправить запрос гуглу по ранее собранному нами адресу.
    proxy_pass $download_url;

    # А чтоб у пользователя при скачивании отобразилось правильное имя файла мы добавим соответствующий заголовок.
    add_header Content-Disposition $content_disposition;

    # Опционально можно поубирать ненужные нам заголовки от гугла.
    proxy_hide_header Content-Disposition;
    proxy_hide_header Alt-Svc;
    proxy_hide_header Expires;
    proxy_hide_header Cache-Control;
    proxy_hide_header Vary;
    proxy_hide_header X-Goog-Hash;
    proxy_hide_header X-GUploader-UploadID;
}

A short version without comments can be seen under the spoiler

location ~* ^/google_drive/(.+)$ {
    internal;
    limit_rate 1m;
    resolver 8.8.8.8;
    
    set $download_url https://www.googleapis.com/drive/v3/files/$upstream_http_file_id?alt=media;
    set $content_disposition 'attachment; filename="$upstream_http_filename"';
    
    proxy_max_temp_file_size 0;
    proxy_set_header Authorization 'Bearer $1';
    proxy_pass $download_url;
    
    add_header Content-Disposition $content_disposition;
    
    proxy_hide_header Content-Disposition;
    proxy_hide_header Alt-Svc;
    proxy_hide_header Expires;
    proxy_hide_header Cache-Control;
    proxy_hide_header Vary;
    proxy_hide_header X-Goog-Hash;
    proxy_hide_header X-GUploader-UploadID;
}

Writing a script to manage all this happiness

The example will be in PHP and deliberately written with a minimum body kit. I think anyone who has experience with any other language will be able to integrate this section using my example.

<?php

# Токен для Google Drive Api.
define('TOKEN', '*****');

# ID файла на гугл диске
$fileId = 'abcdefghijklmnopqrstuvwxyz1234567890';

# Опционально, но так как мы не передаем никаких данных - почему бы и нет?
http_response_code(204);

# Зададим заголовок c ID файла (в конфигах nginx мы потом получим его как $upstream_http_file_id).
header('File-Id: ' . $fileId);
# И заголовок с именем файла (соответственно $upstream_http_filename).
header('Filename: ' . 'test.zip');
# Внутренний редирект. А еще в адресе мы передадим токен, тот самый, что мы получаем из $1 в nginx.
header('X-Accel-Redirect: ' . rawurlencode('/google_drive/' . TOKEN));

Results

In general, this method makes it quite easy to organize the distribution of files to users from any cloud storage. Yes, even from telegram or VK, (provided that the file size does not exceed the allowable size for this storage). I had an idea like this, but unfortunately I come across files up to 2GB, and I have not yet found a method or module for gluing answers from upstream, writing some kind of wrappers for this project is unreasonably laborious.

Thank you for your attention. I hope my story was at least a little interesting or useful to you.

Source: habr.com

Add a comment