My eight-year quest to digitize 45 videocassettes. Part 1

Over the past eight years, I have moved this box of videotapes to four different apartments and one house. Family videos from my childhood.

My eight-year quest to digitize 45 videocassettes. Part 1

After more than 600 hours of work, I finally digitized and organized them properly so that the cassettes can be thrown away.

Part 2


Here's what the footage looks like now:

My eight-year quest to digitize 45 videocassettes. Part 1

My eight-year quest to digitize 45 videocassettes. Part 1
All family videos are digitized and available for viewing from a private media server

This resulted in 513 individual video clips. Each has a title, description, recording date, tags for all participants, indicating the age at the time of recording. Everything is on a private media server that only family members have access to, and hosting costs less than $1 a month.

This article talks about everything I've done, why it took eight years, and how to achieve the same result much easier and faster.

First naive attempt

Around 2010, my mom bought some kind of VHS to DVD converter and ran all of our home videos through it.

My eight-year quest to digitize 45 videocassettes. Part 1
The original DVDs my mom recorded (don't know what happened to the missing letters)

The problem is, Mom only made one set of DVDs. All relatives live in different states, so it was inconvenient to pass discs around.

In 2012, my sister gave me these DVDs. I copied the video files and uploaded everything to the cloud storage. Problem solved!

My eight-year quest to digitize 45 videocassettes. Part 1
DVD rips of family videos in Google Cloud storage

A few weeks later I asked if anyone had seen the tapes. It turned out that no one was watching. I didn't even look. In the era of YouTube, downloading three-hour files of unknown content in search of interesting footage is foolish.

Only my mother was delighted: β€œGreat,” she said, β€œnow can we finally throw away all these cassettes?”

Oh-oh. This is a terrible question. What if we missed some entries? What if tapes could be digitized at higher quality? What if the labels contain important information?

I've always felt uncomfortable throwing away originals until there's absolute certainty that the video is copied to the highest possible quality. Thus, I had to get down to business.

I didn't even know what I was getting into.

Doesn't sound so hard

If you don't understand why it took me eight years and hundreds of hours, I don't blame you. I also thought it would be easy.

Here is what the digitization process looks like from start to finish:

My eight-year quest to digitize 45 videocassettes. Part 1

More precisely, this is how it looks in theory. Here's how it turned out in practice:

My eight-year quest to digitize 45 videocassettes. Part 1

Most of the time was spent reworking what had already been done. I finished one stage, and then after one or two stages I found some kind of flaw in the technique. I had to go back and redo it. For example, I shot video from 20 tapes before I realized that the audio was a little out of sync. Or after weeks of editing, I found myself exporting a video in a format that wouldn't support streaming on the web.

To save the reader's sanity, I'm laying out the process as if it was moving forward in a systematic way, so as not to make you constantly jump back and redo everything, as I had to.

Step 1Capture video

Okay, back to 2012. Mom really wanted to throw away the cassettes she had kept for twenty years, so when we first met, she immediately handed me a huge cardboard box. Thus began my quest to digitize.

The obvious decision was to entrust the work to professionals. Many companies are engaged in digitization, and some specialize specifically in home video.

But I'm quite sensitive about privacy and I didn't want strangers to view our family video with intimate moments of my personal life, including my potty training (at the right age; nothing strange!). And I also thought that there is nothing complicated in digitization.

Spoiler: it turned out to be really difficult.

First attempt at capturing video

My father still had the family's old VCR, so I asked him to dig it out of the basement for the next family dinner. I bought cheap RCA to USB adapter on Amazon and got down to business.

My eight-year quest to digitize 45 videocassettes. Part 1
TOTMC Video Capture Device, the first of many A/V devices I bought during a multi-year quest

To process video from a USB capture device, I used the VirtualDub program, the 2012 version is a little outdated, but not critical.

My eight-year quest to digitize 45 videocassettes. Part 1
Frames in the VirtualDub program, as I read a book to my father at the age of four

Attack with sound distortion

When I started the editing process, I noticed a slight out of sync between audio and video. Okay, no problem. I can move the sound a little.

Ten minutes later, he was out of sync again. Didn't I move it a little the first time?

It gradually dawned on me that audio and video are not just out of sync, they are actually recorded at different speeds. Throughout the tape, they diverge more and more. To synchronize, I had to manually adjust the sound every few minutes.

My eight-year quest to digitize 45 videocassettes. Part 1
If your setup captures audio and video at different rates, then the only solution is to manually correct the audio every few minutes

Can you imagine how difficult it is to distinguish sound 10 milliseconds earlier or 10 milliseconds later? It's really hard! Judge for yourself.

In this video, I am playing with my poor, patient kitten, whose name is Black Magic. The sound is slightly out of sync. Determine if it is ahead of the picture or is it late?


An example of a video clip with sound and picture out of sync

At this point, Black Magic jumps, a fragment with a five-fold slowdown:


Sound and picture out of sync, five times slower

Response: The sound comes with a delay of a few milliseconds.

Maybe spend an extra hundred dollars instead of hundreds of hours of personal time?

The sound correction alone required many hours of tedious, maddening work. Eventually it occurred to me that desync could be avoided by using a better and more expensive video capture device. After some research, I bought a new one on Amazon:

My eight-year quest to digitize 45 videocassettes. Part 1
My second attempt to purchase video capture device

Even with the new device, the desync did not disappear anywhere.

VCR with the prefix "super"

Maybe the problem is with the VCR. On digitization forums it was said that there would be no desynchronization on a VCR with a β€œtime-based corrector” (TBC), this feature is available on all Super VHS (S-VHS) VCRs.

Well, of course! Why did I mess around with the stupid the usual VCR when available super-VCR that solves the problem?

No one makes S-VHS VCRs anymore, but they are still available on eBay. For $179, I bought a JVC SR-V10U model, which seems to be well suited for VHS digitization:

My eight-year quest to digitize 45 videocassettes. Part 1
Vintage JVC SR-V10U VCR I bought on eBay for $179

"Super" VCR came in the mail. After several months of struggling with audio out of sync, I was overjoyed that there was equipment that would solve all my problems.

I opened the box, connected everything - but the sound was still recorded at a different speed. Eh.

Tedious search, troubleshooting and years of struggle

I embarked on a pathetic attempt at troubleshooting. It was painful to watch. Each time I pulled all the equipment out of the closet, crawled on my knees behind the desktop to connect everything, tried to capture video - and again watched that nothing worked.

I came across a random forum post from 2008 about installing some strange unsigned Chinese driver... It's a terrible idea, but I'm desperate. However, he didn't help.

I tried different digitizing programs. Bought special VHS cassetteto clean the magnetic heads of the VCR. Bought third video capture device. Nothing helped.

I invariably gave up, unplugged everything, and hid the equipment in a closet for a few more months.

Surrender and give cassettes to professionals

The year 2018 has come. I moved videotapes and tons of equipment around four different apartments and was about to move from New York to Massachusetts. I could not find the strength to take them again, because I already realized that I would never finish this project on my own.

I asked the family if they could donate the cassettes to a digitization firm. Fortunately, no one objected - everyone wanted to see the records again.

Π―: But that means some company will have access to all of our home videos. Does it suit you?
Sister: Yes, I care. You alone are worried. Wait, so you could have just paid someone in the first place?
Π―: Uh-uh…

Digitization of all 45 cassettes costs $750. It seems expensive, but by then I would have paid anything to not have to deal with this equipment anymore.

When they handed over the files, the video quality was definitely better. On my frames, distortions were always visible at the edges of the frame, but the specialists digitized everything without any distortion at all. Most importantly, the audio and video are perfectly in sync.

Here is a video comparing professional digitizing and my homegrown attempts:


Comparison of professional and homemade digitization in the video where my mother films my first attempt at programming

Step 2. Editing

In home shoots, about 90% of the material is boring, 8% is interesting, and 2% is amazing. After digitizing, you still have a lot of work to do.

Editing in Adobe Premiere

On a VHS cassette, a long stream of video clips is interspersed with blank sections. To edit a tape, you must determine where each clip starts and ends.

For editing, I used Adobe Premiere Elements, which costs less than $100 for a lifetime license. Its most important feature is a scalable timeline. It lets you quickly find the edges of a scene and then zoom in to find the exact video frame where the clip begins or ends.

My eight-year quest to digitize 45 videocassettes. Part 1
Essential zoom timeline in Adobe Premiere Elements

The problem with Premiere is that the process requires constant manual steps, but digitizing and exporting takes a long time. Here is my sequence of operations:

  1. Open a raw file that contains 30-120 minutes of video.
  2. Mark the boundaries of an individual clip.
  3. Export clip.
  4. Wait 2-15 minutes for the export to complete.
  5. Repeat steps 2-4 until the tape runs out.

The long wait meant I was constantly switching back and forth between video editing and some other task, shifting my attention back and forth for hours.

Another disadvantage was non-reproducibility. Fixing a small mistake was almost as difficult as starting from scratch. It hit me hard when it came to posting a video. Only then did I realize that in order to stream on the Internet, it was necessary to initially export the video to a format that web browsers natively support. I was faced with a choice: restart the tedious process of exporting hundreds of clips, or re-encode the exported videos to another format with degraded quality.

Editing automation

After a lot of time spent on manual work, I wondered if AI could be applied here somehow. Determining the boundaries of clips seems to be a suitable task for machine learning. I knew that the accuracy would not be perfect, but let him do at least 80% of the work, and I will fix the last 20%.

I experimented with a tool called pyscenedetect, which parses video files and outputs timestamps where scene changes occur:

 $ docker run 
    --volume "/videos:/opt" 
    handflucht/pyscenedetect 
    --input /opt/test.mp4 
    --output /opt 
    detect-content --threshold 80 
    list-scenes
[PySceneDetect] Output directory set:
  /opt
[PySceneDetect] Loaded 1 video, framerate: 29.97 FPS, resolution: 720 x 480
[PySceneDetect] Downscale factor set to 3, effective resolution: 240 x 160
[PySceneDetect] Scene list CSV file name format:
  $VIDEO_NAME-Scenes.csv
[PySceneDetect] Detecting scenes...
[PySceneDetect] Processed 55135 frames in 117.6 seconds (average 468.96 FPS).
[PySceneDetect] Detected 33 scenes, average shot length 55.7 seconds.
[PySceneDetect] Writing scene list to CSV file:
  /opt/test-Scenes.csv
[PySceneDetect] Scene List:
-----------------------------------------------------------------------
 | Scene # | Start Frame |  Start Time  |  End Frame  |   End Time   |
-----------------------------------------------------------------------
 |      1  |           0 | 00:00:00.000 |        1011 | 00:00:33.734 |
 |      2  |        1011 | 00:00:33.734 |        1292 | 00:00:43.110 |
 |      3  |        1292 | 00:00:43.110 |        1878 | 00:01:02.663 |
 |      4  |        1878 | 00:01:02.663 |        2027 | 00:01:07.634 |
 ...

The tool did show an accuracy of about 80%, but checking its work took more time than it saved. However, pyscenedetect made one of the most important discoveries for the entire project: defining scene boundaries and exporting clips are separate tasks.

I remembered that I am a programmer

Up to this point, I considered everything I did in Adobe Premiere to be β€œediting”. Cutting clips from raw frames seemed to go hand in hand with finding the boundaries of a clip, because that's how Premiere envisioned the task. When pyscenedetect printed out the metadata table, it made me realize that I could separate scene search from video export. It was a breakthrough.

The reason editing was so tedious and time consuming was because I had to wait while Premiere exported each clip. If I were to write the metadata into a spreadsheet and write a script that automatically exports the video, the editing process would fly by.

Moreover, spreadsheets have greatly expanded the scope of metadata. Initially, I cram metadata into the file name, but this limits them. Having an entire spreadsheet allowed me to catalog a lot more information about the clip, such as who was in it, when it was recorded, and any other data that I want to show when the video is shown.

My eight-year quest to digitize 45 videocassettes. Part 1
Giant spreadsheet with metadata about my home videos

Later, I was able to use this metadata to add information to the clips, like how old we were all and a detailed description of what's going on in the clip.

My eight-year quest to digitize 45 videocassettes. Part 1
Spreadsheet functionality allows you to record metadata that gives more information about clips and makes them easier to view

Success of the automated solution

Having spreadsheets, I wrote script, which sliced ​​raw video into clips based on CSV data.

Here's what it looks like in action:

My eight-year quest to digitize 45 videocassettes. Part 1

By now I've spent hundreds hours, tediously selecting clip boundaries in Premiere, hitting export, waiting a few minutes for it to finish, and then starting over. Not only that, the process was repeated multiple times on the same clips when quality issues were later discovered.

As soon as I automated the slicing part of the clips, a huge weight fell off my shoulders. I no longer had to worry about forgetting the metadata or choosing the wrong output format. If an error comes up later, you can simply tweak the script and repeat everything.

Part 2

Digitizing and editing video footage is only half the battle. We still need to find a convenient option for publishing on the Internet so that all relatives can watch the family video in a convenient format with streaming like on YouTube.

In the second part of the article, I will detail how to set up an open source media server with all video clips, which costs me only 77 cents per month.

Continued

Part 2

My eight-year quest to digitize 45 videocassettes. Part 1

Source: habr.com