Google releases data and machine learning model for splitting sounds

Google company ΠΎΠΏΡƒΠ±Π»ΠΈΠΊΠΎΠ²Π°Π»Π° an annotated reference blended sound database that can be used in machine learning systems used to separate arbitrary blended sounds into individual components. Also published is a generic deep learning model (TDCN++) that can be used in Tensorflow to separate sounds. Data prepared based on collection freesound.org ΠΈ published licensed under CCBY 4.0.

The presented project FUSS (Free Universal Sound Separation) is aimed at solving the problem of separating any number of arbitrary sounds, the nature of which is not known in advance. Other such systems are generally limited to the task of separating certain sounds, such as voices and non-voices, or different speaking people.

The database contains about 20 thousand mixes. The suite also includes pre-calculated room impulse responses prepared with a custom-built room simulator that takes into account wall reflections, sound source location, and microphone location.

Source: opennet.ru

Add a comment