BlazingSQL is suitable for performing single analytical queries on large data sets (tens of gigabytes) stored in tabular formats (for example, logs, NetFlow statistics, etc.). BlazingSQL can execute queries from raw CSV and Apache Parquet files hosted in network and cloud file systems like HDSF and AWS S3, directly transferring the result to GPU memory. By parallelizing operations on the GPU and using faster video memory, BlazingSQL queries run up to
To work with the GPU, a set developed with the participation of NVIDIA is used.
BlazingSQL provides the ability to use SQL instead of a data processing API
BlazingSQL greatly simplifies working with data - instead of hundreds of cuDF function calls, you can get by with one SQL query. The use of SQL makes it possible to integrate RAPIDS with existing analytics systems without writing specific handlers and without resorting to intermediate data loading into an additional DBMS, but
while maintaining full compatibility with all parts of RAPIDS, translating existing functionality into SQL and providing performance at the cuDF level. Including support for integration with libraries
Source: opennet.ru