How to create a simple SQL engine

In this article, we try to help developers to build their own SQL engine. We just simplify our storage system by using a memory table so that we can pay more attention on implementation of engine.

Before gain a deeper insight into the implementation, let's briefly outline how to design simple SQL engine using HybridSE:

  1. Design memory table storage

  2. Implement Catalog and TableHandler, e.g, SimpleCatalog, SimpleTableHandler.

  3. Build and execute engine

More detail : simple_engine_demoarrow-up-right

1. Memory table storage

image-20210326122554032

typedef std::deque<std::pair<uint64_t, Row>> MemTimeTable;
typedef std::map<std::string, MemTimeTable> MemSegmentMap;
  • Our memory table support multi-indexes. And each index binds to a SegmentMemMap.

  • SegmentMemMap is a map<Key,MemTimeTable> where key is index key string。Rows with same keys will be collected together, ordered by time and added into the same MemTimeTable .

2. Catalog Implementation

In order to create a HybridSE Engine for our own purpose, we have to implement a Catalog class specifically adapt to our Strorage system. That means it will contain the infomation of the dataset and will define a set of operations to access the memory table efficiently.

Fields

We use database_ and table_handler to maintain and manage database and table。type::Database is our Database prototype and SimpleCatalogTableHandler will be discussed later.

Functions

Here, we list some implementations (More detail can be found from simple_catalog.harrow-up-right and simple_catalog.ccarrow-up-right

  • Constructor

The constructor initialize the enable_index to enable or disable index-based-optimization. And the database and table meta and data are empty.

  • GetDatabases and GetTable

  • AddDatabases and InsertRows

Actually, AddDatabase and InsertRows aren't necessity for a Catalog class, but here we truely need them to help us to initialize database and to prepare data.

Fields

Internally, we maintain table_storage to maintain and manage the [memory table storage](#1. memory table storage):

MemPartitionHandler is a TableHandler implementation. It makes it very convenient toGetWindowIterator.

At the same time, we also use full_table_storage_ to store full table data. Although, it is kind of memory costly, it simply implement GetIterator.

Functions

Here, we list some implementations (More detail can be found here: simple_catalog.harrow-up-rightsimple_catalog.ccarrow-up-right)

  • Constructor

At the very beginning, we have to initialize the table infomations, e.g., TableDef, IndexHint and Types

  • Get table infomation

  • GetPartition and GetWindowIterator

MemPartitionHandler makes it easy to implement GetPartition() and GetWindowIterator()

More details: MemPartitionHandler::GetWindowIterator()arrow-up-right

  • GetIterator

Then we can simplity implement GetIterator by returning the full_table_storage_->GetIterator()

More details: MemTableHandler::GetIterator()arrow-up-right

  • GetCount and At

If we are not ready for some operation, we can just return 0 or null. Sorry do not support error system currently.

3. Build and execute engine

Build engine

  • Prepare Catalog

  • Prepare data

  • Config EngineOption

We simply use default EngineOptions

  • Build Engine

Compile and execute SQL

  • Compile SQL

  • Execute SQL

4. Run SimpleEngineDemo

Last updated