Terality, simple explanation

Mehmet Akif Cifci
2 min readFeb 11, 2022

Terality is a cloud-based serverless data processing engine. There is no need to maintain infrastructure since Terality takes care of scaling computational resources. Engineers and data scientists are the intended recipients. Terality is a serverless data processing engine operating on massive clusters. You can deal with datasets of any large size, blazing quickly without worrying about scaling resources on the clusters or any infrastructure.

Terality main and side engines. (Source: t.ly/Bvwb)

There are two implications to this:

1- The dataset size is almost unrestricted due to the vast amount of RAM available.

2- Even with a 4GB RAM system, you can handle hundreds of GBs of data with a strong Internet connection.

Terality’s main selling point is that it uses the same Python package syntax as Pandas. To convert Pandas to Terality, all you have to do is alter one line of code.

import pandas as pd

df = pd.read_csv(“data/train.csv”)

large_df = df.sample(6 * 10 ** 7, replace=True) # 60 million rows

large_df.to_parquet(

“data/tps_may_large.parquet”, row_group_size=len(df) // 15, engine=”pyarrow”

)

The Python module makes HTTPS requests to the Terality engine when running Pandas routines. It is the job of the engine to process the inputs and outputs and then returns the data.

It takes about a minute to get everything set up. You install the library using pip and input your API to link your computer to the engine.

Is Terality FREE to use?

Terality features a free plan with which you may process up to 500 GB of data each month. Customers with more demanding needs may upgrade to a premium plan.

Data Scientists, in particular, will benefit greatly from the free plan, which will be the primary emphasis of this essay.

You can follow me on Twitter and YouTube.

--

--

Mehmet Akif Cifci

Mehmet Akif Cifci holds the position of associate professor in the field of computer science at TU Wien, located in Austria.