DataPrep.Connector aims to simplify data collection from Web APIs and databases by providing a standard set of operations.


Connector wraps-up complex API calls into a set of easy-to-use Python functions. By using Connector, you can skip the complex API configuration process and rapidly query different Web APIs in few steps, enabling you to execute the analysis workflow you are familiar with in a direct way.

Watch our introduction in PyGlobal Conference here.

Connector offers essential features to facilitate the process of collecting data, for example:

  • Concurrency: Collect data from websites, in parallel, in a fast way!

  • Pagination: Retrieve more rows of a particular query without getting into unnecessary detail about pagination schemes!

  • Authorization: Access more Web APIs quickly! Even the ones that implement authorization!

The user guide first presents a case study for dblp as an example for the process overview and provides a detailed explanation of the functionalities in the following sections.


Connector wraps on connectorx to allow user fetch data from databases through SQL query. The result of the query will be stored into a Python dataframe.

Supported databases:

  • Postgres

  • Mysql

  • Sqlite

  • SQL Server

  • Oracle

  • Redshift (through postgres protocol)

  • Clickhouse (through mysql protocol)