Universal Python database client (support mysql, pg, mongodb or something else with same query)

2018-01-03 03:11:19

I usually need to

read dataset from database (mysql, mongodb)

split dataset in several group, then process or compute

use multiprocessing or distributed workers to process the data

can stop, resume, recover task (need save task status, need know how to split data in step 1)

One time processing is easy, but data set is usually large. And split dataset to task would be slow too, it better generate split query, and run in each worker.

But I didn't find a super power split query(in step 1). I know it is like map reduce, but not exactly .

I would like to see a lib or framework can connect to many different db and use same query language(sqlachemy can't do this).