-
-
Notifications
You must be signed in to change notification settings - Fork 32.7k
Description
Feature or enhancement
Proposal:
Here's some code/benchmark that demonstrates the concept.
import sqlite3, json
import time
import subprocess
def dict_factory(cursor, row):
d = {}
for idx, col in enumerate(cursor.description):
# col[0] is the column name
d[col[0]] = row[idx]
return d
def get_data_to_json():
conn = sqlite3.connect("/home/user/database.sqlite")
conn.row_factory = dict_factory
c = conn.cursor()
c.execute("SELECT * FROM server")
rst = c.fetchall() # rst is a list of dict
return rst
start = time.time()
rst = get_data_to_json()
print(f"normal data load: {time.time() - start}")
start = time.time()
out = subprocess.run(["sqlite3", "--readonly", "/home/user/database.sqlite", ".mode json", "select * from server"], check=True, capture_output=True).stdout
print(f"sqlite3 command: {time.time() - start}")
start = time.time()
loadit = json.loads(out)
print(f"load json from command: {time.time() - start}")
start = time.time()
json.dumps(loadit)
print(f"dump loaded json: {time.time() - start}")
$ python benchmark.py
normal data load: 0.7838420867919922
sqlite3 command: 0.19471359252929688
load json from command: 0.21944022178649902
dump loaded json: 0.16810369491577148
I recently found out that sqlite3's CLI has a JSON mode. This JSON mode is considerably faster than any Python option that I'm aware of.
There's lots of cases where this might be useful. Let's say you have an API that needs to dump a decent bit of data from your database. There's no reason to load the JSON, then dump it to send it. In this case, it's about 5x faster to let sqlite dump the JSON and pass that along.
What's crazy is that it's even 2x faster to call sqlite3
and load JSON, than it is to fetchall()
.
I am not sure how the sqlite interface works and if this would require work on sqlite's part, but to me this is a substantial improvement.
I imagine this might have an interface like:
session.exec(query, mode=json)
or session.exec(query, mode=dict)
Curious what you all think.
Thanks!
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Update: There's now a discussion for it: https://discuss.python.org/t/json-output-support-for-sqlite3/103048
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status
Status