This really would've come in handy when I was debugging my own SQLite parser a couple weeks ago.
One thing that initially confused me was how exactly the pages worked w.r.t. the first page on disk... I misunderstood the SQLite documentation in different ways, but it's really rather simple: the very first page is just treated as containing the file header in it, and it pushes down the rest of the data, making the page shorter than the other pages. You can see that illustrated clearly if you click into the first page of a database using this tool: the database header comes first, then the page header.
This tool will undoubtedly come in handy for anyone who has a reason to be dealing with SQLite data structures directly for whatever reason, especially since the SQLite documentation is a bit terse at times.
Yes I’m also working on a SQLite parser, mine is in raw WebAssembly. Is yours open source too? This tool will be so useful. I have basic page reading and parsing of the CREATE TABLE schema: https://github.com/RoyalIcing/SilverOrb/blob/9dacad0ce521b0d...
My plan is to create a miniature .wasm module to read .sqlite files that works in the browser. It will be in the tens of kilobytes rather than the 1 megabyte that the official fantastic sqlite.wasm is. The reduced download means even on 3G you ought to be able to load within a few seconds. You can use SQLite files as your network payloads then, and perhaps even as the working mutable state synced between server and clients.
Mine is in TypeScript and for the purpose of parsing a file that happens to use sqlite, so I don't think I'll bother parsing the CREATE TABLE schema unless I have to. It's not currently posted anywhere but will be open source. It works, but the code isn't particularly great :)
I really want a data format that is effectively binary JSON. What is the subset of all of the features of SQLite that makes either a read-only or an updatable data set that is compact. But better searchability than a streaming parser.
If you want to maintain the properties that SQLite has for read use cases, you'll need to replicate a couple of features. At the very least, you'll probably want the format to still be page-based with a BTree structure. You really could get away with just using the SQLite format if you didn't mind the weirdness; a functional SQLite parser that can read tables would not be a significant amount of code. I think, though, that if you want to read the schema as SQLite understands it, you'd need to interpret the CREATE TABLE syntax, which would make it a bit more complex for sure. Otherwise, you can read tables and columns themselves relatively easily, and the values are all stringified.
Yeah if I wasn’t clear I’m talking about a minimal file that SQLite can still open read only without errors, not a third party implementation. Though there might be a few tweaks that would allow SQLite to be a bit more lenient. For instance missing metadata that can be assumed. Maybe b tree nodes exceeding the usual load factor.
I would probably just use bson or gRPC. As o clarified elsewhere, I means JSON as an analogy. I want something that can be scanned and queried cheaply.
When I said binary JSON I didn’t mean literal JSON. I meant “common denominator interchange format”. It’s too chatty by far and has dismal performance for queries. So you’re better off asking a specific question and getting a larger document that could answer many questions that you do t yet have. For CDNs things like this matter a lot.
I especially like that it provides a sample database. Would be nice it were downloadable as a file, so we could explore it with our own tools as well and get a better feeling for how the tool works.
I tried with a simple database, and it just shows me a blank screen. The console in developer tools shows an exception (both in Firefox and Chromium):
index-CGfYebmt.js:85 Uncaught RangeError: Offset is outside the bounds of the DataView
at DataView.prototype.getUint32 (<anonymous>)
at R0 (index-CGfYebmt.js:85:4683)
at X0 (index-CGfYebmt.js:85:7966)
at index-CGfYebmt.js:85:8157
at Array.map (<anonymous>)
at Q0 (index-CGfYebmt.js:85:8150)
at index-CGfYebmt.js:85:8240
at Object.useMemo (index-CGfYebmt.js:48:46087)
at My.$.useMemo (index-CGfYebmt.js:17:7276)
at V0 (index-CGfYebmt.js:85:8228)
R0 @ index-CGfYebmt.js:85
X0 @ index-CGfYebmt.js:85
(anonymous) @ index-CGfYebmt.js:85
Q0 @ index-CGfYebmt.js:85
(anonymous) @ index-CGfYebmt.js:85
useMemo @ index-CGfYebmt.js:48
My.$.useMemo @ index-CGfYebmt.js:17
V0 @ index-CGfYebmt.js:85
tc @ index-CGfYebmt.js:48
bc @ index-CGfYebmt.js:48
lo @ index-CGfYebmt.js:48
No @ index-CGfYebmt.js:48
wm @ index-CGfYebmt.js:48
Xc @ index-CGfYebmt.js:48
Ao @ index-CGfYebmt.js:48
Qo @ index-CGfYebmt.js:48
me @ index-CGfYebmt.js:25
4. Click on the "No Throttling" dropdown and check "Offline".
5. You should now see a yellow warning icon in your Network tab and it should prevent all network access. If you reload the tab you should get the "No Internet Connection" error page.
I've long been wishing for the same! With service workers etc. this would be a perfect approach for many small utilities. I've started building things as single HTML files using vite-plugin-singlefile[0], but users either have to manually block network requests, or they have to trust me. It would be awesome if they could be sure that nothing will be uploaded anywhere.
The pledge would have to be persistent, which would make e.g. service worker updates difficult - ideally the browser should show you that an update is available, and offer to delete all stored information so nothing can be saved to localStorage etc. to be uploaded afterwards.
AFAICT it's purely frontend, there is no uploading. Aside from running it yourself, another thing you can do is go into devtools and set the network throttling to "offline". (Note that this is not fool-proof, but it does prevent new connections from being established.)
It's hard to know whether anything is actually uploaded, but thankfully SQLite databases are plentiful that don't contain sensitive data (and this, also, is why there is a sample database included).
This is super cool and will really help anyone studying how to build their own storage engines.
A note: the table that describes the data structures (Field # | Description | Value) should also list the data type (uint8 uint16 int32 text etc.). If it's an enum ("Table Leaf"), reference other possible enum values. If its length depends on another field, reference that field No.
I really like the VS Code extension for sqliteview.app - [0]. It also offers an edit feature for a reasonable price, in my opinion. I'm curious if there are any competing editor alternatives.
This really would've come in handy when I was debugging my own SQLite parser a couple weeks ago.
One thing that initially confused me was how exactly the pages worked w.r.t. the first page on disk... I misunderstood the SQLite documentation in different ways, but it's really rather simple: the very first page is just treated as containing the file header in it, and it pushes down the rest of the data, making the page shorter than the other pages. You can see that illustrated clearly if you click into the first page of a database using this tool: the database header comes first, then the page header.
This tool will undoubtedly come in handy for anyone who has a reason to be dealing with SQLite data structures directly for whatever reason, especially since the SQLite documentation is a bit terse at times.
Yes I’m also working on a SQLite parser, mine is in raw WebAssembly. Is yours open source too? This tool will be so useful. I have basic page reading and parsing of the CREATE TABLE schema: https://github.com/RoyalIcing/SilverOrb/blob/9dacad0ce521b0d...
My plan is to create a miniature .wasm module to read .sqlite files that works in the browser. It will be in the tens of kilobytes rather than the 1 megabyte that the official fantastic sqlite.wasm is. The reduced download means even on 3G you ought to be able to load within a few seconds. You can use SQLite files as your network payloads then, and perhaps even as the working mutable state synced between server and clients.
Mine is in TypeScript and for the purpose of parsing a file that happens to use sqlite, so I don't think I'll bother parsing the CREATE TABLE schema unless I have to. It's not currently posted anywhere but will be open source. It works, but the code isn't particularly great :)
I really want a data format that is effectively binary JSON. What is the subset of all of the features of SQLite that makes either a read-only or an updatable data set that is compact. But better searchability than a streaming parser.
If you want to maintain the properties that SQLite has for read use cases, you'll need to replicate a couple of features. At the very least, you'll probably want the format to still be page-based with a BTree structure. You really could get away with just using the SQLite format if you didn't mind the weirdness; a functional SQLite parser that can read tables would not be a significant amount of code. I think, though, that if you want to read the schema as SQLite understands it, you'd need to interpret the CREATE TABLE syntax, which would make it a bit more complex for sure. Otherwise, you can read tables and columns themselves relatively easily, and the values are all stringified.
Yeah if I wasn’t clear I’m talking about a minimal file that SQLite can still open read only without errors, not a third party implementation. Though there might be a few tweaks that would allow SQLite to be a bit more lenient. For instance missing metadata that can be assumed. Maybe b tree nodes exceeding the usual load factor.
Have you tried MessagePack[0]?
0: https://msgpack.org/index.html
I would probably just use bson or gRPC. As o clarified elsewhere, I means JSON as an analogy. I want something that can be scanned and queried cheaply.
Parquet or some other column oriented data format is probably closest to what you want without getting into indexing your flat files or similar
sqlite itself supports a binary encoding of JSON: https://sqlite.org/jsonb.html
When I said binary JSON I didn’t mean literal JSON. I meant “common denominator interchange format”. It’s too chatty by far and has dismal performance for queries. So you’re better off asking a specific question and getting a larger document that could answer many questions that you do t yet have. For CDNs things like this matter a lot.
MongoDB's BSON?
Mongo sits on a throne of lies and I will never condone anyone using it for any purpose except to make a joke.
Glad you like it.
I especially like that it provides a sample database. Would be nice it were downloadable as a file, so we could explore it with our own tools as well and get a better feeling for how the tool works.
https://github.com/invisal/sqlite-internal/tree/main/public, or just download it by copying the URL from your network tab when you load it.
I tried with a simple database, and it just shows me a blank screen. The console in developer tools shows an exception (both in Firefox and Chromium):
Probably I might made mistake somewhere in my parser. Do you have small reproducible database with no sensitive database and add to the https://github.com/invisal/sqlite-internal/issues
I will try to fix it
Issue created: https://github.com/invisal/sqlite-internal/issues/3
Really nice, looks great!
However, I am not a fan of uploading databases to "strange" sites on the internet, so I will probably never use this.
I wish that we had a browser version of “pledge” that would:
1. Permanently restrict the browser tab from accessing the network.
2. Show an indication of this in the browser UI outside of the content area.
It would be perfect for this kind of app.
In this case, itt's easier to just run it locally: https://github.com/invisal/sqlite-internal
But you can get sort of what you want by:
1. Loading a site.
2. Go into the Web Dev Tools.
3. Select the Network tab.
4. Click on the "No Throttling" dropdown and check "Offline".
5. You should now see a yellow warning icon in your Network tab and it should prevent all network access. If you reload the tab you should get the "No Internet Connection" error page.
What a fantastic method, I'm gonna start using this for all the random tools I upload data to expecting they are fully clientside.
I've long been wishing for the same! With service workers etc. this would be a perfect approach for many small utilities. I've started building things as single HTML files using vite-plugin-singlefile[0], but users either have to manually block network requests, or they have to trust me. It would be awesome if they could be sure that nothing will be uploaded anywhere.
The pledge would have to be persistent, which would make e.g. service worker updates difficult - ideally the browser should show you that an update is available, and offer to delete all stored information so nothing can be saved to localStorage etc. to be uploaded afterwards.
[0]: https://www.npmjs.com/package/vite-plugin-singlefile
AFAICT it's purely frontend, there is no uploading. Aside from running it yourself, another thing you can do is go into devtools and set the network throttling to "offline". (Note that this is not fool-proof, but it does prevent new connections from being established.)
It's hard to know whether anything is actually uploaded, but thankfully SQLite databases are plentiful that don't contain sensitive data (and this, also, is why there is a sample database included).
I loaded up my Star Trek database into the program to toy with: https://chiselapp.com/user/chungy/repository/startrek-db/uvl...
You can probably run it yourself: https://github.com/invisal/sqlite-internal
It is completely frontend. Everyone can review the code or host it themselve. :)
There is a github link so you can run it yourself
This is super cool and will really help anyone studying how to build their own storage engines.
A note: the table that describes the data structures (Field # | Description | Value) should also list the data type (uint8 uint16 int32 text etc.). If it's an enum ("Table Leaf"), reference other possible enum values. If its length depends on another field, reference that field No.
Very cool.
It would be nice if it tied the format in a bit more to the schema and data being shown, but that is a very minor gripe given how nice a tool this is.
I've been enjoying watching you build this on X – nice work.
Thanks so much :)
This is really cool.
this is a wonderful learning tool, thank you.
[dead]
Meta: always found it interesting that .dev was allowed to be a TLD:
* https://en.wikipedia.org/wiki/.dev
More on-topic: another online option:
* https://sqliteviewer.app
* https://inloop.github.io/sqlite-viewer/
Local app:
* https://sqlitebrowser.org
This is a different tool—SQLite File Format Viewer is made for anyone interested in exploring the internal structure of SQLite.
I really like the VS Code extension for sqliteview.app - [0]. It also offers an edit feature for a reasonable price, in my opinion. I'm curious if there are any competing editor alternatives.
[0] - https://vscode.sqliteviewer.app/
The only reasonable price for editing a SQLite database is $0.
Or $2000 for a perpetual license to the Sqlite Encryption Extension, if the database is encrypted ;)
Honestly not a bad price if you're building it into a real product. Steep for hobbyists though.