| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AndrewDucker 1370 days ago
	Can you give an example of what you mean, and what we'd gain from it?

4 comments

dkjaudyeqooe 1370 days ago

When you're accessing a SQLite database in code you have to generate a query string. Parameters ameliorate that somewhat, but in many cases you still have to regenerate a new string for each new query. It's inefficient to translate your query into a string only to have it parsed back into something structured by SQLite.

AndrewDucker 1370 days ago

Are you suggesting an alternate syntax for SQL-like queries, which is more compact?

Or a specific one for SQLite?

I'd be very surprised if generating the SQL query string and parsing it again was more than a trivial percentage of the query execution time, but happy to be proven wrong.

dkjaudyeqooe 1369 days ago

Probably, but it's still inefficient in other ways. There's memory issues and the code overhead of dealing with strings. It'd be much nicer to have a binary interface, but admittedly it would be much more complex.

The real solution is to include an alternative to SQL that looks and works like Datalog including things like variables. That would make SQLite 100x more productive for programmers. But that will never happen.

AndrewDucker 1369 days ago

Wouldn't that then just be a prepared statement with variables?

rprospero 1370 days ago

As I said in another part of the thread, I was in a scenario where we were performed millions of inserts into a table of four integers, one row at a time. Generating the string and parsing it again wound up being enough to blow our 10µs time budget.

dkjaudyeqooe 1369 days ago

But with parameters you don't need to do that, or in any other scenario where only (literal) values change.

On the other hand, if you were importing data and each line specified which table and column it had to go into you'd probably have to write a new SQL statement each time.

aaaaaaaaata 1369 days ago

What was the time constraint due to, out of curiosity?

What did you do for a solution?

rprospero 1369 days ago

It was a neutron detector that stored the the position, energy, and time stamp into the database. The design spec was 10,000 neutrons per second, so that was the origin of the time limit.

Of course, half those neutrons arrived within a 20ms window, so we had a buffer to handle the load. However, however, if the average remained above the limit, the buffer would fill up. There was a brief discussion of just ignoring events when the buffer was full, but that could introduce systematic errors in the data that would be impossible to detect, so it was better for the database to just crash.

The solution was to tighten the slits on the neutron beam to lower the count rate to the point that we never filled the buffer. Granted, we were testing a high flux technique that, so that was a bit of a disappointment. Everything else in the instrument could handle an order of magnitude more neutrons, except this database.

By the way, to be fair to the database designers, they were working with Clinton era tech and were probably told that the flux rate would never be a SUSTAINED 10,000 events per second.

chasil 1370 days ago

SQLite was released in the year 2000; it is 22 years old.

Because of its age, it does carry deprecated API components that are maintained solely for backward compatibility.

Notice how many _v2 and _v3 variants are present, denoting reworked aspects of the API:

https://sqlite.org/c3ref/funclist.html

A product of this age was designed (and redesigned) for specific needs. Unfortunately, your use case is not among them.

ThemalSpan 1370 days ago

Is that a fact or an intuition?

rprospero 1370 days ago

Not the OP, but I can give two gains. First off, by passing an AST, instead of just an SQL string, we cut out a huge number of possibly SQL injection attacks. Second, in most of the projects where I've used of SQL, there's been some kind of database object that builds the actual query, which it then converts to a string. The database then takes that string and parses it into an AST. There's some performance gains to be made by skipping the middle man and just creating the AST directly.

Many years ago, I was on a project that needed to add rows to a database with a hard 10µs limit. Each rows was just four integers, so the writing part was trivial. However, allocating the string, formatting the integers to strings, then parsing the resulting string often put us over the time limit. Every time the time limit was breached, we lost about five grand. Why we were using an SQL database for this at all is a story for a different time.

Taywee 1369 days ago

You should be using prepared statements and variable bindings. You should never have to parse any statement more than once (many dynamic language interfaces even cache statements for you, and many SQLite libraries also offer an interface for this out of the box). You should also never be formatting integers into strings, but simply binding them to the prepared statement. I really hope you weren't interpolating them and formatting each statement with the literal values in the string. That would be horridly inefficient, and a bad misuse of any modern SQL database. I also hope you were properly using transactions to speed up your batches.

See: https://www.sqlite.org/c3ref/stmt.html

danielheath 1370 days ago

Having application build a string and pass it to a library which parses the string into an AST cannot be as efficient as just building the AST, right?

yread 1370 days ago

It's a lot more efficient for the human who is used to reading SQL to read the SQL (or even SQL-producing code) than AST

pif 1370 days ago

Type safety at compile time to begin with.