You will either end up with an ad hoc, informally-specified, bug-ridden, slow implementation of half of SQLite ... or, you will fail to even attempt the features that SQLite gives you - such as locking and dealing with concurrency - and you will have bugs.
Please don't go and add locking now! (It's hard to get right). SQLite was invented to be a better fopen() - use it! There is no reason not to, if you are requiring PHP 5.3. If you want a simple plain text dump of your SQLite DB, that's not hard to add.
A blogging platform for personal use is not magic. Avoiding things like concurrency problems in small platforms like this is often trivial.
Personally I also use a flat file blogging platform, and I explicitly rejected putting stuff in a database because I want to be able to edit the articles with emacs and check them into a git repository. Concurrency in my case is a non-issue because, well, there's only one of me.
There are plenty of scenarios where you really should not use SQLite nor any other RDBMS because it overcomplicates things that are really exceedingly simple.
I hate non-standard stuff like this, even though I myself do similar things all over the place. It's the kind of the thing that's OK if you do it to yourself, but immediately rings an alarm when it is distributed to the general public. When you're already doing something non-standard by using a flat file instead of a well-known DB format, you might as well use standards in other places so that people have fewer reasons to complain.
For example, Markdown has a special syntax for <h1> tags. It looks like this:
This is a title.
================
Since it is extremely unlikely that an <h1> tag will be used for anything other than the title of a post, why not use it to mark a line as the title?
Or maybe use the MultiMarkdown convention of colon-separated header fields at the top of the file, like:
Title: This is the title.
Tags: foo, bar
Date: March 3, 2014
Outside of "the bubble" the web is basically made of php and perl. Its like the dark matter of the internet. If you want to reach all they way down to the weekend dabbler, you really need to choose one of those 2, and of those 2, php seems more immediately accessible.
I think you'll find that graphs like that are meaningless. I work for an agency that does mostly PHP work, and we never advertise, because for common skills like PHP, recruiters and potential hires inundate us with calls every single day. I expect the very common skills like PHP are severely under-represented in ads for this reason. For less common skills, you need to advertise, as people won't know where to go.
Presumably because of how many webhosts have it available automatically.
I'd pondered a flat file based php 'blog' engine once - with all the benefits of markdown or whatever - no database complexity, easily rsyncable for deployment, git can keep revisions, etc...
Yep - and there's also some benefits performance and security wise to not having any serverside code at all (other than nginx, or whatever).
The problems you have with jekyll, pelican, et al is that you lose site-side search, 'related posts' (without some reasonably complex compile-side clobber), etc. etc.
Using extremely minimal PHP lets you deploy just as easily, you don't get too much of a performance hit (a hell of a lot better than wordpress, etc), you can still do search, related posts, forms, embedding, and all that.
PHP as a server-side 'clever templating' language really isn't that bad. It's only awful when used to build anything massivly complex (such as joomla! or drupal...), and that it encourages messy project design.
> PHP as a server-side 'clever templating' language really isn't that bad.
I only touch PHP as little as possible, and work with a legacy PHP codebase, but AFAIK, it hasn't evolved a tag to automatically HTML-escape/JSON-escape content. So, it's as good a templating language as it is a programming language: pretty terrible. I'll trade PHP for something as barebones as Python with WSGI + Jinja2 any day.
To be honest, not implying you're wrong, but you could have "related posts" with pre-computation, and "site-side search" with a mix of pre-computed data structures and front-end Javascript (which could be quite efficient, provided the blog is not too large).
HTMLy has a built-in search feature and related posts. I use file-naming convention than the speed always fast even though let say the blog has about 3k of posts with hundreds of tags, why? HTMLy don't read the content first but filter it first. I already test it with mini VPS (RAM 128 MB) and no speed penalty.
Interesting choice. A lot of places still use it, but it definitely feels dirty. Not the kind of thing you usually see around here. While it is still used a lot, I have to say, when I saw PHP I immediately closed GitHub.
Do you close your browser when you find out it is written in C++? Or shut down your OS when you find out it is written in C? Or angrily close the browser tab when you find out something on the page uses JavaScript? Do you then go and wash your hands and wipe the nervous sweat from your brow?
If a tool gets a job done, does the language matter?
No language is perfect. I know PHP feels "dirty" but the misuse of it is likely the cause of widespread disparaging. Even the English language is not perfect. Does this stop you using it?
Just a thought. I tend to close tabs when I come across stuff written in languages I can't write in... :-)
I don't think it's a static generator as there's no compilation step. Rather, they just replaced the database with file lookup. The server still needs a php interpreter.
Currently, this listing is for only projects that are either or both a Flat File CMS and Static Site Generator, but not for projects which are only Dynamic Servers (such as WordPress and Ghost).
The biggest problem with static blogs is the lack of comments. I see this project uses either facebook or disqus which is a solution a lot of people like.
I prefer not to rely upon external comment-providers though, which is why I wrote my own self-hosted comment-server:
I use jekyll for my blog and looked for ways to integrate comments. I ended up with a simple solution: Provide a per-post unique email address where people can send comments to. The idea was then to manually process the comments and only put the most useful ones on the website.
I wanted to think that the hurdle is very low. It's a bit higher than just entering text in a text box and clicking a button. But sending an email isn't that hard either.
I received maybe half a dozen emails since I introduced my form of static comments. Most in the form of 'Does this work?'. For obvious reasons I didn't publish them. I redesigned my blog a few weeks ago and didn't add the comment feature back in.
It's interesting that we're still writing such applications by hand. One thing that interested me when I learned about CouchDB was the possibility of skipping that and just exposing the database to the browser, with a few schemas and a couple of data validation functions configured. After all, that system is almost a dumb HTTP storage mechanism.
Presumably that means that users can mass-scrape the submitted comments though? (Potentially allowing the email addresses users submitted to be harvested.)
Other than that it doesn't seem like an unreasonable approach.
Why can't they mass scrape your service, though? After all, what you built is essentially a very specialized REST database.
As for harvesting email addresses, I think you could solve that by using a CouchDB view, which is essentially a function that processes and returns JSON documents. In this case, it could just delete the "email" key and return the rest.
You would probably still need to block the direct access to the document via frontend proxy, since I don't think Couch allows you to specify fine-grained per-user permissions, which is definitively a drawback.
Alternatively, since you're already willing to send hashed versions of the emails (as Gravatars), you could just store only the hashes in the first place, and never commit the plaintext to disk.
I might have been making assumptions on CouchDB which aren't valid - that remote users could query all documents (== pages) to get the comments.
With my thing yes it can be crawled, since requests to /comments/ID will return the JSON comment-data. However there is no enumeration of the valid IDs possible, short of a dictionary attack. (This is where I was thinking that exposing CouchDB might expose more data.)
I did consider not storing emails, and for my use-case that's fine, but I figured sooner or later somebody will want to access them so ruling it out unduly would eventually result in a bug report.
I might have been making assumptions on CouchDB which aren't valid - that remote users could query all documents (== pages) to get the comments.
Yes, you'd probably need to block that URL with a proxy, and only allow single page views to be requested. I think this is definitively a shortcoming of the BD; it should allow finer grained permissions.
However there is no enumeration of the valid IDs possible, short of a dictionary attack.
Well, by default CouchDB uses UUIDs, so enumeration shouldn't be possible either. Of course, both are subject to simple scraping of the HTML pages; a simple wget + grep can probably list them all, so you don't gain much, except for private pages you might have.
I did consider not storing emails, and for my use-case that's fine, but I figured sooner or later somebody will want to access them so ruling it out unduly would eventually result in a bug report.
Fair enough. I actually don't think CouchDB, as it is now, would necessarily be a better solution than yours. But the question is, why not? I believe the direction is correct, but the current implementation falls short, and that's a shame.
We use BitNami and Wordpress to generate static files. Works great. Paired with cloudflare, the site is faster than 99 % of the web (according to http://tools.pingdom.com/fpt/).
well, let me add my old project http://flatpress.org it does feel dated, as it uses BBCode, but there plugins for Markdown. If I had the time, I would make MD the default nowadays.
Never really understood why people try to avoid DBs? It's OK for a very basic pile-of-online-texts blog, but the moment you try to show whatever relation between blogposts (e.g. the related articles or the latest articles) you end up reinventing the wheel
Most blogs are small enough that caching all the content in memory and sorting/selecting is so cheap and simple it really makes no difference these days, and you gain simplicity.
My blog is flat file because I like to work on a version on my home server, editing stuff in emacs, commit to git and push an updated version atomically. Even if I continue writing at my current pace for the next 100 years, my current server would hardly notice having to re-read every single article.
As for reinventing the wheel, the code for pulling in the articles from flat files and slicing and dicing them simply by iterating over an in memory connection is so small and simple that there's hardly any wheel to re-invent.
Edit in your editor of choice. Use your source control for edit history. Trivial to keep a separate instance to edit/test on, and push with rsync. Makes it trivial to treat the code and the content as one unit, so that e.g. if you change parsing of the articles, update the articles and need to revert, you don't need to mess with reversing database updates separately).
Fewer moving parts. For a typical modern blog with comments farmed out to Disqus or similar, the data is likely to be tiny and very static. My blogs data, for example, is about 8.3MB of text that changes maybe a couple of times a month, so the 1-2 second cost of reading every article I've ever written in from individual files on disk is hardly an issue.
I wonder what simple blogging / CMS engines are out there that use sqlite as the backend. This also has the benefits of simple backups without the complexity of another process running. Most (if not all) webhosts will have this baked into their PHP installs.
Perhaps using sqlite as the "backed" in that you build it like any other blog, so you get the simplicity of building it out, using one good sql framework etc.
But then, when you click "post" it rips through the database using a library of markdown -> HTML or whatever the case may be. Thhere can't be must overheard to a single include to pull in the html, but you could render the enter page. I just can't see php falling down too much with a few includes and functions being called to generate a header, the included rendered HTML, and a footer with some design elements.
It's not much fun building your own database out of flat files. I cut my teeth on a Mac only system that more or less only talked to Filemaker, which was only as fast as the actual screen could redraw and search out the data. It could be painfully slow.
In a way I am glad, as I learned how to do things and think differently when most were just running a "select * from foo where bar = 'x'" which was a 15 minute luxury I didn't have. There were no joins, no tables, you actually ran applescripts on the database and it returned the data somehow back to a web server on the Mac.
So we used the database on the backend, where admins could be more patient, or do more intricate things, but almost always generated out some HTML, so in the end, the site was semi-dynamic. I think I was doing "caching" of data as a result almost 15 years ago.
The rule was, no more than 2 database calls per page, ever. And you couldn't do things like update foo set name = 'me' where id = 1, because there was no exposed notion of a record id, it was internal, so you had to select name from foo where name = 'whatever' which would return the name, but also a RedID value, which you then got to run another query to update foo set name = 'me' where if = <special RecID token>
I actually have a 100% static site deployed in production. It is served off nginx, built with make, shell and sed (does some include processing and index generation) and is deployed with rsync from make and uses git for version control.
The design of news.arc is notoriously quirky. Given the fact that paging is entirely based on continuations, I don't think "flat-file-based" covers it.
That explains a lot. I get the principle of why this might be done but boy is it ugly. You'd have to maintain state between pages. State is bad. Which probably explains all those link expired errors...
Basically I just want all dependencies works as well as when I test it, and in accordance with the guidelines getcomposer.org :)
This platform prioritizes writing through the admin panel, the convenience for users, particularly for non programmmer or for those who are not familiar with coding at all.
If you have other views, you can contribute to the project, so that we can discuss it further.
You will either end up with an ad hoc, informally-specified, bug-ridden, slow implementation of half of SQLite ... or, you will fail to even attempt the features that SQLite gives you - such as locking and dealing with concurrency - and you will have bugs.
For example, you use file_put_contents. See this comment: http://www.php.net/manual/en/function.file-put-contents.php#...
Please don't go and add locking now! (It's hard to get right). SQLite was invented to be a better fopen() - use it! There is no reason not to, if you are requiring PHP 5.3. If you want a simple plain text dump of your SQLite DB, that's not hard to add.