Hacker News new | ask | show | jobs
by Extropy_ 102 days ago
Even if they reset to several days ago and lose, say, thousands of edits, even tens of thousands of minor edits, they're still in a pretty good place. Losing a few days of edits is less-than-ideal but very tolerable for Wikipedia as a whole
2 comments

At $work we're hosting business knowledge databases. Interestingly enough, if you need to revert a day or two of edits, you're better off to do it asap, over postponing and mulling over it. Especially if you can keep a dump or an export around.

People usually remember what they changed yesterday and have uploaded files and such still around. It's not great, but quite possible. Maybe you need to pull a few content articles out from the broken state if they ask. No huge deal.

If you decide to roll back after a week or so, editors get really annoyed, because now they are usually forced to backtrack and reconcile the state of the knowledge base, maybe you need a current and a rolled-back system, it may have regulatory implications and it's a huge pain in the neck.

I preach to everyone to fail as loudly as possible and as fast as possible. Don't try to "fix" unknown errors in code. It often catches fresh graduates off guard. If you fail very loud and fast most issues will be found asap and fixed.

I had to help out a team in the cleanup of a bug that corrupted some data silently for a while before being found. It was too long out to roll back and they needed all help to identify what was real or wrong data.

Nah, you can snapshot every 15 minutes. The snapshot interval depends on the frequency of changes and their capacity, but it's up to them how to allocate these capacities... but it's definitely doable and there are real reasons for doing so. You can collapse deltas between snapshots after some time to make them last longer. I'd be surprised if they don't do that.

As an aside, snapshotting would have prevented a good deal of horror stories shared by people who give AI access to the FS. Well, as long as you don't give it root.......

>Nah, you can snapshot every 15 minutes.

obviously you can. but, what is the actual snapshot frequency? like, what is the timestamp of the last known good snapshot? that is what matters.

in any case, the comment you are replying to is a hypothetical, which correctly points out that even a day or two of lost edits is fine (not ideal, but fine). your reply doesnt engage with their comment at all.

> the comment you are replying to is a hypothetical, which correctly points out that even a day or two of lost edits is fine (not ideal, but fine). your reply doesnt engage with their comment at all.

I did engage, by pointing out that it wasn't relevant nor a realistic scenario for a competent sysadmin. (Did you read the OP?) That's a /you/ problem if you rely on infrequent backups, especially for a service with so much flux.

> what is the actual snapshot frequency? like, what is the timestamp of the last known good snapshot?

? Why would I know what their internal operations are?

>I did engage, by pointing out that it wasn't relevant nor a realistic scenario for a competent sysadmin.

>Why would I know what their internal operations are?

i mean... you must, right? you know that once-a-day snapshots is not relevant to this specific incident. you know that their sysadmins are apparently competent. i just assumed you must have some sort of insider information to be so confident.

I think you are misreading my comments and made a bad assumption. The reason I'm confident is because this has been my bread and butter for a decade.
>The reason I'm confident is because this has been my bread and butter for a decade.

my decade of dealing with incompetent sysadmins and broken backups (if they even exist) has given me the opposite of confidence.

but im glad you have had a different experience

Nowadays I refuse to do any serious work that isn't in source control anywhere besides my NAS that takes copy-on-write snapshots every 15 minutes. It has saved my butt more times than I can count.
Yeah same here. Earlier I had a sync error that corrupted my .git, somehow. no problem; I go back 15 minutes and copy the working version.

Feels good to pat oneself in the back. Mine is sore, though. My E&O/cyber insurance likes me.

The problem isn't the granularity of the backup but since the worm silently nukes pages, it's virtually impossible to reconcile the state before the attack and the current state, so you have to just forfeit any changes made since then and ask the contributors to do the leg work of reapplying the correct changes
Why would nuked pages matter? Snapshots capture everything and are not part of wikimedia software.
The nuke might be legitimate?
That's not a lot of state lost. Destructive operations are easier to replay than constructive ones.
Is Wikimedia overreacting then?