These updates most definitely could've been handled better. I was having a busy week with exams and I get a call about around 10 machines not booting (this was before the announcement). Sure enough, last thing everyone reported was updating. I call the supplier and apparently they have reports of at least 2000 machines (at that moment) that had to be reimaged across the city (from what I could tell, all were older AMD PCs) because of this dumb update.
I was used to not being to able to trust software, but if I can't trust hardware now either, farming does suddenly appear much more appealing.
Trust me, if you don't like the idea of your entire livelihood hinging on weather or not a piece of equipment boots up, farming is NOT the career for you.
Looks like a field ripe for AI to introduce random auto-correct induced typos into paragraphs where a change of word order could be perceived as poetic.
One of the attempts at adding to the list "Nobody is ever going to sell you goats as a service." is actually wrong. Companies do rent goats for land clearing[1]. Go for beekeeper then you too can do "bees as a service"[2].
Yeah, AWS had scheduled reboots that were supposed to happen around the day this was all announced, so we had to scramble to deal with them manually beforehand so we'd ensure our systems booted up properly.
This actually made the system boot but there are some leftovers being installed on first boot that I've been unable to disable that also causes the system to be unable to boot.
So now, the machine is running but as soon as it is restarted we have to re-image the disk, go through the process of manually removing patches, and then pray that we don't have a power shortage as we'd have to do everything yet again on next boot.
I'm not convinced that this patch will solve the issue either, because if this updates requires a reboot the fix won't be installed if we can't boot. I might try to install this update from the recovery console see if that works.
There is a very fundamental difference between how Unix and Windows view open files:
On Windows, once the file is open, it is that filename that is open; You can't rename or delete it; Therefore, if you want to replace a DLL (or any other file) that is in use, you have to kill any program that uses it before you can do that; And if it's a fundamental library everything uses (USER32.DLL COMCTL.DLL etc), the only effectively reliable way to do that is reboot.
On Unix, once you have a handle=descriptor of the file, the name is irrelevant; You can delete or rename the file, The descriptor still refers to the file that was opened. Thus, you can just replace any system file; Existing programs will keep using the old one until they close/restart, new programs will get the new file.
What this means is, that even though you don't NEED to restart anything for most upgrades in Unix/Linux, you're still running the old version until you restart the program that uses it. Most upgrade procedures will restart relevant daemons or user programs, or notify that you should, (e.g. Debian and Ubuntu do).
You always need a reboot to upgrade a kernel (kernel-splice and friends not withstanding), but otherwise it is enough in Unixes to restart the affected programs.
> On Windows, once the file is open, it is that filename that is open; You can't rename or delete it;
This is wrong... there's no clear-cut thing like the "file name" or "file stream" that you can specify as "in-use". It depends on the specifics of how the file is opened; often you can rename but not delete files that are open. Some (but AFAIK not all) in-use DLLs are like this. They can be renamed but not deleted. And then there's FILE_SHARE_DELETE which allows deletion, but then the handle starts returning errors when the file is deleted (as opposed to keeping the file "alive").
To make it even more confusing, you can pretty much always even create hardlinks to files that are "in use", but once you do that the new name cannot be deleted unless the old name can also be deleted (i.e. they follow the same rules). This should also make it clear that it's not the "name" that's in use, but the "actual file" (whatever that means... on NTFS I'd suppose it corresponds to the "file record" in the MFT).
The rule that Windows always abides by is that everything that takes up space on the disk must be reachable via some path. So you can't delete in-use files entirely because then they would be allocated disk space but unreachable via any path.
> What this means is, that even though you don't NEED to restart anything for most upgrades in Unix/Linux, you're still running the old version until you restart the program that uses it.
What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches. Is this correct? Because it seems to me that the Windows method might be less flexible but is likely to be more stable, since there's a single coherent global view of the file system at any given time.
Yes, in principle what you've said about the Unix approach here is correct, if you upgrade one half of a system and not the other half and now they're talking different protocols, that might not work.
But keep in mind that if your system can't cope with this what you've done there is engineer in unreliability, you've made a system that's deliberately not very robust, unless it's very, very tightly integrated (e.g. two sub-routines inside the same running program) the cost savings had better be _enormous_ or what you're doing is just amplifying a problem and giving it to somebody else, like "solving" a city's waste problem by just dumping all the raw sewage into a neighbouring city's rivers.
Now, the "you can't delete things because then the disk space is unreachable" argument makes plenty of sense for, say, FAT, a filesystem from the 1980s.
But (present year argument) this is 2018. Everybody's main file systems are journalled. Sure enough, both systems _can_ write a record to the journal which will cause the blocks to be freed on replay and then remove that journal entry if the blocks actually get freed up before then. The difference is that Windows doesn't bother doing this.
> Now, the "you can't delete things because then the disk space is unreachable" argument makes plenty of sense for, say, FAT, a filesystem from the 1980s.
Unix semantics were IIRC in place as far back as v7 (1979), possibly earlier - granted, a PDP disk from that time was bigger (~10-100MB) than the corresponding PC disk from a few years later (~1-10MB), but an appeal to technological progress in this particular example case is a moot point.
> But keep in mind that if your system can't cope with this what you've done there is engineer in unreliability
It's weird that you're blaming my operating system's problems on me. "My system" is something a ton of other people wrote, and this is the case for pretty much every user of every OS. I'm not engineering anything into (or out of) my system so I don't get the "you've made a system that [basically, sucks]" comments.
> [other arguments]
I wasn't trying to go down this rabbit hole of Linux-bashing (I was just trying to present it as as objective of a flexibility-vs.-reliability trade-off as I could), but given the barrage of comments I've been receiving: I don't know about you, but it happens more often than I would like that I update Linux (Ubuntu) and, lo and behold, I can't really use any programs until I reboot. Sometimes the window rendering gets messed up, sometimes I get random error pop-ups, sometimes stuff just doesn't run. I don't get why it happens in every instance, and there might be lots of different reasons in different instances. IPC mismatch is my best guess for a significant fraction of the incidents. All I know is it happens and it's less stable than what you (or I) would hope or expect. Yet from everyone's comments here I'm guessing I must be the only one who encounters this. Sad for me, but I'm happy for you guys I guess.
> What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches. Is this correct?
Only if the libraries that use IPC have changed their wire format between versions, which would be a pretty bad practice, so I wouldn't expect that to happen often (if ever).
If something that's already running has its data files moved around or changed sufficiently, and it later tries to open (that is, the app was running but the data file wasn't open when the upgrade happened) what it thinks is an old data file, but is either new and different or just missing, that could cause problems.
> Because it seems to me that the Windows method might be less flexible but is likely to be more stable, since there's a single coherent global view of the file system at any given time.
In practice I've never had an issue with this (nearly 20 years using various Linux desktop and server distros). Upgrade-in-place is generally the norm, and most people will only reboot if there's a kernel update or an update to the init system.
*nixes have a system for handing off to the newer version such as kpatch and kgraft.
kgraft for example swaps off each syscall for a process while it is not being used. This lets the OS slowly transfer to the new kernel as it is running.
kpatch does it all in one go but locks up the system for a few milliseconds.
The version that is currently merged into 4.0+ kernels is a hybrid of the two developed by the authors of both systems.
Runtime kernel patching is a new thing. "*nixes" do not generally have these systems. Some proprietary technologies exist which enable specific platforms to use runtime kernel patching exist.
> What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches.
In theory, but Linux systems tend to do very little IPC other than X11, pipelines, and IP-based communication, where the protocols tend to support running with different versions.
In practice you can achieve multi-year uptimes with systems until you get a mandatory kernel security update.
How can you ave a multi-year uptime unless you willfully ignore kernel security updates? In this day and age, year-long uptimes are an anti-pattern (if only because you cannot be sure whether your services are actually reboot-safe).
X11 (or other window-related tooling) was exactly what I was thinking of actually, because every time I do a major Linux (Ubuntu) update I can't really launch programs and use my computer normally until I reboot. It always gets finicky and IPC mismatch is the best explanation I can think of.
> What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches. Is this correct?
At the first glance this is true, but you can guard against this in several ways. If your process only forks children then it already inherits the loaded libraries from the parent as part of the forked address space. Alternatively you can pass open file descriptors between processes. Another option is to use file-system snapshots, at least if the filesystem supports them.
Yet another option is to not replace individual files but complete directories and swap them out via RENAME_EXCHANGE (an atomic swap, available since kernel 3.15). As long as the process keeps a handle on its original working directory it can keep working with the old version even if it has been replaced with a new one.
Some of those approaches are tricky, but if you want to guard against such inconsistencies at least it is possible. And if your IPC interfaces provide a stable API it shouldn't be necessary.
> And then there's FILE_SHARE_DELETE which allows deletion
That has some issues when the file is mmaped. If I recall correctly you can't replace it as long as a mapping is open.
> What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches.
While this is true, I've never seen this to be a problem. If two programs use IPC, they usually use either stable, or compatible protocol.
To make things even more complicated, you can have two programs, either each in it's own container, or statically linked, or with their private bundles of libraries, doing IPC and then they are free to have different versions of the underlying libraries, while the users still expect them to work fine.
In principle, yes, IPC can fail between different versions of the same software. However, the chances that communication will fail between new version and some other utility are IMO much higher. A surprise comm failure between two copies of the same software (even different revisions) usually makes developers look pretty bad.
Some versions are known to be incompatible and most Linux distributions do a very good job of recommending and doing a restart of affected services in a way transparent to users. I have been running Linux and home and at work for years, almost never restart those workstations and, as far as I can tell, never had problems from piecemeal upgrades. My 2c.
Here is the simplest way I can put it: When you delete a file in NT, any NtCreateFile() on its name will fail with STATUS_DELETE_PENDING until the last handle is closed.[1] Unix will remove the name for this case and the name is re-usable for any number of unrelated future files.
[1] Note that is not the same as your "must be reachable via some path". It is literally inaccessible by name after delete. Try to access by name and you get STATUS_DELETE_PENDING. This is unrelated to the other misfeature of being able to block deletes by not including FILE_SHARE_DELETE.
"Reachable" doesn't mean "openable". Reachable just means there is a path that the system identifies the file with. There are files you cannot open but which are nevertheless reachable by path. Lots of reasons can exist for this and a pending delete is just one of them. Others can include having wrong permissions or being special files (e.g. hiberfil.sys or even $MFTMirr).
this has led to some interesting observations for me in linux when I've had really large log files that were still in use and were "deleted" but the file was still in use. (I think cat /dev/nul > file will do this). Tools like du now cannot find where the disk usage actually is. Only on restart of the app does usage show correctly again. Kinda hard to troubleshoot if you were not aware this was what happened.
> So you can't delete in-use files entirely because then they would be allocated disk space but unreachable via any path.
Isn't there a $MFT\\{file entry number} virtual directory that gives an alternate access path to each file? Wouldn't that qualify as "a way to access the file?"
Also, you might say that in practice Linux abides by the same rule - the old file can be referenced through some /proc/$pid/fd/$fd entry.
>What I expect it also means is that you'll get inconsistencies when doing inter-process communication, since they'll be using different libraries with potential mismatches.
That's why you should restart those programs that were using the library. You can find this out via `lsof`.
Really? Somehow procure a list of all libraries that were updated in a system update, go through each one, find out which program was using it, and kill that program? Every single time I update? You can't be serious.
In a case where one could get some sort of inconsistency because of different library versions, you restart the applications. That’s the point, that this can be handled by restarting the applications and not the entire operating system.
Heck, on Windows I couldn't even rename audio/video files when playing them in an app, but I can on macOS, without anything crashing (except some stubborn Windows-logic apps like VLC which will fail to replay something from its playlist that has since been renamed, but at least on macOS it will still allow you to rename or move the files while they're being played.)
It's small details like these that make so much difference in daily convenience.
I think a general purpose UX should err on the side of "make it difficult for non-technical users to make mistakes"
Renaming/Deleting files in use is one of those things that us nerds like to complain about, but it makes sense when you think of an accountant that has an open spreadsheet and accidentally deletes a folder with that file. For average non-technical people (on any OS) I would say it makes sense to block that file from being deleted.
I see you’ve never actually experienced it? It is actually more intuitive for the average user, as the file name is updated across every application immediately. In fact, you can actually change the file name from the top of the window directly.
No idea, if the OS design plays into this or if it's just a application design convention, but on desktop Linux how it often works (for example with KDE programs) is that if a program has a file open which was moved (including moved to the trash), then the program will pop open a little persistent notification with a button offering to save the content that you still have in the program to the location where the file used to be, effectively allowing you to recover from such mistakes without hindering you from moving/deleting the file.
This doesn't fully explain why a reboot is not required on Linux. If a *nix operating system updates sysfile1.so and sysfile2.so in the way you describe, then there will be some time where the filename sysfile1.so refers to the new version of that file while sysfile2.so refers to the old version. A program that is started in this brief window will get mixed versions of these libraries. It is unlikely that all combinations of versions of libraries have been tested together, so you could end up running with untested and possibly incompatible versions of libraries.
> This doesn't fully explain why a reboot is not required on Linux.
Of course there is a theoretical possibility that this will happen; however, in practice, updates (especially security updates) on Linux happen with ABI compatible libraries. E.g. on debian/ubuntu
apt-get update && apt-get upgrade
Will generally only do ABI compatible updates, without installing additional packages (you need 'dist-upgrade' or 'full-upgrade' for that).
Some updates will go as far as to prevent a program restart while updating (by temporarily making the executable unavailable).
Firefox on Ubuntu is an outlier - an update will replace it with one that isn't ABI compatible. It detects this and encourages you to restart it.
All in all, it's not that a reboot is never required for linux theoretically - it is that practically, you MUST reboot only for a kernel update, and may occasionally need to restart other programs that have been updated (but are rarely forced too).
This generally should never happen, Linux distributions don’t wholesale replace shared objects with ABI incompatible versions - soname’s exist to protect against this very issue.
I had a program with a rarely reported bug that turned out to be lazy loading of .so files that was this bug. Switched to eager loading and it went away.
> On Windows, once the file is open, it is that filename that is open; You can't rename or delete it
It's simple for any application to open a file in Windows such that it will allow a rename or delete while open - set the FILE_SHARE_DELETE bit on the dwShareMode arg of the win32 CreateFile() function. In .NET, the same behaviour is exposed by File.Open / FileShare.Delete.
That is incorrect. It just depends on Windows how you call the Win32 API and what parameters you specify. Many options there - in the end it's just an object in the NT kernel space.
What is really irritating is when for example an update that only changes mshtml.dll requires a reboot because a program unnecessarily depends on it. These are not as common as it used to be though.
> Speaking of which, why do so many things require reboot to update on Windows?
Can't speak for everyone else, but Windows fully supports shared file-access which prevents the kind of file-locks which causes reboot requirements.
The problem is that the default file-share permissions in the common Windows APIs (unless you want to get verbose in your code) is that the opening process demands exclusive access and locking to the underlying file for the lifetime of that file-handle.
So unless the programmer takes the time to research that 1. these file-share permissions exists, 2. which permissions are appropriate for the use-cases they have in their code, and 3. how to apply these more lenient permissions in their code...
Unless all that, you get Windows programs which creates exclusive file-locks, which again causes reboot-requirements upon upgrades. Not surprising really.
In Linux/UNIX, the default seems to be the other way around: Full-sharing, unless locked down, and people seem prepared to write defensive code to lock-down only upon need, or have code prepare for worst-case scenarios.
Windows executables are opened with mandatory exclusive locking. So you can't overwrite a program or its DLLs while any instances of it are running. If a DLL is widely used, that makes it essentially impossible to update while the system is in use.
> Speaking of which, why do so many things require reboot to update on Windows?
We are getting there on Linux too - with atomic or image based updates of the underlying system. On servers you will (or already) have A/B partitions (or ostrees), on mobiles and IoT too, some desktops (looking at Fedora) also prefer reboot-update-reboot cycle, to prevent things like killing X while doing your update and leaving your machine in inconsistent state.
macOS also does system updates with reboot, for the same reasons.
It's a miracle that reboot-less updates mostly work in Linux. You need to restart services, etc. to make sure they have the latest libs. Gnome3+systemd does that now:
>macOS also does system updates with reboot, for the same reasons.
And speaking from experience, the recent macOS way of applying an update is absolutely insane - on my Macbook Pro (with the stock SSD with 3GB/sec read and 2GB/sec of write) a small system update can take 10+ minutes to install
I used to joke that Windows was alone in this issue, my work laptop being a prime example, but even Apple tends to towards reboots more often than not and especially as of late. Fortunately both can do it during slow periods; as in over night; and make updates nearly invisible to users.
So that's a fine question to ask, and you've received many fascinating answers, but can I just suggest that this case - that is, applying patches that relate to the security of your processor cache - is a very fine reason for requiring a reboot, since it will ensure that your processor cache starts out fresh and all behaviors that cause data to be placed there are correctly following the patched behavior.
The main reason is that in Windows executable files and dynamic libraries (.exe and .dll) are locked while a process is using them, while in other systems, e.g. Linux, you can delete them from disk. The only absolute need for reboot should be an OS kernel update (there are cases where a kernel could be updated/patched without a reboot).
I think a better question would be: why does Windows need multiple successive reboots?
Too often my experience can be resumed as: update-reboot-reboot-update continuing-reboot... ad nauseam.
At least on *nix, even when you need a reboot, once is enough.
To force a reinitialisation of all security contexts. Same reason that many websites make you log in again immediately after changing your password (which interestingly Windows doesn’t)
Historically (Windows 95 and earlier) reboots were required to reload DLLs and so on but that’s not really true anymore. Still a lot of installers and docs say to reboot when it’s not really necessary as a holdover from then
I was under the impression that the reason is what u/beagle3 mentions (in a sibling comment to yours): open system files. I'm curious to see your comment on what he describes, as what you mention (reloading some security context) does not seem to be the whole truth. That websites make one log in again after changing your password has nothing to do with this.
That websites make one log in again after changing your password has nothing to do with this.
No it is exactly the same principle: something has changed therefore invalidate all existing contexts. Far less error prone than trying to recompute them, what happens e.g. if a resource has already been accessed in a context that is now denied? Security 101.
I don't see how changing my password changes a "security context". I don't suddenly get more or fewer permissions.
As for logging other places out, that's a design choice. People change password either because they routinely change theirs (they either need to or choose to), or because of a (suspected) compromise. In the latter case you'll probably want to log everyone else out (though, who says you're logging out the attacker and not the legitimate user?) and in the former case you shouldn't (otherwise changing your password becomes annoying and avoided). The interface for changing the password could have a "log out all sessions" checkbox or it could just be a feature separate from changing your password.
No, it's not as simple as you put it. No need to condescendingly pass it off as "security 101".
Same thing happened to me this week-end and we are not alone [1]. The worst is that it's actually the second time on this computer (Kaby Lake i3 on a MB with Intel B250 Chipset). I had the same issue in December last year (exact same behaviour with probably an earlier version of that hotfix).
I'm running with Windows Update service disabled till this is fixed for good !
I have not been able to disable the update service. I'm supposed to be able to, but damn if I don't open my computer in the morning and see everything closed (and lock files all over the place) and all kinds of annoying shit like this.
I actually like Win 10, but it's shit like this that keeps me from becoming a true convert. Oh, for $X00 I can get enterprise update, but IMO that's just Win 10 home being used as ransomware. /rant
That patch disables the use by the kernel of the new IBPB/IBRS features provided by the updated microcode, when it's of a "known bad" revision. Since Linux prefers the "retpoline" mitigation instead of IBRS, and AFAIK so far the upstream kernel (and most of the backports to stable kernels) doesn't use IBPB yet, that might explain why Linux seems to have been less affected by the microcode update instabilities than Windows.
Also interesting: that patch has a link to an official Intel list of broken microcode versions.
> In a related development, there are proposed patches to the Linux kernel (not yet merged) to blacklist the broken microcode updates
Linus probably won't pull it until it's truly known to be stable, because of his attitude towards having decent quality code and not causing needless system instability.
Without Linus... who knows what would have happened by now.
They are on the "tip" tree (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/...), so they'll probably be sent to Linus as soon as the merge window opens (Linux 4.15 has just been released, so the merge window should open soon). I expect these patches to be on 4.16, and also to be backported to the stable releases (4.15.x and others).
But yeah, upstream Linux kernel development is taking it slow. As far as I can see, variant 3 mitigations (PTI) are already in, variant 2 mitigations are partially in (retpoline) and partially not (the microcode dependent ones), and variant 1 mitigations are still under discussion.
"But yeah, upstream Linux kernel development is taking it slow."
Taking it slow seems very appropriate to me. This seems to me to have been a case of everybody grossly overestimating the short-term portion of the catastrophe, and underestimating the long term.
In the short term, the only people who were going to be plausibly affected in the next three to six months are people on shared hosting of some sort where you may share a server with somebody else's untrusted code, where an accelerated fix is in order, but also something that can be centrally handled. I'm not that worried in the next three to six months that my personal desktop is somehow going to be compromised by either Meltdown or Spectre, and personally, if I see a noticeable performance issue I may well revert the fixes (I'm on Linux), because first you have to penetrate my defenses to deliver anything anyhow, then you have to be in a situation where you're not going to just use a root exploit, which probably means you're in a sandbox or something which means it's that much more difficult to figure out how to exploit this. For most users, uses, and systems, spectre and meltdown aren't that immediately pressing.
Meanwhile, in the long term this may require basically redesigning CPUs to a very significant degree; there is no software patch that can fix the underlying issues. It is difficult to overstate the long term impact of this class of bugs. IMHO the real problem from the jousting match with Linus and Intel last week isn't that Intel's patches today aren't quality code, but that it makes me concerned that they're just going to sweep this fundamental problem under the rug. As I said in another post on HN, I fully understand that remediating this is going to be years, and I don't expect Intel to have an answer overnight, or a full solution in their next "tock". But if they're not taking this seriously, we have a very large long-term problem. We're only going to see more leaks in the long term.
I read somewhere that people have developed POCs of these using JavaScript. At minimum, you'll want to keep your browser up to date as there are mitigations happening there too. Who knew that exposing high precision timers to untrusted JavaScript would be a bad idea?
Apart from browsers, it's fortunately pretty easy to avoid running code you don't trust on your devices.
What I've seen that the POCs can actually do is not worth running around with your hair on fire, from what I've seen.
Note I did not say there is no reason to be concerned about Meltdown and Spectre... just that for most users, uses, and systems, it's not that important. In the next three-to-six months, if you care about security at all, unless you are already running a tip-top tight operation, your money and effort is better spent defending against the many already-realistic threats, rather than worrying about the vector that may someday be converted into a realistic threat. Meltdown isn't what is going to drag your business to a halt next week; it's that ransomware that one of your less-savvy employees opened while mapped to the unbacked-up world-writable corporate share that has all the spreadsheets your business runs on. At the moment, the net risk of applying the Meltdown fix comfortably exceeds by several orders of magnitude the risk that Meltdown itself poses.
And my point is precisely that for most users and uses, that panic was not justified. Those for whom that is not true (VM hosting companies) already know they need to be more aggressive. There was no point in pushing out patches that nearly bricked some computers.
Exactly. He's very intelligent about how he manages the kernel, which is precisely why it is preferred by a majority of businesses throughout the world, and is the absolute #1 in the supercomputer world, for the top 500:
>However, Intel does not appear too concerned that the incident will affect its bottom line - the company expects 2018 to be a record year in terms of revenue
There is an interesting paradox in our industry. If you pay enough attention (read: money) to security, you will be late to the market, your costs will be high and you lose profit. If you don't pay enough attention, you take the market, get your profits, but your product (be it hardware or software) and reputation will be screwed later. And worst of all: there's never enough attention to security.
So by simple logic, an optimal strategy is to forge your product quickly, take your profits within a [relatively] short period and vanish from the market. I guess we'll see this strategy executed from IoT vendors when market start to punish them for their bad sec.
For Intel, that "long period" just happened to be REALLY long.
I doubt Intel will see serious punishment in the market. As usual, there will be a lot of wailing and gnashing of teeth but when push comes to shove most people will prioritize nearly everything over security.
All markets work like this. People bitch about the quality of products, but still buy the cheap stuff.
This will be true until something uses Meltdown in the wild to cause massive damage. When a digital superflu comes, businesses and individuals will be faced with a choice: continue to use Intel and be vulnerable to a flu that is literally wiping out businesses, exchanges, hospitals, etc or replace ALL of their hardware with AMD.
Interestingly, I think AMD has a lot of motive to create such a superflu, or at least encourage it's creation.
You conclusion isn't in agreement with the section you quoted, so are you saying that Intel will be punished by the market in the mid to distant future (after 2018)?
"Here's a patch" - "Here's a patch to disable that other patch" - ...
What's next? Repeat? Sounds like this could turn into a maintainance nightmare quickly. Also because I've introduced things like that myself in the past, and that was for normal applications and not a kernel or OS. Somewhere, someday, there's usually this one exception for which none of your rules hold true and the thing blows up in your face.
Anyway, I'd love to see the actual code for this. Not a chance probably?
Im really wanding they had more than 6months to do these patches and they did not bother testing on a good number of systems. Its not like MS + Intel dont have enough money to buy a few 1000 testing machines and get some testers on it.
Have you released a bugfix to a large application? I've had one line fixes break some use case I hadn't even heard of before, and it doesn't always show up right away, either. Intel's fix has to work on every application in every version of Windows, macOS, Linux, for multiple versions of processors with multiple different chipsets. And it has to be done yesterday. That's a nightmare scenario.
I think Spectre may have appeared later, after Meltdown? Remember the investigations into what's possible were proceeding in parallel with the attempted fixes.
Also, CPU design changes take a long time. 6 months may seem a long time from the perspective of HackerNews node.js type hackers, but it's a bit harder to patch decades worth of CPU microcode than a website.
I get it may take a long time (that is fine even if the patches took a few more days), what I don't get is that they released it to production (server) envs seemingly without testing. Surely even rudimentary testing (deploying on a few 1000 different server platforms for a few hours at least should be something that Intel does for all microcode updates, after all they are rather more important than js Node packages as you point out)
I haven't heard of microcode updates that hurt stability before. Presumably the collapse of the embargo caused them to do an accelerated release, skipping their usual long testing cycle.
Currently, they are not patching a decade worth of CPU microcodes since we have 0 working microcode. And, previously, released microcodes were only down to Ivy Bridge EP (~2014).
I have a feeling there's going to be a lot of demand for future Intel hardware that's immune to spectre and meltdown. I think it might cause _more_ sales of Intel chips in the future, not fewer.
Regardless of sales, I would think damage to their image would be a concern. Maybe it's not a concern to investors because their "PR nightmare" turned out to be softball for them, and it's been hard to pin anything on Intel when they keep pointing fingers in all directions.
I think it's our responsibility as technology literate folks and decision makers to explicitly highlight their failures so that mistakes and poor handling like this are not normalized.
I tend to think it's because the folk who trade with stock very well know that these "issues" are actually features.
I can also bet that that "grilling" they get from US government is not about the mess the chip flaws are doing , it's about why the "flaws" were publicly announced.
I know in my company this has indeed resulted in a full stop in buying Intel and going AMD instead. There is also an active "project" to replace the Intel servers.
I work in a bank and they are terrified of the possibility of user processes reading privileged memory. Not necessarily out of actual fear but out of the insane amount of paperwork this will require to satisfy the auditors that it is still safe.
Anecdotally, but you asked for "someone" and here is someone :)
Not by Meltdown which enabled user processes to read kernel memory.
And as we've seen in the aftermath, they have not nearly as much trouble with their patches for Spectre as Intel has had.
Well , Linus already said perhaps linux should look at the arm folk . Should that happen , well guess what ?
From my very small knowledge , 90+% of internet infrastructure runs on linux .
The Meltdown/Spectre class of attacks affect certain CPUs. Spectre is a microarchitectural attack on any CPU that does speculation and uses data caches, regardless of architecture.
Arguably, it has been handling Meltdown and Spectre in a much much more humble and transparent way. See https://developer.arm.com/support/security-update/compiler-s... for work they are doing with the compiler communities to address the Variant 1 both on current and future chips.
Spectre affects AMD, too, so there's no competitor to run to... and CERT was saying at one point that only new processors would fully fix it. They're looking at everyone needing to buy a bunch of replacement products, aren't they?
Intel has had literally months upon months to test and stabilize their microcode and kernel-side patches for Meltdown/Spectre... but it seems like Intel just doesn't give half a shit, if they're having major issues now.
Intel's actions seem to shout that they have cared far more about releasing Kaby/Skylake X and Coffee Lake in short order, as a response to Ryzen/ThreadRipper, than actually really digging into fixing their major security flaws. Their actions speak of them preferring to keep their market and mindshare over actually fixing any security issues.
Intel is still so deeply entrenched that they likely believe that they can get away with their lazy approach. They make millions upon millions, if not billions of dollars ~ why should they give a shit, when their monopoly and half-hearted attempt at a solution will get them by? Intel is being strangled by their shitty management, seemingly...
By my reading of the article, Microsoft is disabling some mitigations for Spectre due to instabilities that Intel's microcode update have been causing.
Intel certainly isn't making any friends these days...
I'm eyeing Threadripper for my next build but beyond that I'm going to choose a motherboard vendor based on the level of support they offer in this scenario. Some observations that I'm making:
* How promptly did they address the issue via official channels, i.e. did they leave users in the dark as they appealed to vendors in their forums (hint: most of them seem to have gone down this route) or did they share updates directly on their official sites, social media accounts, etc.
* Did they provide some estimates as to when users could expect patches?
* How much of their product catalogue were they willing to cover with security updates? Since this is a unique security issue with high impact I would have expected them to cover motherboards at least 4-5 years old.
MSI seems [1] to be fairly proactive right now with patches going back to several X99 motherboards. Asus for example has so far only committed to provide updates for two X99 motherboards.
AMD equipment should be fine, current-gen Ryzen/Threadripper is more than adept at workstation tasks and next-gen Ryzen (named Ryzen 2 and Threadripper 2) will edge out any advantage that Intel's CPUs have.
I have my macbook for work and programming, my PC has windows on it (much to my dislike) and I use it for occasional gaming. I will probably not get new parts anytime soon though as performance is currently fine. Just wondering for when someone asks me to build them a PC.
Well, I built my PC with gaming in mind and chose the i7 8700k. But I got it a week before the spectre/meltdown spectacle. I decided to keep it because of its superior singlecore performance.
I've not been impressed with my Windows 10 installations of late. All my machines that don't have the Long Term Servicing Branch have had wild instabilities and performance issues the past few months - crazy things, like the task manager taking minutes to launch, and the whole shell periodically crashing. The Fall Creators Update was so bad I had to wipe and start over on some boxes. It's not engendering a lot of confidence in their competence of late.
I never got them. The last update in my windows is from Dec 2017. My antivirus is compliant, the registry key correctly set up and yet it refuses to update.
I still haven't had the time to debug it, but I wonder how many people are out there with their OS silently refusing to update.
I had huge problems with Win 10. Updates wold fail and install again and again without actually getting installed. Sometimes I would get an opaque error number but web searches revealed nothing for that number, and it was rare that I would be able to find even an error number. I don't do Windows, and just installed it for VR, and didn't spend that much time in Windows, so I would spend 15-30 minutes looking a month, before realizing I had spent more debugging time than VR time that week.
After probably 9 months of this, and with Windows doing ever more intrusive pop overs whenever I launched it for updates that don't take, I wiped all boot sectors everywhere and installed from scratch. That seemed to work, but it was incredibly frustrating that the boot process was so buggy as was error reporting. I've never encountered a situation like it in the past 15 years of heavy Linux use. Problems there are usually solvable with a couple web searches, even for extremely obscure kernel bugs with obscure packages. Windows refused to tell me anything as did the web.
I built a Windows machine for 3D work and VR just over a year ago, after being a Mac only user for 15+ years. Honestly my Win 10 experience has been the total opposite, it's been stable, fast, minimum update nagging. Overall I've actually been shocked how stable and hassle free the experience has been.
Maybe I just got lucky with the right combination of hardware.
My brother was hit by a recent update to Windows 7 that prevented the machine from booting. He went to Microcenter to buy a hard drive. There were a lot of people doing the same thing for the same reason when he was there.
I don't use the windows side of my machine very often, but decided to update it last night. Booted fine (OS on SSD), but one of the HDDs with all of the windows files was corrupted. No go with ntfsfix, chkdsk, partition table destroyed. Reformatted it as ext4 and windows doesn't get to touch it anymore. Haven't tested it too much yet but seems to be working fine.
Intel has been called out by Linus Torvalds several days ago for the crappy fixes they delivered for GNU/Linux. I would be very surprised if Intel actually shipped proper fixes for Windows. It's a shame, really.
The “dude” is also probably working under an insane amount of pressure and being made to feel like he is somehow responsible or at fault for the whole situation. Best not to make it personal from the peanut gallery.
So what you are saying is that Intel didn't bother to support Linux even with a crappy fix, like they did for Microsoft? Good to hear the Chinese are well supported though.
That is not what I said and not what situation implied. In situation where you have like zero factual information you went out of your way to make up most damaging possibility you could have.
I doubt some the best/most popular players of the USA-tech-industry dream team will get any real punishment at their own soil. Any fines will give a sense of justice to the public but it will just be peanuts.
I wonder how much all this cleanup will cost in hours, downstream, for all the installed users? Judging by all the grief on this thread it's substantial.
Just in case you, like me, missed the memo where Microsoft said they'd stop supplying security updates if you have no AV / AV incompatible with the patches installed. The fix to the former is creating the registry entry manually.
In cases where customers can’t install or run antivirus software, Microsoft recommends manually setting the registry key as described below in order to receive the January 2018 security updates.
Sounds like Microsoft can't tell the difference between "has AV installed that will break" and "has no AV installed", which makes sense. It's probably infeasible to reliably fingerprint all existing AV software.
> Sounds like Microsoft can't tell the difference between "has AV installed that will break" and "has no AV installed", which makes sense. It's probably infeasible to reliably fingerprint all existing AV software.
For something like this, I think best-effort bad-AV detection would have been best. Seems pretty insane to disable security patching because they can't be 100% certain that you have a compatibly AV.
It makes sense though. Only AV programs that comply may set the setting. Without a compliant AV program, there's nothing to do that set - unless you do it manually.
Microsoft does not have any way of knowing whether you have an antivirus or not and because the Spectre patch causes a bluescreen on boot if you have an antivirus that's not updated, they require the antivirus set the registry key to say "hey, it's safe to update". Absence of AV means that registry key doesn't get set.
MS doesn't provide an easy, GUI way of disabling built-in Defender by the way. If you 'disable' defender by using the control panel on windows 10, it only stops its activity temporarily and it can reactivate itself after 24 hours or something like that. You can permanently disable it through registry keys but it's not an officially supported, accepted method to edit the registry by yourself. There's a group policy for 10 Pro and other corp editions though.
For a normal home user, Defender is never fully disabled. It will deactivate itself if you install a third party antivirus, and reenable itself when you uninstall them. Bottom line, the average user is not supposed to be AV-less.
Nah, it was more than that: The patches do things like add the garbage MSR writes to the kernel entry/exit points. That's insane. That says "we're trying to protect
the kernel". We already have retpoline there, with less overhead.
Yes, but as far as I know Linus has made no comment on the microcode patches, so mm-vorticesoft is probably referring to the Spectre patches in general.
His overall point was a bewilderment at the incompetent and non-sensical patches that were being given as "fixes" in this issue. Linus was pointing out a particular instance of that, but this news and other behaviour from Intel seems to indicate this is part of an endemic, cultural, administrative issue inside the company.
OP is probably referencing the 'bullshit patches from Intel' comment from Linus about the patches they were sent, and that Microsoft might have been sent similar obfuscatory patches.
It's not the same company - David Woodhouse works for Amazon. He used to work for Intel but not for a year or so.
It's also not the same reason. Linus doesn't like the mitigation in the kernel, disagreeing on how Intel intends to implement it. This article is about unstable microcode patches that Intel retracted, and that retraction has been discussed on here a few times. The article is just exceedingly bad at describing the actual issue. It also doesn't help that the kernel mitigation depends on new flags introduced by the faulty microcode update, but the update being faulty is orthogonal to Linus' opinion.
The day this blew up we rented our first physical server for the express purpose of running secure critical workloads in unpatched environments. Yes, I know that there is nothing secure, but not everything we do is running a chunk of logic uploaded by an attacker, so we will take our chances.
What does the Spectre bug mean for a person planning to buy a new windows computer? Should I buy an AMD CPU based computer instead of an Intel based computer?
A more accurate title would be that Microsoft disabled the specter mitigation’s due to a flawed Intel update, right? I thought this was all Microsoft’s fault until getting half way through the article.
It's really telling that even Linus Torvalds was not happy with their "fixes" and now Microsoft. Intel needs to start taking the situation completely seriously cause their actions don't imply they are.
Vote with your wallet. That's really the only thing that you can do. Intel is too comfortable in their position as market leader. Until they start to feel some pressure, they have shown they don't really care. I know AMD is not a perfect company either, but I elected to buy a Ryzen processor for my upcoming build. People need to at least consider the competition without defaulting to "I need a processor, I buy the latest Intel chip."
Yes, I know that most people don't build new PCs or upgrade their processors that regularly. Yes, I know that many people don't have much of a choice because they have some requirement that currently ties them to Intel. However, those that do have that choice, should remember this debacle the next time they are buying a CPU/system (even if they are not doing it for awhile). Intel is hoping they can sweep this under the rug, we can't let them until they make amends. Do not buy an Intel chip until they've proven they will do better. I am not endorsing AMD either. You can still vote with your wallet by not buying anything at all. If enough people put off their upgrade, it would put a dent in Intel's bottom line. Things won't change until it hurts them financially.
I understand your point, and agree with the spirit of it, but a few consumers buying Ryzen chips isn't going to make one bit of difference. A couple data centers buying dozens of racks of them, however, would be more measurable. Hit em in the B2B not the B2C.
I agree with you, and you are right. However, most people aren't making decisions about what types of chips to use in a data center. It would be wonderful if the people in those positions explored non-Intel options. For the average consumer, all we can do is choose which company we buy a CPU from every few years. People buying Ryzen chips incentivizes AMD to keep making chips and stay in the market. Competition is good for consumers. I totally get that it is a lot more complex than that, but I personally feel like it's the best we can do as the average consumer.
Before Intel Core processors became the standard hot processor I was an AMD guy. I'm heading back towards that route. This means no Macbook or Surfacebook for my next development laptop for me. If anyone wants my money they better build a developer worthy laptop with an AMD processor and a sweet AMD graphics card. Also AMD is working on providing open source GPU Vulkan drivers.
For the longest time there was no practical alternative, but nowadays... AMD is back! The whole Ryzen lineup has turned out to be pretty good. You may even save some money in the process.
Has AMD processors been confirmed as not vulnerable? As I recall the original investigation only covered Intel processors, but hypothesized that AMD would be affected as well, as they more or less have the same fundamentals around branch prediction.
This is why I always preferred AMD. They gave you either more or the same bang for your buck. I hope they don't slack off on pushing to innovate ahead of Intel now that they're "even" in a sense.
I don't think that's true. Meltdown and Spectre are sub-ISA issues, so you could have a RISC-V implementation with them if it handled caching and speculation similarly.
Ugh, is this the cause of the weird bugchecks I've been having this week? Just gave myself 64gb page file and enabled full memory dumps so I could track it down in WinDbg. I always forget something on fresh installs...