Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video: https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC: https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
---
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
:)
On Sun, 30 Nov 2025, at 18:43, Bradford C. Vokey wrote:
Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video: https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC: https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
I view ECC as a critical step to reliability at multiple levels. Today's OSs keep significant amounts of data in RAM so they don't have to go to disk as often. If you have a memory error that goes undetected that RAM when it gets flushed to disk, has just written bad/corrupted data on disk. The more RAM you have the greater the probability something will go wrong. The higher density (and typically lower operating voltage) chips (7nm for example) have smaller "charges" in each memory cell which increases the probability of an error. IMHO, if you care about your data you want ECC.
I buy laptops that can take ECC memory. Currently that means a Lenovo P-series. Refurbished, not new, as I'm not made of money... -Adam
Get Outlook for Androidhttps://aka.ms/AAb9ysg ________________________________ From: David Milton david@dmilton.ca Sent: Sunday, November 30, 2025 7:14:16 PM To: roundtable@muug.ca roundtable@muug.ca Subject: [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
On Sun, 30 Nov 2025, at 18:43, Bradford C. Vokey wrote:
Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video:
https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC:
https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
I view ECC as a critical step to reliability at multiple levels. Today's OSs keep significant amounts of data in RAM so they don't have to go to disk as often. If you have a memory error that goes undetected that RAM when it gets flushed to disk, has just written bad/corrupted data on disk. The more RAM you have the greater the probability something will go wrong. The higher density (and typically lower operating voltage) chips (7nm for example) have smaller "charges" in each memory cell which increases the probability of an error. IMHO, if you care about your data you want ECC.
-- David Milton david@dmilton.ca, For better email, sign up here: http://www.fastmail.fm/?STKI=7947829 _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca
I totally ECCo the sentiments expressed by Adam and David, especially the tech details by David, that more RAM, denser circuits, and lower voltages all contribute to a greater probability of memory errors (flipped bits). My two HP laptops each have 64 GB RAM (non-ECC), leaving me quite nervous. They are a semi-consumer model, which I chose because reasonably-priced HP ProBook and EliteBook laptops had a motherboard limitation of 32 GB RAM (and might not even do ECC either, I don't know).
Hartmut
On Sun 30 Nov 2025 at 22:12:42 -06:00, Adam Thompson athompso@athompso.net wrote:
I buy laptops that can take ECC memory. Currently that means a Lenovo P-series. Refurbished, not new, as I'm not made of money... -Adam
Get Outlook for Android https://aka.ms/AAb9ysg
*From:* David Milton david@dmilton.ca *Sent:* Sunday, November 30, 2025 7:14:16 PM *To:* roundtable@muug.ca roundtable@muug.ca *Subject:* [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
On Sun, 30 Nov 2025, at 18:43, Bradford C. Vokey wrote:
Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video:
https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC:
https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
I view ECC as a critical step to reliability at multiple levels. Today's OSs keep significant amounts of data in RAM so they don't have to go to disk as often. If you have a memory error that goes undetected that RAM when it gets flushed to disk, has just written bad/corrupted data on disk. The more RAM you have the greater the probability something will go wrong. The higher density (and typically lower operating voltage) chips (7nm for example) have smaller "charges" in each memory cell which increases the probability of an error. IMHO, if you care about your data you want ECC.
-- David Milton david@dmilton.ca, For better email, sign up here: http://www.fastmail.fm/?STKI=7947829 _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca
I don't think HP has ever released a notebook (or even "portable workstation") that supported ECC, but please don't consider that statement to be 100% authoritative. Both Dell and IBM/Lenovo have, but neither has consistently has such a model in production at all times.
For an example of the importance of ECC, we seemingly have only to look at last week's recent Airbus Emergency Airworthiness Directive regarding bit-flip susceptibility in one of the flight computers on certain models of A320 - that corruption causing a major flight control upset in mid-air, with either 17 or 77 people (depending on which article I believe) needing immediate medical attention. It seems they don't use ECC, they instead use some software refresh technique to mitigate but-flips but accidentally disabled it in the latest update. (Go read the coverage yourself if you want complete & accurate details.) MS isn't the only company that releases buggy patches :-(.
-Adam
Get Outlook for Androidhttps://aka.ms/AAb9ysg ________________________________ From: Hartmut W Sager hwsager@marityme.net Sent: Monday, December 1, 2025 1:38:28 AM To: MUUG - Round Table roundtable@muug.ca Subject: [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
I totally ECCo the sentiments expressed by Adam and David, especially the tech details by David, that more RAM, denser circuits, and lower voltages all contribute to a greater probability of memory errors (flipped bits). My two HP laptops each have 64 GB RAM (non-ECC), leaving me quite nervous. They are a semi-consumer model, which I chose because reasonably-priced HP ProBook and EliteBook laptops had a motherboard limitation of 32 GB RAM (and might not even do ECC either, I don't know).
Hartmut
On Sun 30 Nov 2025 at 22:12:42 -06:00, Adam Thompson <athompso@athompso.netmailto:athompso@athompso.net> wrote: I buy laptops that can take ECC memory. Currently that means a Lenovo P-series. Refurbished, not new, as I'm not made of money... -Adam
Get Outlook for Androidhttps://aka.ms/AAb9ysg ________________________________
From: David Milton david@dmilton.ca Sent: Sunday, November 30, 2025 7:14:16 PM To: roundtable@muug.ca roundtable@muug.ca Subject: [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
On Sun, 30 Nov 2025, at 18:43, Bradford C. Vokey wrote:
Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video:
https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC:
https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
I view ECC as a critical step to reliability at multiple levels. Today's OSs keep significant amounts of data in RAM so they don't have to go to disk as often. If you have a memory error that goes undetected that RAM when it gets flushed to disk, has just written bad/corrupted data on disk. The more RAM you have the greater the probability something will go wrong. The higher density (and typically lower operating voltage) chips (7nm for example) have smaller "charges" in each memory cell which increases the probability of an error. IMHO, if you care about your data you want ECC.
-- David Milton david@dmilton.ca, For better email, sign up here: http://www.fastmail.fm/?STKI=7947829 _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca _______________________________________________ Roundtable mailing list -- roundtable@muug.camailto:roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.camailto:roundtable-leave@muug.ca
Ha! I was going to mention the Airbus A320 matter in my post, but I didn't. As soon as I read that it may have been cosmic rays, I thought flipped bits.
And yes, hardware ECC is always superior to software substitutes, and for mission-critical applications like passenger aircraft, it would be good to use both.
Back in the day when I was assembling desktop computers for myself and for customers, I always chose Asus and ECS (Elite Systems Group) motherboards that supported ECC RAM, and I always installed ECC RAM on those motherboards. I came from the IBM System/360 background, which famously did full-fledged ECC, and I automatically carried that mindset forward.
Hartmut
On Mon 01 Dec 2025 at 09:18:05 -06:00, Adam Thompson athompso@athompso.net wrote:
I don't think HP has ever released a notebook (or even "portable workstation") that supported ECC, but please don't consider that statement to be 100% authoritative. Both Dell and IBM/Lenovo have, but neither has consistently has such a model in production at all times.
For an example of the importance of ECC, we seemingly have only to look at last week's recent Airbus Emergency Airworthiness Directive regarding bit-flip susceptibility in one of the flight computers on certain models of A320 - that corruption causing a major flight control upset in mid-air, with either 17 or 77 people (depending on which article I believe) needing immediate medical attention. It seems they don't use ECC, they instead use some software refresh technique to mitigate but-flips but accidentally disabled it in the latest update. (Go read the coverage yourself if you want complete & accurate details.) MS isn't the only company that releases buggy patches :-(.
-Adam
Get Outlook for Android https://aka.ms/AAb9ysg
*From:* Hartmut W Sager hwsager@marityme.net *Sent:* Monday, December 1, 2025 1:38:28 AM *To:* MUUG - Round Table roundtable@muug.ca *Subject:* [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
I totally ECCo the sentiments expressed by Adam and David, especially the tech details by David, that more RAM, denser circuits, and lower voltages all contribute to a greater probability of memory errors (flipped bits). My two HP laptops each have 64 GB RAM (non-ECC), leaving me quite nervous. They are a semi-consumer model, which I chose because reasonably-priced HP ProBook and EliteBook laptops had a motherboard limitation of 32 GB RAM (and might not even do ECC either, I don't know).
Hartmut
On Sun 30 Nov 2025 at 22:12:42 -06:00, Adam Thompson athompso@athompso.net wrote:
I buy laptops that can take ECC memory. Currently that means a Lenovo P-series. Refurbished, not new, as I'm not made of money... -Adam
Get Outlook for Android https://aka.ms/AAb9ysg
*From:* David Milton david@dmilton.ca *Sent:* Sunday, November 30, 2025 7:14:16 PM *To:* roundtable@muug.ca roundtable@muug.ca *Subject:* [RndTbl] Re: Linus Torvalds says ECC is mandatory for his PC's
On Sun, 30 Nov 2025, at 18:43, Bradford C. Vokey wrote:
Linus Tech Tips recently posted a video about building the PERFECT Linux PC with the actual Linus Torvalds:
- full entertaining video:
https://youtu.be/mfv0V1SxbNA?si=AkLZOlwuf3nKqZ7F
- just the short part talking about ECC:
https://www.youtube.com/watch?v=mfv0V1SxbNA&t=510s
So Linus Torvalds (and of course our very own MUUG vice-president Trevor Cordes) need ECC when their computers are responsible for bisecting or merging their own kernels.
I get that.
But can't the rest of us on non-ECC memory people just simply reset our computers and continue on?
Debate started...
I view ECC as a critical step to reliability at multiple levels. Today's OSs keep significant amounts of data in RAM so they don't have to go to disk as often. If you have a memory error that goes undetected that RAM when it gets flushed to disk, has just written bad/corrupted data on disk. The more RAM you have the greater the probability something will go wrong. The higher density (and typically lower operating voltage) chips (7nm for example) have smaller "charges" in each memory cell which increases the probability of an error. IMHO, if you care about your data you want ECC.
-- David Milton david@dmilton.ca, For better email, sign up here: http://www.fastmail.fm/?STKI=7947829 _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca _______________________________________________ Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca
Roundtable mailing list -- roundtable@muug.ca To unsubscribe send an email to roundtable-leave@muug.ca
On 2025-12-01 Hartmut W Sager wrote:
Back in the day when I was assembling desktop computers for myself and for customers, I always chose Asus and ECS (Elite Systems Group) motherboards that supported ECC RAM, and I always installed ECC RAM on those motherboards. I came from the IBM System/360 background, which famously did full-fledged ECC, and I automatically carried that mindset forward.
IBM has Chipkill, which I wish was available on non-"Big Iron". But in general ECC will stop most errors, especially cosmic-ray ones. (Flawed memory modules with entire rows blowing up is a bit different.) The odds of the system not at least detecting a (low-)multi-bit-flip error are astronomically low.
As a reminder, it is possible today to buy ECC on consumer grade hardware without having to splurge on Xeon or similar. And you can do it and still get a x16 slot for a real video card. You just have to buy the "right" Ryzen mobo and cpu -- I'm using one to type this email. And in a shameless act of plugging, my own company specializes in selling such hardware -- so ask me about it at a meeting if you want to get one for yourself! :-)
On 2025-12-01 Adam Thompson wrote:
For an example of the importance of ECC, we seemingly have only to look at last week's recent Airbus Emergency Airworthiness Directive regarding bit-flip susceptibility in one of the flight computers on certain models of A320 - that corruption causing a major flight control upset in mid-air, with either 17 or 77 people (depending on which article I believe) needing immediate medical attention. It seems they don't use ECC, they instead use some software refresh technique to mitigate but-flips but accidentally disabled it in the latest update.
Holy crap! The stupidity (and parsimony) of some companies is astounding. They need to hire technology people who have been MUUG members...
It's all the more stupid because as you get closer to space you get more cosmic ray memory errors because of the thinner atmosphere.
"some software refresh technique to mitigate": ya right. I'd love to see the paradigm/algorithm for that. What happens if the code that does the "software refresh" itself gets corrupted? Hitchhiker's Guide level silliness.
Great... now we get to choose between a fundamentally flawed 737 Max (now renamed to hide that it's still a Max) and non-ECC Airbuses. The scene from Red Dwarf where they are crashing onto a planet and read the boring in-flight magazine between their legs comes to mind.