View Full Version here: : Dual Xeons - a processing platform I am playing with
g__day
16-11-2024, 11:49 AM
Living 13kms North East of Sydney there is not much I can do with our nightime skies and light pollution - but I did want to experiment with what can one achieve if one integrates massive amounts of data on a target. Do to that in a sensible amount of time requires a lot of compute and I/o power.
So it got my thinking what are my options - and I could see four obivious paths:
1. Upgrade my existing workstation from a 8 core extreme I7 to a 16 core I9-10980XE
2. Build a AMD Ryzen Threadripper 3990X based rig (64 cores)
3. Buy a old HP Z640 dual Xeon E5-2699v3 (36 cores)
4. Rent a AWS massive compute and I/O cluster when I want to stack and integrate shots
The first option I may do down the track - the chips aren't so common and finding one and potential upsizing my water cooler for the larger hotter CPU meant this path would probably cost $1,400.
The second option is likely the most powerful - the CPU alone costs around $5K its basically a baby EPYC core - but a solution going down that pathway could be in the $8K - $10K route - a bit too much for a processing love to have.
Option 4 was a bit tricky to justify - as there would be a lot of data to move up and down from the cloud which would require considerable time and internet bandwidth - and I would need not just very high compute - high I/Ops is required too - so lacking the experience I discounted this option.
So that left option 3 - and during a roadtrip to the Sunshine coast over the past two weeks opportunity struck. The TechFactory just before Brisbane had a HP Z640 Win 10 Pro based dual Xeon with 36 cores and 72 threads, a P4000 graphics card (about equivalent to a NVidia 1070) a 1TB SSD, 8 TB HDD and 256GB RAM for $1,900 so I jumped on it.
I also picked up a 27" 2K monitor, keyboard and mouse for $200 and my wife got a great Mac Air - so it was a real fun visit to a place that re-purposes end of life high end servers, switches and workstations - its was a real geeks playground - they had militarly grad equipment everywhere.
So I got back home and set it all up. First task was to improve I/O - so I added a old ASUS Hyper M.2 X16 Gen 4 PCIE card that I had bought years ago and never used (for lack of PCIE 16 lanes). To this I added 4 x 2TB MP600 Elite cards and put them in RAID 0 as my working space - a 8TB scratch file space. So all up that is another $1K in gear. The Z640 BIOS makes it easy to bifrucate a PCIE X16 lane into 4x/4x/4x/4x that you need to see all of the M2 drive and thus stripe them under Windows Disk Management into a dynamic disk. The only losses you get is the Z640 is only PCIE x16 gen 3.0 - whereas the drives and hypercard are Gen 4 - so although each drive is rated to 7000 MB/sec under gen 4 this halves under gen 3. But in RAID scores above 8000 MB/sec are commonplace which makes for blindingly fast I/O.
So all the rest I had to do was install the latest NVidia Quadro P4000 drives, instal PI and install GPU acceleration for PI using CUDA. These CPUs are not Windows approve for Windows 11 - so I am stuck on Win 10 Pro for a while.
Testing PI's WBPP showed interesting results - PI only uses half the processor cores. It sees 72 logical cores and schedules significant work for this many cores - but Windows seems to only dispatch all the work to one of the CPUs - the first NUMA node of 36 logical cores - the rest just sit idle.
So at 50% CPU load this new workstation too about 3 hours to stack and integrat 730 subs - that took my old workstation 4 hours. So if half the processors can do that - I will be really keen to see what the rig can do when fully loaded! It is also super quiet - at full load it is basically totally silent and cool. The PI guys are working on release 1.9 due out in the next few weeks - but are trying to investigate why my and a few other rigs only use half the avialable processors (and this is only a Windows behaviour - on Linux all processors get used).
I also saw very unusual CPU scheduling behaviour on the RC-Astro Xterminator suite. My old workstation could process an image with BlurXterminator in about 3 minutes on CPU and 20 seconds on GPU; the new one took 40 minutes on CPU (with only about 5-6 cores only 5% loaded) but on GPU it did it in under 40 seconds - a 60 fold improvement - pointing to something really off with the workload dispatcher.
So for anyone thinking of having a dedicated astro processing rig - a used HP Z640 of the right configuration, core count and memory size can be nought for a real steal nowadays!
rustigsmed
17-11-2024, 12:01 PM
looks like a fun project Matthew - so many cores will be interesting to see how it runs when all cores are up and running properly. might be worth dual booting to see the linux performance - it is generally higher scoring. the system will be great in core heavy tasks!
Leo.G
17-11-2024, 04:32 PM
Did you mean "Bought" or "Nought"?
I went down this path several years back with an old IBM desktop server with twin Xeon processors, nowhere near as much RAM and I can't remember which graphics card. My son at the time had his Lenovo home server with a much later Xeon processor and more RAM and it was by far quicker, less noise and a lot less electricity.
In saying that my son has a HP C7000 blade cabinet fully populated including one double sized unit with 4 x Xeon processors and a few disk shelves, along with a 48 tape library but We haven't played with any of it for processing. It's all large equipment and not so cheap to run.
One thing, Registax from memory was the one program which always threw up an exception error if I tried to run more than one core for processing and it didn't matter which version I tried (or it could have been DSS, I can't remember).
We used to buy a lot of cheap server gear from Grays auctions going back a few years. I've always wanted to try one of the smaller HP units or the similar other near identical machine brand which currently alludes my pathetic brain.
The same with my sons Lenovo home/small business server (S30), it was great for everything including gaming when we first got it.
I have a D30 which takes slower dual processors but it has a dead BIOS my son was re-writing for it, he may still finish it one day. It's a HUGE machine though.
We suffer from small house syndrome, too much junk, not enough house.
AlexN
18-11-2024, 10:26 AM
I have a HP Z-640 too that is my PI machine.
Its specs are
Dual Xeon E5-2680 v4's (28c/56t total)
256GB DDR4
12Gb nVidia Titan Xp.
512GB SSD for boot/windows
Pci-e riser with 4x2tb nvme SSD's for storage/processing swap etc.
Including the Titan XP and nvme hdd's etc it was less than $1200... and in PI, its ridiculous. WBPP full pipeline including normalisation, calibration, image solving, integration and drizzle integration usually takes less than 1hr for 3~400 subs, and less than 10 minutes for preliminary stacks of 56 or less images (56 individual threads means each operation runs in parallel if I have less than 56 subs)
The Titan Xp GPU pumps through Blur/Noise/Star Xterminator or Graxpert in a matter of seconds too...
I literally couldn't think of a better machine for PI. I'd love to get a pair of 2699's for it, just to boost that thread count and and base speed up a little - but I think the 28c 56t at 2.4GHz base, 3.3GHz boost is plenty..
AlexN
18-11-2024, 10:35 AM
Oh, worth noting that LOTS of users seem to complain that dual CPU rigs do not get fully loaded by PI, as only one CPU gets loaded.
Mine gets 100% load across all cores/threads during WBPP, I don't recall doing anything specific in order to make that happen - it may have been a bios settings or a CPU affinity setting that I did when I first got/configured the rig.
g__day
22-11-2024, 01:31 AM
An update from what I know so far - comming from discussion on the pixinsight forum - including running some code Juan the CEO of PI shared on my machines.
https://pixinsight.com/forum/index.php?threads/new-xeon-gen-5-system-low-performance-looks-like-1-8-9-2-3-dont-like-w11-24h2.24249/
So I haven't programmed in 40 years - but last week I downloaded a C++ compiler and with some amazing help from ChatGPT trialled a few simple parallel programs on my workstations to confirm something that hasn't been well shared about PI.
On boot up it detects how many logical processors (72) are on my machine. On benchmark running and general PI tasks - PI only uses 36 cores and reports is sees only 36 logical cores - although PI CEO confirms from the past version PI is meant to be multi core aware.
Windows divides machines with over 64 cores into two (or more) NUMA nodes - each having and equal number of cores. The code under Windows in C++ to determine numa cores has to be slightly more sophisticated than that that Juan has shared to correctly how many determine logical cores a machine with over 64 cores has. So that is problem one - gettting the logical core count correct.
Problem two - making sure each core get the workload. Once you have over 64 cores - your programming has to be Processor or NUMA node affinity aware - else all work goes to the current NUMA node the main program started on.
So until this is sorted by PI I have decided on a simple workaround - turn off hyperthreading! This lowers the logical CPU count from 72 back down to 36 - and being under the 64 limit - PI then assigns all work to all processors.
Now Hyperthreading improves CPU resources by about 10% form memory - meaning 36 physical cores will perform on many workloads like 40 physical cores. But if switching it on means I am only using 18 physical cores - well I will take 36 physical cores over 36 logical ones representing only 18 physical cores.
My CPU scores doing this went from low 12,000 in PI to 15,500 - so I will take that gain.
Just have to see everything is stable now. With HT on - everything worked very reliably. When I first switched it off Windows didn't get past the splash screen teh first time, and halted with CPU in task manager later on. Now I am running a WBPP workload that took 3 hrs 15 mins last I ran it - see how it will perform this time!
g__day
22-11-2024, 06:19 PM
So to complete this update - PI WBPP ran in 2 hours 18 mins once it could throw the work to both my new workstations cores.
So that is roughly 1/2 the time my old Workstation - which was no slouch - took to process 721 subs!
Put another way - my total processing time has decreased from 20 seconds a sub to 10 seconds :)
AlexN
22-11-2024, 06:37 PM
That's awesome Matthew.
The old Z640 is an awesome processing rig, I've been thoroughly impressed by it over the last 12 months... It hammers at PI
joshman
23-11-2024, 03:07 PM
How well do the benchmark results translate into actual PI Performance?
I recently built a machine dedicated to PI Processing (and some casual gaming):
AMD 7950X, 64GB Ram, NVIDIA 4070 TI Super. 2TB NVME for Windows, 4TB NVME for PI Working Space/file storage, and dedicated 2x 500GB NVME for PI Swap Space.
I had wanted to put the 2x 500GB into Raid 0 for Swap, but ended up just directing 8 swap folder pointers to each drive in PI, and setting up 32 File/Folder I/O threads.
From a pure Benchmark result, it's fantastic
rmuhlack
23-11-2024, 05:20 PM
I have been using refurbished workstations for pixinsight for a while now, and I agree - fantastic bang-for-buck.
My current workstation is a Dell T7910 which also features a dual Xeon E5-2699v3 (36 physical processors, 72 logical processors) setup, with 128GB RAM and a Quadro P4000 GPU. I also encountered the issue with only one CPU loading up (apparently a Win10 limitation), so the answer to utilise all logical cores is to use linux - which yielded a Total performance PI benchmark score of 25992.
AlexN
24-11-2024, 01:14 AM
Yeah mine is in the mid 20k range too with the 2x E5 2680v4s with raid 0 512gb nvme drives for swap, and data on a 1tb nvme drive
g__day
24-11-2024, 06:50 PM
Josh - that is a awesome result - what was the build cost of your rig - as it shows well what modern technology can deliver!
Richard - that is a brillant result - do you remember what your rig was scoring on Windows 10 - I didn't know Linux was that much faster than Windows!
I do follow STAstros analysis on the last 3 versions of PI on Windows and how his rig went from 31K on the benchmark on 1.8.9.1 to 7K on 1.8.9.3
https://pixinsight.com/forum/index.php?threads/new-xeon-gen-5-system-low-performance-looks-like-1-8-9-2-3-dont-like-w11-24h2.24249/
So PI seems to have issues under Windows getting all it can from a dual Xeon. If I could double my CPU scores I would be stoked too!
AlexN
25-11-2024, 10:53 AM
As a note - My rig is running Windows 10.
I think it must be the CPU's that you're running, resulting in > 64 Logical processors as was mentioned earlier.
Under Windows 10, my 28c 56t dual 2680 v4's, all 56 logical processors are pinned at 100% utilisation during many operations in WBPP, namely calibration and integration, if I have 56 or more subs, all 56 logcial processors will be at 100% utilisation.
It would be interesting to see if jumping to Linux would making any sort of difference, however this machine is used for more than just PI, and some of those tasks are either not possible in Linux due to drivers/applications not existing, or far less performant.
I'm stuck with windows... but with 56 threads, I'm having a good time in PI with 2680 v4's..
g__day
25-11-2024, 01:48 PM
Watching a Youtube video of dual booting Ubuntu for a Windows user - the steps don't seem so hard - so I think I will wait until release 1.9 comes out and how performance is reported on large core rigs on Ubuntu versus Windows.
If I were to dual boot I would likely add a new drive to be the boot drive then its see if you can add a boot manager to a PCIE x4 NVME drive and have the rig boot from that - and likely switch Hyperthreading back on when I play with Linux.
Seems like lot of effort to get back 30% - 70% of performance that is lost since release 1.9.8.1 to release 1.8.9.3 for reasons Jaun as CEO of the PI simply can't get to the bottom of!
But yes on the upside - these systems are still very fast!
rustigsmed
26-11-2024, 10:07 AM
Hi Matthew - great idea on dual booting and having a separate drive makes it cleaner rather than reducing the windows partition size (but can be done). just be aware that the systems will run different partition formats - windows uses NTFS and linux you can choose either ext4 or btrfs. what this effectively means is that windows won't be able to see files on the linux formatted drives. while on the other hand linux can see the windows partition and copy / paste files from the windows drives. I was thinking you may have had some trouble editing files on the windows partition from linux but i've just tested it and it seems to be work fine.
Also if you don't like the GNOME desktop look that Ubuntu uses you can choose another of the ubuntu flavours that is ubuntu under the hood but with a different desktop environment (look and feel of the icons some default programs, application menu/launcher etc) https://ubuntu.com/desktop/flavours - I would recommend Kubuntu 24.10 https://kubuntu.org/getkubuntu/ which uses KDE plasma desktop environment if you want it to be a bit more windows-familiar in layout. or alternatively you are able to install different desktop environments at the same time and choose which one you want at log in https://en.ubunlog.com/how-to-have-multiple-desktop-environments-on-ubuntu-and-derivatives/ but it is cleaner and takes up way less space just having one option. Explaining Computers youtube channel has a lot of good info on dual booting and introducing people to linux.
keep us posted on your results :thumbsup:
g__day
01-12-2024, 12:28 PM
Well the latest workstation news is I wanted to up its CUDA capabilities - for PI Xterminators, for SetiAstro Suite denoise and sharpen and Davinci Resolve encoding. So after a lot of research about power and cabling I decided to step away from 2nd hand cards like the P6000 and go with a ASUS ProAft Geforce RTX 4070 - the 8 pin power connector version (using dual female 6 pin PCIE to single male 8 pin adapter).
So the GPU will be available on Monday and hopefully the required power cable will arrive next week so I can put it all together. I am still trying to work out if it will / should have a bracket to help support the weight - the way the P4000 currently does.
This approach will up the number of CUDA cores from 1,792 to 5,888. Now from a gaming perspective - one wants to keep a CPU speed to GPU speed balanced - and one would be generally looking for a 4-5 GHz CPU to pair with a 4070 or faster card - but for compute processing a ton of 2.8 GHz cores trying to keep a 4070 busy will be interesting to observe; and I guess it could still play games - though that is not its key purpose.
The 4070 was the most modern card I wished to go to - not a Super or Ti version or a 4080 or 4090 or wait a few months for the 5000 series - becuase of power and unknowns about the 5000 series. All the larger 4xxx cards need more than a single 8 PCIE power cable to get a 12 or 16 pin cable - and I didn't want to get crazy wiring to get two 8 pin cables by gluing 6 pin PCIE and 2 x SATA power cables to try and clear that hurdle.
So hopefully in the next few days I can report how it all goes down!
g__day
07-12-2024, 06:26 PM
So RTX 4070 added now - glad that is all set up - a tad tricky - but its rocking along - and PixInsight Xterminators are far faster, as is SetiAstro AstroSuite and Davinci Resolve - things are between 3x - 6x faster in initial benchmarks - but I will post a more comprehensive update later.
Bottom line - I am completely happy with this set up now!
Test results:
So a bit more of an Astronomy workload benchmarking afternoon since installing a current generation graphics video card (Asus ProArt RTX 4070) in my new workstation. I concentrated benchmarking using SetiAstro's amazing astro suite running Sharpen and Denoise on a 73 MB test image - testing both the CPUs and new vs old GPU, than I ran Passmark for good measure.
The results are rather solid and pleasing:
SetiAstro Sharpen took:
1. Dual CPUs 36 cores at 100% load took 15 mins 30 seconds
2. Original Quadro P4000 GPU 3 mins 20 seconds
3. New RTX 4070 GPU took 76 seconds
SetiAstro Denoise took:
1. Dual CPUs took 5 mins 40 seconds
2. Original P4000 GPU took 3 min 27 seconds
3. RTX 4070 GPU took 47 seconds
Pass mark results were all pleasing for a system whose heyday of the CPU and Memory set up would have been ten years ago!
g__day
15-12-2024, 12:57 PM
So it appears the QT6 library that PI is compiled with under Windows may be causing some material delays in all processing since 1.8.9-2,
Interestingly I see in the benchmarks for PI a rig with the exact same CPUs as mine - it completes the benchmark in just 60% of the time mine does under Linux - so that is something worth looking into!
By.Jove
17-12-2024, 05:38 PM
Switch to the dark side - Apple Silicon M4 will toast your rig.
rustigsmed
19-12-2024, 07:54 AM
not so sure about that ... for PI purposes anyway..
fastest M4 benchmark - https://www.pixinsight.com/benchmark/benchmark-report.php?sn=7UWEJCWWZ86XKBF378X05 1MI27XK407Z (Total 26356)
fastest E5-2699 v3 (linux) - https://www.pixinsight.com/benchmark/benchmark-report.php?sn=CPB06G83ETWN2O26BQN64 CY8O8L08466 (Total 25992)
g__day
19-12-2024, 02:11 PM
If you look into that benchmark - the dual Xeons are faster than the M4 - its the NVME drives that owner has paired them with that is retarding his score.
But regardless it is rather impressive that a M4 can keep up with a ten year old dual Xeon - likely because PI is not compiled to leverage the Xeon CPUs specific matrix manipulation capabilities :)
A further update from STAstro - he has found two interesting insights:
1. WBPP is faster under Windows then Ubuntu - total surprise there
2. Under Windows v1.8.9-1 is materially slower in WBPP than v1.8.9-3 - most of this accrues to Local Normailsation being about 40% faster in the latest version of PI running WBPP v2.7.8
rmuhlack
21-12-2024, 11:33 AM
That linux benchmark is my PC, a used Dell Precision T7910 workstation that I bought off eBay for $1200 in Jan 2024. What does a Macbook Pro M4 Max (the machine used in the other comparison test liked above) set you back...?
rmuhlack
21-12-2024, 04:54 PM
curious about your comment re NVME swap drives compromising performance. The swap storage directories (16 in all) are on a fast NVME SSD that is separate to both the OS drive and the PI working directory.
Other than using a ramdisk for PI swap, how else would you suggest setting up the swap directories to improve performance?
rustigsmed
21-12-2024, 11:52 PM
nice job with the setup / benchmark.
i think you're looking at about $5k for that particular model.
g__day
22-12-2024, 07:32 PM
I am not sure how he has set up his NVME drives - but if it used an add on Gen 4 PCIE card and set it up in RAID 0 that score would double and - if he is on Gen 5 then its performance would be double again! RAMDisks sometimes don't score as high as the very best NVME drives in RAID 0 - not sure why that is the case though...
BTW PixInsight v1.9 dropped yesterday - for a brief moment in time - I had the two fastest Windows based systems scores in the world :) that's right - my setup were the only two Windows enteries. V1.9 for Windows scores about 2,000 points lower than V1.8.9-3 - I haven't tried WBPP on it yet though!
g__day
04-03-2025, 09:34 PM
Well a very interesting latest release by PixInsight v1.9.3 that delivers 20% - 30% performance improvements (real world and benchmarks) by the thoughtful practice of once-off performance monitoring of worker thread numbers and sub threads spawned versus image sizes. This is basically individual machine processor optimisation of thread counts for the most frequent and heavily used tasks in PI.
So once this update is installed and all patches download - users are encouraged to run processor optimisation to create their own optimisation profile. On 8 core machines this takes about 10 - 12 minutes. On my 36 core Xeons it took about 35 minutes. One user with a 48 core / 96 thread platinum Xeon took 70 minutes running this diagnostic.
So the end results - my best run with PI version 1.9.2 scored 17,523 - my best run with version 1.9.3 is 20,207 - a very nice improvement for free!
Interesting to see one the third fast rig on the table - a Platnium Xeon 8558P - its scores top out as 21,473 for Windows and 55,091 on Linux - with exactly the same gear.
Makes me ponder would my scores would also increase by 2.5x (giving me the 4th fast rig benchmarked - vs my current 28th ranking!
Lastly a user with Xeon Gold 6536N processor has not only mapped Windows vs Linux scores - he has done this with 8, 16, 24, 32, 48, 56 and 64 threads - and the surprising conclusion is this (and most recent versions of) PI scale very poorly with increased processor threads in Windows - but on the exact same gear performance scales linearly in Linux - pointing to a poor Windows coding issue!
Leo.G
05-03-2025, 10:33 PM
Wow, I want your computer, I must check Grey's auctions.
I tried a couple of old servers but they weren't that good. I may have been running Windows server edition on them, I don't remember, one was a 4 processor unit but old processors and they sucked energy like sand sucks water.
rustigsmed
06-03-2025, 02:25 PM
so plenty more free performance to be had with an OS upgrade! would be interesting to see your results if you decide to add a linux partition. :thumbsup:
g__day
09-03-2025, 04:09 PM
I may try the Linux path in the future - if the PI team can't figure out what in the thread management is bottlenecking PI. The very fast Linux running under Windows runs far faster then running directly under Windows should give them some hint of where to look for what is going askew!
Leo.G
10-03-2025, 01:46 PM
Which Pi are you using Matthew?
Sorry, it's probably mentioned up above somewhere.
g__day
10-03-2025, 10:33 PM
The latest Leo, v1.9.3 running on Windows 10 Pro
g__day
25-06-2025, 09:09 AM
Hi all,
Back from a long trip in Europe and now attempting my latest software aided journey - exploring different O/S for Pix Insight image processing.
So an interesting challenge I have comming up. I want to try dual booting my system across WIndows 10 Pro and Windows 11 Pro operatings systems - and may eventually add a third O/S on a thrid SSD - likely Ubuntu (Linux).
I intend to set each O/S up on its own specific SATA SSD - so imagine 3 SSDs each with its own O/S.
I need to share one or in future two large HDDs across each O/S (both formated in NTFS) - this should be simple.
The challenge is I have a scratch file drive that will be heavily used - it is currently a Windows 10 Pro created dynamic disk on a PCIE x16 card comprising of four seperate 2TB NVMe M2 Crucial drives - created by the Windows 1-0 Pro O/S as a dynamic RAID 0 array.
Now my challenge is this RAID 0 array was created by the Windows 10 Pro O/S instance - not in the BIOS by the HP Z640 workstations motherboard - this maybe something it can do - but I haven't gone down this path yet - seeing if the HP Z640 can create a RAID array on a PCIE x16 card - thus possibly making it visibile to any O/S I choose to launch.
When I try and dual boot Windows 11 Pro next - I am wondering 1) where I should put the boot loader - on the intial C: drive that holds Windows 10 Pro - letting it see the first original SSD as the Windows 10 Pro drive and making the new second drive a Windows 11 Pro drive (that when it launches it will likely assign letter c: to that second drive and 2) will the RAID 0 array (my R: drive in WIndows 10 Pro) be usable to Windows 11 Pro - or will it just see 4 un-initialised disks - becuase it didn't create the dynamic drive?
Then later I will create a Ubuntu drive - so I face the problems - where do I store and what boot loader do I choose to launch 3 O/S and will the Ubuntu O/S be able to see a WIndows created RAID 0 array.
My fear is if one O/S creates a RAID array - the other O/S may not be able to understand it - forcing hem to nuke it and recreate it from scratch every time they are launched and I read somewhere when a HP Z640 BIOS tries to create a RAID array it sometimes forces all drives attached to try and become RAID arrays - so I have to look into that more.
The last (as if they aren't enough) challenge I will face is how a dual Xeon workstation launches an O/S - you don't generally see anything for 60 seconds until the Windows boot screen appears - so how or even will I see a multi boot loader to choose which O/S I desire? Unlike a normal PC - where the BIOS power on is quick and rather visible - a dual CPU workstation boot seems to be hidden until the OS splash screen appears.
I am off on a learning curve - any help that can be given will be greatly appreciated!
Many thanks,
Matthew
PS
If it all becomes too hard - I realise I can simply delete my 4 x 2TB RAID 0 array (R:) and create 4 x 2TB simple drives (R: S: T: U:) and tell my software (PixInsight) that requires fast I/0 to use 4 specific drives (with multiple shares for i/o speed) for holding astro images and the scratch files required to process them in large batches (stacking of 1,000s of images).
Peter Ward
26-06-2025, 10:40 AM
I'm running a six year old AMD Threadripper 3970X under Windoze10 which cost me some
coin in the day but still performs quite well with a PI benchmark in the 28k region.
That said , the latest Threadrippers are quite remarkable. I'd likely avoid
Intel and go there again if I were to update my current machine.
Leo.G
26-06-2025, 12:31 PM
My son does this between Windows 8.x, Windows 10 and Windows 11 (all pro) and the easiest way is he boots from a flash drive with the particular OS on it set up as bootable (ISO from memory, I forget more than I remember now). Multi boot hard drives can be problematic at times.
Take your bank manager, they are far from cheap.
https://www.scorptec.com.au/product/cpu/amd-threadripper/107024-100-100000884wof
That's just the processor.
96 core 192 threads but very little software to take full advantage in astronomy circles (possibly in paid software I can't afford to play with, I haven't researched that).
I must admit I like AMD processors, they've been better bang for the buck for a long time and were the first to 500Mhz, then 1Ghz.
I spent years studying IT and building and repairing PCs at a component level on boards back in the XT and 286/386/486 days when everything wasn't plug and play and I hear so many people say "don't even get me started on AMD builds", hinting there's always a problem. I've never come across any problems with AMD builds. The bigger issue is people not doing the research, RAM for Intel is different from RAM for AMD (didn't used to be but is now) and if matching of hardware is researched and done correctly they are an amazing machine.
Peter Ward
26-06-2025, 01:36 PM
Indeed I can echo those sentiments.
I ran Intel for decades and while I did have a failure with an early AMD CPU the current crop has proved to be bulletproof.
BTW just tweaked the BIOS and am now scoring a 31k Pixinsight benchmark
Not bad for a 6 year old machine. :)
Leo.G
26-06-2025, 04:23 PM
I'm not in a position to buy new equipment, pensions don't make good for people with too many hobbies (not complaining but I went from a 6 figure salary in 91 down the $12K annually, slight change in life style was necessary) but I mainly run used stuff bought at auction, cheap. I don't need the latest and greatest for what I play with but I keep my son in reasonably current gear but never the latest and greatest, he has a nice 2 in 1 Lenovo for most of his programming needs and a games suitable desktop, bought used (MB and graphics, new RAM) near new goods at a bargain price from a local notice board.
Were I building a new system it would be only AMD and when I still occasionally throw systems together for friends I always insist AMD is the way to go.
g__day
26-06-2025, 07:45 PM
Very nice results on your gear Peter!
I am just in the process of making a current recovery USB for my Win 10 Pro machine - before I try installing Windows 11 Pro on new drive.
Hey Leo - I saw an amazing Youtube video of a guy taking a Gigagbyte MS73-HB1 motherboard (which is dual Xeon platinium board with multiple Gen 5 PCIE lanes (Gen 5 is 4x the speed of Gen 3) then bought two 56 core Xeon Platinium ES chips (engineering samples) to create a 112 core / 224 thread CPU for under $1,600. I would be impressed to see what it can with PI.
I am wondering if sharing a software RAID array on PCIE x16 is difficult across Operating Systems - should I lean towards try creating a hardware RAID array - using the PCIE ASUS Hyperdrive card and my PCs BIOS - or just revert the 4 NVMe drives on the Hyperdrive card to individual disks and just give PI 4 x 2TB NVMe drives to play with.
Leo.G
27-06-2025, 10:26 AM
Sorry Matthew, I replied last night then bumped the back button on my mouse (I hate this mouse) and lost it all and I was tired.
Hardware RAID is not dependant on the software as is software RAID and would be your best bet if running multiple OS's
I should imagine sharing it across operating systems could present some problems but I haven't played with it. I long got out of any networking, that's my sons forte now. I will see what I can find either in my books or online however.
I have a Lenovo dual CPU mother board my son was rewriting the BIOS code for because of a corrupt code in the chip but he's sort of stopped most things with his depression, still a work in progress. Twin CPU's however far from cheap for anything decent (and restricted to MB) but I doubt in Aus we'd be getting twin 56 core processors for that price, maybe US?
g__day
27-06-2025, 02:12 PM
Well my dual boot is working fine, all device drivers for Windows 11 Pro installed and according to device manager - no issues. All Windows 11 Pro updates applied fine.
A pleasant surprise - the RAID 0 NVMe PCIE based drive created as a dynamic disk under WIndows 10 Pro appears and works fine under Windows 11 Pro - I wasn't suspecting that - total win! CrystalDiskMark shows about 8,000MB/sec read write speeds - so all good.
But under my Astrophotography suite (PixInsight) - set up the same way under Win 10 and 11 - Win 10 is almost twice as fast (a lot higher CPU utilisation under the benchmarks) and the swap file - my RAID arrays scratch directory shows 3 times the throughput under Windows 10 and Windows 11 - which is frankly wierd - off to research this!
I checked each set up has the same swap file point (the RAID array) the same number of shares and the same number of read and write threads. I also ensured under Windows 11 Pro that xisf file types and PixInsight.exe and the QT web engine exe were all added to Win 11 Virus and Security exclusions.
So I am trying to see why Win 11 is processing CPU at 50% of the rate of Win 10 (all 36 cores are utilised - just to a lower extent) and why I/O throughput is a third that of Win 10.
Puzzling...
rustigsmed
27-06-2025, 02:28 PM
in terms of multi booting, perhaps it is worth looking into "rEFInd"boot manager to see if there is an easy solution there..
i can't really provide much guidance on your raid situation other than i recall hearing tech people say hardware raid is pretty much considered dead (since ~2022).
since my earlier posts i've ended up getting a 9950x3d, repurposed the old machine for a homelab/personal ai. the new cpu has pretty much doubled my benchmark to 60693 - 3rd fastest machine on the current benchmark. although realistically i don't intend on running the PBO function 24/7 which impacts the score somewhat.
what was the purposeo fo w10 and w11? just to check any performance differences?
*note i just saw you have made a post since i started on this
Camelopardalis
27-06-2025, 08:07 PM
Maybe Win11 forces all the speculative execution mitigations that the Intel chips have, and Win10 doesn’t? Just guessing, but it’s the kind of scenario where you would feel the impact.
Dual booting Linux is easy, it’ll setup the bootloader when it’s done.
Camelopardalis
27-06-2025, 08:09 PM
Why would you disable PBO?
g__day
27-06-2025, 10:01 PM
I wanted to see if Windows 11 Pro offers a discernible upgrade to Windows 10 Pro performance (but it seems a severe downgrade so far).
I thought I would try Win 11 first and then tri-boot to Ubuntu next.
For me it was 80% about seeking more performance and 20% having a fallback if PI on Windows 10 Pro stops doing what it is supposed to!
Nice performance by the way
g__day
07-07-2025, 06:31 PM
Well this is totally unexpected by me - I ran WBPP in High Quality mode against 437 subs I captured of NGC 6537 last year that I had yet to process.
I ran it twice on Windows 10 Pro and twice on Windows 11 Pro.
PixInsight's WBPP stacking at high quality - Windows 11 Pro is 3.5 times faster than Windows 10 Pro!
Windows 11 Pro took 46 minutes vs WIndows 10 Pro 2 hours and 36 mins!
LN reference generation took around 60 secs in Win 11 and 17 minutes in Win 10
Local Normalisation was similarily much, much faster in Win11.
So for WBPP I think I will be booting into Win11 henceforth!
Note that for general post stacking processing - Windows 10 Pro seems to be about 30% faster under Windows 10 Pro!
PS
Just re-processed NGC 6744 - WBPP on High Quality settings took 3 hrs 6 mins originally on Windows 10 Pro; on Windows 11 Pro - 56 mins - very, very nice performance boost!
rustigsmed
09-07-2025, 10:49 PM
nice improvement!! (still aren't you curious on what it might be on linux?? ;)
g__day
10-07-2025, 01:40 PM
I certainly do - whilst I expect Ubuntu to be discernibly faster overall - noting the PI benchmarks often show a 60% - 100% improvement under Linux - WBPP is a bit of a surprise beast - I remember a user with a very high core count system (close to 100 cores) found Windows was the fastest for WBPP - which seems totally unexpected and wrong to me.
I seriously think PixInsight team should pay more care with benchmarks - and organise them by operating system variant and have a benchmark for WBPP too. The Benchmarks as I have said over and over on their forums can tell you if something fundamental is going wrong - but generally Jaun and team tend not to pick this priority - as users are always clamouring for more and improved functionality. My view is you have to keep quality of product high too - else tech debt looms and sooner or later everyone will have to pay the price...
Eventually I do plan to benchmark it - the only real way to know performance is to throw a real stacking workload and trying it across all operating systems!
I would love to be able to review the Thread Optimsiation data file - to see if Windows 10 Pro has different settings under Windows 10 Pro.
If it were easy I could take the Win 11 settings and try it in the Win 10 system - and see if that is better or worse - and the same for Win 10.
I still ponder the Thread Optimsation logic may get the logical core count wrong under Windows 10 in some use cases!
vBulletin® v3.8.7, Copyright ©2000-2025, vBulletin Solutions, Inc.