DSPRelated.com
Forums

Whip a duo-core PC into submission...?

Started by Rune Allnor November 18, 2008
On 19 Nov., 14:25, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 19 Nov, 14:01, Martin Thompson <martin.j.thomp...@trw.com> wrote: > > > Rune Allnor <all...@tele.ntnu.no> writes: > > > So, how can I trick my PC to assign this single-thread > > > program to run on the idle core? (Yeah, I know. It's no > > > reason to expect that it can be done, but I have to ask...) > > > You can (under windows) use the task manager to assign a process to a > > particular core, but I'm not sure that's your problem. > > Ah! That did the trick. I just assigned the process > to the 2nd CPU and the thing just went through the roof. > I'm watching the task manager as I write this, and the > program is easily chewing 750 MB of RAM, increasing as > we speak. Well, write.
strange, it just doesn't make sense, the OS pick what ever core is available and it will then max that out. if I start a heavy matlab script on my duo it will immediately show 100% load on one of the cpu's, if I start another matlab and start another heavy script in that the other cpu will also get fully loaded. (some matlab function like fft can use both cores at the same time) as for memory, afaik 32bit windows can use 4gb, but a single task can only get 2gb, 2gb is for kernel. theres a boot option that makes it 3gb for apps and 1gb fopr kernel but I've heard not all driver are compatible -Lasse
On 20 Nov, 00:22, "langw...@fonz.dk" <langw...@fonz.dk> wrote:
> On 19 Nov., 14:25, Rune Allnor <all...@tele.ntnu.no> wrote: > > > On 19 Nov, 14:01, Martin Thompson <martin.j.thomp...@trw.com> wrote: > > > > Rune Allnor <all...@tele.ntnu.no> writes: > > > > So, how can I trick my PC to assign this single-thread > > > > program to run on the idle core? (Yeah, I know. It's no > > > > reason to expect that it can be done, but I have to ask...) > > > > You can (under windows) use the task manager to assign a process to a > > > particular core, but I'm not sure that's your problem. > > > Ah! That did the trick. I just assigned the process > > to the 2nd CPU and the thing just went through the roof. > > I'm watching the task manager as I write this, and the > > program is easily chewing 750 MB of RAM, increasing as > > we speak. Well, write. > > strange, it just doesn't make sense, the OS pick what ever core is > available and it will then max that out.
Not strange. The techies who sold me the thing (see another post) made sure I only got 3GB RAM. One core has 1GB and the other 2GB. I need to manually make sure that these programs run on the 2nd core.
> as for memory, afaik 32bit windows can use 4gb, but a single task > can only get 2gb, 2gb is for kernel.
The reason why you can't get more than 2 GB is that one needs a signed offset to access the memory randomly. In the binary number system you use one bit for the sign, so a 32-bit int can hold a 31-bit number + sign.
> theres a boot option that makes > it 3gb for apps and 1gb fopr kernel but I've heard not all driver are > compatible.
I've never heard of that option before. Hopefully, people use it to map addresses of peripherals, like graphics cards etc. Rune
On 20 Nov., 09:59, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 20 Nov, 00:22, "langw...@fonz.dk" <langw...@fonz.dk> wrote: > > > > > On 19 Nov., 14:25, Rune Allnor <all...@tele.ntnu.no> wrote: > > > > On 19 Nov, 14:01, Martin Thompson <martin.j.thomp...@trw.com> wrote: > > > > > Rune Allnor <all...@tele.ntnu.no> writes: > > > > > So, how can I trick my PC to assign this single-thread > > > > > program to run on the idle core? (Yeah, I know. It's no > > > > > reason to expect that it can be done, but I have to ask...) > > > > > You can (under windows) use the task manager to assign a process to a > > > > particular core, but I'm not sure that's your problem. > > > > Ah! That did the trick. I just assigned the process > > > to the 2nd CPU and the thing just went through the roof. > > > I'm watching the task manager as I write this, and the > > > program is easily chewing 750 MB of RAM, increasing as > > > we speak. Well, write. > > > strange, it just doesn't make sense, the OS pick what ever core is > > available and it will then max that out. > > Not strange. The techies who sold me the thing (see another post) > made sure I only got 3GB RAM. One core has 1GB and the other 2GB. > I need to manually make sure that these programs run on the > 2nd core.
That not how it works, it is shared memory the two cores both have equal access to all the memory in the same 4gb memory space.
> > > as for memory, afaik 32bit windows can use 4gb, but a single task > > can only get 2gb, 2gb is for kernel. > > The reason why you can't get more than 2 GB is that one needs > a signed offset to access the memory randomly. In the binary > number system you use one bit for the sign, so a 32-bit int > can hold a 31-bit number + sign.
might be, could also be that it was a simple way of dividing the memory 32 bit memory in application and executive access
> > > theres a boot option that makes > > it 3gb for apps and 1gb fopr kernel but I've heard not all driver are > > compatible. > > I've never heard of that option before. Hopefully, people use > it to map addresses of peripherals, like graphics cards etc.
if you have an application that needs more that 2gb, it is either that or going to 64bit ... -Lasse
32 bits will address 4GB - unsigned integer.  And, that's how a PC works 
unless I'm very mistaken.

System applications use address *space* - not RAM.  That's why you see many 
new mainboards being offered with 3GB of RAM and DDR3 (3 channel) memory 
access.
The older DDR2 would access with 2 channels and, thus, it was nice to have 
the memory symmetrical in halves.
But, with the 3GB+ limitation of available address space, it makes some 
sense to offer machines with 3GB of memory - even though perhaps 3.2GB or so 
would work just as well.
(With memory being relatively cheap, it's easy to just throw 4GB into a DDR2 
machine and only use 3+GB of it in reality.  But, cost is important in a 
commodity market so populating with 3GB total makes sense if you can use it 
speedily - thus DDR3).

I believe that the graphics card memory uses part of that address space so 
that could readily be 0.5GB by itself.

The whole issue of parallel processing hinges on the ability to build 
programs that make good use of multiple processors.  This remains somewhat 
of a challenge to realize.
Left to its own devices, a multicore CPU system will task the processors in 
some fashion that makes sense to the CPU/system.
I don't know but it surely makes sense to give some ability to the 
programmer to assign cores among threads if you know that you have at least 
two heavy-duty threads that could run in parallel advantageously.

A quick Google on multicore thread assignment resulted in this article .. 
which, if a bit sales oriented, touches on the issues.  If they can do it, 
so can you.

http://zone.ni.com/devzone/cda/tut/p/id/6452

You can well expect that Intel and AMD and Microsoft would pay attention to 
such matters:

http://download.intel.com/technology/advanced_comm/315697.pdf

and

http://software.intel.com/en-us/articles/multi-core-introduction

and

http://msdn.microsoft.com/en-us/magazine/cc163340.aspx

Rune, how did you do it?

Fred


Martin Thompson wrote:
> You can (under windows) use the task manager to assign a process to a > particular core, but I'm not sure that's your problem.
Martin, Ah! I see how one can set "Affinity" of a process to cores in Vista Ultimate at least..... Does XP do this as well? Home? Pro? Fred
On 21 Nov, 21:28, "Fred Marshall" <fmarshallx@remove_the_x.acm.org>
wrote:
> 32 bits will address 4GB - unsigned integer. &#4294967295;And, that's how a PC works > unless I'm very mistaken.
After I gave up following the HW circus, some 20 years ago, my main influence is SW methods. In C++ the two concepts used are the Pointer type, which on Win32 contains the 32 bit pointer, and the Pointer Difference type, which is needed to handle address offsets. Since differences can be negaive, the address space available is 2GB.
> System applications use address *space* - not RAM. &#4294967295;That's why you see many > new mainboards being offered with 3GB of RAM and DDR3 (3 channel) memory > access. > The older DDR2 would access with 2 channels and, thus, it was nice to have > the memory symmetrical in halves. > But, with the 3GB+ limitation of available address space, it makes some > sense to offer machines with 3GB of memory - even though perhaps 3.2GB or so > would work just as well. > (With memory being relatively cheap, it's easy to just throw 4GB into a DDR2 > machine and only use 3+GB of it in reality. &#4294967295;But, cost is important in a > commodity market so populating with 3GB total makes sense if you can use it > speedily - thus DDR3).
The way I interpret this is that each core can handle 2GB of address space. The worst-case scenario is that you have one 2GB process running on each core (ignoring OS and other overheads) in which case I naively assume that 4 GB of RAM is needed.
> I believe that the graphics card memory uses part of that address space so > that could readily be 0.5GB by itself.
That's what the salesman who sold me the thing said.
> The whole issue of parallel processing hinges on the ability to build > programs that make good use of multiple processors. &#4294967295;This remains somewhat > of a challenge to realize.
Well, yes. However, for now I'll settle for one core that can access 2GB of RAM, leaving the other core to run the OS and whatever other stuff is going on. In the long run the plan is to use some multi-threading where some pre-processing of data, main processing and GUI are done in different sub-threads. It will be a a couple of years till I get there; hopefully quad-core laptops are available by then.
> Left to its own devices, a multicore CPU system will task the processors in > some fashion that makes sense to the CPU/system. > I don't know but it surely makes sense to give some ability to the > programmer to assign cores among threads if you know that you have at least > two heavy-duty threads that could run in parallel advantageously.
The problem is that *I* know that the threads will eventually become heavy-duty (because I know what I will use the program for), but the task scheduler doesn't. It only sees a GUI that starts and then gos idle. No CPU load to speak of; no memory assignment. Until I start importing data from file. Then it grabs all it kan cet. I know there are some smart programmers out there, but I don't see how they can anticipate the characteristics of my programs just like that.
> A quick Google on multicore thread assignment resulted in this article .. > which, if a bit sales oriented, touches on the issues. &#4294967295;If they can do it, > so can you.
Yes, but there is the time factor. Not much time to learn. ...
> Rune, how did you do it?
Do what? I used the task manager with the Affinity setting to allow the process to run on the one core which can access the 2 GB of RAM (WinXP Pro). So I can do what I want; the problem is that I need to do that manually every time I start the program. The data processing took some time to crack. The general ideas have fermented, so to speak, in my mind for over a decade already; some simplifications came about this summer. Only last night did I find the way to simplify the processing to the extent that the goals can be achieved with the resources at my disposal. It's 20 years since I last worked with memory-limited programs (that infamous 64k segments and 640 kB RAM on the XT/AT) so it took some time to get back in that mode of thinking. Anyway, a night of insomnia in a freak winter weather was all I needed to see how to get the job done without requiring 16GB RAM. Rune
On Nov 21, 1:08 pm, Rune Allnor <all...@tele.ntnu.no> wrote:

...
.>
.> After I gave up following the HW circus, some 20 years ago,
.> my main influence is SW methods. In C++ the two concepts
.> used are the Pointer type, which on Win32 contains the 32 bit
.> pointer, and the Pointer Difference type, which is needed to
.> handle address offsets. Since differences can be negaive,
.> the address space available is 2GB.

 What C++ sees is a meta-machine that exists for the convenience of
the programmer. It is not the machine implemented by the OS, assigned
to an application or addressed by a core.

.> ...
.> The way I interpret this is that each core can handle 2GB of
.> address space. The worst-case scenario is that you have one
.> 2GB process running on each core (ignoring OS and other
.> overheads) in which case I naively assume that 4 GB of RAM
.> is needed.
.>

As Fred said, the OS and cores can address 4GB of ram, some of which
is mapped to things like graphics memory.

> ... > Well, yes. However, for now I'll settle for one core > that can access 2GB of RAM, leaving the other core to > run the OS and whatever other stuff is going on.
Under XP, applications are limited to a 2GB space which may be mapped anywhere in available ram or virtual memory. Each application gets it's own space. If total space requirements exceed available ram, virtual memory is mapped into the space. Since the virtual memory is on disk it slows the processing considerably. The space of an application is mapped the same for any core that may be executing that application's threads, however, cache contents may have to be reloaded if there is a core switch. .> ... .> It's 20 years .> since I last worked with memory-limited programs (that .> infamous 64k segments and 640 kB RAM on the XT/AT) so .> it took some time to get back in that mode of thinking. The 640 kB of a 1MB RAM memory space is now the 3.x GB of a 4GB RAM space. All 3 GB of RAM installed hardware may be usable by the OS. Not all of 4 GB of RAM installed hardware will be usable by the OS. I think the Vista x32s are the same. Vista x64 Home is up to 8 GB RAM and the more expensive x64 Vistas run to 128 GB RAM space. Server style motherboards are available today that mount 8 dimm's of 2 GB DRR2. There -are- solutions in bigger RAM spaces today. .> ... .> Rune Dale B. Dalrymple
Rune Allnor wrote:
> On 21 Nov, 21:28, "Fred Marshall" <fmarshallx@remove_the_x.acm.org> > wrote: >> 32 bits will address 4GB - unsigned integer. And, that's how a PC >> works unless I'm very mistaken. > > After I gave up following the HW circus, some 20 years ago, > my main influence is SW methods. In C++ the two concepts > used are the Pointer type, which on Win32 contains the 32 bit > pointer, and the Pointer Difference type, which is needed to > handle address offsets. Since differences can be negaive, > the address space available is 2GB.
I don't see where we disagree on this.... just different perspectives. The limit of addressable memory on the 32-bit architecture *is* 4GB.
> The way I interpret this is that each core can handle 2GB of > address space. The worst-case scenario is that you have one > 2GB process running on each core (ignoring OS and other > overheads) in which case I naively assume that 4 GB of RAM > is needed.
I don't think this is the model that's in use. The processor can address 4GB and the cores address different parts of memory for the most part and can use the multiple channels that are generally available. I really don't believe that one core gets 2GB and the other core gets 2GB any less than a single core 32-bit machine gets 4GB. But, that doesn't mean that the channel architecture doesn't make it look that way. I don't know about that part. Threads get assigned to a core. That's the model in my head. You don't control (in general) which core the assignment goes to. That's why the word is "affinity" I suspect. One set of processes favors one core while another set of processes favors another core. When my dual core Vista machine is loafing I see work being done on both cores more or less equally either at one instant or averaged over a short period of time where they seem to ping pong the loads. That makes sense if the tasks are sent off to the least loaded processor. With a heavy set of processes then one might usefully tell them which core to "hit" because we have a better idea of what is to come. "Look ahead" scheduling done manually. That's how I think it works...... Fred
On 22 Nov, 07:55, "Fred Marshall" <fmarshallx@remove_the_x.acm.org>
wrote:
> Rune Allnor wrote: > > On 21 Nov, 21:28, "Fred Marshall" <fmarshallx@remove_the_x.acm.org> > > wrote: > >> 32 bits will address 4GB - unsigned integer. And, that's how a PC > >> works unless I'm very mistaken. > > > After I gave up following the HW circus, some 20 years ago, > > my main influence is SW methods. In C++ the two concepts > > used are the Pointer type, which on Win32 contains the 32 bit > > pointer, and the Pointer Difference type, which is needed to > > handle address offsets. Since differences can be negaive, > > the address space available is 2GB. > > I don't see where we disagree on this.... just different perspectives. &#4294967295;The > limit of addressable memory on the 32-bit architecture *is* 4GB.
But if you want to access that space in a random manner you need a signed integer type as pointer offset. If you don't have the signed offset, you don't have *Random* Access Memory but *Sequential* Access Memory. Or whatever the correct term might be. So to use that *R*AM you either expand the address signed offset (and thus memory pointer, since these for practical reasons need to be of the same type) to 32 bits + sign or you include the sign in the 32 bits you already have available. The latter solution is what is done in practice on Intel CPUs (it's a HW decision; other systems might do things differently), which is why application programs can access at most 2 GB of *R*AM on 32-bit systems. The question is what is done with memory associated with peripherals, like the graphics card. Do you map this memory into the address space available to applications? Or do you map it into the address space associated with the 2GB of memory that's not accessible from applications? I hope it's the latter.
> > The way I interpret this is that each core can handle 2GB of > > address space. The worst-case scenario is that you have one > > 2GB process running on each core (ignoring OS and other > > overheads) in which case I naively assume that 4 GB of RAM > > is needed. > > I don't think this is the model that's in use. &#4294967295;The processor can address > 4GB and the cores address different parts of memory for the most part and > can use the multiple channels that are generally available. > > I really don't believe that one core gets 2GB and the other core gets 2GB > any less than a single core 32-bit machine gets 4GB.
I have yet to see a 32-bit single-core computer, PC or other, with more than 2 GB RAM. Rune
On Nov 21, 11:21 pm, Rune Allnor <all...@tele.ntnu.no> wrote:
.> ...
.> But if you want to access that space in a random manner you
.> need a signed integer type as pointer offset. If you don't
.> have the signed offset, you don't have *Random* Access Memory
.> but *Sequential* Access Memory. Or whatever the correct term
.> might be.
.> ...

You are still confusing the C++ programming model with what happens on
hardware address busses. Pointer offsets are only used to calculate
addresses. RAM isn't addressed by the values of pointer offsets.

Just as your 16 bit processor used to have a 20 bit hardware address
bus to access 640 KB of 1 MB, Intel's modern 32-bit processors have 36
bit address busses. One thing that is still the same is that the OS
still limits the access your program has to the paging capabilities of
the memory management. See:

http://en.wikipedia.org/wiki/Physical_Address_Extension

See, things haven't changed so much in the last 20 years. There are
just more address lines in the CPU flat memory model. Didn't some guy
named Moore suggest something that would make that useful?

Dale B. Dalrymple