Maker Pro
Maker Pro

Finites State Machine (OT?)

B

Brian Inglis

Jan 1, 1970
0
A subroutine that doesn't call any other subroutines. I thought this
was standard terminology; if it's IBMese, I apologise.

ISTM on commonly used stack based machines the difference tends to be
insignificant. Whereas on non-stack machines like IBM 3x0 with
generalized subroutine call conventions, where you have to
(dynamically) allocate a register save area in non-leaf routines,
there's more significance to and awareness of the difference.
 
C

Casper H.S. Dik

Jan 1, 1970
0
A subroutine that doesn't call any other subroutines. I thought this
was standard terminology; if it's IBMese, I apologise.


"leaf subroutine" is also used on SPARC (leaf subroutines don't bother
to use a register window and they return with "retl", return from leaf
subroutine)


Casper
 
K

KR Williams

Jan 1, 1970
0
Maybe so. But at a couple of billion dollars a pop, CPUs can get
expensive. Add another bunch of billions for applications support,
maybe.

$Billions? Me thinks you exaggerate. There are still survivors.
Even the 8051 will survive, much to your chagrin. ;-)
The argument was that PA-risc and Alpha have already fallen, and Sparc
is in trouble. Itanic isn't looking too good, either.

PA-RISC and Alpha didn't fall. They were pushed. Itanic? It
may indeed make a big thud. ;-)
 
J

John Larkin

Jan 1, 1970
0
$Billions? Me thinks you exaggerate. There are still survivors.
Even the 8051 will survive, much to your chagrin. ;-)

Afraid so. Small embedded-type chips are easy to make, so they will be
around forever. There have been something like 100 or so variants on,
say, the HC05 alone. But at 90 nm, 8-layer copper, strange
dielectrics, multiple CPU cores running at 5 GHz, and megabytes of
cache, the air's getting mighty thin. At 45 nm, it will only be worse.

It's just sad the the most hideous, klugey architectures (80xx,
Windows) win through sheer force and mass.
PA-RISC and Alpha didn't fall. They were pushed. Itanic? It
may indeed make a big thud. ;-)

Itanic is now ten years old and costs are in the billions. And sales
are pathetic. Not even the big boys are any good at predicting the
future.

John
 
S

Sander Vesik

Jan 1, 1970
0
In comp.arch Casper H.S. Dik said:
"leaf subroutine" is also used on SPARC (leaf subroutines don't bother
to use a register window and they return with "retl", return from leaf
subroutine)

Until this discussion I could have sworn this was an universal term.
 
A

Anne & Lynn Wheeler

Jan 1, 1970
0
Casper H.S. Dik said:
"leaf subroutine" is also used on SPARC (leaf subroutines don't bother
to use a register window and they return with "retl", return from leaf
subroutine)

the original cp/67 kernel that i got in jan '68 had all intra-kernel
linkages via 360 supervisor call. the kernel convention was that the
svc call interrupt routine would dynamically allocate a savearea for
the called routine ... also do a little bookkeeping and trace entry
for debugging. the svc return would unallocate the saveaea and return.

one of the pathlength things i did was go thru all of the kernel and
identify all leaf routines. i identified these and modified the kernel
call macro to recognize a leaf routine was being called and do a BALR
in place of an svc8. i then defined a fixed (unused/reserved) location
in page zero for temporary register save ... this then eventually came
to be called "balrsave" (so the leaf routine saved caller's registers
in page zero temporary area rather than in a passed save area).

the next thing was to go thru and identify all non-leaf routines that
only made calls to leaf routines ... these then became sort of 2nd
order leaf routines. these were also modified so that the caller used
BALR in place of svc8. however, these routines instead of using
"balrsave" for temporary save of the caller's registers used an
adjacent area that became to be called "freesave".

For various reasons, the svc8/svc12 calling convention originally took
approx 275microseconds on 360/67 (per call) ... it was possible to
optimize that down to about 100microseconds by recoding some of the
stuff used for debugging purposes. Several of the leaf routines were
high frequency calls and performed operations on the order of hundred
microseconds or less ... and therefor the svc8/svc12 calling
convention was on the order of half that processing time.

The svc call to BALR change picked up something like 20-30 percent of
(remaining) kernel time ... on a kernel that I had already optimized
to pickup something like 80percent with fastpath changes described
in previous posts
http://www.garlic.com/~lynn/2004f.html#6 Infiniband - practicalities for small clusters

the earlier 80percent kernel overhead optimization (presented at fall
'68 boston share) had included various interrupt and dispatching
fastpath as well as special case fastpath for various virtual machine
simulation operations. It also included the reduction in the svc8/12
call/return overhead from 275mics to around 100mics ... but didn't
include the BALR call changes for leaf routines.

The BALR call changes were done the following summer (of 69), when I
got con'ed into going to Boeing (student summer employee with a
fulltime management job classification level and a badge that let me
park in the management parking lot at corporate hdqtrs next to boeing
field) to help get BCS setup and operational. That summer, I also did
the first version of dynamic adaptive fair share scheduler, the global
LRU page replacement, and the hack that allowed portions of the cp
kernel to be non-resident and pageable.

scheduler refs:
http://www.garlic.com/~lynn/subtopic.html#fairshare
page replacement refs:
http://www.garlic.com/~lynn/subtopic.html#wsclock
 
R

Rob Warnock

Jan 1, 1970
0
+---------------
| [email protected] (hack) wrote:
| >>What's a leaf?
| >
| >A subroutine that doesn't call any other subroutines. I thought this
| >was standard terminology; if it's IBMese, I apologise.
|
| ISTM on commonly used stack based machines the difference tends to be
| insignificant.
+---------------

Actually, it can make a big difference even on stack based machines,
since leaf routines -- knowing they will never call any other routines --
can avoid saving any caller-save registers at procedure entry. Also,
if a leaf routine needs *no* local storage (other than available temp
registers), it can avoid most or even all of the standard procedure
prolog/epilog code (allocating some stack, setting a frame pointer, etc.).

[Obviously, for maximum debugability you need a way to turn off such
aggressive optimization, but that's no different than other optimization
tradeoffs.]


-Rob
 
A

Anton Ertl

Jan 1, 1970
0
+---------------
| ISTM on commonly used stack based machines the difference tends to be
| insignificant.
+---------------

Actually, it can make a big difference even on stack based machines,
since leaf routines -- knowing they will never call any other routines --
can avoid saving any caller-save registers at procedure entry.

Caller-saved registers are typically saved around calls; in a leaf
function they are automatically not saved, because there are no calls.
The compiler does not need to do anything special to get that benefit.

Callee-saved registers are typically saved on entry and restored on
exit. You can avoid that only if you don't use the register in the
function.

A simple compiler might use leaf/non-leaf as a criterion for deciding
to whether a value should be allocated to a caller-saved or a
callee-saved register. However, a sophisticated compiler will do data
flow analysis and find out which values survive which calls and it
will decide the allocation based on that (this subsumes the leaf
heuristic).

Followups to comp.arch

- anton
 
B

Brian Inglis

Jan 1, 1970
0
PA-RISC and Alpha didn't fall. They were pushed. Itanic? It
may indeed make a big thud. ;-)

HP might do well to securely archive their PA and Alpha design
documents, far from political interference, if it's not too late
already: they may come in handy.
 
S

Sander Vesik

Jan 1, 1970
0
In comp.arch John Larkin said:
Afraid so. Small embedded-type chips are easy to make, so they will be
around forever. There have been something like 100 or so variants on,
say, the HC05 alone. But at 90 nm, 8-layer copper, strange
dielectrics, multiple CPU cores running at 5 GHz, and megabytes of
cache, the air's getting mighty thin. At 45 nm, it will only be worse.

Will there be actual 45nm in production? Considering the pains with 90
and that nobody is taking about 60 that much yet it seems to be a
bit cloudy.
 
J

Jim Thompson

Jan 1, 1970
0
Did you accidentally misslep perhaps "1972" or "1962"?

RSW considers himself a "prefect" ?????

Bwahahahahahahahahahahaha! Gag!

ROTFLMAO!!!

...Jim Thompson
 
T

Tim Auton

Jan 1, 1970
0
Hank Oredson said:
Did you accidentally misslep perhaps "1972" or "1962"?

comp.arch predates usenet and perhaps even arpanet? Interesting...


Tim
 
K

KR Williams

Jan 1, 1970
0
HP might do well to securely archive their PA and Alpha design
documents, far from political interference, if it's not too late
already: they may come in handy.

Too late, IMO. When the design team is scattered to the winds,
no documentation will help. Note that the designers are a small
part of the team. The framework supporting the design is huge.
 
S

Spehro Pefhany

Jan 1, 1970
0
Tim Auton said:
R. Steve Walz said:
Hank Oredson wrote: [yadda yadda]
Did you accidentally misslep perhaps "1972" or "1962"?
You first tell us what "misslep" means, moron.

I konw.


Tim

Damn you Tim, I laughed so hard I choked. ROTFLTICTD (?)

Cheers
Terry (oops I mean trery)

I assumed it was a metaphor. To "schlep" ("slep" could be a variation,
since it comes from ML German "slepen") is "to move slowly or
laboriously", so to misschlep (misslep) would be to move slowly or
laboriously to the wrong place.

Best regards,
Spehro Pefhany
 
K

KR Williams

Jan 1, 1970
0
Afraid so. Small embedded-type chips are easy to make, so they will be
around forever. There have been something like 100 or so variants on,
say, the HC05 alone. But at 90 nm, 8-layer copper, strange
dielectrics, multiple CPU cores running at 5 GHz, and megabytes of
cache, the air's getting mighty thin. At 45 nm, it will only be worse.

You're not telling me anything new here. The products I work on
tend to be the first in line for these processes (why 8-layer?).
As they say, BTDT.
It's just sad the the most hideous, klugey architectures (80xx,
Windows) win through sheer force and mass.

Kinda like 3x0? Yeah, it's kludgy by today's standards, but
it's not the $billions of development costs, nor support costs
that keep it going. $Billions in *applications* keep it going.
X86 is no different. This is why my bets are on AMD64, rather
than Opteron. History has shown that evolution works. Revolution
doesn't cut it.

BTW, 3x0 has been around twice as long as x86 and still has
a significant following.
Itanic is now ten years old and costs are in the billions. And sales
are pathetic. Not even the big boys are any good at predicting the
future.

You completely misunderstood my comments. Intel bluffed, and
almost pulled it off. Others have noticed (both the bluff and
the failure to deliver). The market may not be wide open (that
revolution thing again), but it's there for those with chips (and
not $Billions) to push onto the table.
 
Top