pci and caching

John Larkin · Dec 3, 2005

Consider a Pentium PC with main memory and a PCI bus. One can plug
memory-type devices into the PCI bus, things like video or ADC
buffers, CPCI cards, and including, I suppose, more program-space
memory.

A couple of questions:

Is there (I guess there must be) a mechanism for a Windows program to
directly map a chunk of PCI-bus memory into its virtual address space?
Anybody know how this works?

Does anybody know how the BIOS decides what should be cached? There's
nothing in a device's PCI config registers that says "don't cache me"
as far as I can tell.

I know the guy who wrote the book "PCI Bus Demystified" so I asked
him; he hadn't a clue about any of this.

Thanks,

John

slebetman@yahoo.com · Dec 3, 2005

John said:
Consider a Pentium PC with main memory and a PCI bus. One can plug
memory-type devices into the PCI bus, things like video or ADC
buffers, CPCI cards, and including, I suppose, more program-space
memory.

A couple of questions:

Is there (I guess there must be) a mechanism for a Windows program to
directly map a chunk of PCI-bus memory into its virtual address space?
Anybody know how this works?

Don't know about windows, but on linux, you'd use mmap() to map a
physical memory range such as a PCI VGA card to a process's virtual
memory. Check out linux device drivers for more info:
http://www.xml.com/ldd/chapter/book/ . Specifically, check out chapter
13.

I guess on windows its the same. Look for documentation of mmap() or
memmap() etc for windows.

Does anybody know how the BIOS decides what should be cached? There's
nothing in a device's PCI config registers that says "don't cache me"
as far as I can tell.

Don't know, but mmap() seems to handle this. CPUs that implement
caching typically have ways to prevent certain memory I/O from being
cached. Usually this can only be done in supervisory mode so you really
need to ask your OS to set it up for you.

Mac · Dec 4, 2005

Consider a Pentium PC with main memory and a PCI bus. One can plug
memory-type devices into the PCI bus, things like video or ADC
buffers, CPCI cards, and including, I suppose, more program-space
memory.

A couple of questions:

Is there (I guess there must be) a mechanism for a Windows program to
directly map a chunk of PCI-bus memory into its virtual address space?
Anybody know how this works?

Not really. Once upon a time I wrote some DOS programs that did this
(using djgpp) but I've never done it in windows. You may have to write
device driver code to do this sort of thing. Normal application code may
not have enough privilege to perform this type of mapping.

Does anybody know how the BIOS decides what should be cached? There's
nothing in a device's PCI config registers that says "don't cache me"
as far as I can tell.

I'm not 100% sure what you mean by cached, in this instance. Do you mean
cached by the CPU itself (L2 cache)? If so, I have no idea. I am pretty
sure that when control initially passes to the BIOS the cache is disabled,
though. At some point, the BIOS turns on the cache, but this may be very
late in the boot process.

But if you are talking about accesses being cached by the North Bridge,
then I would guess that PCI accesses are never cached.

I think the way this words is that memory access from the CPU goes over
the host bus to the North Bridge, and the North Bridge routes the access
to PCI space or SDRAM as appropriate. If it is a PCI access, the North
Bridge would not ever cache, I assume.

I know the guy who wrote the book "PCI Bus Demystified" so I asked
him; he hadn't a clue about any of this.

Thanks,

John

What are you trying to do? As far as I know (which may not be very far!),
you don't need to worry about this issue unless you are doing something
really obscure.

I have written DOS code (again, with djgpp) which accessed PCI config
space (and memory mapped areas) for reading and writing and there were no
problems with caching.

HTH!

--Mac

John Larkin · Dec 4, 2005

What are you trying to do? As far as I know (which may not be very far!),
you don't need to worry about this issue unless you are doing something
really obscure.

I'm thinking about doing a CPCI board that would look like a smallish
block of stuff in memory space (as opposed to i/o space) on the PCI
bus. I was wondering if a Win application could get at it directly,
without doing a driver call for every i/o access, and how, in general,
a PC decides what's cachable and what's not. My board would deliver
realtime data in the register block ('volatile' in C-speak, I think)
so it must not be cached at any level.

I'm guessing everything beyond the contiguous ram space is not cached,
and/or maybe anything above the 2Gig line doesn't get cached. Funny
how little seems to be known about this.

I have written DOS code (again, with djgpp) which accessed PCI config
space (and memory mapped areas) for reading and writing and there were no
problems with caching.

Yeah, we have a couple of PowerBasic programs that run under DOS or
9x, that search PCI config space for a device and then drag it down
into a hole in real space, between 640K and 1M. Of course, first you
have to locate an unused, uncached hole to plop it into, and that
seems to be different from bios to bios. We've found systems that have
unused, cached holes!

John

Hal Murray · Dec 4, 2005

I'm thinking about doing a CPCI board that would look like a smallish

block of stuff in memory space (as opposed to i/o space) on the PCI
bus. I was wondering if a Win application could get at it directly,
without doing a driver call for every i/o access, and how, in general,
a PC decides what's cachable and what's not. My board would deliver
realtime data in the register block ('volatile' in C-speak, I think)
so it must not be cached at any level.

I'm rusty on this stuff, so be suspicious.

I/O space is a kludge for ISA. Best to avoid it, but I don't
think it's really any different. (other than not many address bits)

I think there is a cachable bit in the config space options.

If you have appropriate driver support, you can map some of
an applications virtual address space to "memory" on your
device, and it's easy to make that uncached.

With appropriate driver support, I've written diagnostic/hacks
that ran in user space. Interrupts are a bit tricky. I forget
the details. I think there was a magic location to read that
turned off the interrupt. We probably had some API to tell the
driver the address of that location, it would read it and save
the answer, and more API to get that data (probably that last one)
and a count of how many interrupts had happened since the last
time you asked.

Yeah, we have a couple of PowerBasic programs that run under DOS or
9x, that search PCI config space for a device and then drag it down
into a hole in real space, between 640K and 1M. Of course, first you
have to locate an unused, uncached hole to plop it into, and that
seems to be different from bios to bios. We've found systems that have
unused, cached holes!

The deal with PCI is that the BIOS allocates PCI addreses. If a
device wants X bits of address, they must be aligned on an X bit
address boundry. Unless you are very lucky, that will result in
holes. (Lucky means you filled in all the holes with chunks
from other devices.)

Keith · Dec 4, 2005

Does anybody know how the BIOS decides what should be cached? There's
nothing in a device's PCI config registers that says "don't cache me"
as far as I can tell.

Not PCI config registers. Memory configuration registers. MTRRs (Memory
Type Range Registers) in the processor and chipset control such things.

I know the guy who wrote the book "PCI Bus Demystified" so I asked him;
he hadn't a clue about any of this.

Memory on the PCI bus can no longer cached (deprecated in V2.2, IIRC).
Cacheable memory on the PCI bus is a mess so most systems, even before it
was yanked out of the spec, didn't support it.

I *highly* recommend "PCI System Architecture" by Shanley and Anderson as
a introduction/reference. MindShare has a very good series of books on
busses and processors. They do an *excellent* series of courses too, if a
tad expensive. Their books are available in dead tree form or as e-books:
http://www.mindshare.com

Mac · Dec 4, 2005

I'm thinking about doing a CPCI board that would look like a smallish
block of stuff in memory space (as opposed to i/o space) on the PCI
bus. I was wondering if a Win application could get at it directly,
without doing a driver call for every i/o access, and how, in general,
a PC decides what's cachable and what's not. My board would deliver
realtime data in the register block ('volatile' in C-speak, I think)
so it must not be cached at any level.

Hmm. It seems like other people must have done this (or something
like it) before. I am guessing that this will work fine without any
special precautions.

I'm guessing everything beyond the contiguous ram space is not cached,
and/or maybe anything above the 2Gig line doesn't get cached. Funny how
little seems to be known about this.

Well, there are people out there who know this stuff, but they might not
be reading here. There is a lot of information about Intel Architecture
that is hard to ferret out. Most of it doesn't seem to be in any kind of
real specification, either.

Anyway, I am quite sure that accesses to memory mapped areas on the PCI
bus are totally distinct from true memory accesses in the sense that the
North Bridge is well aware which area it is accessing. So I don't think
you have to worry about cacheing there.

As for the L2 cache, I'm not sure how that is managed. Maybe it only
caches instructions. But it obviously doesn't interfere with normal device
driver operation, so I don't think you have to worry about it, either.
As you can tell, I'm just guessing, but it seems like there are a lot of
things that wouldn't work right if these types of accesses were cached by
hardware.

Yeah, we have a couple of PowerBasic programs that run under DOS or
9x, that search PCI config space for a device and then drag it down
into a hole in real space, between 640K and 1M. Of course, first you
have to locate an unused, uncached hole to plop it into, and that
seems to be different from bios to bios. We've found systems that have
unused, cached holes!

John

Oh, wow. That sounds kind of hard. With djgpp and a protected mode stub,
you can write real 32-bit programs for DOS. No worries about dragging
stuff down below 1M. You don't get any kind of GUI, but for some tasks
that is not a problem.

The Intel Architecture and associated DOS baggage is unbelievable arcane.
It sure would be nice to dump it all and start over. ;-)

--Mac

John Larkin · Dec 4, 2005

Oh, wow. That sounds kind of hard. With djgpp and a protected mode stub,
you can write real 32-bit programs for DOS. No worries about dragging
stuff down below 1M. You don't get any kind of GUI, but for some tasks
that is not a problem.

It wasn't bad... just had to figure out a couple of bios calls. It's
available for anybody who's interested. We also wrote a couple of cute
utilities that scan the 0..1M address space; one shows the data
contents, one graphs access time vs address. Between the two you can
pretty much figure out where the BIOS has put things (like shadows of
itself!) and what's cached.

We like to use DOS for our dedicated VME/PCI test programs (rackmount
PC or VME card embedded Pentium) because we can own the CPU and run
pretty much realtime, even turning off or dancing around the 18 Hz
clock tick in extreme cases.

The Intel Architecture and associated DOS baggage is unbelievable arcane.
It sure would be nice to dump it all and start over. ;-)

Well, I'm typing on, essentially, a heavily kluged 8008 CPU, which was
a ghastly bad architecture to start on. How we wound up with Intel and
Microsoft as the dominant computer architecture is one of the
tragedies of our time. Think of how things would be if IBM had gone
with the 68K and anybody but Bill.

John

Mark McDougall · Dec 5, 2005

Keith said:
Memory on the PCI bus can no longer cached (deprecated in V2.2,
IIRC). Cacheable memory on the PCI bus is a mess so most systems,
even before it was yanked out of the spec, didn't support it.

I'm not sure where you got this info, but it's news to me! :O

To answer the OP:

As for mapping PCI memory from windows, there's a toolkit called TVICPCI
<http://entechtaiwan.net/dev/pci/index.shtm> which grants user space
access to PCI resources, which of course includes memory. They have a
demo version for evaluation. You won't be able to tell Windows to use
this as generic memory space (ie. Windows won't be executing out of it),
but you will be able to use it for your own data.

There's a bit in the PCI BAR that specifies whether or not the region is
pre-fetchable. If set, a PCI master will know that it is allowed to use
MRL (memory read line) or MRM (memory read multiple) on that address
space. That doesn't necessarily mean it will.

Regards,
Mark

tbroberg_nospam@hifn.com · Dec 5, 2005

John, the NT DDK provides an example program that does just this.
Please see this link from MS:

http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q189327

The down side is that you are opening up protected hardware resources
to user apps, which isn't 100% kosher.

I have seen this applied to Win2k and XP. Beats me what would be
required for 9x.

Keith · Dec 5, 2005

I'm not sure where you got this info, but it's news to me! :O

The specs, perhaps?

To answer the OP:

As for mapping PCI memory from windows, there's a toolkit called TVICPCI
<http://entechtaiwan.net/dev/pci/index.shtm> which grants user space
access to PCI resources, which of course includes memory. They have a
demo version for evaluation. You won't be able to tell Windows to use
this as generic memory space (ie. Windows won't be executing out of it),
but you will be able to use it for your own data.

There's a bit in the PCI BAR that specifies whether or not the region is
pre-fetchable. If set, a PCI master will know that it is allowed to use
MRL (memory read line) or MRM (memory read multiple) on that address
space. That doesn't necessarily mean it will.

That doesn't change the cacheability. Caches are a processor thing and
*NOT* under control of any PCI device.

Mark McDougall · Dec 5, 2005

Keith said:
That doesn't change the cacheability. Caches are a processor thing
and *NOT* under control of any PCI device.

Which begs the question, why would the PCI spec refer to something that
has nothing to do with PCI?

If a PCI memory space is marked as 'pre-fetchable' then it guarantees,
among other things, that the act of pre-fetching memory has no
side-effects. This means nothing more than the fact that it may be a
suitable candidate for caching, if the platform supports it. In this
case, a master may issue MRL (& MRM) commands.

OTOH, cache-coherency (which I assume you're hinting at) is a different
problem altogether - especially if you've got multiple bus masters
accessing PCI memory space with their own caches. However, this is a
*system* problem and (IMHO) not really any concern of the PCI bus spec
group to mandate that PCI memory is not 'cacheable' - whatever that
means in each context!

In fact, there's little discussion what-so-ever in the spec (that I can
see) about 'caches' - which is just what I would expect.

BTW I'm quite happy to be shown the error in my reasoning!

Regards,
Mark

John Larkin · Dec 5, 2005

Which begs the question, why would the PCI spec refer to something that
has nothing to do with PCI?

If a PCI memory space is marked as 'pre-fetchable' then it guarantees,
among other things, that the act of pre-fetching memory has no
side-effects. This means nothing more than the fact that it may be a
suitable candidate for caching, if the platform supports it. In this
case, a master may issue MRL (& MRM) commands.

OTOH, cache-coherency (which I assume you're hinting at) is a different
problem altogether - especially if you've got multiple bus masters
accessing PCI memory space with their own caches. However, this is a
*system* problem and (IMHO) not really any concern of the PCI bus spec
group to mandate that PCI memory is not 'cacheable' - whatever that
means in each context!

In fact, there's little discussion what-so-ever in the spec (that I can
see) about 'caches' - which is just what I would expect.

But only the PCI card itself knows whether its memory-space registers
are suitable for caching. If it's video memory, likely they are; if's
an ADC buffer, it sure ain't. Seems to me that, if PCI space is
allowed to be cached, there should be a mechanism that allows a PCI
card to tell the bios (or OS) whether caching is a good idea for it.

John

slebetman@yahoo.com · Dec 5, 2005

Mark said:
Which begs the question, why would the PCI spec refer to something that
has nothing to do with PCI?

If a PCI memory space is marked as 'pre-fetchable' then it guarantees,
among other things, that the act of pre-fetching memory has no
side-effects. This means nothing more than the fact that it may be a
suitable candidate for caching, if the platform supports it. In this
case, a master may issue MRL (& MRM) commands.

OTOH, cache-coherency (which I assume you're hinting at) is a different
problem altogether - especially if you've got multiple bus masters
accessing PCI memory space with their own caches. However, this is a
*system* problem and (IMHO) not really any concern of the PCI bus spec
group to mandate that PCI memory is not 'cacheable' - whatever that
means in each context!

In fact, there's little discussion what-so-ever in the spec (that I can
see) about 'caches' - which is just what I would expect.

BTW I'm quite happy to be shown the error in my reasoning!

Don't confuse pre-fetching with caching. And don't confuse MRM with
cachable. To be able to pre-fetch data simply requires that a read to
that data not cause any side-effect. To be able to cache data requires
that not outputting that data not cause any side-effect.

Generally the only cachable device is generic RAM. When you output to
any other device besides RAM you expect to see that output in the real
world. RAM is therefore the only safe device to cache.

For example, say you have a memory mapped variable 'foo'. You need to
send a signal to the device that 'foo' belongs to and it needs to be a
pulse. Say you code something like:

foo = 1;
delay(1);
foo = 0;

If 'foo' was cached, then the signal may or may not be generated in
this case. Even with the delay statement (which prevents some C
compilers form optimising away foo=1) 'foo' may not be written back to
the device if there is still enough space in the cache to not require
writing back 'foo'.

Pre-fetchable means it is safe to read data multiple times. Cachable
means it is safe to not write out data.

yusufilker@gmail.com · Dec 5, 2005

I guess the answer for linux is hidden between ;

"Chapter 9 page 236 I/O Registers and Conventional Memory"

and

"Chapter 12 page 316 Accessing the I/O and Memory Spaces"

of ldd3.

yusuf

Keith Williams · Dec 5, 2005

Which begs the question, why would the PCI spec refer to something that
has nothing to do with PCI?

Short answer: Snoops from any PCI initiator into PCI cached memory
are in the purview of the spec. ;-)

Longer answer[*]: The SBO# (Snoop Back Off) and SDONE (snoop done)
signals are part of the pre-PCI2.2 spec (actually, I believe in 2.2
it's recommended that they not be used and pulled high). These are
used by the memory bridge to initiate retries to cached memory.
SDONE indicates an access to cached memory is complete. SBO# active
indicates a cached line is being accessed and the access must be
terminated by the initiator and retried later.

[*] I've ever used these things so I'm not real up on the spec
here. The performance is horrible so isn't often implemented and
less often used.

If a PCI memory space is marked as 'pre-fetchable' then it guarantees,
among other things, that the act of pre-fetching memory has no
side-effects. This means nothing more than the fact that it may be a
suitable candidate for caching, if the platform supports it. In this
case, a master may issue MRL (& MRM) commands.

prefetching said:
OTOH, cache-coherency (which I assume you're hinting at) is a different
problem altogether - especially if you've got multiple bus masters
accessing PCI memory space with their own caches. However, this is a
*system* problem and (IMHO) not really any concern of the PCI bus spec
group to mandate that PCI memory is not 'cacheable' - whatever that
means in each context!

But it *is* part of the (pre 2.2) spec. The bus must guarantee
coherency in this case. The way it does it is with back-offs and
retries. Now think about this with multiple bridges and
initiators. It gets to be a mess.

In fact, there's little discussion what-so-ever in the spec (that I can
see) about 'caches' - which is just what I would expect.

It's there in the older versions of the spec. As I've mentioned,
it's been deprecated in later versions.

Frank-Christian Kruegel · Dec 5, 2005

Consider a Pentium PC with main memory and a PCI bus. One can plug
memory-type devices into the PCI bus, things like video or ADC
buffers, CPCI cards, and including, I suppose, more program-space
memory.

A couple of questions:

Is there (I guess there must be) a mechanism for a Windows program to
directly map a chunk of PCI-bus memory into its virtual address space?
Anybody know how this works?

With Windows NT (2k/xp/2k3) applications can never access hardware directly.
(*) You will have to write a kernel mode device driver to control your piece
of hardware. The NT kernel mode api is very different from what you know
from the user mode Windows api. You will need the Microsoft Platform SDK and
the DDK and the Microsoft C compiler. The DDK contains samples and
documentation for everything. Try to minimize the user/kernel mode switches.

(*) There are kludges like giveio.sys etc, which allow user applications to
access io ports. Forget about these - it won't be enough for you.

Mit freundlichen Grüßen

Frank-Christian Krügel

steve_schefter@hotmail.com · Dec 5, 2005

John said:
Does anybody know how the BIOS decides what should be cached? There's
nothing in a device's PCI config registers that says "don't cache me"
as far as I can tell.

The BIOS doesn't. The device driver (which knows the PCI card
intimately) asks for non-cached memory when it asks for a virtual
mapping of the physical address range of the card. If that's
appropriate. If there's no reason why it can't be cached, it doesn't.

Steve

John Larkin · Dec 5, 2005

The BIOS doesn't. The device driver (which knows the PCI card
intimately) asks for non-cached memory when it asks for a virtual
mapping of the physical address range of the card. If that's
appropriate. If there's no reason why it can't be cached, it doesn't.

Steve

--------------------------------------------------------------------------------------------
Steve Schefter phone: +1 705 725 9999 x26
The Software Group Limited fax: +1 705 725 9666
642 Welham Road,
Barrie, Ontario CANADA L4N 9A1 Web: www.wanware.com

But we often use PCI devices under DOS, with no device driver at all.
It appears to me that the BIOS locates devices in PCI config space,
looks at the requested resources (in the PCI overhead reggies) and
assigns memory space, as requested, to the gadgets. Usually these
addresses are really high, past 2G as I recall. So the cached/uncached
situation must be resolved, somehow, before the os boots, although it
can certainly be changed by drivers later.

John

slebetman@yahoo.com · Dec 6, 2005

I guess the answer for linux is hidden between ;

"Chapter 9 page 236 I/O Registers and Conventional Memory"

and

"Chapter 12 page 316 Accessing the I/O and Memory Spaces"

What book? The usual answer for Linux is O'Reilly's Linux Device
Driver, chapters 7, 8 and 13. The free online version can be found at:

http://www.xml.com/ldd/chapter/book/

Moore's Lobby Podcast

Menu

Categories

Platforms

Content

Connect With Us

Network

pci and caching

pci and caching

John Larkin

[email protected]

Mac

John Larkin

Hal Murray

Keith

Mac

John Larkin

Mark McDougall

[email protected]

Keith

Mark McDougall

John Larkin

[email protected]

[email protected]

Keith Williams

Frank-Christian Kruegel

[email protected]

John Larkin

[email protected]

Similar threads