Erhalten Sie Zugang zu diesem und mehr als 300000 Büchern ab EUR 5,99 monatlich.
Software is in many cases interacting with hardware, the peripheral devices, to interact with is physical environment. Those hardware-dependent software parts, in the context of an operating system better known as device driver, are crucial for system performance and stability. In order to design hardware-dependent software, the principles and foundations of the interaction between hardware and software needs to be understood on lowest level as well as on abstract level. The reader can follow the ideas and principles from foundations in computer architecture over low-level communication up to software design and development methods. Describing the interaction with UML gives the software engineer direct hints on how to design the software based on model driven techniques and show the limits its expressiveness in this area. The textbook avoids programming language or operating system dependencies to reveal the underlying, often hidden principles. Nevertheless, as software development is complex in this area, one focus point in the development cycle is on debugging techniques for hardware-dependent software.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 311
Das E-Book (TTS) können Sie hören im Abo „Legimi Premium” in Legimi-Apps auf:
Introduction
Objectives of Hardware-dependent Software
Computer Architecture Foundations
3.1. Central Processing Unit
3.2. Bus-System
3.3. Peripheral Devices
3.3.1. Device Core
3.3.2. Register File
3.3.3. Device and Computer System Structure
3.4. Conclusion
Communication and Synchronisation
4.1. Communication
4.1.1. Nature of the Register File
4.1.2. Communication on Software Side
4.2. Synchronisation
4.2.1. Synchronisation Principles
4.2.2. Strange Behaviour of Register Files
4.2.3. Message Transfers
4.2.4. Functionality Adjustment
4.2.5. The Semantic Layer
4.2.6. Functional Synchronisation
Design
5.1. Object-Oriented Design
5.1.1. Object-Oriented Layering
5.1.2. Applying Design Pattern
5.1.3. Device Counterparts
5.1.4. Topology Analysis
5.1.5. Descriptors and Caching
5.2. Activity Based Design
5.2.1. Overview
5.2.2. Data Acquisition Example
5.2.3. Conclusion and Open Problems
5.3. State Machine Based Design
5.3.1. Finte State Machines
5.3.2. Hardware/Software-Codesign Perspective
5.3.3. Software Modelling with FSM
5.3.4. Conclusion
Device Driver
6.1. Interfacing Device Drivers
6.2. System Start-up and Shut-down
6.3. Management of Devices
6.3.1. Handling Single Devices
6.3.2. Multiple Devices and Va riant Management 106
6.3.3. Service Addressing
6.3.4. Hot-Plugging
6.3.5. Reconfigurable Devices
Development
7.1. Software Engineering
7.1.1. Life Cycle and Project Management
7.1.2. Requirements Engineering
7.1.3. Architecture and Design
7.1.4. Implementation and Integration
7.2. Debugging
7.2.1. Observing the System
7.2.2. Software Side Debugging
7.2.3. Hardware Side Debugging
7.3. System Simulation/Emulation and Te sting
7.3.1. System Simulation
7.3.2. Subs, Mocks and other Integrated Simulations
7.3.3. Device Simulation
7.4. Summary
A. Appendix
A.1. Replacement Techniques
A.1.1. Control by Configuration Management
A.1.2. Code Changes on Calling Side
A.1.3. Binding Time Selections
A.1.4. Dynamic Branching during Runtime
What is Hardware-dependent Software (HdS) in comparison to ”normal software”? What is the difference? What is specific to a device driver or a Hardware Abstraction Layer (HAL)? First of all, the term ”normal software” is already hard to define which makes the comparison more difficult. Within this book, Hardware-dependent Software (HdS) is seen as a Software System which operates in close relationship with digital hardware, often called peripheral device or simply device. HdS interacts with the device, controls it, steers it. Furthermore, it extends the hardware functionality, restricts it, and hides away or encapsulates problems of the hardware/software interaction.
Functionality in software components and on the opposite in hardware operate in their own execution domains. HdS interacts with those execution domains in hardware. The execution domains need to be connected and synchronised, which is in the responsibility of HdS. This synchronisation is a mayor topic, so this book will focus on it.
HdS is more than a Hardware Abstraction Layer (HAL) that just provides a first common interface of the underlying hardware platform. A common interface as a low-level abstraction focusing on portability on lowest level or closest level to the hardware.
Figure 1.1.: HdS as part of the system, without HAL or as dominating part
Within the microcontroller domain, a whole application can be interpreted as HdS, as in most cases the software just coordinates the interaction of the peripheral devices. Or the control logic of the application is a fraction in comparison to the part that can be interpreted as HdS. Within that domain, HdS can additionally be seen as a library for the development of embedded applications. So the HdS is here the basement for higher-level applications. It provides, as a component, services similar to a software library but including hardware interaction (see Figure 1.2).
Regarding the functionality provided, HdS and so called device driver are closely related. To distinguish HdS from device driver, a device driver is here a hardware-dependent software component that is integrated or operates in the context of an operating system (OS). This needs extra considerations on the design, as the OS on the one hand restricts interfaces and communication channels, respectively determines parts of the architecture. On the other hand, the OS provides common services, that ease the design of device drivers. So software, that is typically denoted as device driver, is here HdS in the environment of an operating system.
Figure 1.2.: Hardware-Dependent Software in comparison to libraries
Figure 1.3.: HdS as low-level component of a library/OS
Peripheral devices often interact with other (hardware) components, or to be more precise, an outside world. As software contains a model of the domain it is designed for, the behaviour of the outside world and the interaction needs consideration during the design of the HdS. But the focus stays on the interaction with peripheral devices. So the control of a robot, factory, UAV, etc. falls into the application domain is off topic here.
Nevertheless it is a smooth transition. HdS ranges from a tiny HAL, via more complex modules, device drivers, up to nearly full applications. All are closely related or interact with hardware devices. This interaction leads to typical structures and makes the software design often look tricky. On the other hand, the exploding demand of embedded systems with realtime, reliability, security and safety constraints make the design of HdS a crucial domain. Within this book, interaction means are analysed and designs approaches discussed.
So the whole book is organised in a bottom-up approach. All topics are discussed as general as possible, to emphasis on principles and methodology, and not on ”and in case of the controller x454B92A, you have to create a function that toggles bit 5 ...”. The later can be found in various books on microcontroller programming, but without looking at the underlying principles and methodologies.
This book starts with a short discussion on the intention of HdS. After that, a brief review on computer architecture principles follows to have a common understanding on the underlying hardware architecture.
HdS and hardware devices are parallel executing elements. In parallel systems, the communication and synchronisation means are essential. Hardware devices and software execute according to different paradigms. Both sides need consideration to design software components which aim to reach higher levels of abstraction.
HdS can be designed with usual software engineering methods for parallel systems based on the found communication and interaction model. A software design is driven by different views on the system. The design of HdS from different perspectives is discussed based on the design and modelling with description languages like the Unified Modelling Language (UML).
Specific issues on device management, device drivers, and software development in this special area are later on discussed in separate chapters.
Hardware-dependent Software is software that provides services in cooperation with hardware, the device. Main objective is to make hardware-services available to other higher-level software components. Furthermore the Hardware-dependent Software may emulate missing functionality or restricts the access to hardware functionality. So the hardware and the environment it interacts with is protected against illegal operation requests. Another objective is the management of multiple devices as hot-plugging techniques and power-management let the devices disappear during runtime. This chapter briefly introduces the objectives in order to give an overview were the other analysis and design steps of the following chapters are aiming for.
For now lets focus on the nature of Hardware-dependent Software (HdS) for a moment. We will return to the hardware part in the section on computer architecture again. In this section, the various objectives of a HdS shall be introduced to give a first inside into the complexity of HdS-design within this domain.
Hardware-dependent Software is often denoted as low1-level software. This is true, as it operates close to the hardware of the execution platform and its peripheral components, simply denoted as devices. In embedded systems, the focus of the system is on the interaction with a physical environment via those peripheral components/devices. Hence, the majority of software components are related to hardware. Furthermore, operating systems can be seen as a collection of device drivers which are in terms of this book Hardware-dependent Software components within an operating system environment.
Figure 2.1.: Overwiew of the interaction
First of all, HdS makes the hardware functionality available to higher level software. At the beginning with the initialisation and configuration of the device for the expected operations, later with interaction functions. The availability of device functionality is often crucial for the commercial success of the device vendor. If the functionality isn’t available with the required performance and quality by the driver (HdS), the device cannot demonstrate its full potential. One key competence is to reveal this potential and grand interoperation with it.
The design of the HdS determines which part of the hardware is available via the interface towards higher software functionality, here denoted as Application. Which level of abstraction is reached at the application interface? Furthermore, as the hardware may interact with other hardware components or the outside world, does the HdS provide appropriate access to them? These questions must be determined by the intended use and influence the design. For one kind of applications, a very low-level channel to parts of the hardware is required, for other applications, a very high-level view is needed.
Figure 2.2.: HdS as adaptor layer between application and hardware
So in some cases, the HdS allows transparent access to parts of the device and at the same time a transparent access to devices beyond the own device with a high level of abstraction. However, HdS is the adaptor between the needs of an application and the provided capabilities of the peripheral device and the associated end point(s) inside the device core.
Figure 2.3.: HdS accesses different endpoints in device core
Software on CPU and hardware devices execute in parallel, whereas the digital circuit in the device is a massive parallel system itself. The functionality of the software side and on the hardware side needs to synchronise at specific points. Devices provide a wide range of synchronisation means which must be understood and the counterpart in software needs to be realised.
Hardware devices and HdS in combination provide a service as a whole. So functionality can be either implemented in hardware or in software (compare the HW/SW-Co-Design approach). Thus HdS not only makes hardware functions available, it can extend them with additional functionality (compare a in Figure 2.4). The functionality is emulated in software by using resources of the target platform, like the main CPU and main memory. In extreme cases, the HdS provides the service just with the later ones and without any real existing device. A good example is the RAM-disk. It behaves like a hard-disk but is emulated with the CPU and the main memory by the RAM-disk driver. Another mid-level approach is to emulate a behaviour at the application interface with the support of hardware of different kind (compare Figure 2.4 b)). For example, a CD-ROM is emulated with a hard-disk. Read access to the disk image is supported, but write is not allowed (or just only for initialisation) in order to fulfil the semantic of a write-only device.
The partial emulation of devices implies another approach, that not all functionality of a device is accessible and usable via the HdS (compare additional hardware-ports in Figure 2.4 c)). The non-availability restricts the usage of the device. This is often the case, where hardware devices are designed for a superset of possible use-cases. For marketing or production reasons, the combination of specific HdS, mostly in the sense of device driver, and the common device determine as a whole the set of available services. The variation of the software determines the functionality variant. In summary, HdS provides a restricted access, enables the full potential of a device, or extends it by emulation of additional functionality.
Figure 2.4.: HdS in a extends functionality, b emulates functionality, or c restricts accesses to device core
Some devices are very similar, especially if they are from the same vendor. Hence, in many cases HdS must handle a set of similar devices. The software part is common to the devices and the whole functionality variant is here determined by the device variant. In all cases, to adapt to the given hardware, it is essential to clearly identify the hardware device in type and revision. And the hardware design has to provide means to identify those variations of the device core.
An additional objective of HdS is to hide away little variations in the hardware or, better to say, hide away revision variations. A new revision of the device removes defects and the previous software work-arounds can be removed or disabled. Vice versa, new defects may be introduced and workarounds need to be supplied in HdS. Or in the sense of HW/SW-Codesign, functionality has moved between software and hardware.
Along with the management of variations of single devices, a computer system may consists of a set of devices of the same kind. The HdS manages multiple devices of similar functionality in combination with multiple using applications (compare Figure 2.5). So device functionally of a specific one has to be linked to a specific application. The device management includes managing devices which are only temporary available (hot-plugging) or which allow powersaving sleep modes that makes them temporarily disappear.
Figure 2.5.: HdS adapts between Applications and Devices
Furthermore, HdS has to protect the hardware and the surrounding system, either in software or in hardware. Obvious, the HdS must not allow that the device hardware is damaged or destroyed by illegal operations of the application. The commands and requested actions needs to be filtered according to the current system state and maybe according to the requester of the action. HdS operates in combination with hardware that often controls physical systems that do harm to the outside environment. So in the sense of safety, it must not be origin of any hazard, especially if it is part of an safety critical system, for instance in medical devices or avionic control systems2.
And, last but not least, HdS has to be very stable, reliable, and of high software quality. HdS has typically full access to the hardware without any limitations. The HdS is not allowed, either by intention or by failure, to access other hardware than the assigned hardware or to have any side-effects on other hardware components or software components in the computer system. Device driver as part of an operating system are allowed to execute privileged commands, so it has nearly all rights and hence again all responsibility. So in the sense of security the driver design and implementation must not allow to compromise the security integrity of the operating system. Although safety and security are different aspects they interact and must be handled together in system design[4].
In conclusion, HdS is not simply an adaptor between the device and a high level perspective to it. Next to the interaction with complex hardware it might have to emulate hardware, add missing features, protect the hardware and the system, mange resources and usage and makes everything work in a stable, reliable, safe and secure manner in a complex environment.
All aspects need consideration during software development. So software engineering methods have to be applied as for ”normal” software. This script will provide insights to what HdS is, how it interacts with hardware and what engineering methods can be applied.
1Low in the sense of abstraction level or in the sense of layered software-architecture.
2Systems that have been driven to a fail-safe state must ignore all commands for normal operation until leaving this state.
This chapter contains a brief review on computer architecture as it is assumed in this book. Nowadays computer architectures are more sophisticated in order to achieve more performance, the fundamentals discussed here are still valid.
Figure 3.1.: Principle structure of a computer system
A computer system basically consists of four components:
the central processing unit (CPU), executing the software,
the main memory, storing data and programs,
the communication facilities, like bus-systems and
the input/output systems (I/O), the peripheral devices, providing specialised computations and/or connection to the outside world.
All those components interact with each other and influence the design of HdS. Even as they have evolved to highly optimised complex systems, their base functionality has more or less stayed the same.
The Central Processing Unit (CPU), the processing core, executes the commands of a program. Internal registers hold data that is processed by the internal processing units, like the Arithmetic Logical Unit (ALU), or other specialise units, like floating point units, units allowing vector operations or cryptographical operations. The CPU transfers data via bus-systems from the main memory to the internal registers and vice versa (compare for instance [27]). The transfer is always initiated by the CPU, CPU registers cannot be accessed from the outside, not even in multi-master bus-systems. Hence the transfers from or to CPU registers are only initiated by software running on the CPU.
The execution of the commands is from the programmers perspective strict sequential. Data are processed one after the other and the result is stored after calculation completion. Modern out-of-order-execution processors optimise the internal schedule of the commands to gain a hight utilisation of each processing unit. The sequential semantic still holds, because the results are re-assembled as if the execution has been in-order. This strict sequential order is not true for the sequence of bus-operations. It might be different between in-order execution and out-of-order execution even if the same software is executed. To explicitly avoid out-of-order transfers on the bus, which might violate the desired communication protocol to I/O, explicit commands, so called memory-barriers, control the (partial) order of bus-operations (comparer for a detail discussion of race-conditions [23]).
Figure 3.2.: Core and Cache structure
Nowadays CPUs consist of multiple cores. Each core is a CPU itself, with own register set, Instruction Unit, ALU, etc. Each core is capable of executing an independent sequential stream of commands. All cores access the same main memory by a common bus system. As each core is independent, but shares same resources of the computer system, synchronisation on different levels is required to avoid performance loss and race-conditions.
A fast memory close to the CPU registers, the cache, holds copies of main memory data to provide a fast access on further read requests. The cache size is limited by the core frequency and the speed of light. So a hierarchy of caches with different speed and size is used. In a multicore architecture, the view to the main memory must be coherent on each core. Cache coherent protocols between the caches synchronise this view. Depending on the cache architecture, the coherence is realised on different levels. However, if data must be read from a main memory location without using a cached value (or must be written to a memory location), the cache must be configured to not cache this location or an explicit cache control command in software must enforce the transfer. This is often the case for the access of peripheral devices, as peripheral devices can alter the content independent of the CPU. Otherwise data is not really read from peripheral systems, instead only out-dated data is read from the cache.
The simple form of a CPU with just one processing core and only one bus-system with multiple attached peripheral devices and main memory is still found in many microcontrollersystems.
The bus-system connects the CPU with the main memory and other devices, the peripheral devices, of the computer architecture. In simple architectures, the bus is line-oriented, connecting each device and memory with the CPU (see Fig. 3.1). In modern computer architectures, multiple bus systems with maybe different technologies span a tree from the CPU toward main memory and the peripheral devices. The root of the tree is connected with a Processor Local Bus (PLB) connecting all cores or caches in a cache-coherent[48] manner (compare Fig. 3.2 and Fig. 3.3).
Figure 3.3.: Tree like structure of bus systems
Often, the communication link towards the main memory branches at an early stage of the bus topology, mostly direct at the root node, from the bus-systems towards the peripheral devices; they form individual trees.
Additional bus-bridges connect the peripheral bus systems of different technology and protocol within the tree. Those bridges are devices on their own and the leaves of the tree are peripheral devices. A peripheral device can again be the root node of a further network, for instance like the the USB host-controller. Examples for that kind of bus topology are the architectures of the embedded PowerPC®processors, for instance the PowerPC440 family[28].
The topology of the bus-system from cache down to the peripheral devices is hidden from software. The CPU, respectively each core, sees the memory and all device in a flat addressspace. Each peripheral device is embedded in the address-space of the processor as memory mapped I/O. The CPU is intended to perform transfers between memory and its internal registers. Hence, the root node of the tree understands memory transfers and hence each device has to act like memory.
The bus-system performs data transfers with different technology and protocols. Each attached component must understand and follow the protocol. If the interface of a digital component does not comply with the protocol, an adaptor component adapts to the protocol. In case of peripheral devices, the register file adapts the device core to the bus-system, its protocol and the scattered access in time.
Figure 3.4.: Device connected to bus system
So from software point of view, memory and devices are accessed in the same way. On the first glance, everything locks and feels like memory. As we see later, the interface for peripheral devices, the register file, just looks like memory from the perspective of the bus protocol. The real behaviour in comparison to memory cells can be totally different.
Nowadays bus-systems base on serial communication links instead of parallel lines. Examples are PCIe®[49], Hypertransport® [14] and QuickPath Interconnect[33]. Nevertheless, the perspective from the CPU is still an access to a location in a memory address-space. The transfer is performed via a serial protocol instead of a parallel one.
Figure 3.5.: Tree-structure of a PCIe system and mapping to a flat memory perspective
Peripheral devices are digital devices as part of the computer architecture. Attached as leaves of the bus topology, they are linked with the CPU and accessible from the software. Peripheral devices provide additional computational functionality and/or connect the computer with the outside world.
Figure 3.6.: Peripheral device as link between discrete and the analogous outside world
The outside world is either digital again or analogous. The devices need to interface to the analogous signals in value and time. Analog-to-Digital Converter (ADC) translate those signals to digital signals and Digital-to-Analog Converter (DAC) back again. Even with those converters, the outside world operates with physical time instead of a discrete time that can be stopped at will. This unstoppable time requires special multi-paradigm debugging techniques, as discussed later in Sec. 7.2.
The inner structure of peripheral devices can be separated into the device core and the Register File (RF). The device core provides, as a processing element implemented in digital hardware, the device functionality. The register file and its attached bus-adaptor interfaces the digital input-output lines of the device core and the attached bus-system.
3.3. Peripheral Devices
Figure 3.7.: Register file as adaptor between bus and device core
The device core realises the major functionality of the peripheral device. It can be substructured into a mesh of interconnected digital components providing the functionality. The real internal structure of the device core is in most cases unknown. From the outside it is a black box. One can neither know whether it is pure hardware with state machines nor whether it is a complete microcontroller with software (referred to as firmware) nor another sophisticated architecture, for instance Graphical Processing Units (GPU). Even the interconnection between the components can be of various kind, like direct digital lines, links with dedicated protocols, or a full internal bus-system.
Figure 3.8.: Mesh of internal components, linked directly or via bus system
However, the device core is a processing element realised with digital hardware executing massively parallel. Combinational logic, realised with logical gates, processes digital encoded values on digital signal lines. Computation speed is limited by the delay of the logical gates and interconnection links of the given technology.
In combination with clocked registers, sequential logic can be build. On the first level to synchronise results of the logical gate in a Register Transfer Level (RTL) design, over pipeline architectures, up to state machines realising sequential control. They form the control path of a digital circuit. The synchronisation of the registers with a clock signal allows a step-wise execution or control.
The complexity of device circuits can reach the level of processor architectures that execute firmware. An important example are General Purpose Graphical Processing Units (GPGPUs). The real physical architecture is unimportant; from the software perspective executing on the main CPU, those complex devices are still peripheral devices (compare Fig. 3.9) that have a dedicated job. They operate fast, optimised, and nearly independent. The fundamental communication schema is still a memory access in the address range of the device.
Figure 3.9.: GPGPU as device in the CPU perspective (according to [39])
In conclusion, a device is a mesh of encapsulated functional behaviours realised in digital hardware that interacts internally in full parallel. The execution and internal communication is often synchronised by a common clock signal. Internal components communicate with the external world by way of adaptors, that transform the signals in value and time domain. Communication with the computer components is realised via the Register File and its busadaptor. How many and which device’s components are visible via the Register File depends on the device’s design.
The bus-system transfers chunks of bytes between the CPU and memory locations. The Register File (RF) is an adaptor between the device core and the bus-system the device is attached to (see Fig. 3.10). On the bus-side, the register file must follow the bus-protocol and consequently behaves like memory. On the device core side, the communication paradigm are groups of direct linked digital signals with dedicated encodings and protocols. The adaptors between those two domains are groups of register which can be read and changed from both sides and hence adapt to either side.
The two domains, bus and device core, have different behaviour in the time domain. Data are transferred by the bus-system from time to time. Even with a stream of data, only sections in the size of some bytes change in the register file. In each bus cycle, only the number of bit in the size of the data bus width can change in the register file. On the device side, each register bit is attached to the device core via a dedicated signal line. Consequently, each register can be read at any time in parallel and can be changed at any time as well. The access is massive parallel and only restricted in time by the device clock1.
So the register file is a storage for signals from or to the device core. Signal values from the device core are buffered and can be read from the bus side as if they are values in memory cells. Vice versa, values transferred to the register file are stored and applied as logical signals to the device core. As the register file is digital hardware and only has to look like memory by complying with the bus protocol, some registers implement additional behaviour.
Thus, the entire communication between the device core and software executing on the CPU is carried out via the register file. The behaviour of this adaptor with its additional behaviour has to be taken into consideration whilst designing Hardware-dependent Software. Sometimes the software designer gets the feeling to operate through a key-hole.
Figure 3.10.: Different access paradigms on each side of the register file
Figure 3.11.: Connection of peripheral bus and device internal bus via bus-bridge
A rare alternative communication mean is the linkage of the bus-system with a device internal bus-system via a device internal bus bridge to gain access to an internal shared memory (see Fig. 3.11). The internal memory is mapped into the memory space of the computer system. GPU-devices uses this technique for fast data transfer between their main memory and the main memory of the computer system utilising direct-memory-access (DMA) controller.
The real internal structure of a device is often unknown. A functional reconstruction is feasible by analysing the device’s manual. Or at least developers derive an equivalent behaviour from the description in the manuals2.
Some parts are visible through the interfaces, the register file. The structure or at minimum a protocol of signals for the communication with the device components attached to the register file can be derived. These signals are visible in the register file and hence the protocol for signals can be determined.
Nevertheless, a device is a mesh of functional components. The functionality of each component of the mesh is encapsulated and either directly accessible or only accessible via other components (compare Fig. 3.12). So from the perspective of the HdS, sometimes key-hole operations need to be performed to steer one component via another component, because it is not directly reachable.
Figure 3.12.: Hidden components in the device core without direct connection to the register file
Device hardware is normally fixed in its structure. The chip has a fixed structure and hence the implemented resources are fixed. A dynamic growing or resource allocation is typically not the case. There are two exceptions where the hardware structure is not fixed. The first one is hot-plugging, where a whole device can be added to a bus-system or can be removed. The computer system is able to expand or to shrink concerning its device resources in the granularity of full devices. Modern architectures allow to control the powering of individual devices. The effect of turning off the device power is the same as removing it in the perspective of the software. The second exception are reconfigurable devices. Those systems base programmable logic, a modern FPGA-architecture that allows to modify the interconnection and the functionality of logical gates and hence the provided functionality of the whole chip during runtime. Some architectures even allow a partial change of dedicated areas, a partial reconfiguration, during runtime. Within the limits of the FPGA resources, the hardware device is able to expand, to shrink, or to change behaviour at runtime.
Within the device, different clock domains may exist. Typically, all components are clocked by the same clock line. Preserving energy, or for other design reasons, components inside the chip can operate with different clock signals, mostly derived from the same master clock. Synchronisation means on the communication links must ensure the data integrity while crossing the domains.
Within the computer system, each processing element has its own execution pace: the software executing on the CPU and the functionality implemented in the device. Both domains operate asynchronously and the processing domains are only coupled by the register file. Sequential executing software on the CPU copies data from memory to the register file in chunks whereas the device core has full parallel access to the register file content. Additionally both processing elements operate with a clocked time whereas the outside world operates with physical time.
Finally, we have to discriminate the execution paradigms. Hardware is executed massive parallel and software strict sequential. Again, both paradigms meet at the register file.
Figure 3.13.: General interconnection structure of processing elements
A CPU is a set of sequentially executing Processing Elements (PE), Cores, connected with the main memory and a set of massively parallel executing processing elements, the peripheral devices, connected via a tree-like connection system, the bus-system (compare Fig. 3.13). The software executed on the Cores controls the data transfers between internal registers and the main memory or the devices’ register file. The devices communicate with their register file in full parallel.
HdS executes on the Cores and communicates with the associated devices. Both execute in full parallel. So key questions for the design of HdS are the communication and the coordination/synchronisation. The register file in the devices plays an important role here. Furthermore, the influence of the opponent execution paradigms needs consideration.
The next chapter discusses principles of synchronisation and communication via the register file. With this foundations, the lowest levels of the HdS are designed. As the CPU initiates nearly all communication, communication initiated by the device needs extra consideration. With a two way communication and synchronisation, the design of HdS can begin.
1The access frequency is actually only limited by the used technology, but access faster than device clock doesn’t make sense in RTL-designs.
2 Most developer wouldn’t say, that they create any model; nevertheless, it is not explicit but in their minds is something like a behavioural model. Otherwise, they would not be able to design any software for it.
As indicated in the previous chapter, a computer architecture can be reduced to a system of four components, the CPU, the bus-system, the memory, and peripheral devices (register file and device core). Functionality is executed on the CPU and the device cores. HdS executes as part of the functionality on the CPU. Thus it is a parallel computing system with two kinds of execution units, processor cores and digital circuits1.
Crucial aspects of parallel systems are the communication and the synchronisation of the parallel executing components. The opposed execution paradigms as well as the bus-system and the register file behaviour, as element of the communication link, have an impact on communication and synchronisation. So the design of HdS starts with the communication between those parallel components, respectively the adapting register file for each side. Each side has a very different processing architecture, that leads to asymmetric handling of communication and synchronisation. Based on the execution model, the communication model, and the register file behaviour, the synchronisation means are analysed and implications on the software design are discussed in here.
Figure 4.1.: Functionality determines HW, SW and Communication in Hardware/Software Codesign whereas the HW in combination with Communication determines the HdS-Design and hence Functionality
It is assumed, that the peripheral device is set and cannot be changed in its behaviour at the interface. Even with Hardware/Software Codesign methodology, it is still widely-established practice, that a peripheral device is designed first and than integrated into a computer system or a microcontroller system as System-on-a-Chip (SoC).
The section on communication starts with an analysis of the register file nature. The information exchange on lowest level will lead to a layered architecture on the software side that handles the encoding and transfer. General implementation issues will be discussed as well.
The section on synchronisation starts with an overview on synchronisation principles. With the help of these principles, the somehow strange behaviour of some register file implementations will be explained. Additionally, modern CPU architectures have a strange communication behaviour as well (out-of-order), that has an impact on synchronisation and will be explained as well. As a result, a layering architecture for the HdS is derived. The layers provide a more abstract interface respectively interfaces with more semantic meaning.
The low-level synchronisation will be extended to the exchange of message and functionality synchronisation. Here the two concepts of polling and message handling by Interrupt Service Routines will be discussed including communication between the two flows of control.
The communication, low-level synchronisation, and the functional synchronisation are the foundations for the design of HdS. Based on the findings in this chapter, the design of the higher functionality, the counterpart of the device core functionality, is described in the following chapter.
Parallel executing functionality is never totally decoupled and those parallel threads need to communicate. The communication depends on the processing elements they are mapped to and the communication link in-between. HdS is mapped to the CPU and the associated functionality is mapped to the device core of the peripheral device.
Crucial component in the communication between the software on the CPU and the device core is the register file interfacing the two execution and communication paradigms on each side. The typical register file design in combination with the connection to a bus-system leads to a layered architecture to transparently hide the communication on lowest level.
Inside the peripheral device, the register file (de-)couples the systems of opposed execution paradigms, the massive parallel hardware (device core) and the strict sequential software (HdS). Major objective is handling of the asynchronous information exchange. The software side of the register file is attached to the computer peripheral bus system, the other side is directly attached to the device core.
Figure 4.2.: The