초록
Various multiprocessor systems have been developed to enhance the per-
formance of computer systems. Since increasing the number of processors
in a multiprocessor system causes communication overhead for the synchro-
nization among processors, the system performance may not be improved
as the number of processors increases, In many of current systems, the spi%
focf synchronization scheme is used because of its simple structure. How-
ever, the scheme needs to be Implemented carefully because the processors
contending the focA uanasfes keep spinning, which affects the system per-
formace, Out of the motivation of using caches for eHcient synchronization
two schemes we.e p.oposed lately. One is the qOLB(Queue On Lock Bit)
scheme where a queue of spinning processors is implemented in the shared
memo.y and caches, and the othe. is the LBP(Lock Based p.otocol) scheme
where a cache protocol and a synchronization protocol are combined. Com- bining cache and synchronigation protocols, however, makes the implemen-
tation diHcult. Moreover, modifying cache protocol would be impossible
when microprocessors providing cache protocols Inside of the ch3ps are used
to bulid a system.
To solve the p.oblem, a new scheme called eCr(Queueing and Caching
with the exchange primitive based on the spin lock) for eHclent processor
synchronization, is proposed in this dissertation. The main characteristic
of the scheme is to provide a synchronization protocol completly separated
from the cache protocol. In the scheme, both the cacAtnf for the lock
variables and the 9%e%emf for the spinning processors are processed with
a simple hardware architecture composed of a lock cache, a lock address
buffer, a queue buffer, three comparators, and a state machine. The QCX
scheme was implemented in the TICOM 111 system, which is a typical shared
memory multiprocessor system developed by ETRI. The hardware for the
scheme was implemented in the Main Processing Unit of the TICOM 111
system as an ASIC by utilizing the 0.8 micron BiCMOS technology.
A hardware gimulation model and a workload model were established
to evaluate the performance of the QCX scheme. In the simulation, the
performance was measured with increasing the number of processors up
to 30 on different contentions for the lock varables. To compare the QCX
scheme to the QOLB and the tBP schemes, the performance of two schemes
were also measured in the same simulation environment .
A comprehensive evaluation o( the QCX scheme was reviewed in three
different aspects. First, the hardware eHciency of the QCX scheme is the
best. Secondly, the software utiligation is good for all three schemes, And
lastly, the performance of the QOLB scheme is the best in less contention
and that of the LBP scheme is the best in heavy contention. However, the
QCX scheme is ranked the second in both cases.
Although the QCX scheme is ranked the second in terms of perfor-
mance, it is to be the most eBcient one when implementation feasibility
and practicability are conisidered. In other words, since the tBP scheme is
very dincult to be implemented, it is not plausible to anticipate the per-
formance showed in the simulation. And the QCX scheme shows better
performance than the QOLB scheme in case of heavy contention on shared
data. In conclusion, the QCX scheme is evaluated to be the most encient
one among three processor synchronigation schemes considered here
닫기