Publications(January 2001 - December 2001)
- Kenji Watanabe, Wanming Chu, and Yamin Li, "Exploiting
Java Instruction/Thread Level Parallelism with Horizontal
Multithreading'', Australian Computer Science Communications,
Vol.23, No.4, IEEE Computer Society Press, 2001, pp.122-129.
Abstract - Java bytecodes
can be executed with the following three methods: a Java
interpreter running on a particular machine interprets
bytecodes; a Just-In-Time (JIT) compiler translates bytecodes
to the native primitives of the particular machine and
the machine executes the translated codes; and a Java
processor executes bytecodes directly. The first two methods
require no special hardware support for the execution
of Java bytecodes and are widely used currently. The last
method requires an embedded Java processor, picoJavaI
or picoJavaII for instance. The picoJavaI and picoJavaII
are simple pipelined processors with no ILP (instruction
level parallelism) and TLP (thread level parallelism)
supports. A so-called MAJC (microprocessor architecture
for Java computing) design can exploit ILP and TLP by
using a modified VLIW (very long instruction word) architecture
and vertical multithreading technique, but it has its
own instruction set and cannot execute Java bytecodes
directly. In this paper, we investigate a processor architecture
which can directly execute Java bytecodes meanwhile can
exploit Java ILP and TLP simultaneously. The proposed
processor consists of multiple slots implementing horizontal
multithreading and multiple functional units shared by
all threads executed in parallel. Our architectural simulation
results show that the Java processor could achieve an
average 20 IPC (instructions per cycle), or 7.33 EIPC
(effective IPC), with 8 slots and a 4-instruction scheduling
window for each slot. We also check other configurations
and give the utilization of functional units as well as
the performance improvement with various kinds of working
loads.
- Yamin Li and Shietung Peng, "Fault-tolerant Routing
and Disjoint Paths in Dual-cube: a New Interconnection Network'',
Proceedings of the 2001 International Conference on Parallel
and Distributed Systems (ICPADS'2001), IEEE Computer Society
Press, 2001, pp.315-322.
Abstract - In this paper,
we introduce a new interconnection network, the dual-cube,
its topological properties, and the routing/broadcasting
algorithms in the dual-cube. The advanced subjects such
as fault-tolerant routing and constructing multiple disjoint
paths in dual-cubes are also included in this paper. The
binary hypercube, or r-cube, can connect 2r
nodes. In contrast, a dual-cube with r links for
each node, Fr, can connect 22r-1 nodes while
keeps most of topological properties of hypercubes. Fault-tolerant
routing and constructing multiple disjoint paths in dual-cubes
can be solved elegantly using a new structure, called
extended cube. We show that for any two nonfaulty nodes
s and t in Fr which contains up to
r-1 faulty nodes, we can find a fault-free path
s to t, of length at most 3d(s,t)
in O(r) optimal time, where d(s,t) is the
distance between s and t. We also show that,
in a fault-free Fr, r disjoint paths s
to t, of length at most d(s,t)+6 can be
constructed in O(2r) optimal time.
- Yamin Li and Shietung Peng, "Algorithms of Routing
and Matrix Multiplication on Dualcube'', Proceedings of
the Second International Conference on Software Engineering,
Artificial Intelligence, Networking and Parallel/Distributed
Computing (SNPD01), Nagoya Institute of Technology, Japan,
Aug., 2001, pp422-429.
Abstract - Dualcube
is an interconnection networks that has hypercube-like
structure with the capacity to hold much more nodes than
the conventional hypercube with the same number of links
per node. The motivation of using dualcube as an interconnection
network is to mitigate the problem of increasing the number
of links in the large-scale hypercube network while keeps
most of the topological properties of the hypercube network.
In this paper, we focus on the design of efficient algorithms
for routing and numerical operations on dualcube such
as prefix computation, vector-matrix and matrix-matrix
multiplications. Our results show that the routing and
the basic numerical computations can be done on dualcube
almost as fast as those on hypercube.
- Yamin Li, Shietung Peng, and Wanming Chu, "Efficient
Collective Communications in Dual-cube'', Proceedings of
the Thirteen IASTED International Conference on Parallel
and Distributed Computing and Systems (PDCS-2001), Anaheim,
USA, Aug., 2001, pp266-271.
Abstract - The binary
hypercube, or n-cube, has been widely used as the interconnection
network in parallel computers. However, the major drawback
of the hypercube is the increase in the number of communication
links for each node with the increase in the total number
of nodes in the system. This paper introduces a new interconnection
network for large-scale parallel computers called dual-cube.
This network mitigates the problem of increasing number
of links in the large-scale hypercube network while keeps
most of the topological properties of the hypercube network.
Design of efficient routing algorithms for collective
communications is the key issue for any interconnection
networks. In this paper, we show that collective communications
can be done efficiently in dual-cube.
- Yamin Li, Shietung Peng, Wanming Chu, and Sanli Li, "Properties
and Performance of Dual-cube Architecture'', Proceedings
of The Second International Conference on Computer and
Information Technology (CIT'2001), Journal of Shanghai
University, Vol. 5, Suppl. Sep. 2001, Shanghai University
Press, pp.53-61.
Abstract -The properties
and performance of the dual-cube architecture, an interconnection
network for large-scale parallel computers, are studied.
The dual-cube is derived from the conventional hypercube
in order to mitigate the problem of increasing communication
links in hypercube for large-scale parallel computers.
We investigate the properties of the dual-cube, including
diameter, bisection width, cost, average distance, and
give algorithms for total exchange communication and arithmetic
applications in dual-cube, and evaluate the performance
of these algorithms.
>PAGE TOP
|
|
|