文档文库

手机版

投诉建议

首页 > 华中科技大学计算机学院博士入学考试真题

华中科技大学计算机学院博士入学考试真题

发布时间：2020-07-30 09:34:28 来源：文档文库

小中大

字号：

手机查看

华科大200×年计算机系统结构博士生入学考题

一、判断题(选择一个最佳的答案, 每题3分,共１８分)

红色标注答案供参考

1. 测量计算机对事务处理的性能，所用的测试基准程序（Benchmark）应该是：

(a) Whetstone; (b) SPECint; (c)TPC-C; (d) SPECfp

2.从汇编语言程序员的角度看, 下列哪个是不透明的:

(a)cache; (b) 数据通路的宽度;(c) 虚拟存储器;(d)流水线

3.某个应用程序运行所需要的最少存储空间为320页，每页32KBytes。某计算机实际物理内存为256MBytes, 但250Mbytes的内存已被其它程序占用，硬盘的容量为40GBytes, 操作系统为该程序分配的虚拟内存地址空间应该是：

(a) 6MBytes; (b) 10MBytes; (c) 256MBytes; (d)40GBytes

4.某个由多处理器构成的服务器中, 每个处理器都有自己的存储器, 但所有存储器都统一编址, 这种结构属于:

(a) UMA (b)NUMA (c)SMP (d)CLUSTER

5. 某计算机采用了超标量流水线技术, 其指令级并行度为 8（即平均每个时钟周期完成8条指令的执行）, 若已知该机采用了两条流水线, 其机器超流水线周期为:

(a) 2个时钟周期; (b) 1个时钟周期; (c) 1/2个时钟周期; (d) 1/4个时钟周期;

6.一台由8个磁盘构成的磁盘阵列，其冗余校验信息在某个磁盘中，所属的RAID级别是：

(a) RAID0； (b) RAID1; (c)RAID3; (d) RAID5.

二、问答题：(10分)

1． RISC机和CISC机相比，　提高性能的基本原理是什么? 在指令条数（增加），CPI（减少）和时钟频率（）等方面，它提高性能的关键点在什么地方?(4分)

2．机群系统和多台计算机组成的局域网有什么区别?（３分）

3． SIMD和向量机（Vector）的异同点是什么？（３分）

三、计算题（４９分）

（1）（８分) 一个程序有40%的部分可以并行化，若采用多个CPU的办法来加速可并行化部分，而总程序运行时间由于多CPU的原因由原来单CPU的120秒缩短为80秒，试问至少要用几个CPU?

解：设要x个，则

120/80＝1/(0.4/x+0.6) 得x＝6

（2）（１２分）某一存储子系统包括SCSI控制器和一个磁盘。SCSI控制器通过总线将请求发送到磁盘上，然后将数据写到磁盘上，在此过程中它不响应其它请求。假设 SCSI控制器命令总开销为1ms, 磁盘的传输率为40Mbyte/s，平均旋转延迟和寻道延迟之和为6.6ms。忽略磁盘中的cache影响，CPU向存储子系统每秒发出60个16KB的写请求，达到SCSI控制器的I/O请求按指数分布。

a.计算存储子系统对于16KB读请求的服务时间。

b.存储子系统的利用率是多少？

c.计算I/O请求的系统平均响应时间，包括排队和服务时间。

解：a：平均服务时间＝1＋6.6＋16/40＝8ms

b：系统利用率＝到达速率×平均服务时间＝60×0.008＝48％

c：Time(排队)＝8×0. 48/(1－0. 48)＝7.4

平均响应时间＝8＋7.4＝15.4ms

（３）（１０分）某种机器具有层次型存储结构，包含cache，内存和磁盘系统。内存的平均存取时间50个时钟周期，磁盘的平均存取时间1,000,000个时钟周期。在不发生存储器停顿(memory stall)时，所有的指令通常是用两个时钟周期完成（CPI=2）。如果有一个程序在此机器上运行，平均每条指令要访问存储器1.5次，在cache的命中率是98%，而读内存的缺失率(miss rate)为0.01%。程序运行执行的总指令数为1,000,000条，时钟周期为10ns，请问这个程序的总的运行时间T，并分析这个程序有多少时间是花在磁盘I/O上。

解：T＝指令数×（指令执行时钟周期数＋缺失率×存储器存取次数/指令数×缺失代价）×时钟周期时间

下一级得缺失代价＝50＋0.0001×1000000＝150个时钟周期

花在磁盘上得IO时间＝1000000×1.5×0.02×0.0001×1000000×10

＝30ms

T＝1000000×（2＋0.02×1.5×150）×10＝65ms

（４）　　(８分) 一个具有64个相同处理器的DSM机器运行某个程序, 每个处理器在存取本地存储器时的CPI为1, 存取远端存储器的时间为3000ns. 处理器的时钟周期为20ns. 与2%的指令需要远端存取相比, 处理器完全没有远端存储器存取时,速度会提高多少?

（５）　（６分）设指令在4个功能部件的执行时间分别为20ns, 16ns, 18ns, 12ns, 如果采用流水线增加了2ns 的附加开销, 与不采用流水线相比, 流水线的最大加速比将是多少?

解：(20＋16＋18＋12)/(20+2)=3

（6）（５分) 在100Mbits/s的网络上传送2000Bytes的信息，通行时间为600μs, 发送方和接受方的开销均为200μs.接收方所花费的时间为多少？

四、分析题（２３分）

(1) （７分）列出下面代码的相关关系，并把它改写为并行程序

for (i=1; i<=100; i=i+1) {

a[i]=b[i]+c[i]; /* S1 */

b[i]=a[i]+d[i]; /* S2 */

c[i+1]=a[i]+e[i]; /* S3 */\

}

（2）(10分) 在一个包括取指F、译码D、执行X、存数M、写回W的五段流水线机器上，假设每段都花费一个时钟周期，流水线没有采用forwarding 或者Bypassing技术，但寄存器在同一时间的读和写可通过寄存器堆来forward.

机器运行下列代码：

loop: LW R1,0 (R2)

ADDI R1, R1,#1

SW R1, 0 (R2)

ADDI R2, R2,#5

SUB R4,R3,R2

BNZ R4, loop

假设 R3的初始值为 R2 + 100, 并假设所有的存储器存取都命中。

画出执行上述程序的时空图，并计算这段程序的执行将花费多少个周期。

（3）（６分) 一个程序一共有6页, 但分配给该程序的实际内存空间只有3页, 该程序的执行顺序为:

P1,P2,P3,P4,P2,P3,P5,P5,P4,P6,P3,P3,P2,P1,P5

请画出用LRU和LFU替换算法进行工作时的替换表, 并指出两种情况下的命中率为多少？

Exam on Computer Architecture 2003

Part I Multiple Choice (24 points, 3 for each)

1. Compared with CISC technology, RISC

(a) has more instructions (b)has more transistors on chip

(c) has less CPI (d) has less registers

2. A machine with superscalar -superpipeline technology completes 6 instructions in one clock, the number of pipeline is two, which is the delay time of each superpipeline stage

(a) 1/4 clock (b) 1/3 clock (d) 1/2 clock (d) 1 clock

3. To reduce miss rate, which of the following methods is the most effective one?

(a) multilevel cache (b) larger cache size

(c) giving priority to read misses over write (d) small and simple caches

4. A machine can perform (a1, a2, …an)+(b1,b2,…bn) operation simultaneously. Which is the name of the machine?

i. SMP (b) MPP (c) Cluster (d) SIMD

5. Which application is most suitable to the RAID 3?

(a) E-mail server (b) OLTP (c) multimedia (d) WWW

6. Send a 5000Bytes message on a 1 Gbits/s network takes 440μs total latency, receiver overhead and sender overhead is both 150μs , what is the time of flight?

(a) 100μs (b) 135μs (c) 140μs (d) 150μs

7. A computer system needs 99.99% availability, MTTF is 500,000 hours, about how much down time is allowed?

(a)5 hours (b) 10 hours (c) 50 hours (d) 100 hours

8. Which benchmark is suitable for the OLTP?

(a) SPEC CPU2000 (b)SPECSFS (c)TPC-C (d) TPC-W

Part II Computation and Analysis

1.Replace an old CPU with a new CPU, and increase the clock rate from 1.2GHz to 1.6GHz. Replace the old disk with a new disk with doubled I/O speed. A program takes 10 seconds on old CPU and 6 seconds on old I/O. If the program running on new machine takes 8 seconds and the CPI for old CPU is 1.5,

(1) what is the MIPS of the old CPU?

(2) what are the CPI and MIPS of the new CPU?

解：MIPS＝时钟频率/CPI×10^6＝1.2GHz/1.5*10^6=800

CPU时间＝（IC×CPI）/时钟频率

10＝（IC×1.5）/1.2

5 ＝（IC×cpi）/1.6 cpi＝1

MIPS＝1.6GHz6/1×10^6＝1600

2. A two-way set-associative cache system has 8 blocks( from block 0 to block 7), and each block has 16 words. The memory has 32 blocks( from block 0 to block31). Which block in cache can be mapped with the data that is stored at the 122h memory unit ?

16×4＝64

122=64＋58

所以122h映射在block1块

3. This exercise use the classic RISC five-stage integer pipeline(F,D,X,M,W) and assume all memory accesses takes 1 clock. Use the following code fragment:

Loop: LD R1, 0( R2) ; load R1 from address 0+R2

LD R3, 0(R4) ; load R3 from address 0+R4

DSUB R1, R3, R1 ; R1=R3-R1

SD 0(R2), R1 ; store R1 at address 0+R2

DADDI R2, R2,#6 ; R2=R2+6

DSUB R5, R4, R2 ; R5=R4-R2

BNEZ R5, LOOP ; branch to loop if R5 is not 0

Assume that the initial value of R4 is R2+114

Show the timing of this instruction sequence for the RISC pipeline without any forwarding or bypassing hardware but assuming a register read and a write in the same clock cycle “forwards” through the register file, as in Figure A.6. Assume that the branch is handled by flushing the pipeline. How many cycle does this loop take to execute?

19×18＝342个clock

4. Suppose a processor sends a 80 disk I/O per second, these requests are exponentially distributed. The average seek time of the disk is 6.6ms and the disk rotates at 10,000RPM. On average, each I/O takes 0.3ms to transfer data to or from disk media. The controller overhead is 0.1ms. Answer the following questions:

a) What is the average time to service an I/O request?

b) On average, how utilized is the disk?

c) What is the average length of queue?

d) What is average response time for a disk request, including the queuing time and disk service time?

解：a：平均旋转延迟＝0.5/(10000/60)＝3ms

平均服务时间＝6.6＋3＋0.3＋0.1＝10ms

b：磁盘利用率＝10ms×80＝0.8

c：队列长度＝0.8×0.8/0.2＝3.2

＝80ms×10×0.8/0.2

d：平均响应时间＝10＋10×0.8/0.2＝50ms

5. The fraction of sequential of original computation that can be speedup by parallel processing is 90%. Suppose you want to achieve a speedup of 5, how many processors you need to use for the speedup?

解：1/（0.9/x＋0.1）=5 x＝9

本文来源：https://www.2haoxitong.net/k/doc/34801470ce1755270722192e453610661ed95aa5.html