PHYD57. Final exam 12 December 2024 QUIZ. SOLUTIONS (41 pt max.), but 3 are for free (cf. below) __________________________________________________________________________________ Edit this text file. Leave [Y] for Yes, or [N]. Statements may be tricky, so read carefully. A single word or number may be incorrect. Any "[N]" answer MUST have at least one wrong word or number highlighted by enclosing them in << >> brackets. Please disregard typos. Symbol % stands for command-line prompt, Ftn means Fortran, and symbol ^ is raising to a power, e.g. 2^3 = 8. To make up for unintended ambiguity of the questions, 3 points will be added to your result. __________________________________________________________________________________ [N] If 1 billion single precision floating point numbers (representing 100M particles) are transferred between the CPU and GPU by a data bus at 10 GB/s, then it takes at least <<2 s>> to transfer the data back and forth, to/from GPU. [N] Programs in CUDA can be <> totally on GPU, without the help of CPU. [N] Moore's law states that the <> of integrated-circuit processors doubles in about 2 years [Y] A double precision, i.e. 8-Byte, float or real number (in C or Ftn) can represent real numbers with accuracy of 15 to 16 digit after decimal point. [N] In Linux, %ls ~/one.ext two.ext <> two.f90 that points to the file one.ext residing in top directory of the user. [N] Intel compiler icc <> to compile C/C++ programs that include CUDA C constructs into executable programs [Y] OpenMP directive-driven language assumes shared memory access, i.e. all the created program threads must have access to the same operational memory (RAM). [N] In 2003-2004, the rate of improvement of one-core processor performance slowed down considerably, <> the so-called power barrier no longer allowed exponential increase of <> on an integrated circuit. [Y] Today, performance of supercomputers in Top 500 lists doubles on a timescale two times longer than that timescale in the period 1995-2005 [N] In 1942, Atanassov and Berry computer was built to <>. It also solved linear algebra equations, using thousands of vacuum tubes. [Y] You sometimes could not login to art-1 linux node because your IP address was not listed as in the range of IP addresses allowed to use ssh [N] Supercomputers use up to dozens of MW of electrical power. About <<1/2>> of it is emitted as heat, and 1/2 is used for computation. [N] Simpson's 1/3 integration method is 4th order, which means that error of a computed definite integral <> as N^4 (N = number of subintervals) [N] Direct CPU register content manipulation <> be performed in C++ language [N] Scanning the momory allocated to a 3-d array in Ftn from start to end, the <> index of the array changes slowest and the <> index fastest. The opposite is true of C/C++. [Y] Loading of two float (real) numbers from RAM to CPU can last up to 10^2 times longer than their multiplication. Thus bandwith to/from RAM often is a limiting factor on program execution speed. [N] Density of transistors on an integrated circuit has <> 14 MT/mm^2 (14 million transistors per mm^2). Total number of transistors exceeded 140 billion. [N] In gravitational N-body codes, smoothing length is <> distance between particles, and then NaN (not a number) values after division by such distance. [Y] Had the curent slowdown of performance improvements of processors occurred in 1975 rather than 2005, most phones today would not have graphics display, except for maybe 1-line calculator-style ones, and laptops would not play video. [N] P ~ N^2 C V f, where P=energy usage per second to change the state of N transistors from representing 1s to representing 0s or vice versa, f = clock frequency of processor, C = capacitance and V = voltage on a transistor. [N] <> says that parallel performance of n processors cannot exceed the single-processor performance by more than a factor 1/(1-q) where q = fraction of the program which cannot be parallelized [N] Operating system <> was created at AT&T Bell Labs on PDP-series minicomputers [N] %sftp filename.ext in Linux <> filename.ext in current directory against transfer by users other than the file owner. [Y] Intel corp. produced a microprocessor called 8086 in late 70's. It was a 16-bit CPU. Later processors had compatible instruction sets, called x86_, for instance x86_64 is a family of 64-bit processors by Intel and AMD.) [N] A six-core Intel CPU on art-1 node typically uses <> and <> speed up the single threded code by more than a factor 6 or so. [Y] OpenMP has directives for making variables private, i.e. creating copies of variables that are only known to a given thread of the program. [N] After on OMP-instrumented loop in C or Ftn ends, a separate directive <> to collapse the fork of multiple threads to a single thread. [Y] Top supercomputer Frontier at Oakridge Ntnl Lab uses AMD processors and exceeds the arithmetic performance of 1 EFLOP (exaflop) both nominally and in tests. This means it does 10^18 double precision floating point operations/s. [N] Symplectic integrators are preferable to RK4 and higher order Runge-Kutta schemes for the <> reason ARM/RISC processor architecture is winnig with CISC architecture (complex instruction set computer): <>. [N] Besides this, symplectic integrators have <> [Y] The number of bodies N simulated in a single program on UTSC phi cluster using Intel Xeon Phi or on the whole art cluster, can exceed 1 billion. The largest cosmological simulations in the world (e.g., Bolshoi, Millenium XXL) have used an order of magnitude more particles. [Y] The simulations of gas flow by UTSC graduate Jeffrey Fung have shown that a planet like proto-Earth, growing in a gas disk, sheds 4 counterrotating vortices. [N] Calculation by Fung and Artymowicz showed that optically <> gas disks surrounding a single star and subject to its radiation pressure, are unconditionally <> and devolop large-scale spiral growing modes. [N] SPH or Smoothed Particle Hydrodynamics needs a fast neighbor-searching component. One solution to this is offered by the so-called <>. [Y] Type-III migration was discovered in CFD (Computational Fluid Dynamics) simulations. It is a mode of migration of early planets in gas disks, in which they force the gas to flow at high speed across a severely underdense gap surrounding their orbits. [N] Aerodynamical forces on a wing of an airliner can be computed using <> [Y] AI teached us that 'prediction is difficult, especially of the future', as one baseball player (nicknamed Yogi Berra) once said. [Y] Ftn can use handwritten CUDA kernels (routines, threads), or the compiler can write turn loops into kernels by writing them automatically [N] In practice, one <> achieve a typical speedup by a factor <<10^2>> from a consumer graphics card (GPU) over the preformance of a modern CPU program multithreaded with OMP [N] Typical memory sizes and floating point processing speeds of small computers at the end of 1970s were expressed in units starting with the prefix "kilo" or 10^3. Nowadays the prefix is "giga" or 10^9. The increase is million-fold. Thanks to this computers parse and translate spoken words in real time, and <> increased <>. [Y] Neural nets are capable of balancing a double and even triple pendulum in inverted position, by shifting the base point sideways.