PHYD57. Midterm exam 1 Mar 2022 - Problems. Solutions. Outline a well commented C(++)/Fortran program. Use as many comment lines as you please, but do explain each line that does something. Small mistakes in syntax will not be counted against you. Random photon path Write in C(++) or Fortran program that simulates a random walk in 3-D of a photon inside a star. The photon starts at the origin of (x,y,z) system and each coordinate changes from -1 cm to +1 cm in each of N = 1024**3 (about 1 billion) steps. To provide the random numbers in the [-1,+1] range with a physically plausible probability distribution, a separate fast procedure called Rnd(tab) is available. That is, if tab is a 1-D float/real array of length 3N, then calling the procedure as "Rnd(tab);" in C, or in Fortran "call Rnd(tab)", you will quickly fill the whole array with random numbers. The program needs to be multithreaded via OpenMP (omp) to make simultaneous use of 16 available CPU threads. Make sure that you construct 1 continuous path, not 16 different paths in your code, and that the code is parallel not serial, yet if run with 1 thread only would return the same results. The final trajectory should be contained in 3 float/real arrays of length N, called x, y, and z. SOLUTION OUTLINE There can be different approaches, but in general the problem is called a parallel scan (equivalent to prefix sum, or integration of a function sampled at N points.) Given enough threads, one can try to implement O(ln N) - type algorithms (see wikipedia on prefix sum), but here, with 16 threads, a more reasonable, simple, and very fast algorithm is simply divide-and-conquer algorithm, where each of the 16 threads does N/16 summations. In real life "summation" may not simply be a summation, but some involved and difficult calculation of its own. However, there is one problem here, since the final result must be one continous result, while each of the 16 pieces of integration (they should start from x=0,y=0,z=0) is a disconnected, discontinuous partial random walks. They need to be connected at the end. I see two varians of a program part to reconnect the pieces. One is actually building a new array of continuous trajectory, the other is essentially the same but apply to the whole block of data, saving a few operations. The second method leaves the partial integrations data as they are (that's memory-efficient) and only on demand provides a value of the position after step i, F(i), by doing the following steps. 1. Compute deltas: delta_k = f((k+1)*N/16) -f(k*N/16) = sum of all steps in segment k, k=0...15. 2. For any step number i at which we want to evaluate the position, compute j = i/16 (Fortran) or i//16; (C/C++) (integer division provides the number j labeling the section, j = 0...15). 3. Sum up all deltas up to j-1, and add the result to the f(i). F(i) is ready. Time needed for integration is of order O(N/16), as is the time needed to complete the construction of the whole tracectory, so the cost is of order O(N/8). This is to be compared with the cost O(N) of sequential calculation in this particular problem which is extremely light on computations. But if, say, evaluating each step takes 100 arithmetic operations, then the amount of time is O(N/16*100 + N/16)= O(N/16*101) and the adjustment cost is only 1% of the essential computation. PHYD57. Midterm Exam 2022. QUIZ. Solutions. Edit this text file. Leave only one, Y[es] or N[o], then sign and submit this file. Statements may be tricky, so read carefully. A single word or number may be incorrect. Any "[N]" answer MUST have at least one wrong word or number highlighted by enclosing them in << >> brackets. Please disregard typos. Symbol % stands for command-line prompt, symbol ^ is a power operation, e.g. 3^2 = 9. To make up for unintended ambiguity of the questions, 2 points will be added to your result. Good luck! [Y] Antikythera Mechanism was the first known digital+analog portable computer of times of lunar and solar eclipses, positions of planets, olympiads etc. [Y] IP packets sent over Internet contain detailed information such as the so-called MAC addresses, allowing unique identification of the sending equipment. [N] Numpy is faster than straight Python because it is <> and pre-compiled in other languages [N] icpc is a <> compiler able to traslate C, C++,<< Fortran and Python>> programs [Y] OpenMP language assumes shared memory access, that is all the threads it creates must have access to the same operational memory (RAM). [N] <>, Atanassov and Berry computer was built by <> motion in air. It also solved linear algebra equations. It used thousands of vacuum tubes. [N] Break-in attempts to Linux can be blocked by removing or changing <> file [Y] Simpson's integration method is 4th order [Y] Assembly program is a series of low-level processor instructions, e.g. using registers [N] A 4-Byte integer can represent all numbers from 0 to 2^{32} <> [N] Scanning the momory allocated to a 3-d array in Fortran from start to end, the <> index of the array changes fastest and the <> index most slowly. [N] Option -openmp of Intel compiler makes sure OpenMP is <> [Y/N - was not discussed in detail so far!] In the same number of clock cycles a CPU can perform 30 aritmetic operations or fetch a randomly positioned number from RAM; thus fetching blocks of RAM into cache [N] Density of transistors on an integrated circuit has <> 1 MT/mm^2 (1 million transistors per mm^2). [N] Double semicolon in the C loop: for (i,j,k=0;;) {(...)} <> [Y] IBM computers in 1960s were mostly leased at high cost to wealthy businesses and universities. In the 1970s, Digital Equipment Corp (DEC) disrupted that model by selling a 100x cheaper minis: PDP-7 to PDP-11. [N] <> is a relatively new language, created for parallel computing of a scientific problems. [Y] Operating system Unix was created at AT&T Bell Labs on PDP-series minicomputers [N] Currently, the <> performance of processors grows by a factor of 2 in about 2 years. [Y] In Linux, a wave (tilde ~) means home directory of user; .. means parent of the current directory [N] %hostname in Linux prints the name of host computer; %ssh does secure <> [N] Command %rm /home/user/output*2.dat removes files in /home/user, such that any <> can be in place of "*". [Y] Intel corp. produced a microprocessor called 8086 in late 70's. It was a 16-bit CPU. Later processors had compatible instruction sets, called x86_, for instance x86_64 is a family of 64-bit processors by Intel and AMD.) [N] Debian, Linux, Fedora, Red Hat etc. are << flavors (versions) of Unix OS >> [Y] OpenMP has directives for making variables private, i.e. creating copies of variables that are only known to a given thread of the program. [N] Single precision floating point number is <<8>> Bytes long, and has 6 to 7 accurate decimal places [N] If you parallelize a loop by OpenMP directive, the threads remain alive and usable after the loop ends. A separate directive needs to be issued to collapse the fork of multiple threads to a single thread. [Y] Charles Babbage obtained in early 19th century a grant to build "Differential Engine" to compute astronomical & math tables. The project was mostly successful but it was not a general calculator that could do */+- operations.