PHYD57 - Preparation for midterm exam 2020

PHYD57 Advanced Comp Methods in Physics - Midterm preparation materials.

Introduction

Review all the materials posted on the course web page before the exam, especially lecture notes and code page, and for the Linux commands + languages + OpenMP, also the Links section at the end of course page.

You'll find below problems intended to give you an idea af the kind of quiz questions and problems that you will encounted at midterm exam on the 18 Oct Fri (11:05-12:05).

You can (and should, as it helps to remember things later) make facts sheets (maximum 4 pages) of hand-written notes. Calculator (any type) encouraged as technical aid. No other materials allowed.

Quiz

There will be about 30 quiz questions in the actual midterm. Instructions below apply to midterm and final exams. Ignore typos and grammar, look for more significant issues. Sometimes only 1 word may be incorrect, so read carefully.

................................................................................
Note: k-th power of x is denoted as x^k.

If you think the statement is fully correct, or at least Yes seems much more
correct than No as a judgment on the truth of the statement, then circle Y.
If you answer is No then circle N and ALSO CIRCLE at least one word which 
is incorrect. If you do not indicate what is wrong, or circle word(s) which 
are correct, you get zero points for that question. 

If you want to leave extra explanation on why you selected Y or N, do it on the
marging or just above the statement. 

Since an occasional ambiguity may happen in the formulation of the question, 
about 7% points will be given for free while marking the quiz. 
................................................................................
 
[Y N] Motivation for many early computer builders was creation of astronomical 
    tables (Wilhelm Schickard, Gottrfied Leibnitz, Charles Babbage).

[Y N] In 1600s Blaise Pascal built machines that could add and subtract numbers,
   to aid	in tax collection. No multiplication or division was implemented.

[Y N] The first known use of binary system in mechanical calculators was the 
   calculator known as 'steped reckoner' by Alan Turing from the beginning of
   20th century. 

[Y N] Babbages unfinished, hand-cranked, Differential Engine contained 8000 parts
 and weighed 5 tons.  

[Y N] The pre-WWII versions of German coding machine ENIGMA were deciphered and
	reverse-engineered by Polish cryptologists Rejewski, Zygalski and Rozycki.  

[Y N] Since information about Colossus was classified until 1970, and other 
 projects like those of K. Zuse were unknown, for a long time it was thought 
 that the first computers were built by the US military (ENIAC in 1946)

[Y N] ENIAC used binary numbers but its programs had no conditional branching

[Y N] The basic principle of von Neumann architecture of our computers is that CPU,
 the central processign unit has access to operating memory storing both program
 and data, but which is physically separate from the memory. 

[Y N] High-level programming languages and compilers first appeared on mainframe
	computers in the late 1950s. Compiler is a programmer who operates the main
	console of the computer.
 
[Y N] IBM computers in 1950s were mostly leased at high cost to wealthy 
	businesses and universities. In late 1990s, Digital Equipment Corp (DEC) 
	disrupted that model by selling a 100x cheaper minicomputer series PDP-8 to 
	PDP-11. 

[Y N] C and later C++, were created mainly for scientific computations. 
 
[Y N] 8086 was the name of a popular Intel 16-bit processor. Subsequnt generations 
	of Intel processors with similar instruction set have a name retaining part of
	that name. For instance, x86_64 means all modern 64-bit processors compatible
	with Intel's instruction set.

[Y N] In Linux, ls -l | more  is a command that pipes the output of directory
	listing command (long format) to 'more' command which shows the output page
	by page or using other size of block of text.

[Y N] In Linux, 'cp files* path1' command will copy all files whose names start 
	with string 'files' to subdirectory of the working dir. named path1.

[Y N] In order to suspend the execution of a foreground program and put it in 
	a background execution mode in Linux, one needs to press CTLR-Z  to get the 
	prompt, then type 'bg' +RETURN. 

[Y N] Root or superuser in Linux is the user with full priviledges. Other users 
	cannot issue certain dangerous commands, and access files restricted for use
	by the root.  
 
[Y N] ASCII is a table specifying how characters (including Latin alphabet and 
	some text formatting symbols) are mapped to unsigned 1-Byte integer values 

[Y N] ifort -openmp    provides access to OpenMP by a fortran program 

[Y N] OpenMP defines 3 things: compiler options, name of its library  
	and OMP compiler pragmas (directives) 

[Y N] Timing of the sections of code in multithreaded applications written in 
	C or Fortran should be done using double precision function omp_get_wtime() 

[Y N] A loop: for(i=0; i < N; i++) { ...work...} must be preceded by OMP 
	pragma #pragma omp parallel for private(i,...), otherwise the loop counter i
	will be a single variable common to all program threads, which will lead to 
	race condition and overwriting of correct values of i. 

[Y N] OMP allows the programmer to specify a different number of threads 
	to be spawned by each fork-join parallel block section, for instance one 
	such section may use 8 threads and another 16 threads.
 
[Y N] After 2003 or 2004, internal clocks processors stopped having gradually 
higher frequencies; stagnation happened at 1-4 GHz frequency, depending on 
the processor type. 

[Y N] CPU powere usage per unit area of IC chip equals P = N f C V, 
where N is the number of cores on the chip, f their clock frequency, 
C is capacitance of one transistor, and V its working voltage.  

[Y N] Multithreading via OpenMP creates fork-join parallel region encompassing 
a loop construct

[Y N] In multiple, nested loops, OpenMP mutithreading should be applied to 
	the innermost loop

[Y N]  The most powerful supercomputers today have arithmetic speed capability 
of order 1000 PetaFLOPS = 1 ExaFLOPS. 

[Y N] Up to several years ago, every ~2 years, the size of all electronic components
etched onto a silicon wafer was decreased by a factor sqrt(2).  

[Y N] We do not need to learn methods of parallel programming if we use efficient
languages like C, Julia or Fortran.

[Y N] In 1999, 1 out of 500 supercomputers was in China. In 2019, 220 (44%) of 
the top 500 supercomputers in the world were in China.

[Y N] The top supercomputers use 10 to 40 MW of power, and cost ~300 million 
dollars. Just like one big airliner.  

[Y N] If 1 million particles interact with each other pairwise, and calculation 
of one interaction takes 100 ns (nanoseconds), then a well-written sequential 
program will take almost 1.5 hours to evaluate all interactions. 

[Y N] Random walk of 10000 steps of length 1 will on average end at location 
100 units away from the starting point.

[Y N] Compiler lists all the errors a program generates during the compilation, 
while the interpreter (such as Python) lists them only after encountering errors
during execution of the script. We thus know all that can go wrong with a 
compiled program before we run it.

[Y N] Arrays in C are zero-based (start with index 0), while in Fortran they are
1-based.

[Y N] In fortran, loop 0,4,8,...,512 starts like:  do  i = 0,512,4

[Y N]  k=1; if(k<=2) k=2*k+1;   This line compiles correctly in C, C++, or F95 code 

[Y N] Scanning RAM allocated to a 3-dimensional array in C/C++ from start to end, 
the first index of the array changes fastest and the last index most slowly. 


[Y N] Buildings on campus are connected with 1 Gbit/s ethernet network 
(one-directional throughput of data is in practice, let's say, 0.8 Gbit/s). 
To transfer a database of 800 million double precision C or Fortran floating 
point numbers between the buildings, one needs almost 11 minutes.

[Y N] I have read all Lecture notes and all the pages linked to our course page.

Practice Problems

There will be only one programming problem in the midterm. (There may be one short other task, too.)

Algorithms should be implemented in C (or C++) or Fortran95. We are looking for correct algorithms and will overlook what we judge as simple syntax errors. For example, if you forget the precise syntax of calling the Fortran subroutime:
call random_number(tab)
and write some other similar name or mistake it for a function and write tab = random_number(), but explain satisfactorily in comments what the function or routine does , your markes will not be reduced.
[REMEMBER to thoroughly comment your code!!! Short, single-line comments should appear in the program. Number and place long explanations outside the program, only putting reference to them in a comment line. Lack of thorough comments will lead to reduced mark.]

Glaring lack of knowledge of basic language constructs such as loops, arrays, conditional and arithmetic statements, writing code in Python, or only outlining the program structure in some metalanguage, will result in reduction of mark down to about 70% of maximum, but not to zero (it is much better to write a Python code, or an awful/buggy Fortran code, than nothing).

_____________________________________________________________________


Problem 1. 

Write a main program that declares a 4-Byte float (or real) two-dimensional
array Tab holding 6 floats representing x,y,z,vx,vy,vz, for N particles,
where N is a constant equal one million. Fill the array with 
values randomly (uniformly) distributed in the the range 0. < val < 1.,
for every variable var = x,y,...

Particles attract each other pairwise with force 
F_ij_x = -(dx) * (dx*dx + dy*dy + dz*dz)**(-3./2.) 
F_ij_y = -(dy) * (dx*dx + dy*dy + dz*dz)**(-3./2.) 
F_ij_z = -(dz) * (dx*dx + dy*dy + dz*dz)**(-3./2.) 
where position of particle i with respect to particle j is a vector
(dx,dy,dz), and F_ij  vector acts on particle i. All particles have mass
equal m = 1 in code units. Time step is equal dt = 0.001 in code units. 

Show that the above equations represent a Newtonian or Coulomb -1/r^2 force law.

Write a function step() containing a code that performs one time step of the 
simulation of the evolution of the system of N particles. The algorithm can be 
Euler (1st order) or leapfrog (2nd order accuracy in time) but in each case, 
meaning of variables and the step-by-step scheme of the algorithm, should be explained.   

Insert OpenMP pragma/directive to parallelize the execution with K threads, 
where K is integer constant defined at the beginning of the program. 



Problem 2. 

Given a double precision type function fun(x), satisfying
fun(a)*fun(b) <= 0.  for [a,b] = [0.,1.], implement in C or Fortran 
the Newton-Raphson search for the root of fun() in interval [a,b]. In other
words, find x such that fun(x)=0.
 
Print information about every iteration of the method, including an estimate 
of the current value of fun(x) and the estimated error of x (assuming fun(x) 
is linear near the zero). 
Final accuracy of root position should be better than 3e-5 * |a-b|. 
If it can't be achieved, appropriate statement after 30 iterations 
should be printed and the program should finish.


Problem 3. 

N = 120 small model cars whose size can be neglected in this problem are
initially distributed uniformly on a frictionless, straight, air track extending 
from x=0.0 to x=L=12 m. Initial position of particle k is
x_k = k * L/N, k = 1,...,N. 
Initial speed is +-10 cm/s, left or right direction being chosen at random. 
Upon encountering a neighbor, cars bounce off elastically (without dissipation 
of energy, therefore preserving speed). Cars fall off the ends of track.

Simulate the motion of cars until the last one falls off the track. 
Optimize the speed of the program using OpenMP with 12 threads. Print final time. 

Note: There is a simple analytical solution. 

Hint: Random number generation in Fortran. You can fill the whole array 
x or a single value x with random number(s) between 0. and 1. like so:    
 call random_number(x);   v = 20.*(x-0.5).
Array v will hold random values +-10.

Hint: Random number generation in C/C++. 
There is a standard library function (so you must 
 #include  in the header of your code) called 'rand()'.
It returns one integer from 0 to a large number much exceeding the 
number of objects you simulate. In C/C++, you randomize the 
array of initial velocities by creating N remainders of division by 2
in a loop. Expression 
 10 * (2 *(rand()%2) -1) 
has a value +-10 with equal probability. 
 


Problem 4. 

Write an N-body code using leapfrog method & direct summation of all forces.
The force formula is -1.0/distance_ij**2, i.e. attractive 1/r-square force
but the constant in numerator is just 1. Equal masses of particles assumed.   
Staring points are a cloud of random points in 3-D, within radius 1.0, all 
particle are at rest. Random number generator Random() is available, it return 
a double prec. number from 0. to 1. 
N = 1e5. You are supposed to parallelize calculation by OpenMP directives
to perform on a multi-core CPU. Don't worry about the output or checks of accuracy. 
Just the Newtonian dynamics equations and their integration for the first 1000
steps with timestep dt = 0.0001.

It's best if you use exact C or Fortran syntax, but you can also write metacode
which is not so strict, just so we know exactly what is being done. 
Include comments to clarify everything!