Course PHYD57. Advanced Computational Methods in Physics

Welcome to the in-person course that will help you become a more knowledgeable and better scientific programmer! You already know basic programming in Python (cf. next section). Now is the time to learn some history and even some physics behind our technical civilization, where computing is helping us everywhere. Powerful operating system (linux) and languages will be discussed and tried out, networking, and the principles of parallel programming that enables you as a scientist to run large-scale simulations of the physical world. We will also discuss some advanced algorithms used in Physics. The course includes practical exercises in writing and optimizing programs on single node(s) of a computational cluster. Although we'll outline the methods, we leave out the node-to-node communication and writing message-passing programs. You can acquire those skills later as part of a research undergrad or grad course.

Brief Intro

2024 Syllabus

TXT file with dates of lectures and exams, topics

Quercus

This page provides the sole access to all the materials, up-to-date SYLLABUS (any changes in schedule will be seen there), texts of assignments, and preliminary grades. The present page will be modified with new contents as the course progresses. Things highlighted in red are newly added or important in the near future.

Quercus will be used for: submission of your term work, announcements via email, your unmoderated discussion, and posting of some old recordings of lectures and tutorials in Media Gallery tab.

Prerequisites

The only crucial prerequisite is programming using basic Python at the level of PHYB54 (earlier PSCB54), Introduction to Computing in Physics.

Assumptions & languages

This course assumes your interest in:
(i) intermediate and advanced numerical methods used in physical sciences;
(ii) HPC (High Performance Computing), including parallelization of algorithms;
(iii) coding in one of the languages more powerful than Python.
Why 1.5? You will understand programs (with their arithmetic, iterations/loops, conditional statements, functions/subroutines and their calling, array declaration and usage, pseudorandom numbers, and so on) in both C and Fortran. But you will be required to actively express your ideas (a more demanding task!) in only one of those languages. In other words, we want you to acquire passive knowledge of C(++) AND modern Fortran, and active knowledge of C(++) OR Fortran.
By Fortran I mean modern fortran: 90, 95, 2003, 2008, or 2015 definitions; they're the same for our purposes. Avoid books/tutorials on really old, earlier FORTRAN versions (IV, 77), even though Fortran, unlike Python, is virtually backward compatible. Knowing Python, Fortran is much easier to learn than C++. It has changed much since the inception, but one thing remained constant: on general platforms like workstations and laptops, scientific simulations (typically those using multidimensional arrays) on average have always executed somewhat faster when written in Fortran than in any other language, of which thousands have been invented.

C(++) is a few decades younger, and different: more wordy, with more bells and whistles, it is lower level (we'll discuss what that means). It is better known and thus more marketable outside academia. C is a sine qua non for operating system writers, guardians and hackers. Scientists have a love/hate relationship with C++. Even though they weren't originally meant for each other, they've been together for ages and gave birth to some great implementations and program packages.

Intel and Nvidia/PGI is providing both languages in their compiler suites for free.

 

More about this course

Course grading scheme

3 sets of assignments A1-A3        24% (8% ea.)
assignment A4 (group project)    24%
midterm exam              18%
final exam                     34%

Lectures, Tutorials, Office hours, UofT Code of Conduct etc.

Lectures (see syllabus for dates and times) follow the organization outlined in syllabus. If in-person, you are expected to take own notes of what's happening on the blackboard, while online you will have the recordings in Media gallery on Quercus. Informative web pages may sometimes appear during lectures as well; it's a good practice to note down keywords or URLs for visiting them later.

Tutorials are NOT meant for the explanation of Lectures, although sometimes relate to things discussed there. There will be no points for attendance/activity. Negative aspects of not attending and not being active are merciless/automatic enough. We are a small bunch of interested people in this course, whose stile I'd place somewhere between undergraduate and graduate.

Office hours: let's discuss things immediately after the lecture and/or tutorial. I'll try to stay until all topics you want to talk about are discussed or until I have to leave. We will see where we can talk, if the classroom is busy we'll move to one of the rooms on 5th floor (SW). Other times may be arranged too, depending on my time.

The Code of Behaviour on Academic Matters at UofT should be respected. If you have not read it, you may unknowingly find yourself in trouble. This document succinctly defines conduct that constitutes plagiarism, i.e. misrepresentation of authorship, e.g., cheating that the assigned work is yours and original, while you merely rephrase somebody's code by changing its appearance. It is an equal offense to help somebody plagiarize your own work. The Code mentions how any reasonably grounded suspicions must be addressed by your Instructor, by Dept. Chair, and so on. Of course group work on assignment 4 is a different matter! Read more here.
In the same vein, using ChatGPT or other "AI" to plagiarize the codes for assignments is a serious offense.

Contact

Email or visit the instructor Prof. Pawel Artymowicz (guide to pronunciation: PAvel ArtyMOvich), in his office SW 506G, 5th floor (Physics & Astrophysics Group) in old Science Wing. Email: pawel.artymowicz@utoronto.ca, importantly with a subject line "PHYD57", or else it may be accidentally ignored. Send mail from utor acount, not from gmail or facebook (replies are v. often lost, it's a problem with utsc mailer).

Tutorials are given by the lecturer. This course will have a TA/marker: Pejvak Javaheri (his email has the form first.lastname@mail.utoronto.ca).

Course contents

You've done the heavy lifting. You have learned programming in PHYB57 or equivalent, and the basic numerical methods. In this course, you will learn some more algorithms and techniques used in numerical physical sciences. You will discover the world of High Performance Computing (HPC): its history, theory and some practice in high-performance languages.

Practical skills will include basic familiarity with Linux operating system. That will be done as one of the first things - you may want to read on the net about linux before the course starts. You are strongly encouraged but not required to install Centos OS or a newer flavor of linux on your computer (or use MacOS, which is a linux customized by Apple Inc.).
You will have access to one node of a cluster housed in BV building (there is a banner with a photo of it next to the entrance to DPES office in EV bldg). It has Centos 6 (linux) operating system and basic software, enough for doing a few exercises on your own. Physical access to the server room is restricted, even I don't have the codes and the key, so if the machine(s) crash because of something (e.g. power outage) then we are temporarily in trouble. But we won't worry in advance, we'll backup our work on a generally resilient mail and web server called planets. In any case, making the research machine(s) available to you is outside the scope of u/g courses and is an extra benefit to you given by me without explicit or implicit promises. I will argue that you should try to develop and debug codes, and run them on your own machine, which today is quite feasible.

On your computer, you will sometimes use your Python3, and will be asked to install compiler(s) for HPC languages Fortran and/or C(++), and familiarize yourself with how they function. (If you already know both these languages, then it would be great if you could install Julia compiler and try that new language, created at MIT for purposes consistent with ours, then share your experience.)

In summary, our goals are to help you become a better scientific programmer, able to solve numerically demanding problems on a workstation or a good laptop. While the practice of cluster supercomputing is outside the scope of this undergraduate course, you will hear a lot about it, and even have an option to try it small-scale if you want.

This is a 4th year (upper u/g) course. It explains the issues, tries to interest you in some directions and put you on the right track, but it's your job to move along. This means your own initiative in learning is expected. Great many materials, books and web pages, will be suggested to you, and often you will make your own choice which of them to use. Start very soon. Happy experimenting with the kind of computing you likely haven't experienced yet.

Assignments

The deadline is at the beginning of the meeting (11 am) on lecture days marked in syllabus. Submission format is explained in the text of assignment 1. Submission late by more than 3 hr earns 50% of the points. No personal exceptions, no group extensions unless announced via email from Quercus. Possible transfer of assignment points in case of emergencies. This policy is motivated mainly by the need to discuss/publish solutions a.s.a.p.

Text of assignment set #1. Due 22 September 11:59pm. Partial solutions.
Text of assignment set #2. Due 4 October. Solutions..
Text of assignment set #3. Additional comments included in red font (re-download!). Due 8 Nov. Solutions of A3. <\a>
Assignment set #4 . Presentation on 29 November. Group project, from 2 to max. 4 persons per group.

A4 Projects

Download and read! Then ask at least one question after presentation.

1. N-body galaxy project (pdf).   The program in f95

2. Solar Flares (pdf)

3. Smolensk Crash (pdf)

Code page

Continuously updated codes, including some solutions of assignments will be available at:
  http://planets.utsc.utoronto.ca/~pawel/progD57
You will download and run the codes. At first, before you gather the experience in compiling and running some of them, you can just take a look at them. You are encouraged to develop them further to do more things, change their output style, etc. That is how programmers learn programming. Nobody but you have the right to do that - please don't share the link to the code page with the world.
Knowledge of structures and methods used in these codes is a course requirement. You must acquire a passive knowledge (i.e., ability to understand) of both basic C and Fortran. You may actively program your assignments in one chosen language, C/C++ or Fortran, unless the assignment explicitly mentions Python.

Volcano on Io. Source code


Books

Please see the guide to literature in the lecture notes section below. Add a secret ingredient at the end of the URL you are viewing to see a web page that is helpful.
Books discussed below should be in UofT library, in electronic form. If not, let me know.
Find books in library catalog. I do not recommend just watching videos on Youtube about programming; most of them are only very remotely relevant to PHYD57.

Our textbooks and materials will be shown in Lecture notes, and will be specific to the current lecture. For instance, Lecture 2 will benefit from reading parts of

N. Schroeghofer, "Lessons in Scientific Computing", CRC 2019.
Please focus on subjects not discussed in introductory course PSCB57, for instance author's views on different programming languages and how to select the best combination of two complementary ones. If you read a chapter and the contents seems really familiar, skip and begin the next chapter.

For L2 and later lectures, and for the knowledge of Fortran, please browse through:

J. Izaac and J. Wang, "Computational Quantum Mechanics", Springer 2018.
and
P. Turner, T. Arildsen, K. Kavanagh, "Applied Scientific Computing with Python", Springer 2018.
and make sure all the subjects are known to you.

Non-required book for those who want to brush up on Python:

S. Linge, H. Langtangen, "Programming for Computations - Python", Springer 2019 [this 2nd edition uses Python3]

Lecture notes and guides

Programs discussed in lecture notes are available from http://planets.utsc.utoronto.ca/~pawel/progD57

Guide to literature L1-L5, approximately      

Lecture 1       Lecture 2       Lecture 3       Lecture 4
Lecture 5       Lecture 6       Lecture 7       Lecture 8
     
Lecture 9+10    

Two papers recommended for Neural Networks in application to Astronomy,
NNs-spect.pdf, and NN-exoplanets.pdf. Also, look at the NN section of the private, auxiliary page!

Recordings

The course content will of course change somewhat this year. Access is normally blocked (since you will have access via Quercus > Media Gallery) Day 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Lecture 1 recording is incomplete, second (shorter) part is missing. But the missing story of decryption of German military codes by Polish, and then British code-breakers associated with Blatchley Park (British gov. bureau of codes and ciphers) is told both in the PDF of Lecture 1 notes and the extensive text and links below (Links > History). We stopped near the end of the lecture notes.

Midterm exam 18 Oct. 2024, starts 11:05

This page: mid-prep-2024.html will help you prepare for this exam.
After the midterm, it's published with solutions here.

Group project (A4)

All the descriptions are now published in the assignment 4, well ahead of due date.

Final exam: now with solutions

The exam (see solutions here) was 34% of course mark: 70% of that for problems, 30% for quiz.

The rest of the information was important before the exam.
You write the final exam online (e.g., from home). This is to allow you to not only create algorithms in C/Fortran, but implement them, compile and run, and then describe the results and submit both the source code and descriptions to Quercus Assignments tab by 17:10, the usual way. The C/Ftn programming will NOT require producing graphics output. Practical knowledge of loop multithreading by OMP and ability to read or write a simple kernel in CUDA may be required. Writing a code that you cannot successfully debug and run during the exam will earn partial credit, so don't hesitate submitting imperfect programs (although we will not debug them for you :-)
Have a computing environment ready ahead of time (editor, compiler) either on your machine or on art-1 (via ssh and sftp, in your phyd57 subdirectory).

In addition, there will be a midterm-style Quiz (with 41 questions, more than the midterm quiz), but this time it will only account for 30% of the final exam score. You will download, edit and submit quiz solution as a text file. Instead of circling the wrong words, you should include them in chevrons like so, $\lt\lt$ wrong word here $\gt\gt$.

Please also have Python ready for the occasional task that requires plotting of functions, and as an interactive calculator. Your exam may be submitted in separate source code/text description of solutions/text solution of Quiz, and graphics output (screen snapshots by OS, or by your phone).

In case of technical trouble or for clarifying questions, send a text message to my phone. If anything of general interest appears for me to announce to you (like corrections to the text of the exam), then I'll note it in a red font on the web page RIGHT HERE -- therefore, you'll be asked to occasionally refresh/check during the exam, this exam subpage (no solutions).

I can give you these general hints about the final exam:
The exam will consist of 4 problems. One will be to create a code (detailed comments inside the code required!) & run it (include the code output in your solution, upload code & description of solution to Quercus, upload the code to art-1 subdirectory as well, please). It will deal with dynamics of some simple physical system. (This one may take more time than others).
Second problem will be analytical and will be about numerical calculus (differentiation or integration methods). You may take a snapshot of a handwritten solution with the phone and upload to Quercus, or you can send a pdf file from your tablet, even better. There is no need to upload problem 2 solution to art-1.
Third problem is to create an efficient algorithm & implement it in the form of C/F code - a simple one, from a general field of digital signal processing. Comments about the code inside it, and about the algorithmic issues in a separate file uploaded to Quercus (it would be good to have one such file for all problems).
The fourth task will be to debug improve a code, which will have a few different sorts of errors. The code will be in C and Fortran, compiling and working ok in phyd57 account on art-1; choose one to debug/improve,copy to your subdirectory and/or computer. It'll make the file readable but non-writeable by you, to prevent accidental corruption.

When you modify or create codes, give the files names including problem number and your student number, please, before submitting to Quercus and/or upload to your art-1 subdir. As you know from assignments, infinite updates will be allowed, in case you need them - but indicate in the name that it's an update, for instance including approx time of creation.

Partial results

Please see the current tally

Links

Advice for Windows users

Please see this advice on linux simulation provided by Daniel.

History

A cartoon from 1980s

http://www.linfo.org/bit.html . Concept and history of bit (b). Please study it, and follow links to other important basic concepts in that short article, which you will be required to understand and define during exams and quizzes, such as: bitrate, bandwidth, data bus, 32- vs. 64-bit systems, ASCII code, pixels, various powers of 10 in computer lingo, and RAM memory. Note that this online encyclopedia is slightly outdated as (from 2004); in 2019 "current" hardware capabilities are a bit higher (pun intended).

http://www.linfo.org/byte.html . Concept and history of byte (B). Another bunch of important basic terminology.

https://en.wikipedia.org/wiki/Marian_Rejewski . Are you interested in ENIGMA and its decoding before and during WWII?
Then you have to dig deeper than the related Hollywood movies, which are at times extremely short on real history (better cf. Max Hastings, "The secret War: Spies, Codes and Guerillas 1939-1945", London, 2015). ENIGMA machines were used by Hitler's military for coded communications between the headquarters and units in the field. ENIGMA decoding, first achieved in Poland by Marian Rejewski's team, may eventually have preserved millions of lives, by saving England from German invasion, tipping the balance in favor of Allies, and shortening WWII.

After the pre-war Polish cryptologic breakthroughs, hardware devices were built there, implementing the decoding algorithms called Bomba (cryptographic bomb; or "bombe" in top-secret U.S. Army reports of the time). https://en.wikipedia.org/wiki/Bomba_(cryptography). Copies of ENIGMA machines were also built. Eventually an even faster hacking of the gradually more complex combinations of wheels inside ENIGMAs became necessary.

This was undertaken by British government using qualitatively the same approach as invented by the secret Polish Cipher Bureau. An electromechanical version of the Polish Bomba was built in Blatchely Park near London by an outstanding pioneer of computational science Alan Turing. He also created some statistical theory to aid this work.
The recent biographical movie on A. Turing and the British Enigma-cracking machinery titled "The imitation game" distorts the history, as his team is incorrectly given the sole credit for breaking ENIGMA code. In fact they were neither the first not the last to come up with essential breakthroughs. For instance, a half-forgotten team of brilliant engineers of the British Post Office headed by Thomas Flowers working around the clock for months constructed the specialized Colossus computer and transferred it to Turing's bureau. That device is shown in the movie, but was neither Turing's invention nor was tasked with deciphering ENIGMA codes! Instead, it dealt with other, non-ENIGMA German codes.

Even more powerful machines were developed during the WWWII by American government (Navy), in order to decipher the changed coding procedure used by German submarines. Such devices were the first computers, though not general-purpose electronic computers we have today. They were mostly electromechanical, that is to say based on relays and automatic switches.

As you see, computing from the beginning was and continues to be very very useful. Unfortunately these days it is often used for spying on... you. You do have a smartphone, right? Anyway, enjoy the four ENIGMAtic movies nicely broken down for you on this page .

http://mason.gmu.edu/~montecin/computer-hist-web.htm Computing history

https://cs.uwaterloo.ca/~shallit/Courses/134/history.html Computer science history, with further links

Supercomputers

https://www.top500.org". Supercomputing stats. Explore the site, incl. the most recent list of the fastest supercomps in the world: https://www.top500.org/list/2019/06/ .

• Article where about #1 Fugaku supercomputer is on our way to exaflop computing.

Python

https://wiki.python.org/moin/BeginnersGuide/Programmers Introductions of all sorts to Python language for beginnig coders.
https://docs.python.org/3.0/reference is Python Language Reference straight from the horses mouth.
https://docs.python.org/3.0/reference/index.html, Site with docs and resources on Python
https://matplotlib.org/3.1.0/tutorials/introductory/pyplot.html. Tutorial on MatPlotlib and Pyplot
https://docs.scipy.org/doc/numpy/index.html. Tutorial and user guide to NumPy
https://matplotlib.org/3.1.1/tutorials/introductory/pyplot.html. Tutorial on Pyplot (part of Matplotlib based on MATLAB syntax).

Linux and its shells

• You are encouraged but not required to install linux on your computer, although that is a better option. On art-1 there is an account for you, and you will be shown a password to it, if you later forget it please ask the prof in email. This system has Centos6 OS available for your exercises. To be able to do anything there, you need to learn basic Linux, and if you're connecting from a Windows system, to install an SSH (secure shell) client program PuTTY (or similar) on your machine. The ssh is the way to connect securely and work on a remote linux server, while sftp is a similar client that connects for the purpose of transferring files between two machines on internet. See more in the references cited on our page of references and literature. For instance, cf. p. 52+ in Membrey's book on CentOS.

https://www.freecodecamp.org/news/the-best-linux-tutorials/. Getting Started with Linux, with links to tutorials.
http://www.linfo.org. Linux info page, including some history pages and explanations of commands. Follow the links.
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html#toc7 . Programming in bash shell.
https://docs.freebsd.org/44doc/usd/04.csh/paper.html . Programming in csh shell.
https://www.wisdomjobs.com/e-university/shell-scripting-tutorial-174.html . Shell scripting tutorial in different shells.

Languages

http://planets.utsc.utoronto.ca/~pawel/PHYD57/refs/ This is your primary reference. Remember about its confidential, sister page.

Click here . Python vs. Fortran. This discussion on Stackexchange (the often informative resource for coders) shds some light on what programmers think about the differences between Python and high-level compiled languages - here Fortran (C/C++ is another).

This article proposes that: "It certainly seems likely, in light of the above, that Fortran will remain the fastest option for numerical supercomputing for the foreseeable future—at least if 'fast' refers to the raw speed of compiled code. But there are other reasons for Fortran’s staying power."
I personally prefer Fortran for demanding research and Python and/or another scripting language IDL for small tasks and visualization. C/C++, in which (Unix and) Linux operating systems are written, is very useful too, as it sometimes provides the most direct way to tinker with graphics cards (GPUs) to make them do CUDA. [That is an inside joke for people coming from my part of Europe. CUDA is the extension of C/C++ (and Fortran) that directs data transfer to/from CPU and computation on GPU. Appropriately, the word "cuda" means "miracles" in Slavic languages and in Hungarian (csodak)]. We deal with such miracles in this course. C programs are easily callable from Python (as we will learn), so you can practice multi-language programming and interfacing.

https://flow.byu.edu/posts/sci-prog-lang provides views similar to my own on the best scientific programming languages (Python+Fortran and C but not C++, Julia in the near future or now, at graduate level). Page by Andrew Ning from Flight, Optimization, and Wind Laboratory.

https://docs.julialang.org/en/v1/. Julia language page

https://towardsdatascience.com/the-serious-downsides-to-the-julia-language-in-1-0-3-e295bc4b4755 . Advantages and a few gripes about Julia language

https://www.fortran.com/the-fortran-company-homepage/fortran-tutorials/. Links to Fortran tutorials & other resources.

https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/fortran.html. Another Fortran tutorial.

https://www.learn-c.org. Interactive C tutorial.

https://beginnersbook.com/2014/01/c-tutorial-for-beginners-with-examples. C Tutorial – Learn C Programming with examples. Also C++.

Just for fun. Semi-serious stuff on computer languages, which teaches you history of programming. Description of a new language called BS. Combines the worst features of all existing languages.

Networking

https://www.youtube.com/playlist?list=PLQVJk9oC5JKp_8F9LPa3Pv67boA80KLm1 . First of 14 videos in a series on networking. See the next ones on YT.

Parallel computing - OpenMP

https://www.openmp.org/wp-content/uploads/omp-hands-on-SC08.pdf OpenMP, "Hands-on intro to OpenMP", a slide show by Intel programmers from SC08, Austin, TX
http://jakascorner.com/blog/2016/04/omp-introduction.html OMP: Introduction
http://jakascorner.com/blog/2016/06/omp-for.html OMP: for loops
http://jakascorner.com/blog/2016/05/omp-for-sections.html OMP: Sections
http://jakascorner.com/blog/2016/06/omp-for-scheduling.html OMP: for loop scheduling (assignment of threads to pieces of work)
http://jakascorner.com/blog/2016/06/omp-for-reduction.html OMP: Reduction
https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf OpenMP, manual by creators, full description
https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf OpenMP, examples in C(++) and Fortran. Lots of code fragments, with some advanced concepts.
https://software.intel.com/en-us/articles/performance-obstacles-for-threading-how-do-they-affect-openmp-code, this page discusses the Performance Obstacles in fork-join parallelism using OpenMP.

Parallel computing - CUDA C and CUDA Fortran

Below you'll find a lot of internet resources. Remember to read materials on our extra page with pdf's (send me email if you forgot how to access it).

CUDA Fortran blog by Greg Ruetsch (author of one of recommended books on extra page.)
CUDA C blog by Mark Harris, NVIDIA
CUDA C presentation - different modes of usage, by Cyril Zeller, NVIDIA.
CUDA C Programming Guide. Detailed. Definitive (By Nvidia!). Read the first 30 pages.
CUFFT library description - useful when you need FFT. When mastered, the world of DSP is yours. Contains code examples.
CUDA nvcc compiler manual
Paper comparing CUDA in Fortran and C with CPU performance, as a function of job size.
Dr. Dobbs blog CUDA, Supercomputing for the Masses. Links to all 12 pieces. A bit old but informative.
CUDA Fortran compiler, PGI ver.18 User Guide
Fortran compiler, PGI ver.18 User Guide
Let's end with a GPU version of a program you already know. To be compiled with PGI compiler pgf90/95 (only that compiler understands CUDA Fortran on art-2): http://planets.utsc.utoronto.ca/~pawel/progD57/tetraDg-3.f95.

Old and new technologies

hpcwire.com is a journal of HPC.
https://www.scientific-computing.com - if you want to dig (a tiny little bit) below the surface level of buzzwords.
https://www.youtube.com/watch?time_continue=414&v=Qm-LLfDPYYQ&feature=emb_logo. The Coming Age of Extreme Heterogeneity ǀ Jeffrey Vetter. ORNL
• Overview of desktop supercomputing, in the past and recently (written in 2021).


last modified: Nov 2024