Welcome to the in-person course that will help you become a more knowledgeable and better scientific programmer! You already know basic programming in Python (cf. next section). Now is the time to learn some history and even some physics behind our technical civilization, where computing is helping us everywhere. Powerful operating system (linux) and languages will be discussed and tried out, networking, and the principles of parallel programming that enables you as a scientist to run large-scale simulations of the physical world. We will also discuss some advanced algorithms used in Physics. The course includes practical exercises in writing and optimizing programs on single node(s) of a computational cluster. Although we'll outline the methods, we leave out the node-to-node communication and writing message-passing programs. You can acquire those skills later as part of a research undergrad or grad course.
Quercus will be used for: submission of your term work, announcements via email, your unmoderated discussion, and posting of some old recordings of lectures and tutorials in Media Gallery tab.
C(++) is a few decades younger, and different: more wordy, with more bells and whistles, it is lower level (we'll discuss what that means). It is better known and thus more marketable outside academia. C is a sine qua non for operating system writers, guardians and hackers. Scientists have a love/hate relationship with C++. Even though they weren't originally meant for each other, they've been together for ages and gave birth to some great implementations and program packages.
Intel and Nvidia/PGI is providing both languages in their compiler suites for free.
Tutorials are NOT meant for the explanation of Lectures, although sometimes relate to things discussed there. There will be no points for attendance/activity. Negative aspects of not attending and not being active are merciless/automatic enough. We are a small bunch of interested people in this course, whose stile I'd place somewhere between undergraduate and graduate.
Office hours: let's discuss things immediately after the lecture and/or tutorial. I'll try to stay until all topics you want to talk about are discussed or until I have to leave. We will see where we can talk, if the classroom is busy we'll move to one of the rooms on 5th floor (SW). Other times may be arranged too, depending on my time.
The
Code of Behaviour on Academic Matters at UofT should be respected.
If you have not read it, you may unknowingly find yourself in trouble.
This document succinctly defines conduct that constitutes plagiarism,
i.e. misrepresentation of authorship, e.g., cheating that the assigned work is yours
and original, while you merely rephrase somebody's code by changing its
appearance. It is an equal offense to help somebody plagiarize your own work.
The Code mentions how any reasonably grounded suspicions must be addressed
by your Instructor, by Dept. Chair, and so on. Of course group work on assignment 4
is a different matter! Read
more here.
In the same vein, using ChatGPT or other "AI" to plagiarize
the codes for assignments is a serious offense.
Tutorials are given by the lecturer. This course will have a TA/marker: Pejvak Javaheri (his email has the form first.lastname@mail.utoronto.ca).
Practical skills will include basic familiarity with Linux operating system.
That will be done as one of the first things - you may want to read on the
net about linux before the course starts. You are strongly encouraged but not
required to install Centos OS or a newer flavor of linux on your computer
(or use MacOS, which is a linux customized by Apple Inc.).
You will have access to one node of a cluster housed in BV building (there is
a banner with a photo of it next to the entrance to DPES
office in EV bldg). It has Centos 6 (linux) operating system and
basic software, enough for doing a few exercises on your own.
Physical access to the server room is restricted, even I don't have the codes
and the key, so if the machine(s) crash because of something (e.g. power outage)
then we are temporarily in trouble. But we won't worry in advance, we'll backup
our work on a generally resilient mail and web server called planets.
In any case, making the research machine(s) available to you
is outside the scope of u/g courses and is an extra benefit to you given by me
without explicit or implicit promises. I will argue that you should try to
develop and debug codes, and run them on your own machine, which today is quite
feasible.
On your computer, you will sometimes use your Python3, and will be asked to install compiler(s) for HPC languages Fortran and/or C(++), and familiarize yourself with how they function. (If you already know both these languages, then it would be great if you could install Julia compiler and try that new language, created at MIT for purposes consistent with ours, then share your experience.)
In summary, our goals are to help you become a better scientific programmer, able to solve numerically demanding problems on a workstation or a good laptop. While the practice of cluster supercomputing is outside the scope of this undergraduate course, you will hear a lot about it, and even have an option to try it small-scale if you want.
This is a 4th year (upper u/g) course. It explains the issues, tries to interest you in some directions and put you on the right track, but it's your job to move along. This means your own initiative in learning is expected. Great many materials, books and web pages, will be suggested to you, and often you will make your own choice which of them to use. Start very soon. Happy experimenting with the kind of computing you likely haven't experienced yet.
Text of assignment set #1. Due 22 September 11:59pm.
Partial solutions.
Text of assignment set #2. Due 4 October.
Solutions..
Text of assignment set #3
1. N-body galaxy project (pdf). The program in f95
Volcano on Io.
Source code
Our textbooks and materials will be shown in Lecture notes, and will be specific to the current lecture. For instance, Lecture 2 will benefit from reading parts of
N. Schroeghofer, "Lessons in Scientific Computing", CRC 2019.
Please focus on subjects not discussed in introductory course PSCB57, for
instance author's views on different programming languages and how to select
the best combination of two complementary ones. If you read a chapter and
the contents seems really familiar, skip and begin the next chapter.
For L2 and later lectures, and for the knowledge of Fortran, please browse through:
J. Izaac and J. Wang, "Computational Quantum Mechanics", Springer 2018.
and
P. Turner, T. Arildsen, K. Kavanagh, "Applied Scientific Computing with
Python", Springer 2018.
and make sure all the subjects are known to you.
Non-required book for those who want to brush up on Python:
S. Linge, H. Langtangen, "Programming for Computations - Python", Springer 2019 [this 2nd edition uses Python3]
Guide to literature L1-L5, approximately
Lecture 1
Lecture 2
Lecture 3
Lecture 4
Lecture 5
Lecture 6
Lecture 7
Lecture 8
Lecture 9+10
Two papers recommended for Neural Networks in application to Astronomy,
NNs-spect.pdf, and
NN-exoplanets.pdf. Also, look at the NN section of the private,
auxiliary page!
The rest of the information was important before the exam.
You write the final exam online (e.g., from home). This is to allow you to not only
create algorithms in C/Fortran, but implement them, compile and run, and then describe
the results and submit both the source code and descriptions to Quercus Assignments
tab by 17:10, the usual way. The C/Ftn programming will NOT require producing graphics
output. Practical knowledge of loop multithreading by OMP and ability to read or
write a simple kernel in CUDA may be required. Writing a code that you cannot
successfully debug and run during the exam will earn partial credit, so don't
hesitate submitting imperfect programs (although we will not debug them for you :-)
Have a computing environment ready ahead of time (editor, compiler) either on
your machine or on art-1 (via ssh and sftp, in your phyd57 subdirectory).
In addition, there will be a midterm-style Quiz (with 41 questions, more than the midterm quiz), but this time it will only account for 30% of the final exam score. You will download, edit and submit quiz solution as a text file. Instead of circling the wrong words, you should include them in chevrons like so, $\lt\lt$ wrong word here $\gt\gt$.
Please also have Python ready for the occasional task that requires plotting of functions, and as an interactive calculator. Your exam may be submitted in separate source code/text description of solutions/text solution of Quiz, and graphics output (screen snapshots by OS, or by your phone).
In case of technical trouble or for clarifying questions, send a text message to my phone. If anything of general interest appears for me to announce to you (like corrections to the text of the exam), then I'll note it in a red font on the web page RIGHT HERE -- therefore, you'll be asked to occasionally refresh/check during the exam, this exam subpage (no solutions).
I can give you these general hints about the final exam:
The exam will consist of 4 problems. One will be to create a code (detailed comments
inside the code required!) & run it (include the code output in your solution, upload
code & description of solution to Quercus, upload the code to art-1 subdirectory as
well, please). It will deal with dynamics of some simple physical system.
(This one may take more time than others).
Second problem will be analytical and will be about numerical calculus
(differentiation or integration methods). You may take a snapshot of a handwritten
solution with the phone and upload to Quercus, or you can send a pdf file from your
tablet, even better. There is no need to upload problem 2 solution to art-1.
Third problem is to create an efficient algorithm & implement it in the form of C/F code -
a simple one, from a general field of digital signal processing. Comments about the code
inside it, and about the algorithmic issues in a separate file uploaded to Quercus
(it would be good to have one such file for all problems).
The fourth task will be to debug improve a code, which will have
a few different sorts of errors. The code will be in C and Fortran,
compiling and working ok in phyd57 account on art-1; choose one to
debug/improve,copy to your subdirectory and/or computer. It'll make the file
readable but non-writeable by you, to prevent accidental corruption.
When you modify or create codes, give the files names including problem number and your student number, please, before submitting to Quercus and/or upload to your art-1 subdir. As you know from assignments, infinite updates will be allowed, in case you need them - but indicate in the name that it's an update, for instance including approx time of creation.
• http://www.linfo.org/bit.html . Concept and history of bit (b). Please study it, and follow links to other important basic concepts in that short article, which you will be required to understand and define during exams and quizzes, such as: bitrate, bandwidth, data bus, 32- vs. 64-bit systems, ASCII code, pixels, various powers of 10 in computer lingo, and RAM memory. Note that this online encyclopedia is slightly outdated as (from 2004); in 2019 "current" hardware capabilities are a bit higher (pun intended).
• http://www.linfo.org/byte.html . Concept and history of byte (B). Another bunch of important basic terminology.
•
https://en.wikipedia.org/wiki/Marian_Rejewski .
Are you interested in ENIGMA and its decoding before and during WWII?
Then you have to dig deeper than the related Hollywood
movies, which are at times extremely short on real history (better cf.
Max Hastings, "The secret War: Spies, Codes and Guerillas 1939-1945",
London, 2015). ENIGMA machines were used by Hitler's military for coded
communications between the headquarters and units in the field.
ENIGMA decoding, first achieved in Poland by Marian Rejewski's team,
may eventually have preserved millions of lives, by saving England from
German invasion, tipping the balance in favor of Allies, and shortening WWII.
After the pre-war Polish cryptologic breakthroughs, hardware devices were built there, implementing the decoding algorithms called Bomba (cryptographic bomb; or "bombe" in top-secret U.S. Army reports of the time). https://en.wikipedia.org/wiki/Bomba_(cryptography). Copies of ENIGMA machines were also built. Eventually an even faster hacking of the gradually more complex combinations of wheels inside ENIGMAs became necessary.
This was undertaken by British government using qualitatively the same approach
as invented by the secret Polish Cipher Bureau. An electromechanical version of
the Polish Bomba was built in Blatchely Park near London by an outstanding
pioneer of computational science Alan Turing. He also created some statistical
theory to aid this work.
The recent biographical movie on
A. Turing and the British Enigma-cracking machinery titled "The imitation
game" distorts the history, as his team is incorrectly given the sole credit
for breaking ENIGMA code. In fact they were neither the first not the last
to come up with essential breakthroughs. For instance, a half-forgotten team
of brilliant engineers of the British Post Office headed by Thomas Flowers
working around the clock for months constructed the specialized Colossus
computer and transferred it to Turing's bureau. That device is shown in the
movie, but was neither Turing's invention nor was tasked with deciphering
ENIGMA codes! Instead, it dealt with other, non-ENIGMA German codes.
Even more powerful machines were developed during the WWWII by American government (Navy), in order to decipher the changed coding procedure used by German submarines. Such devices were the first computers, though not general-purpose electronic computers we have today. They were mostly electromechanical, that is to say based on relays and automatic switches.
As you see, computing from the beginning was and continues to be very very useful. Unfortunately these days it is often used for spying on... you. You do have a smartphone, right? Anyway, enjoy the four ENIGMAtic movies nicely broken down for you on this page .
• http://mason.gmu.edu/~montecin/computer-hist-web.htm Computing history
• https://cs.uwaterloo.ca/~shallit/Courses/134/history.html Computer science history, with further links
• Article where about #1 Fugaku supercomputer is on our way to exaflop computing.
•
https://wiki.python.org/moin/BeginnersGuide/Programmers Introductions of
all sorts to Python language for beginnig coders.
• https://docs.python.org/3.0/reference is Python Language Reference straight from the horses mouth.
• https://docs.python.org/3.0/reference/index.html, Site with docs and resources on Python
•
https://matplotlib.org/3.1.0/tutorials/introductory/pyplot.html.
Tutorial on MatPlotlib and Pyplot
•
https://docs.scipy.org/doc/numpy/index.html. Tutorial and user guide to NumPy
• https://matplotlib.org/3.1.1/tutorials/introductory/pyplot.html.
Tutorial on Pyplot (part of Matplotlib based on MATLAB syntax).
• You are encouraged but not required to install linux on your computer, although that is a better option. On art-1 there is an account for you, and you will be shown a password to it, if you later forget it please ask the prof in email. This system has Centos6 OS available for your exercises. To be able to do anything there, you need to learn basic Linux, and if you're connecting from a Windows system, to install an SSH (secure shell) client program PuTTY (or similar) on your machine. The ssh is the way to connect securely and work on a remote linux server, while sftp is a similar client that connects for the purpose of transferring files between two machines on internet. See more in the references cited on our page of references and literature. For instance, cf. p. 52+ in Membrey's book on CentOS.
•
https://www.freecodecamp.org/news/the-best-linux-tutorials/.
Getting Started with Linux, with links to tutorials.
• http://www.linfo.org. Linux info page,
including some history pages and explanations of commands. Follow the links.
•
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html#toc7 . Programming in bash shell.
•
https://docs.freebsd.org/44doc/usd/04.csh/paper.html . Programming in csh shell.
•
https://www.wisdomjobs.com/e-university/shell-scripting-tutorial-174.html .
Shell scripting tutorial in different shells.
• Click here . Python vs. Fortran. This discussion on Stackexchange (the often informative resource for coders) shds some light on what programmers think about the differences between Python and high-level compiled languages - here Fortran (C/C++ is another).
This article proposes that:
"It certainly seems likely, in light of the above, that Fortran will remain
the fastest option for numerical supercomputing for the foreseeable future—at
least if 'fast' refers to the raw speed of compiled code. But there are other
reasons for Fortran’s staying power."
I personally prefer Fortran for demanding research and Python and/or another
scripting language IDL for small tasks and visualization.
C/C++, in which (Unix and) Linux operating systems are written, is very useful
too, as it sometimes provides the most direct way to tinker with graphics cards
(GPUs) to make them do CUDA. [That is an inside joke for people coming from
my part of Europe. CUDA is the extension of C/C++ (and Fortran) that
directs data transfer to/from CPU and computation on GPU. Appropriately, the word
"cuda" means "miracles" in Slavic languages and in Hungarian (csodak)]. We deal
with such miracles in this course. C programs are easily callable from Python
(as we will learn), so you can practice multi-language programming and interfacing.
• https://flow.byu.edu/posts/sci-prog-lang provides views similar to my own on the best scientific programming languages (Python+Fortran and C but not C++, Julia in the near future or now, at graduate level). Page by Andrew Ning from Flight, Optimization, and Wind Laboratory.
• https://docs.julialang.org/en/v1/. Julia language page
• https://towardsdatascience.com/the-serious-downsides-to-the-julia-language-in-1-0-3-e295bc4b4755 . Advantages and a few gripes about Julia language
• https://www.fortran.com/the-fortran-company-homepage/fortran-tutorials/. Links to Fortran tutorials & other resources.
• https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/fortran.html. Another Fortran tutorial.
• https://www.learn-c.org. Interactive C tutorial.
• https://beginnersbook.com/2014/01/c-tutorial-for-beginners-with-examples. C Tutorial – Learn C Programming with examples. Also C++.
• Just for fun. Semi-serious stuff on computer languages, which teaches you history of programming. Description of a new language called BS. Combines the worst features of all existing languages.
•
CUDA Fortran blog by Greg Ruetsch (author of one of recommended books on
extra page.)
•
CUDA C blog by Mark Harris, NVIDIA
•
CUDA C presentation - different modes of usage, by Cyril Zeller, NVIDIA.
•
CUDA C Programming Guide. Detailed. Definitive (By Nvidia!). Read the
first 30 pages.
•
CUFFT library description - useful when you need FFT.
When mastered, the world of DSP is yours. Contains code examples.
•
CUDA nvcc compiler manual
•
Paper comparing CUDA in Fortran and C with CPU performance, as a function
of job size.
•
Dr. Dobbs blog CUDA, Supercomputing for the Masses. Links to all 12 pieces. A bit old
but informative.
•
CUDA Fortran compiler, PGI ver.18 User Guide
•
Fortran compiler, PGI ver.18 User Guide
Let's end with a GPU version of a program you already know.
To be compiled with PGI compiler pgf90/95 (only that compiler understands CUDA Fortran
on art-2):
http://planets.utsc.utoronto.ca/~pawel/progD57/tetraDg-3.f95.