PHYD57 Advanced Computational Methods in Physics

Volcano on Io. Source code

Pandemic adaptations

We will meet on zoom. Your presence online is required (course is synchronous online until further notice); times of meetings are unchanged. The final exam will also be online, as a matter of fact it will be a presentation of your results of a take-home exam programming project announced 4 weeks earlier.
Welcome to the course that will help you become a more knowledgeable and better scientific programmer!

This page (and not any Quercus/old files scattered on internet or UTSC servers) provides access to all the materials, including up-to-date syllabus, assignments, and some preliminary grades. Quercus will be used for announcements via email, submission of your term work and exams, and for posting of the recordings of lectures and tutorials (in Media Gallery tab).

This page will be modified with new contents as the course progresses. Things highlighted in red are newly added or otherwise important.


TXT file with dates of lectures and exams, topics

Course prerequisites

One requirement is programming using basic Python, at the level of PSCB54 Introduction to Computing in Physics. That course had a prerequisite: Calculus II for Mathematical or Physical Sciences [MATA36H3 or MATA37H3]. This implies that you've also had Calculus I. That's all. The rest you'll learn during the course and, by the way, don't expect to be a certified scientific programming genius after it. The field is vast.

Course requirements and languages

This course assumes your interest in: (i) selected intermediate and advanced numerical methods used in physical sciences, (ii) HPC (High Performance Computing) including parallelization of algorithms, as well as (iii) becoming multi-lingual programmer (adding 1.5 powerful languages to your toolbox). Why 1.5? Every student who finished PHYD57 will understand uncomplicated programs (with their arithmetics, loops, conditional statements, functions/subroutines and their calling, array declaration and usage, pseudorandom numbers, and so on) in both C and Fortran. You will be required to actively use (a different and more demanding task!) one of those languages. In other words, we want you to acquire passive knowledge of C(++) AND modern Fortran, and an active knowledge of C(++) OR Fortran. By fortran I mean fortran 90, 95, 2003, 2008, or 2015 definitions; they're the same for our purposes. Avoid books/tutorials on older versions. They were substantially different, even though new compilers tolerate 99% of programs written in fortran 77. Starting from your familiarity with Python, Fortran is easier to learn. In addition, many scientific programs (typically using multi-dimensional arrays) execute faster when written in Fortran than in C. On the other hand, C(++) is more marketable outside academia. Intel and Nvidia/PGI provide either language in their compiler suites for free to students.


Code page

Continuously updated codes, including some solutions of assignments will be available:
You will download and run the codes. At first, before you have much experience in compiling and running some of them, you can just take a look at them. You are encouraged to develop them further to do more things, change their output style, etc. That is how programmers learn programming. Nobody but you have the right to do that - please don't share the link to the code page with the world.
Knowledge of structures and methods used in these codes is a course requirement. You must acquire a passive knowledge (i.e., ability to understand) of both basic C and Fortran. You may actively program your assignments in one chosen language, C/C++ or Fortran, unless the assignment explicitly mentions Python.

About this course

Lectures, Tutorials, Office hours, UofT Code of Conduct etc.

Lectures (see syllabus for dates and times) follow the organization outlined in syllabus. If in-person, you are expected to take own notes of what's happening on the blackboard, while online you will have the recordings in Media gallery on Quercus. Informative web pages may sometimes appear during lectures as well; it's a good practice to note down keywords or URLs for visiting them later.

Tutorials are NOT meant for the explanation of Lectures, although sometimes relate to things discussed there. There will be no points for attendance/activity. Negative aspects of not attending and not being active are merciless/automatic enough. We are a small bunch of interested people in this course, whose stile I'd place somewhere between undergraduate and graduate.

Office hours: let's discuss things immediately after the lecture and/or tutorial, I'll try to stay until all topics you want to talk about are discussed or until I really have to go. Other times may be arranged too, depending on my time (I'm teaching one other, larger, astrophysics course.)

The Code of Behaviour on Academic Matters at UofT should be respected. If you have not read it, you may unknowingly find yourself in trouble. This document succinctly defines conduct that constitutes plagiarism, i.e. misrepresentation of authorship, e.g., cheating that the assigned work is yours and original, while you merely rephrase somebody's code by changing its appearance. It is an equal offense to help somebody plagiarize your own work. The Code mentions how any reasonably grounded suspicions must be addressed by your Instructor, by Dept. Chair, and so on. Of course group work on assignment (if explicitly asked for) is a different matter! Read more here.

More about the course

You've done the heavy lifting. You have learned programming in PSCB57 Intro to Scientific Computing. Moreover, you've learned some basic numerical methods. In this course, you will learn more, slightly more advanced, algorithms and techniques used in numerical physical sciences. You will discover the world of HPC: its history, theory and some practice of High Performance Computing in high-performance languages.

Practical skills that you will develop include basic familiarity with Linux operating system. That will be done as one of the first things - you may want to read on the net about linux before the course starts. You are strongly encouraged but not required to install Centos OS or other flavor of linux on your computer (or use Macos wich is a linux customized by Apple Inc.).
You will have access to one node of a cluster housed in BV building (there is a banner with a photo of it next to the entrance to DPES administration office in EV bldg). It has a Centos 6 (linux) system and basic software, enough for doing a few exercises on your own. Physical access to the server room is restricted, even I don't have the codes and key, so if the campus is closed, the IT staff works from home, and the machine(s) crash because of own reasons or a nasty power outage & they are down then we're in trouble. But we won't worry in advance, we'll backup our work on a mirror machine and maybe it will stay up. In any case, making the research machine(s) available to you is outside the scope of u/g courses and is an extra benefit to you given by me without any explicit or implicit promises. I will argue that you should try to develop codes and even run them on your own machine, which today is more than feasible. Ask your TA Fergus Horrobin about some former UTSC students that spectacularly succeeded. He is one of them.

On your computer, you will sometimes use your Python3, and will be asked to install compiler(s) for HPC languages Fortran and/or C(++), and familiarize yourself with how they function. (If you already know both these languages, then it would be great if you could install Julia compiler and try that new language, created at MIT for purposes consistent with ours, then share your experience.)

In summary, our goals are to help you become a better scientific programmer, able to solve numerically demanding problems on a workstation or a good laptop. While the practice of cluster computing or supercomputing is outside the scope of this undergraduate course, you will hear a lot about it, and even have an option to try it small-scale if you want.

This is a 4th year (upper u/g) course. It explains the issues, tries to interest you in some directions and put you on the right track, but it's your job to move along. This means your own initiative in learning is expected. Great many materials, books and web pages, will be suggested to you, and often you will make your own choice which of them to use. Start very soon, and happy experimenting with the kind of computing you likely haven't experienced yet.


Email or visit the instructor Prof. Pawel Artymowicz (guide to pronunc.: PAvel ArtyMOvich), office SW 506G, on the 5th floor (Physics & Astrophysics Group) in old Science Wing. Please email remarks and questions to:, importantly with a subject line "PHYD57", or else it may be accidentally misplaced. Tutorials with the lecturer.

We are fortunate to have Fergus Horrobin as TA/marker. He will mostly do the marking, with some individual email contact.


Please see the guide to literature in the lecture notes section below. Add a secret ingredient at the end of the URL you are viewing to see a web page that is helpful.
Books discussed below should be in UofT library, in electronic form. If not, let me know.
Find books in library catalog. I do not recommend just watching videos on Youtube about programming; most of them are only very remotely relevant to PHYD57.

Our textbooks and materials will be shown in Lecture notes, and will be specific to the current lecture. For instance, Lecture 2 will benefit from reading parts of

N. Schroeghofer, "Lessons in Scientific Computing", CRC 2019.
Please focus on subjects not discussed in introductory course PSCB57, for instance author's views on different programming languages and how to select the best combination of two complementary ones. If you read a chapter and the contents seems really familiar, skip and begin the next chapter.

For L2 and later lectures, and for the knowledge of Fortran, please browse through:

J. Izaac and J. Wang, "Computational Quantum Mechanics", Springer 2018.
P. Turner, T. Arildsen, K. Kavanagh, "Applied Scientific Computing with Python", Springer 2018.
and make sure all the subjects are known to you.

Non-required book for those who want to brush up on Python:

S. Linge, H. Langtangen, "Programming for Computations - Python", Springer 2019 [unlike 1st ed., this 2nd edition uses Python 3]

Course grading scheme

3 sets of assignments A1-A3         24% (8% ea.)
assignment A4 (group project)          15%
midterm exam (timed, online)         19%
final exam (take-home + presentation)    42%

Lecture notes and guides

Programs discussed in lecture notes are available from

Guide to literature L1-L5, approximately      

Lecture 1       Lecture 2       Lecture 3       Lecture 4
Lecture 5       Lecture 6       Lecture 7       Lecture 8
Lecture 9+10     Lecture 11     Lecture 12     

Two papers recommended for Neural Networks in application to Astronomy,
NNs-spect.pdf, and NN-exoplanets.pdf. Also, look at the NN section of the private, auxiliary page!


Normally blocked (since you have access via Quercus > Media Gallery), but some day this way to access them may be needed: Day 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Lecture 1 recording is incomplete, second (shorter) part is missing. But the missing story of decryption of German military codes by Polish, and then mostly British code-breakers associated with Blatchley Park gov. bureau of codes and ciphers, is told both in the PDF of Lecture 1 notes and the extensive text and links below (Links > History). We stopped near the end of the lecture notes.


The deadline is at the beginning of lecture tutorial on days marked in syllabus. Submission format is explained in the text of assignment 1. Submission late by more than 3 hr earns 50% of the points.

Text of assignment set #1. Due 1 Feb.
Text of assignment set #2. Due 17 Feb.
Text of assignment set #3. Due 20 Mar midnight.
Assignment set #4 . Due 5 Apr. 15% of course grade. In contrast to A1-A3, it can be a group project (1 or 2 students). Do all 3 problems (1 problem per week per 1..2 persons, on average).

Midterm exam

Download and edit this midterm 2022 file:
Then submit it to Quercus Assignments between 15:05 and 15:15.

This page: will help you prepare for this exam.

Solutions of midterm

Final exam 29 Apr from 12:01pm

The exam will be a zoom meeting at devoted to the presentation+discussion time of one of the take-home programming projects in C(++) || F95, announced a month before the exam. Everybody will have 25 to 30 min time to present, followed by 10-15 min discussion. We will meet at 12:00 noon (unless you have a time conflict with another exam then -- please send me email!) and continue for up to 3 hrs, with some breaks.

By 5 April choose one of the 4 final projects (projects solutions are strictly individual, even if your choices overlap):
1. Planetary system dynamics. Liapunov stability, statistical study.
2. N-body w/FFT grav: 2-component galaxy + BH (sinking satellite problem)
3. SPH - spiral density waves excited by a protoplanet in a gas disk
4. Mie theory of scattering with application to glories and specters.

Unlike assignments, the project should not be treated as a limited set of tasks, all specified in the text of the assignment. Show your initiative, curiosity and ambition, do interesting computations and tell us about them, even if they deviate from the suggested descriptions below.
A 8-10 page PDF writeup is also required, and is not the same as your presentation. Presentation is a slide show, while the writeup is like an essay or blog entry, starting from the background on the project that others might not know, methods you used, code details, tests showing that the code works well and returns meaningful results in some simple cases, then your results (estimate accuracy!) and their analysis to draw physical conclusions about the systems in nature. The writeup unlike the presentation needs to include the list of references.

1. Massively parallel study of planets named after a beer named after monks

See L11 for background on chaos in dynamical systems, and the concept of Liapunov exponent (or inverse of it, called Liapunov time). See previous lectures on 4th order symplectic integrator that you are required to use. See all CUDA materials on how to write the CUDA kernels that concurrently integrate K >> 1 slightly different systems (differing in initial positions and/or velocities of planets). The calculation should not be serial, i.e. multiple versions of a planetary system described below should be simulated concurrently on CPU or GPU. [The choice is yours, and may warrant a little experimentation first. Namely, it's not clear if a better overall throughput measured in number of simulated system-years per second of wall clock time, is achieved on CPU or GPU, bacause both calculations need to be done in double floating point precision & GPUs on art-2 are not as fast in double as in single precision]

Data for the fiducial (basic) setup are largely specified in the entry of the catalog (enter Trappist-1 into search box). Trappists are Belgian monks making a famous dark beer (visit LCBO if interested). No, they did not observe the exoplanets, but the graduate students doing observations apparently learned a lot about the Trappist beer during their studies. The system has been discovered recently, it has 7 planets, all of them Earth-class and a few in the habitable zone! Find out more about it on the net. Perturbations to the initial conditions away from the given set of parameters will be small and pseudo-random (if they were truly random, you would not be able to repeat the same integration, which you need to do to debug and verify your code).

If you haven't got the knowledge of Kepler problem (2 Body problem) from ASTC25 course to set up initial conditions of the simulation, read about Kepler equation and its iterative solution on the web and ask me questions. Assume that at the start of the calculation, the phase of the motion (fraction of full orbital period the body did from the last pericenter) *and* the time true anomaly theta (the angle you find in the equation of ellipse: r(theta) = a(1-e^2)/(1+e cos theta), are random numbers with uniform distributions (e.g., f between 0 and 2 pi rad). Both statements cannot be simultaneously strictly true, since the motion on ellipse is not uniform in time. But since the Trappist planets are on fairly circular orbits, the assumption is justified in practice. Analytically, i.e. with pen and paper, from the equation of ellipse cited above derive the instantaneous radial and transverse components of velocity vector at any theta, and recalculate to Cartesian coordinates if you use that system for calculations. A hint is that angular momentum per unit mass is a constant in the elliptic motion: r*r*d(theta)/dt = r*v_theta = const. = sqrt(GMa(1-e^2)). therefore v_theta follows very easily from the initial r and the parameters of the ellipse, and v_r, which contains derivative d(theta)/dt, is also easy to compute.

There are actually two ways to quantify the stability, one by watching for close encounters and/or escapes and finding the time at which the change occurs, the other by integrating the shadow systems with miniscule initial deviation of the positions and velocities, to find the Liapunov exponent. I would like you to use the first one and the 2nd as additional indicator if at all. We want to label a system "unstable" if at any point in time (you can check every time step or every now and then, not as often as every time step) planets approach to within 2 Roche lobe radii (see L12) of any of the approaching planets, or if a planet is located further than 2 times the initial distance between the star and outermost planet, or closer to the star than the initial radius of the innermost planet. Ideally, the calculation would then stop and a new variant of the system started. But it's up to you how to optimize & parallelize. Discribe everything in the writeup. Also, in any dynamical calculation describe how you chose the optimum constant time step dt, and how you tested the accuracy of your simulation.

2. N-body galactic encounter

This is essentially an N-body simulation with N on the order of 100 million, in which N stars in a target/host galaxy and in a smaller dwarf galaxy (that has just collided with the host galaxy) are feeling the attraction of all the other stars. Direct computation of all the interactions is too slow. All gravity forces needs to be evaluated on a 3d grid by FFT method already outlined in L10 (more details in L12). Zero-padding will be used to avoid aliasing; the size of full density array will be 512x512x512 or at least 512x512x256 (flat host galaxy and nearly coplanar initial orbit of the dwarf one allow a smaller resolution in "z" direction). Each of the two galaxies hosts a relatively massive particle in the center, representing an SMBH (supermassive black hole).

The objective is to write and test the FFT gravity solver first, then test the dynamic integrator (2nd order symplectic, leapfrog), show that both work accurately enough (time step for dynamics integrator must be optimized). Testing involves checking the conservation of those physical quantities that should be constant. Finally, you'll run the code for an extended period of the simulated time, and see what happens with the host galaxy, dwarf galaxy, and their SMBHs. There is no limit how much fun you can have doing the visualization. The minimum is to produce a series of snapshots in 2-D projection(s); plan maximum would be to do animations/videos.

3. SPH project

See lectures 10 and 11 on SPH, both nearest neighbor finding using linked lists, and the SPH algorithm to do gas dynamics.
The best description for the purpose fo implementation that I've found is this article in Annual Review of Astronomy and Astrophysics:
Smoothed Particle Hydrodynamics (1992).
In addition, here you have two other descriptions with more details, derivations and comments on implementation:
Monaghan (2005).     Monaghan (1997).
Description from Lagrangian point of view (optional material! I frankly cannot remember which book it is from) Chapter 3 (2010).
You study aresponse of to a gravitational perturbation of a disk of gas. In principle, you can either study any disk such as a disk of gas in a spiral galaxy, perturbed by an imposed, known low-amplitude spiral disturbance of gravity from a stellar disk (which you do not need to model, though in real research you would have to - in a way explained in the previous project!). I suggest that you study a protoplanetary gas disk disturbed by the gravity of an embedded protoplanet. You should program the planet to travel on a circle at a speed described by Kepler's laws, but intruduce it into calculation gradually, leaving time for the disk to relax both (i) the initial 'ringing' due to imperfectly calculated initial conditions of force balance, as well as (ii) the unovoidable oscillations purely due to rapid insertion of a perturber into the disk. Therefore, ramp the mass of a planet linearly up to the full mass (study 2 mass ratios = m/M = 0.001, 0.01) in time t=0...10 orbital periods. The planet should be in the middle of the disk at radius 1 (if you prefer to use correct physical units instead of rescaled non-dimensional coordinates, then 1 would mean 1 AU = 150e6 km). The disk should extend from 0.45 to 2 (AU). On demand I'll describe the disk setup in more detail. (The disk should be 3-D, and have z/r = 0.05 throughout, where z = vertical scale height.) Mass of the disk should not play a role. Disk should be vertically isothermal, its soundspeed should only vary with the cylindrical radial coordinate r. What maximum density contrast in the disk do you obtain after 10, 20, and 50 orbital periods of the planet?

4. Pilot's glory. Mie theory calculations.

Pilot's glory (a.k.a. Brocken Spectre; cf. wiki article on glories) is a colorful halo surrounding the shadow of an object (observer) cast on the cloud or fog consisting of suspended water particles. It results from a near-180 degrees backward scattering of sunlight off a collection of spherical water droplets in the air. (That's why it is seen surrounding a shadow).
Picture 1    picture 2
It can be predicted. And vice versa: from observation, in principle, we should be able to find out the distribution of droplet sizes corresponding to each picture. You can find more pictures here, as well as here , and choose one to explain.

In the phyd57/mie subdirectory you will find some working fortran subroutines such as mie.f90, predicting the spatial distribution of light scattered off a spherical water drop, for a given optical refraction index (or alternatively dielectric constants) of water, and ratio of the particle size to the wavelength of light. [There are in fact 2 separate inputs, radius of a particle and the wavelength of light, since the refractive index of water at a given wavelength can be computed by a separate little function or procedure.] Using that subroutine or translating it into a C(++) procedure, you will assemble a theoretical color image of pilot's glory using 200 separate wavelengths of visible light, and 100 different sizes of water droplets. (You can use Planck function with temperature 3700 K to model the spectrum of a slightly reddened sunlight.) The distribution of water drop sizes will be Gaussian, characterized by a mean size and half-width of distribution (total width will be +-3 half-widths). Fiducial mean size is lambda = 10 micrometers, and half-width of the distribution function is delta = 3 micrometers. Ideally, you will be able to adjust the two parameters of the size distribution until you find a reasonably good match to the chosen picture. At present, nobody knows what size distribution the cloud shown in the picture had. You may be the first person to know it.

If you are interested in Mie theory, although this is not strictly required to master for this project, here you have a book about it. There is a fortran code in the appendix, though my old code is a bit better :-)

Partial results

Please see the current tally



A cartoon from 1980s . Concept and history of bit (b). Please study it, and follow links to other important basic concepts in that short article, which you will be required to understand and define during exams and quizzes, such as: bitrate, bandwidth, data bus, 32- vs. 64-bit systems, ASCII code, pixels, various powers of 10 in computer lingo, and RAM memory. Note that this online encyclopedia is slightly outdated as (from 2004); in 2019 "current" hardware capabilities are a bit higher (pun intended). . Concept and history of byte (B). Another bunch of important basic terminology. . Are you interested in ENIGMA and its decoding before and during WWII?
Then you have to dig deeper than the related Hollywood movies, which are at times extremely short on real history (better cf. Max Hastings, "The secret War: Spies, Codes and Guerillas 1939-1945", London, 2015). ENIGMA machines were used by Hitler's military for coded communications between the headquarters and units in the field. ENIGMA decoding, first achieved in Poland by Marian Rejewski's team, may eventually have preserved millions of lives, by saving England from German invasion, tipping the balance in favor of Allies, and shortening the world war.

After the pre-war Polish cryptologic breakthroughs, hardware devices were built there, implementing the decoding algorithms called Bomba (cryptographic bomb; or "bombe" in top-secret U.S. Army reports of the time). Copies of ENIGMA machines were also built. Eventually an even faster hacking of the gradually more complex combinations of wheels inside ENIGMAs became necessary.

This was undertaken by British government using qualitatively the same approach as invented by the secret Polish Cipher Bureau. An electromechanical version of the Polish Bomba was built in Blatchely Park near London by an outstanding pioneer of computational science Alan Turing. He also created some statistical theory to aid this work.
The recent biographical movie on A. Turing and the British Enigma-cracking machinery titled "The imitation game" distorts the history, as his team is incorrectly given the sole credit for breaking ENIGMA code. In fact they were neither the first not the last to come up with essential breakthroughs. For instance, a half-forgotten team of brilliant engineers of the British Post Office headed by Thomas Flowers working around the clock for months constructed the specialized Colossus computer and transferred it to Turing's bureau. That device is shown in the movie, but was neither Turing's invention nor was tasked with deciphering ENIGMA codes! Instead, it dealt with other, non-ENIGMA German codes.

Even more powerful machines were developed during the WWWII by American government (Navy), in order to decipher the changed coding procedure used by German submarines. Such devices were the first computers, though not general-purpose electronic computers we have today. They were mostly electromechanical, that is to say based on relays and automatic switches.

As you see, computing from the beginning was and continues to be very very useful. Unfortunately these days it is often used for spying on... you. You do have a smartphone, right? Anyway, enjoy the four ENIGMAtic movies nicely broken down for you on this page . Computing history Computer science history, with further links

Supercomputers". Supercomputing stats. Explore the site, incl. the most recent list of the fastest supercomps in the world: .

• Article where about #1 Fugaku supercomputer is on our way to exaflop computing.

Python Introductions of all sorts to Python language for beginnig coders. is Python Language Reference straight from the horses mouth., Site with docs and resources on Python Tutorial on MatPlotlib and Pyplot Tutorial and user guide to NumPy Tutorial on Pyplot (part of Matplotlib based on MATLAB syntax).

Linux and its shells

• You are encouraged but not required to install Centos OS on your computer, although that is a better option. On art-2 there is an account for you, and you will be shown a password to it, if you later forget it please ask the prof in email. This system has Centos6 OS available for your exercises. To be able to do anything there, you need to learn basic Linux, and if you're connecting from a Windows system, to install an SSH (secure shell) client program PuTTY (or similar) on your machine. The ssh is the way to connect securely and work on a remote linux server, while sftp is a similar client that connects for the purpose of transferring files between two machines on internet. See more in the references cited on our page of references and literature. For instance, cf. p. 52+ in Membrey's book on CentOS. Getting Started with Linux, with links to tutorials. Linux info page, including some history pages and explanations of commands. Follow the links. . Programming in bash shell. . Programming in csh shell. . Shell scripting tutorial in different shells.

Languages This is your primary reference.

Click here . Python vs. Fortran. This discussion on Stackexchange (the often informative resource for coders) shds some light on what programmers think about differences between Python and high-level compiled languages - here Fortran (C/C++ is another).

This article proposes: "It certainly seems likely, in light of the above, that Fortran will remain the fastest option for numerical supercomputing for the foreseeable future—at least if 'fast' refers to the raw speed of compiled code. But there are other reasons for Fortran’s staying power."
I personally prefer Fortran for demanding research and Python and/or another scripting language IDL for small tasks and visualization. C/C++, in which (Unix and) Linux operating systems are written, is very useful too, as it sometimes provides the most direct way to tinker with graphics cards (GPUs) to make them do CUDA. [That is an inside joke for people coming from my part of Europe. CUDA is the extension of C/C++ (and Fortran) that directs data transfer to/from CPU and computation on GPU. Appropriately, the word "cuda" means "miracles" in Slavic languages and in Hungarian (csodak)]. We deal with such miracles in this course. C programs are easily callable from Python (as we will learn), so you can practice multi-language programming and interfacing. provides views similar to my own on the best scientific programming languages (Python+Fortran and C but not C++, Julia in the near future or now at graduate level). Page by Andrew Ning from Flight, Optimization, and Wind Laboratory. Julia language page . Advantages and a few gripes about Julia language Links to Fortran tutorials & other resources. Another Fortran tutorial. Interactive C tutorial. C Tutorial – Learn C Programming with examples. Also C++.

just for fun. Semi-serious stuff on computer languages. Description of a new language called BS. Combines the worst features of all existing languages.

Networking . First of 14 videos in a series on networking. See the next ones on YT.

Parallel computing - OpenMP OpenMP, "Hands-on intro to OpenMP", a slide show by Intel programmers from SC08, Austin, TX OMP: Introduction OMP: for loops OMP: Sections OMP: for loop scheduling (assignment of threads to pieces of work) OMP: Reduction OpenMP, manual by creators, full description OpenMP, examples in C(++) and Fortran. Lots of code fragments, with some advanced concepts., this page discusses the Performance Obstacles in fork-join parallelism using OpenMP.

Parallel computing - CUDA C and CUDA Fortran

Below you'll find a lot of internet resources. Remember to read materials on our extra page with pdf's (send me email if you forgot how to access it).

CUDA Fortran blog by Greg Ruetsch (author of one of recommended books on extra page.)
CUDA C blog by Mark Harris, NVIDIA
CUDA C presentation - different modes of usage, by Cyril Zeller, NVIDIA.
CUDA C Programming Guide. Detailed. Definitive (By Nvidia!). Read the first 30 pages.
CUFFT library description - useful when you need FFT. When mastered, the world of DSP is yours. Contains code examples.
CUDA nvcc compiler manual
Paper comparing CUDA in Fortran and C with CPU performance, as a function of job size.
Dr. Dobbs blog CUDA, Supercomputing for the Masses. Links to all 12 pieces. A bit old but informative.
CUDA Fortran compiler, PGI ver.18 User Guide
Fortran compiler, PGI ver.18 User Guide
Let's end with a GPU version of a program you already know. To be compiled with PGI compiler pgf90/95 (only that compiler understands CUDA Fortran on art-2):

Old and new technologies is a journal of HPC. - if you want to dig (a tiny little bit) below the surface level of buzzwords. The Coming Age of Extreme Heterogeneity ǀ Jeffrey Vetter. ORNL

• Overview of desktop supercomputing, past and present (as of 2021).

last modified: Dec 2021