ParticleTracker, Julia vs Cython vs Numba

@namurphy @rocco8773 @KhalilBryant (please tag anyone who was there as well!),

In yesterday’s notes you guys wrote:

  • Particle Tracker
    * What is easier to package cython vs Julia?
    * Nick and I will review in tandem
    * Which is faster.
    * How difficult is it to put Julia in a pip-installed package?
    * Topic came up because crew at BaPSF are looking into developing a Particle Tracker

And while I couldn’t be there in person, I think it’s a good call to keep this discussion up.

  • Cython will be easier to package for a Python package. It’s not even close - the ecosystem is stable. I do still have to read up on PyJulia, and I’ve seen some packages include it, so it looks like it’s getting there.
  • In a vacuum, Julia code is way easier to write than Cython code is.
  • However, we’re not in a vacuum and some of our functionality is already in Python (mostly magnetostatics, etc). This is additional code we’d have to transfer to Julia, or call from Julia using PyCall.jl (at the risk of them being slow-ish).
  • At the same time, I’m pretty sure Cython is going to be slower than Julia would, because the Cython compiler is not all that smart to optimize stuff, while a ton of effort is going into doing that in Julia.
  • It might be a local news bubble effect, but it seems like everyone’s moving away from writing low level C code; writing stuff in Cython feels like doing exactly that.

There is a third option - numba, that might have been missing in that discussion. Numba’s neat because

  • VERY easy packaging nowadays
  • The code isn’t significantly harder to write
  • Goes well with existing code

But it has its own set of issues, the chief of which is that you often can’t call Python code from JIT code, so you can’t do

with numba.jit:  # not real syntax but bear with me for a minute
    for timestep in iterations:
         interpolate_fields_to_particles()    # currently on scipy.interpolate
         push_particles()                     # JIT atm
         scatter_particle_charge()            # hypothetical code for PIC 
         recalculate_fields()                 # likewise

but instead you have to do this in a pure Python loop. This means that, instead of dropping from the Python interpreter into JIT code once, you drop in and out of it twice per loop execution. This is going to do terrible things to our performance, and I don’t want to do it that way.

So… I don’t know what the best call here is. There isn’t that much code we have in Python that ParticleTracker would rely on (like I said, magnetostatics mostly). Might be best to rewrite this stuff, honestly.

@stanczakdominik Thanks for posting this. Yeah, I didn’t think of numba until after the fact. I guess we really have three options at the moment:

  1. Cython
  2. Julia
  3. numba

In order to decide which to use I think we need to weigh the following:

  1. performance
  2. ease of package integration
    A. difficulty of coding
    B. packaging for deployment
    C. number of required supporting pkgs (via pip & independent)
  3. ease of use for users
    A. what extras do users need to install (e.g. cython needs gcc but I think numpy also needs that so users will already have it…I may be wrong on this)
    B. can users stay withing the plasmapy API

This is not an exhaustive list, just the first things that come to mind. @stanczakdominik How do you think we should proceed? I’m thinking we should go about trying to answer all the above points for the current three options.

Good call. I’ll try to incrementally do overviews for the options over the next few days. I’m decidedly non-hyped up for Cython, though, but it might be a decent idea to rationally evaluate its feasability. Still - we might be short on people to implement it if we go that way.

1 Like

I found this blog post on cython vs numba, but it’s pretty old. 2015. The gist I got from it, cython is easier to distribute, they’re on part to code with, and cython is more type restrictive. That is, if you write a cython function to work w/ 1D arrays then that’s all it will work with (without jumping thru hoops). Numba’s compile just in time allows for more type flexibility. Again, it’s an old post so things have probably evolved a little.

Yeah, 5 years is a long time - looks like the Julia folks were just releasing 0.4 (we’re now at 1.5), and I’m pretty sure that wasn’t nearly as usable.

Before getting into the choice of languages itself, let’s define the scope, first. We want to:

For a given set of particle positions and velocities and external fields:

  • in a time loop:
    1. gather fields at particle locations through one or more of the following:
      • directly specified analytical functions
      • existing Magnetostatics objects (magnetic coils)
      • fields defined on grids (through interpolation)
    2. move and push the particles, updating the velocities and position values for the particles
    3. (optionally, not at every timestep) save the particles’ positions and velocities to some storage (disk?)
    4. (optionally, if/once we go into PIC):
      1. basically histogram the particles (multiplied by shape functions) to grab the charge/current densities;
      2. solve Maxwell’s equations

Once the loop is over, we want to look at the particles’ position and velocity history, maybe plot it, anyway - we want to access them as arrays. Depending on how often we save, These arrays may become larger than memory, so it’d be helpful to read them back in using something like xarray - we want lazy loading/chunking, and the underlying Dask integration would handle that.

I’m also assuming that you’re bound by your machine’s memory for the number of particles in your simulation. We could sort of sidestep that with something like Dask, but let’s skip that for now.

That’s how I see it atm.

1 Like

That sounds like a good test bench procedure to me! I do think the saving to disk will be the bottleneck in the entire process. It would probably be useful to take the temporary memory penalty and copy the array in memory and then save that array to disk as the particle tracker continues on it’s way. How much of the time history to you plan to keep in memory? All of it? Would it be an optional argument?

My suspicion, Julia, Cython, and Numba will not have significantly different execution times. They both seem to get close to the execution time of C. I think the deciding factors for the package will be (1) ease of coding, (2) ease of integration into plasmapy, (3) ease of distribution, and (4) ease of use on the user end. But, well find out!

Yeah, I’m definitely thinking of an interation or time step between trajectory dumps to disk being an optional argument.

There are some existing solutions for an easy way of dumping things like that to disk, the one that comes to mind is https://zarr.readthedocs.io/en/stable/tutorial.html#persistent-arrays. Could be handy.

Anyway,

Numba

performance

  • the particle stepper is pretty easy to write using Numba. However, given that the particle stepper is trivially parallelizable, I don’t actually expect that part to be very slow anyway. It’s the simplest thing to write using just numpy, regardless even of the chosen stepper algorithm.
  • IIRC I have been able to improve the runtimes for magnetostatics with it in the closed particle tracker PR (I’d have to rerun my tests); it came at the cost of some pretty damn ugly hacks, though. Things might be better now that jitclasses have improved a little, but…
  • I would suspect that interpolating the fields to the particles could actually be made faster using Numba rather than using Scipy, at the cost of rewriting some of that code. It would then play nicer with the rest of the functionality, and we could do do something like “between dumps to disk, run the fragment of the full computational loop inside of numba” for an added speed boost.

ease of package integration

difficulty of coding

Basically, you can’t write methods with numba. There are tricks and ways around it - @staticmethods can be JITted - but that includes some awfully ugly hacks.

I think the way to go here might be mixing in a little of Julia style. For the example of calculating the magnetic field from a MagnetoStatic element at a point:

  • have a fast numba private _function that takes arrays and does the computational lifting;
  • in the method of the MagnetoStatic, call that function, do unit handling etc;
  • inside of ParticleTracker, call the raw function.

But this might get complicated, because in Julia you could use the awesome power of multiple dispatch to do this:

calculate_B(point_in_space, wire::GeneralWire) = # use general biot savart on current and position arrays of the wire
calculate_B(point_in_space, wire::MagneticDipole) = # use analytical expression

and then you just call iterate over the MagnetoStatics we have in the simulation and, regardless of their type, call calculate_B. But we can only have a single function under that name, and we can’t specialize on types… there may be ways around this, but I’m not sure yet.

packaging for deployment

Whenever I used it, it was just a matter of pip installing numba and that was it, so I think it’s just a matter of adding it to requirements . Numba uses LLVMLite to do the compilation, and that was difficult to install without conda in the past, but it’s gotten much better.

However, I don’t know what the state of things would be on windows - gcc could be an issue.

ease of use for users

I don’t foresee difficulties beyond the above; but I might have an above average pain threshold for handling installs etc, so I’m not the right person to ask about this! :sweat_smile:

Okay, I’ve been putting off continuing this thread for way too long now.

The truth is, I don’t feel experienced in Cython enough to give that comparison the depth it deserves. So I’m just going to say I’m not doing it.

If we do decide to go with Numba, as I think we should, I thought I should link this here:

It’s a PIC code that runs on Numba, and targets GPUs, even. I think going through their docs and could be a good idea to pick up some design patterns for that.