Skip to content
Basheer Subei edited this page Aug 20, 2015 · 77 revisions

See this issue for current progress on the REPL feature.

Summary

The goal is to incorporate interactive features into the Swift/T language and environment, so users could interactively run Swift code and monitor its progress, making it easier to learn Swift and prototype Swift code.

Feature List

  • To run and evaluate multiple Swift/T code snippets.
  • History of commands (In and Out)
  • Code-completion
  • Syntax Highlighting
  • Interactive Namespace that contains variables, function definitions, etc.
  • Session logging and restoring
  • "magic" functions
    • %run to run scripts
    • %reset to reset namespace
    • Monitoring commands
      • overview status including active servers and workers.
      • status of running tasks (like ps)
      • view all data members and inspect them while tasks are running in the background
  • System shell commands (different from Swift app functions, probably just run in IPython)
  • Notebook (web application)

See IPython's main features for reference.

List of tasks to complete:

  • STC -c flag to omit generating boilerplate.
  • (not sure about this?) Have the -c STC flag generate unique names (functions and globals) for tic code, maybe using prefixes. When running evaling -O0 tic code with multiple workers, the other worker ranks crash and say invalid command name for the SOFT, HARD proc names. I'm assuming that's because they were redefined using uplevel #0.
  • REPL should load user-input tic file (only proc definitions and global var assignments), then spawn tasks (max priority) for all other workers to do the same. Then (barrier or not), it should execute any calls in main of the tic.
  • prompt currently only takes commands once. Make it a proper REPL (loops infinitely), exits when given %EXIT command
  • previous user input (scripts) are separate from subsequent commands.
    • dict of global_vars gets rewritten every time a command runs. Use appending instead of rewriting.
    • import statements have to be stored and reused implicitly during subsequent commands.
    • function definitions are not (actually they are, in the tic) sent to all workers, also they need to be reused implicitly on subsequent script calls. Possibly use ANTLR Python in the jupyter kernel to parse functions and save them.
    • function pruning is not disabled when both -c and -O0 flags are on.
    • subsequent scripts cannot access globals from previous scripts
      • add extern keyword to STC so it declares a variable (does not allocate or store value in them). Does not generate declare_globals or turbine::store* call.
      • STC should ignore undefined variable errors for externly declared variables.
      • keep track of all global variables so far, and implicitly prepend all global variable extern declaration for every new user script.
      • Remove the four globals HARD, SOFT, RANK, and NODE when -c flag is used. Decided to remove them manually in Tcl REPL.
    • extern statements still cause STC errors inside for loops and other blocks (maybe functions)
    • how would a user redefine global variables and functions in subsequent scripts?
  • deal with multiline (using %END for now) properly. taken care of by Jupyter's client-to-kernel mechanism (each line is sent to be compiled).
  • Jupyter kernel for Swift/T
    • kernel should start up a turbine instance that runs the REPL in a (blocking) worker.
    • set up two pipes between the kernel and the REPL.
    • the kernel writes any input it gets to the pipe, and the REPL executes user script coming from pipe.
      • turbine repl stuck on first stdin (looping over and over). Check stdin buffers in tcl, maybe we have to flush it(?). Also check fconfigure.
    • REPL needs to write back all the output to the pipe, so that the kernel can print it out to stdout.
    • kernel acts as a socket server to send then receive to turbine repl.
    • turbine repl acts as a socket client to receive then send back to kernel.
      • redirect turbine stdout to the socket client connection to send it back to kernel safely. Also need to figure out how to output that nicely to Jupyter client (stdout might work in terminal, but not in a webapp). Tasks running in the background might print stuff out, we can't asynchronously send that to the client (maybe there's no solution for this, but that's why we have %ls command)
      • turbine repl should handle errors gracefully and close socket properly.
      • kernel should also close or restart repl if it fails.
    • fix kernel shutdown to terminate turbine (zombie processes).
    • once that's done, we can implement magic commands.
    • deciding whether to continue input or to send to kernel's do_execute(). TEMPORARY FIX DONE
    • syntax highlighting (reuse ANTLR grammar)
    • code completion
    • figure out how to pass number of processes to turbine command when a kernel starts
  • ls command to view state of all global vars. Use the global_map (Tcl dict) that gets created whenever create_globals is called (Tim's new commits). Also needs below task.
  • only create TDs once (on worker 0), then create globals_map on all workers (put tasks), and set the variable references on all workers (put tasks).
  • double check the multicreate permanent thing.
  • make STC use multicreate_repl instead of declare_globals when -c is passed.
  • the local references to the global TDs are being lost at some point along the way
  • stdout doesn't work when multiple turbine workers are used (even on worker 0).
  • double-check the put task priority thing (make sure the "put to all workers" gives max priority)
  • more formal definition of use cases. Look at things like IPython and how user interacts. Also look at in situ workflows for HPC.
  • proper solutions for the four builtin location globals in STC, so that object files work.
  • having the REPL (infinite loop) running on rank 0 means that worker is unable to do any work besides REPL. So there need to be more than one worker (3 ranks minimum). Is this a problem?
  • ps command to look at status of all ranks
    • possibly also display running/waiting tasks

Notes on IPython and Jupyter:

  • Look at Notebooks. Text files which have embeddable/runnable chunks of code that produce output (perhaps even visual). So maybe instead of a REPL, it's more like a notebook (like Mathematica).
  • "IPython has abstracted and extended the notion of a traditional Read-Evaluate-Print Loop (REPL) environment by decoupling the evaluation into its own process. We call this process a kernel: it receives execution instructions from clients and communicates the results back to them. This decoupling allows us to have several clients connected to the same kernel, and even allows clients and kernels to live on different machines."
  • look at the list of "magics" that IPython uses (meta commands like my %ls).
  • look at the simpler wrapper kernels for Python and possibly using something like Pexpect to start/control the Turbine session
  • look at how other compiled languages implement REPL, like cling
  • look at GNU Readline as an easier basic alternative
  • IPython has a lot of cool stuff on parallel computing