Jupyterhub Tutorial Readthedocs Io en Latest
Jupyterhub Tutorial Readthedocs Io en Latest
Tutorial Documentation
Release 1.0
Project Jupyter
1 JupyterHub References 3
1.1 JupyterHub Cheatsheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Timeline of tutorial video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Tutorial notebooks 7
2.1 Getting Started with JupyterHub Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Custom Authenticators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 JupyterHub Spawners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 JupyterHub’s API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
i
ii
Getting Started with JupyterHub Tutorial Documentation, Release 1.0
Contents 1
Getting Started with JupyterHub Tutorial Documentation, Release 1.0
2 Contents
CHAPTER 1
JupyterHub References
Project
JupyterHub
• GitHub
• Documentation online
• PDF download
Tutorial
Authenticators
3
Getting Started with JupyterHub Tutorial Documentation, Release 1.0
Spawners
Batchspawner - “Custom Spawner for Jupyterhub to start servers in batch scheduled systems” https://github.com/
jupyterhub/batchspawner
Dockerspawner - “Enables JupyterHub to spawn user servers in docker containers.” https://github.com/jupyterhub/
dockerspawner
Sudospawner - “enables JupyterHub to spawn single-user servers without being root, by spawning an intermediate
process via sudo, which takes actions on behalf of the user” https://github.com/jupyterhub/sudospawner
Proxy
Deployment examples
0:04:17 JupyterHub
0:05:41 Login
0:05:55 Spawner
0:06:27 Proxy
0:07:11 Redirect user
0:07:17 Browser to ask hub for auth
[Additional reading for overview] (http://jupyterhub.readthedocs.io/en/latest/getting-started.html#overview)
1.2.5 Authenticators
• Client Secret
• ./env -> export the variables
0:30:44 Tell Jupyter to use oauthenticator
0:32:48 Sign in with GitHub
0:34:20 Specifying users
• PAM ok
• GitHub probably not ok
• user whitelist - put in a python set in config file
• admin users - put in a python set in config file
0:36:26 Jupyterhub Custom Authenticators
• PAM - form based fairly simple
• Secure Authenticator
• jupyterhub hashing salted functions
Tutorial notebooks
2.1.1 Resources
• JupyterHub Documentation
• the PDF of PyData London 2016 slidedeck
• the video on YouTube of PyData London 2016 tutorial
• Timeline of video
Introduction
Overview of JupyterHub
0:04:17 JupyterHub
0:05:41 Login
0:05:55 Spawner
0:06:27 Proxy
7
Getting Started with JupyterHub Tutorial Documentation, Release 1.0
Installation
JupyterHub Defaults
Configuration of Hub
Authentication
Custom Authenticators
Spawners - DockerSpawner
Custom Spawners
Deployment
Wrap Up
1:11:00 Q & A
1:13:00 Simula deployment with persistence in Hub
Args:
handler (tornado.web.RequestHandler): the current request handler
data (dict): The formdata of the login form.
The default form has 'username' and 'password' fields.
Returns:
username (str or None): The username of the authenticated user,
or None if Authentication failed
[0;31mFile:[0m ~/dev/jpy/jupyterhub/jupyterhub/auth.py
[0;31mType:[0m function
PAM calls out to a library with the given username and password:
In [3]: PAMAuthenticator.authenticate??
[0;31mSignature:[0m PAMAuthenticator.authenticate(self, handler, data)
[0;31mSource:[0m
@gen.coroutine
def authenticate(self, handler, data):
"""Authenticate with PAM, and return the username if login is successful.
"""
username = data['username']
try:
pamela.authenticate(username, data['password'], service=self.service)
except pamela.PAMError as e:
if handler is not None:
self.log.warning("PAM Authentication failed (%s@%s): %s", username, handler.request.r
else:
self.log.warning("PAM Authentication failed: %s", e)
else:
return username
[0;31mFile:[0m ~/dev/jpy/jupyterhub/jupyterhub/auth.py
[0;31mType:[0m function
Here’s a super advanced Authenticator that does very secure password verification:
In [4]: class SuperSecureAuthenticator(Authenticator):
def authenticate(self, handler, data):
username = data['username']
# check password:
if data['username'] == data['password']:
return username
2.2.1 Exercise:
- load_state
- get_state
- start
- stop
- poll
"""
db = Any()
user = Any()
hub = Any()
authenticator = Any()
api_token = Unicode()
ip = Unicode('127.0.0.1',
help="The IP address (or hostname) the single-user server should listen on"
).tag(config=True)
start_timeout = Integer(60,
help="""Timeout (in seconds) before giving up on the spawner.
This is the timeout for start to return, not the timeout for the server to respond.
Callers of spawner.start will assume that startup has failed if it takes longer than this.
start should return when the server process is started and its location is known.
"""
).tag(config=True)
http_timeout = Integer(30,
help="""Timeout (in seconds) before giving up on a spawned HTTP server
Once a server has successfully been spawned, this is the amount of time
we wait before assuming that the server is unable to accept
connections.
"""
).tag(config=True)
poll_interval = Integer(30,
help="""Interval (in seconds) on which to poll the spawner."""
).tag(config=True)
_callbacks = List()
_poll_callback = Any()
debug = Bool(False,
help="Enable debug-logging of the single-user server"
).tag(config=True)
For example:
This should coerce form data into the structure expected by self.user_options,
which must be a dict.
Instances will receive this data on self.user_options, after passing through this function,
prior to `Spawner.start`.
"""
return form_data
env_keep = List([
'PATH',
'PYTHONPATH',
'CONDA_ROOT',
'CONDA_DEFAULT_ENV',
'VIRTUAL_ENV',
'LANG',
'LC_ALL',
],
help="Whitelist of environment variables for the subprocess to inherit"
).tag(config=True)
env = Dict(help="""Deprecated: use Spawner.get_env or Spawner.environment
environment = Dict(
help="""Environment variables to load for the Spawner.
cmd = Command(['jupyterhub-singleuser'],
help="""The command used for starting notebooks."""
).tag(config=True)
args = List(Unicode(),
help="""Extra arguments to be passed to the single-user server"""
).tag(config=True)
notebook_dir = Unicode('',
help="""The notebook directory for the single-user server
default_url = Unicode('',
help="""The default URL for the single-user server.
disable_user_config = Bool(False,
help="""Disable per-user configuration of single-user servers.
See Also
--------
get_state, clear_state
"""
pass
def get_state(self):
"""store the state necessary for load_state
Returns
-------
state: dict
a JSONable dict of state
"""
state = {}
return state
def clear_state(self):
"""clear any state that should be cleared when the process stops
State that should be preserved across server instances should not be cleared.
def get_env(self):
"""Return the environment dict to use for the Spawner.
env['JPY_API_TOKEN'] = self.api_token
return env
def get_args(self):
"""Return the arguments to be passed after self.cmd"""
args = [
'--user=%s' % self.user.name,
'--port=%i' % self.user.server.port,
'--cookie-name=%s' % self.user.server.cookie_name,
'--base-url=%s' % self.user.server.base_url,
'--hub-host=%s' % self.hub.host,
'--hub-prefix=%s' % self.hub.server.base_url,
'--hub-api-url=%s' % self.hub.api_url,
]
if self.ip:
args.append('--ip=%s' % self.ip)
if self.notebook_dir:
self.notebook_dir = self.notebook_dir.replace("%U",self.user.name)
args.append('--notebook-dir=%s' % self.notebook_dir)
if self.default_url:
self.default_url = self.default_url.replace("%U",self.user.name)
args.append('--NotebookApp.default_url=%s' % self.default_url)
if self.debug:
args.append('--debug')
if self.disable_user_config:
args.append('--disable-user-config')
args.extend(self.args)
return args
@gen.coroutine
def start(self):
"""Start the single-user process"""
raise NotImplementedError("Override in subclass. Must be a Tornado gen.coroutine.")
@gen.coroutine
def stop(self, now=False):
"""Stop the single-user process"""
raise NotImplementedError("Override in subclass. Must be a Tornado gen.coroutine.")
@gen.coroutine
def poll(self):
"""Check if the single-user process is running
def stop_polling(self):
"""stop the periodic poll"""
if self._poll_callback:
self._poll_callback.stop()
self._poll_callback = None
def start_polling(self):
"""Start polling periodically
Explicit termination via the stop method will not trigger the callbacks.
"""
if self.poll_interval <= 0:
self.log.debug("Not polling subprocess")
return
else:
self.log.debug("Polling subprocess every %is", self.poll_interval)
self.stop_polling()
self._poll_callback = PeriodicCallback(
self.poll_and_notify,
1e3 * self.poll_interval
)
self._poll_callback.start()
@gen.coroutine
def poll_and_notify(self):
"""Used as a callback to periodically poll the process,
and notify any watchers
"""
status = yield self.poll()
if status is None:
# still running, nothing to do here
return
self.stop_polling()
death_interval = Float(0.1)
@gen.coroutine
def wait_for_death(self, timeout=10):
"""wait for the process to die, up to timeout seconds"""
for i in range(int(timeout / self.death_interval)):
status = yield self.poll()
if status is not None:
break
else:
yield gen.sleep(self.death_interval)
File: ~/conda/envs/jupyterhub-tutorial/lib/python3.5/site-packages/jupyterhub/spawner.py
Type: MetaHasTraits
Start is the key method in a Spawner. It’s how we decide how to start the process that will become the single-user
server:
In [3]: LocalProcessSpawner.start??
Signature: LocalProcessSpawner.start(self)
Source:
@gen.coroutine
def start(self):
"""Start the process"""
if self.ip:
self.user.server.ip = self.ip
self.user.server.port = random_port()
cmd = []
env = self.get_env()
cmd.extend(self.cmd)
cmd.extend(self.get_args())
File: ~/conda/envs/jupyterhub-tutorial/lib/python3.5/site-packages/jupyterhub/spawner.py
Type: function
Here is an example of a spawner that allows specifying extra arguments to pass to a user’s notebook server, via
.options_form. It results in a form like this:
class DemoFormSpawner(LocalProcessSpawner):
@default('options_form')
def _options_form(self):
default_env = "YOURNAME=%s\n" % self.user.name
return """
<label for="args">Extra notebook CLI arguments</label>
<input name="args" placeholder="e.g. --debug"></input>
""".format(env=default_env)
def get_args(self):
"""Return arguments to pass to the notebook server"""
argv = super().get_args()
if self.user_options.get('argv'):
argv.extend(self.user_options['argv'])
return argv
def get_env(self):
"""Return environment variable dict"""
env = super().get_env()
return env
2.3.1 Exercise:
Write a custom Spawner that allows users to specify environment variables to load into their server.
"""
container = yield self.get_container()
if container is None:
image = image or self.container_image
if not self.use_internal_ip:
host_config['port_bindings'] = {8888: (self.container_ip,)}
host_config.update(self.extra_host_config)
if extra_host_config:
host_config.update(extra_host_config)
host_config = self.client.create_host_config(**host_config)
create_kwargs.setdefault('host_config', {}).update(host_config)
else:
self.log.info(
"Found existing container '%s' (id: %s)",
self.container_name, self.container_id[:7])
File: ~/conda/envs/jupyterhub-tutorial/lib/python3.5/site-packages/dockerspawner/dockerspawner.p
Type: function
2.3.2 Exercise:
Subclass DockerSpawner so that users can specify via options_form what docker image to use.
Candidates from the Jupyter docker-stacks repo include:
• jupyter/minimal-singleuser
• jupyter/scipy-singleuser
• jupyter/r-singleuser
• jupyter/datascience-singleuser
• jupyter/pyspark-singleuser
Or, build your own images with
FROM jupyterhub/singleuser
The easiest version will assume that the images are fetched already.
Subclass DockerSpawner so that users can specify via options_form a GitHub repository to clone and install, a la
binder.
• genindex
• modindex
• search
23