Next Year

Objectives

What AI solutions can be used on what problems?
Processing large datasets?
Model debugging strategies?

Social impact:

Feedback loops
Biases

Meta:

Easy for students to predict the class structure.
Make it easy for me to give feedback.

Knowledge

Kinds of AI systems
Components of systems
Mathematical basis (what is backprop, how it works, etc.)
How outputs depend on data
Worldviews of data ethics
History and trajectory of AI?

Skills

Select an approach to solve a problem
Map problem characteristics onto approach elements
Implement approach in code

Dispositions

Treat data as both valuable and flawed
Treat people as more than just data sources or targets to manipulate
Investigate where data came from
- "you credit the source, but the source doesn't credit their sources at all. So we can't acknowledge the artists who created these skins (thankfulness, humility), have any recourse if some of the skins are inappropriate (accountability), or easily add new skins (replicability)."
Think through how a system will affect people.
Think about what the data doesn't include, what the algorithm ignores or de-emphasizes.
Repeat experiments.
Notice and document the decisions you're making.
Seek understanding of what a system gets wrong (rather than just reporting numbers)
Try variations
experiment-tracking mechanism? model storage?

High level

Start with applications. Gradually deepen the code and concepts.
Intentionally overlap application areas and even specific datasets between DATA 202 and CS 344.
The material on interpretability and fairness/bias could have come much earlier in the semester.
Specifically connect with Reformed tradition

Projects

Remind students about the drivetrain approach for projects
Ask for a minimal deliverable and a stretch deliverable.
Check with Pat and with 396/398 if there are elements of project structure that we could deliberately practice.

Materials

The fast.ai textbook doesn’t encourage enough spending time with data (EDA)
Use more of the HuggingFace stuff instead? Sequence modeling?
Is there a canonical simple thing to do with RL? Perhaps we could start with 3 weeks on applications, then spiral into details?

Logistics

Simplify, reduce number of platforms (most common student feedback!)

Don't bother auto-pushing content to student repos.
Use a single compute platform (Colab?)
Use a single submission platform (Moodle?)
Projects: meet earlier, get more specific
AI Lab Hours

Analyze a system for two weeks:

First week, what problem does it solve and how.
Second week: social and ethical perspectives. Make sure we articulate both sides clearly.

Possible systems

Facebook news feed
YouTube recommender
Facial recognition
Prediction of life events (recidivism, loans, admissions, insurance)
Google search
Predictive text keyboard
Text to speech and vice versa
GAN
Background blurring

Potential Resources

Applied Machine Learning course at Cornell Tech
https://incidentdatabase.ai/
Interactive AI:
- Agency plus automation: Designing artificial intelligence into interactive systems
- Ben S Human-Centered AI IUI 2021 tutorial

Potential Activities

Play with latent space / generation (projection, latent space math, )
Vision to text encoder decoder—with paraphrase setup both ways
Write an explanation, share it with your mom, describe how it went, improve the explanation.
Actually have a small-group discussion about a perspectival topic (deliberately trying to express a range of perspectives) and summarize it (record it?).
Instead of fighting with querying Bing (ch2), just collect a small set of images by hand / from personal photo library

Partially prepped materials I could return to

Translation encoder-decoder (was Lab 5)
Trebuchet
Team Elo
Transformers from scratch
(see my Colab for more)

Potential Fundamentals

Logistic regression in NN (as same thing as linreg, just different loss and activation). Explicitly draw the connections between the sklearn approach and this one.
Overfitting a batch
Tabular models with fastai
CNN model input/output shapes; last-layer representation shape
Some sort of VAE
LM on image tokens

Low-Level Stuff

More visualization tools.
- Loss plots
- Data exploration, e.g., class distribution
- A "quick snapshot" of a dataset?
  - examples, clustered using pretrained embeddings?
- Visualizing the results of learning:
Batch size -- can use a bigger model.
sklearn 011: Make sure to ask about validation set (See Andrew Feikema's submission)
Reporting: clearer delineation of problem statement vs student answer.
More data wrangling?
Clarify to students what limitations means (e.g., "time to finish the project" is not a limitation)
If using Colab: share the working directory with me? Use underscores instead of dashes in filenames.

Project Debrief

For a team project: what was their role?
What's your 60-second self-reflection on your performance on the project?
Ask students to describe their process; what was challenging and how overcame

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

next_year.md

next_year.md

Next Year

Objectives

Knowledge

Skills

Dispositions

High level

Projects

Materials

Logistics

Potential Resources

Potential Activities

Partially prepped materials I could return to

Potential Fundamentals

Low-Level Stuff

Project Debrief

Files

next_year.md

Latest commit

History

next_year.md

File metadata and controls

Next Year

Objectives

Knowledge

Skills

Dispositions

High level

Projects

Materials

Logistics

Potential Resources

Potential Activities

Partially prepped materials I could return to

Potential Fundamentals

Low-Level Stuff

Project Debrief