Social Prediction Tasks for Machine Learning

Building a good challenge problem is a necessary condition to harness the machine learning community.

There are currently good tasks in images, speech, and computer games. There are almost no good tasks with strong social components.

The Problem

Machine learning benchmarks need:

  • Clear metrics
  • Large datasets
  • Tasks where progress is measurable

Social prediction is hard to benchmark because:

  • Ground truth is contested
  • Social systems react to predictions about them
  • Outcomes often take years to materialize

Potential Social Tasks

Tasks that use the future as the held-out set:

Academic paper outcomes

  • Reproduction success
  • Replication outcomes
  • Retraction prediction
  • Citation trajectories

Substance adoption patterns

  • Popularity of compounds on forums like Bluelight or Reddit
  • Declared illegal status
  • Wikipedia page view trajectories

Scientific knowledge updates

  • Which findings will be revised
  • Confidence interval shrinkage patterns
  • Cross-field citation patterns

Criteria for Good Social ML Tasks

  • Multiple tasks within the same domain: enables transfer learning research
  • Natural held-out sets: the future provides this for predictions
  • Clear feedback: outcomes must eventually become unambiguous
  • Scale: enough instances to train modern models
  • Resistance to gaming: predictions shouldn’t be self-fulfilling or self-defeating in ways that make the benchmark useless

The Meta-Challenge

The meta-challenge is that social prediction itself changes the social system. A model that predicts which papers will be retracted might cause those papers to be scrutinized more carefully—either preventing or accelerating retractions.

This reflexivity is both a bug (for benchmarking) and a feature (for impact). Building ML benchmarks that account for this remains an open problem.

Started on January 1, 2018