Problems with Benchmarks

We’ve seen the possible problem of overfitting
- remember machine learning benchmarks?

Two common approaches are used
- benchmark libraries
  - should include hard problems and expand over time
- random problems
  - should include problems believed to be hard
  - allows unlimited test sets to be constructed
  - disallows “cheating” by hardwiring algorithms
  - so what’s the problem?

Previous slide Next slide Back to first slide View graphic version