Chapter 1 - Introduction to Computer Vision
Chapter Goals
- Define what computer vision is
- Understand how it connects with related disciplines
- Explore real-world applications
- Preview the book's approach: physical models, statistical methods, deep learning
Key Concepts
- Computer Vision: interpreting images using algorithms
- Input: images or video
- Output: structured data (labels, geometry, etc.)
- Physical Modeling: optics and geometry of image formation
- Statistical Methods: inference, probability, machine learning
- Deep Learning: neural networks for recognition and perception
Real-World Applications
- Photography: panoramas, HDR, beautification
- Medical Imaging: segmentation, tumor detection
- Autonomous Vehicles: lanes, pedestrians, signs
- Augmented Reality: pose tracking, anchoring
- Surveillance: tracking, behavior analysis
- Robotics: object detection, scene understanding
Relation to Other Fields
- Graphics: inverse process (graphics creates images; vision interprets)
- Image Processing: basis for filtering and edge detection
- AI / ML: essential for recognition, segmentation
- Robotics: visual input for interaction
- Psychology: inspiration from human vision
Challenges
- Ambiguity: 2D to multiple 3D interpretations
- Lighting variation and occlusion
- Dataset limitations and labeling
- Generalizing across domains
Book Approach
- Combines modeling, optimization, and learning
- Builds modularly from pixels to recognition
- Uses real-world datasets and benchmarks
- Emphasizes reproducibility and practicality
Classroom Add-on
Mini Quiz:
1. What makes CV different from image processing?
2. Name two applications needing geometry.
3. How is CV connected to AI?
Suggested Exercise:
- Take a photo and describe what objects are visible
- How would a computer process them?
- What challenges might it face?