본문 바로가기
AI/Computer Vision Materials

8.1 Bag of Visual Words

by 쵸빙 2020. 7. 30.

In this lecture, we are going to learn 'Bag of Visual Words'.

 

 

 

classification pipeline

In last lecture, we learned how classical image classification pipeline looks.

 

What object do these parts belong to?

 

 

An object is consists of a collection of local features (bag-of-features)

Some local feature are very informative.

It can deal well with occlusion, scale invariant, rotation invariant.

 

 

 

spatial information can be ignored for object recognition

Spatial information of local features can be ignored for object recognition (i.e, verification)

 

CalTech6 dataset

'Bag of features' technology works pretty well for image-level classification.

It represents an image as a histogram over visual features.

 

 

Textons

Texure is characterized by the repetition of basic elements or textons.

 

For stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters.

 

histograms of tetons

Textures can be represented as histograms of texton.

Histogram representations are convenient for data retrival.

 

 

Vector Space Model

Vector Space Model(aka Bag-of-Words) is usually used in NLP.

A document (datapoint) is a vector of counts over each word (feature).

 

Vd represents a histogram over words.

n( · ) counts the number of occurrences.

 

To find out the similarity between two document, we can use any distance but the cosine distance is fast.

Cosine Distance

 

In our next lecture, we are going to learn 'BoW Classification'.

'AI > Computer Vision Materials' 카테고리의 다른 글

1.2 Application of computer vision  (0) 2020.07.31
1.1 What is Computer Vision?  (0) 2020.07.31
8.X TF-IDF  (0) 2020.07.31
8.2 BoW Classification  (0) 2020.07.30
8.0 Image classification  (0) 2020.07.30