Benvenuto in Scribd!

Reducing Computational Cost: - Nearest-Neighbors Has O (N) Complexity

Caricato da

Il 0% ha trovato utile questo documento (0 voti)

17 visualizzazioni20 pagine

The document discusses techniques for speeding up nearest neighbor searches for large datasets, including k-d trees and locality sensitive hashing. K-d trees are a binary tree data structure that partitions space using hyperplanes along different dimensions to speed up searches. Locality sensitive hashing works by projecting data into a low-dimensional binary space such that similar items have a higher probability of colliding in hash space. Decision trees can also be used to build interpretable classifiers by recursively splitting the data space based on attribute tests, similar to a game of 20 questions.

Descrizione originale:

cs189

Titolo originale

dtrees1

Copyright

Formati disponibili

PDF, TXT o leggi online da Scribd

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Segnala questo documento

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Il 0% ha trovato utile questo documento (0 voti)

17 visualizzazioni20 pagine

Reducing Computational Cost: - Nearest-Neighbors Has O (N) Complexity

Caricato da

Sid Naik

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Salta alla pagina

Sei sulla pagina 1di 20

Cerca all'interno del documento

Reducing Computational Cost

Nearest-neighbors has O(N) complexity

Infeasible for large datasets

Can we speed it up?

Think of guessing a number between 1 and 10

K-d tree
K-d tree is a binary tree data structure for organizing a set of points in
a K-dimensional space.
Each internal node is associated with an axis aligned hyper-plane
splitting its associated points into two sub-trees.
Dimensions with high variance are chosen first.
Position of the splitting hyper-plane is chosen as the mean/median of
the projected points balanced tree.
l
1

l1 6

3
1

l10

2
11

Images: Anna Atramentov

K-d tree construction

Simple 2D example
4

l1 6

7
l6

5
9

l8 3
1

l10 10

2
11

l8
1

2
3

l10

11
9

8
10

Slide credit: Anna Atramentov

K-d tree query

l1 6

q
l2

5
9

l8 3
1

l10 10

2
11

l8
1

2
3

l10

11
9

8
10

Slide credit: Anna Atramentov

K-d tree: Backtracking

Backtracking is necessary as the true nearest neighbor
may not lie in the query cell.
But in some cases, almost all cells need to be inspected.

Figure: A. Moore

Do we need axis-aligned hyperplanes?

Normal unit vector r defines is a hyperplane separating the space

For any point x, define:

Hashing by Random Projections

Take random projections of data
Quantize each projection with few bits
101

Feature vector

0
Fergus et al

1
0

Locality Sensitive Hashing

The basic idea behind LSH is to project the data
into a low-dimensional binary (Hamming) space;
that is, each data point is mapped to a b-bit
vector, called the hash key.
Unlike normal hashing, here we want our
hashes to cluster create collisions
Each hash function h must satisfy the locality
sensitive hashing property:
Where
In our case:
Kristen Grauman et al

[0, 1] is the similarity function.

Datar, N. Immorlica, P. Indyk, and V. Mirrokni. Locality-Sensitive

Hashing Scheme Based on p-Stable Distributions. In SOCG, 2004.

Approximate Nearest-Neighbor Search

A set of data points

N
Hash function

hr1rk
Xi

Search the hash table

for a small set of images

<< N
Hash table

110101

hr1rk

110111
111101

New query

[Kristen Grauman et al]

results

Decision Trees

In kd-tree, we looked at the data x, but not

the labels y.
How can we use the labels?

Benefits of Decision Trees

Feature selection
interpretable result
What would the decision regions look like?
Somewhere between kNN and linear classifiers

20 Questions Game

Building a Decision Tree

1. Decide on the best attribute on which to split
the data
2. Long live recursion!!!

Decision trees for Classification

Training time = find good set of questions
Construct the tree, i.e. pick the questions at each node
of the tree. Typically done so as to make each of the
child nodes purer (lower entropy). Each leaf node
will be associated with a set of training examples

Test time
Evaluate the tree by sequentially evaluating questions,
starting from the root node. Once a particular leaf node
is reached, we predict the class to be the one with the
most examples (from training set) at this node.

Potrebbero piacerti anche

Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
Documento54 pagine
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
Pedro Jesús García Ramos
Nessuna valutazione finora
4FF 2024
Documento82 pagine
4FF 2024
Lilit
Nessuna valutazione finora
Hardening Function For Large Scale Distributed Computations: Doug Szajda Barry Lawson Jason Owen
Documento39 pagine
Hardening Function For Large Scale Distributed Computations: Doug Szajda Barry Lawson Jason Owen
dszajda
Nessuna valutazione finora
Algorithms: The Basic Methods Instance-Based Learning: Data Science
Documento17 pagine
Algorithms: The Basic Methods Instance-Based Learning: Data Science
Ragini Gupta
Nessuna valutazione finora
Chapter 5 Clustering
Documento40 pagine
Chapter 5 Clustering
Mohamedsultan Awol
Nessuna valutazione finora
Accelerated Data Science Introduction To Machine Learning Algorithms
Documento37 pagine
Accelerated Data Science Introduction To Machine Learning Algorithms
sanketjaiswal
Nessuna valutazione finora
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
Documento102 pagine
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
Carlos Niger Diaz
Nessuna valutazione finora
Data Mining CH - 5
Documento18 pagine
Data Mining CH - 5
Hasset Tiss Abay Genji
Nessuna valutazione finora
Data Mining: I Gede Mahendra Darmawiguna
Documento25 pagine
Data Mining: I Gede Mahendra Darmawiguna
Bitboxk
Nessuna valutazione finora
Advanced Topics in Data Mining Special Focus: Social Networks
Documento35 pagine
Advanced Topics in Data Mining Special Focus: Social Networks
smjain
Nessuna valutazione finora
Unsupervis Ed Learning Clustering Associatio N Analysis Supervised Learning Classificati On
Documento19 pagine
Unsupervis Ed Learning Clustering Associatio N Analysis Supervised Learning Classificati On
Chanpreet Singh
Nessuna valutazione finora
1694600817-Unit2.3 KNN CU 2.0
Documento25 pagine
1694600817-Unit2.3 KNN CU 2.0
prime9316586191
Nessuna valutazione finora
Cluster
Documento66 pagine
Cluster
Cristhian Danilo Cardenas
Nessuna valutazione finora
04-Data Reduction New
Documento39 pagine
04-Data Reduction New
Lakshmi Priya B
Nessuna valutazione finora
09 ClassificationPCA
Documento80 pagine
09 ClassificationPCA
Shubhendu Karmakar
Nessuna valutazione finora
Multidimensional Index Structures
Documento70 pagine
Multidimensional Index Structures
Vishal Mittal
Nessuna valutazione finora
Lab # 12 K-Nearest Neighbor (KNN) Algorithm: Objective
Documento5 pagine
Lab # 12 K-Nearest Neighbor (KNN) Algorithm: Objective
Rana Arsalan Ali
Nessuna valutazione finora
DADM S15 K-NN Classification
Documento13 pagine
DADM S15 K-NN Classification
Anisha Sapra
Nessuna valutazione finora
Image Matching: - Alok Talekar - Sairam Sundaresan
Documento70 pagine
Image Matching: - Alok Talekar - Sairam Sundaresan
Sukma Nata Permana
Nessuna valutazione finora
Clustering. Computational Journalism Week 2
Documento41 pagine
Clustering. Computational Journalism Week 2
Jonathan Stray
Nessuna valutazione finora
Lect 4
Documento34 pagine
Lect 4
yoursweetseptember
Nessuna valutazione finora
Instance Based Learning
Documento39 pagine
Instance Based Learning
Darshan R Gowda
Nessuna valutazione finora
9.54 Class 13: Unsupervised Learning
Documento54 pagine
9.54 Class 13: Unsupervised Learning
GrantMwakipunda
Nessuna valutazione finora
DM Chapter 5 (Clustering)
Documento40 pagine
DM Chapter 5 (Clustering)
world channel
Nessuna valutazione finora
DMW Unit-V
Documento47 pagine
DMW Unit-V
Ravindra Pawar
Nessuna valutazione finora
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
Documento50 pagine
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
sartg
Nessuna valutazione finora
Predictive Analytics - Session 3: Associate Professor Ole Maneesoonthorn
Documento30 pagine
Predictive Analytics - Session 3: Associate Professor Ole Maneesoonthorn
Sope Dalley
Nessuna valutazione finora
Data Struct Algo Exam2 Review2
Documento1 pagina
Data Struct Algo Exam2 Review2
gauron87
Nessuna valutazione finora
6 - KNN Classifier
Documento10 pagine
6 - KNN Classifier
Carlos Lopez
Nessuna valutazione finora
Data Acquisition
Documento28 pagine
Data Acquisition
vkv.foe
Nessuna valutazione finora
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
Documento18 pagine
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
Hiino
Nessuna valutazione finora
Clustering K-Means
Documento28 pagine
Clustering K-Means
Faysal Ahammed
Nessuna valutazione finora
Mean-Shift Tracking: R.Collins, CSE, PSU CSE598G Spring 2006
Documento93 pagine
Mean-Shift Tracking: R.Collins, CSE, PSU CSE598G Spring 2006
nguyenduong994
Nessuna valutazione finora
Ma/Csse 473 Day 28: Optimal Bsts
Documento29 pagine
Ma/Csse 473 Day 28: Optimal Bsts
William
Nessuna valutazione finora
K Means
Documento36 pagine
K Means
Saurabh Mishra
Nessuna valutazione finora
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
Documento13 pagine
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
「瞳」你分享
Nessuna valutazione finora
Evolutionary Clustering
Documento32 pagine
Evolutionary Clustering
Adam Asshidiq
Nessuna valutazione finora
Lecture 8-9 - Clustering
Documento43 pagine
Lecture 8-9 - Clustering
johndeuterok
Nessuna valutazione finora
Deep Learning Basics Lecture 7 Factor Analysis
Documento32 pagine
Deep Learning Basics Lecture 7 Factor Analysis
baris
Nessuna valutazione finora
06 Cluster Analysis
Documento34 pagine
06 Cluster Analysis
hawariya abel
Nessuna valutazione finora
Unit#8 - Top - Most Popular DS Algorithms
Documento11 pagine
Unit#8 - Top - Most Popular DS Algorithms
Tanveer Ahmed Hakro
Nessuna valutazione finora
4.kNN Concepts
Documento12 pagine
4.kNN Concepts
Suyash Jain
Nessuna valutazione finora
K Nearest Neighbors: Probably A Duck."
Documento14 pagine
K Nearest Neighbors: Probably A Duck."
Daneil Radcliffe
Nessuna valutazione finora
Computer Vision: Spring 2006 15-385,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm - 4:20pm
Documento34 pagine
Computer Vision: Spring 2006 15-385,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm - 4:20pm
ramnbantun
Nessuna valutazione finora
ML Assignment No. 3: 3.1 Title
Documento6 pagine
ML Assignment No. 3: 3.1 Title
laxman
Nessuna valutazione finora
Object Recognition
Documento43 pagine
Object Recognition
A J
Nessuna valutazione finora
Machine Learning Technique K-Nearest Neighbors (K-NN) : Dr. Gaurav Dixit
Documento7 pagine
Machine Learning Technique K-Nearest Neighbors (K-NN) : Dr. Gaurav Dixit
Aniket Sujay
Nessuna valutazione finora
EE769-11 Dimension Reduction
Documento16 pagine
EE769-11 Dimension Reduction
fun world
Nessuna valutazione finora
TOD 212 - Digging Through Data - PPT - For Students - Monsoon 2023 (Autosaved)
Documento18 pagine
TOD 212 - Digging Through Data - PPT - For Students - Monsoon 2023 (Autosaved)
dhyani.s
Nessuna valutazione finora
COS 423, Spring 2013: Heory OF Lgorithms
Documento13 pagine
COS 423, Spring 2013: Heory OF Lgorithms
Hiep Nguyen
Nessuna valutazione finora
Learning With Hadoop Based Data Mining: - A Case Study On Mapreduce
Documento38 pagine
Learning With Hadoop Based Data Mining: - A Case Study On Mapreduce
Wan Na Prommanop
Nessuna valutazione finora
Lecture 22 - K-Nearnest Neighbours
Documento11 pagine
Lecture 22 - K-Nearnest Neighbours
bscs-20f-0009
Nessuna valutazione finora
A Complete Guide To KNN
Documento16 pagine
A Complete Guide To KNN
cinculiranje
Nessuna valutazione finora
CV Lecture 7
Documento119 pagine
CV Lecture 7
Lovely doll
Nessuna valutazione finora
Learning Types ML
Documento18 pagine
Learning Types ML
21124059
Nessuna valutazione finora
Gap Statistic
Documento32 pagine
Gap Statistic
Kikie Goguma Gyu
Nessuna valutazione finora
ML Application in Signal Processing and Communication Engineering
Documento27 pagine
ML Application in Signal Processing and Communication Engineering
aniruddh nain
Nessuna valutazione finora
K-Nearest Neighbour (KNN)
Documento14 pagine
K-Nearest Neighbour (KNN)
Muhammad Haroon
Nessuna valutazione finora
K Nearest Neighbor Algorithm: Fundamentals and Applications
Da Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
Nessuna valutazione finora
Simple Data Science (R)
Da Everand
Simple Data Science (R)
Narayana Nemani
Valutazione: 5 su 5 stelle
5/5 (1)
Techmo Bowl Instructions
Documento10 pagine
Techmo Bowl Instructions
Sid Naik
Nessuna valutazione finora
Cs170 Solutions Manual
Documento66 pagine
Cs170 Solutions Manual
Sid Naik
Nessuna valutazione finora
Akai MPC-1000 E3 Owners Manual
Documento104 pagine
Akai MPC-1000 E3 Owners Manual
kevmoonstarr
Nessuna valutazione finora
FFlab
Documento3 pagine
FFlab
Sid Naik
Nessuna valutazione finora
Grid World - Part2 and Part3 - Siddharth Naik
Documento6 pagine
Grid World - Part2 and Part3 - Siddharth Naik
Sid Naik
Nessuna valutazione finora
All Quiet On The Western Front
Documento1 pagina
All Quiet On The Western Front
Sid Naik
Nessuna valutazione finora
OHT Estimates 50000 Ltrs
Documento59 pagine
OHT Estimates 50000 Ltrs
Sandgrouse Raj
Nessuna valutazione finora
Annex A - Technical Specifications
Documento52 pagine
Annex A - Technical Specifications
Nikko Montoya
Nessuna valutazione finora
Agc-4 DRH 4189340686 Uk
Documento222 pagine
Agc-4 DRH 4189340686 Uk
GiangDo
Nessuna valutazione finora
Technical Information Sheet: General Information: ISO 12944 TI - G 9 / Usa
Documento6 pagine
Technical Information Sheet: General Information: ISO 12944 TI - G 9 / Usa
Bash Mat
Nessuna valutazione finora
Guide To Using In-Situ Tensile Pull-Off Tests To Evaluate Bond of Concrete Surface Materials
Documento12 pagine
Guide To Using In-Situ Tensile Pull-Off Tests To Evaluate Bond of Concrete Surface Materials
Mauricio Javier León Tejada
Nessuna valutazione finora
Patrol Box Plans
Documento6 pagine
Patrol Box Plans
PePe Projects
Nessuna valutazione finora
(4.5.0 ZULU Beta) (DUMP ALL) BTFL - Cli - 20230916 - 172153
Documento27 pagine
(4.5.0 ZULU Beta) (DUMP ALL) BTFL - Cli - 20230916 - 172153
Dan Multi
Nessuna valutazione finora
Keystone - GR Series
Documento16 pagine
Keystone - GR Series
Mohd Khairi Mohd Norzian
Nessuna valutazione finora
BS en 50483 6 2009
Documento27 pagine
BS en 50483 6 2009
Shara Logistic
Nessuna valutazione finora
92 - Summary of Items Discussed in 4 - 2021 ADF On 13.8.2021
Documento20 pagine
92 - Summary of Items Discussed in 4 - 2021 ADF On 13.8.2021
trickygg
Nessuna valutazione finora
Curriculum Vitae: Dr.N.PRABHU, M.SC., M.Phil, PH.D
Documento6 pagine
Curriculum Vitae: Dr.N.PRABHU, M.SC., M.Phil, PH.D
Pavan
Nessuna valutazione finora
Projeto Experimental Fatorial para Aumentar A Produção de Metano Na Digestão de Resíduos Lácteos
Documento7 pagine
Projeto Experimental Fatorial para Aumentar A Produção de Metano Na Digestão de Resíduos Lácteos
Luís Paulo Cardoso
Nessuna valutazione finora
Battery Catalogue
Documento6 pagine
Battery Catalogue
rantaro
Nessuna valutazione finora
PCMX Data Eng 01
Documento13 pagine
PCMX Data Eng 01
spam
Nessuna valutazione finora
Material Handling System and ASRS PDF
Documento13 pagine
Material Handling System and ASRS PDF
Harsh Patel
Nessuna valutazione finora
Training Courses
Documento16 pagine
Training Courses
Feroz Khan
Nessuna valutazione finora
Introduction To Paint
Documento12 pagine
Introduction To Paint
Miracle Uzoma
Nessuna valutazione finora
DW-143 - Leakage Factor As Per Calculation Formula
Documento2 pagine
DW-143 - Leakage Factor As Per Calculation Formula
sandeep7426
Nessuna valutazione finora
Guinness Case Study
Documento3 pagine
Guinness Case Study
Munyaradzi T Hoto
Nessuna valutazione finora
Amadeus Web Services
Documento2 pagine
Amadeus Web Services
Boris Choi
Nessuna valutazione finora
CH 19
Documento147 pagine
CH 19
Kathy Yella
Nessuna valutazione finora
ICE Annex A Methodologies For Recycling
Documento17 pagine
ICE Annex A Methodologies For Recycling
Chai Lin Nyok
Nessuna valutazione finora
Vector VII 104-561kVA User Manual 6.6.1
Documento165 pagine
Vector VII 104-561kVA User Manual 6.6.1
juliocanel2009
100% (3)
Li-Fi Industries Communication Using Laser Media in Open Space
Documento61 pagine
Li-Fi Industries Communication Using Laser Media in Open Space
VinothKumar
100% (1)
How To Draw and Read Line Diagrams Onboard Ships?: Share
Documento9 pagine
How To Draw and Read Line Diagrams Onboard Ships?: Share
Shaif uddin rifat
Nessuna valutazione finora
Designs of Canals and CM&CD Works
Documento61 pagine
Designs of Canals and CM&CD Works
VenkataLakshmiKorrapati
Nessuna valutazione finora
Mitsubishi Electric Industrial Robots: Communication Middleware (Melfarxm - Ocx) Instruction Manual
Documento152 pagine
Mitsubishi Electric Industrial Robots: Communication Middleware (Melfarxm - Ocx) Instruction Manual
Rafael Gago
Nessuna valutazione finora
A Business Intelligence Framework For The Future
Documento10 pagine
A Business Intelligence Framework For The Future
mcalbala
100% (1)
Bhanu Brose Excursion Report
Documento6 pagine
Bhanu Brose Excursion Report
Bhanu Pratap Singh
Nessuna valutazione finora
Hydraulics - Civil Engineering Questions and Answers Page 3
Documento6 pagine
Hydraulics - Civil Engineering Questions and Answers Page 3
umair
Nessuna valutazione finora