Information Retrieval From Document Images

Caricato da

Dr.Mallikarjun Hangarge

Il 0% ha trovato utile questo documento (0 voti)

9 visualizzazioni13 pagine

Karnatak Arts, Science and Commerce College, Bidar

Copyright

Formati disponibili

PPTX, PDF, TXT o leggi online da Scribd

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Segnala questo documento

Karnatak Arts, Science and Commerce College, Bidar

Copyright:

Formati disponibili

Scarica in formato PPTX, PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Il 0% ha trovato utile questo documento (0 voti)

9 visualizzazioni13 pagine

Information Retrieval From Document Images

Caricato da

Dr.Mallikarjun Hangarge

Karnatak Arts, Science and Commerce College, Bidar

Copyright:

Formati disponibili

Scarica in formato PPTX, PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Salta alla pagina

Sei sulla pagina 1di 13

Cerca all'interno del documento

Dr.

Mallikarjun Hangarge
Document Image: Why?
• Paperless Solution
– Efficient transfer
– Organization
– Convenience
• Access to a variety of content
– Universal reader – email, attachments, spread
sheets
– Don’t need original applications
How Do We Acquire Document Image?
• Scanner
• Camera
• Smart Phones
Where we find ?

Everywhere
What we can do with them?
• Can we Access it?
– Search
– Browse
– “Read”
• Index and Retrieve them?
In their basic form not really!
• We can
– View
– Print
– Not much else
Why?
1. Image ID
Query
2. Structure
Documents
3. Decomposition
4. Handwriting
Layout
Similarity
Ranked 5. Stamps/ Logos
Results
6. Zone Classificatio
Images
w/Text
Genre Class
Classification Results

Page Document Handprint Line

Enhancement
Classification Images Detection

Hand
Signature
Noise Page Detection
Decomposition

Images Zone
Machine Segmentation
w/o Text Labeling

Stamp and Logo

Graphics Detection

< .5 .25-3 1-3 1-3

Target Processing Speed in Seconds
OCR
• Limited OCR
– Extensive History ( someone else work)
– Engineering Oriented ( Hindi OCR, Bangla OCR)
– Needs significant Normalization
• Degraded Imagery
• Our Priority
– Unknown heterogeneous content
– Graphics( Logo/ Seal) and Handwritten content
• Focus on Development of system for Classification and
Retrieval of Heterogeneous Documents
Traditional Approach
• Conversion/OCR + Text Retrieval
– Advantages :
• Gets support from prior Information retrieval research
• Text Retrieval optimized using against relevance
– Disadvantages : Text retrieval is only as good as OCR
• OCR accuracy varies widely
• Poor quality of Images or Unique fonts
• Character, word or line segmentation is hard
• Cant Process Graphics Objects
• Page Structure Difficult to Preserve
Why (document) image retrieval ?
• Wide use of scanners created Huge document
image collection:
– Historical documents : books, letters, Manuscripts
– Paperless Office : Memos , Letters, Notes, forms etc..
– For this all Images are common format
– Increase in amount of content
• Research challenge :
– How to make this content available in searchable
format for users ?
Current Trends in Document Image
Retrieval
• Search directly against pixel content :
– Page structure : Layout Analysis/ classification
– Graphical Objects : Logos, Signature, seals
– Named Entity , Date field
– Word Spotting
• Drawbacks
– Don’t allow free form queries
– Algorithms tasted on limited data sets and don’t scale
– Technique is not compared to text retrieval
– No Techniques tested for user relevance
– Most techniques use training and recognition
Large Scale Image access
• Retrieval :
– Features
– Indexing
– Querying and Retrieval
• Our Contributions
– Testing potential of Global and Local Attributes for
Word Retrieval from Kannada Documents
– Logo Detection in Document Images
Word Spotting
Logo Detection and Retrieval

K C Santosh et al. 2015

Potrebbero piacerti anche

Feature Extraction: Dr. Mallikarjun Hangarge
Documento17 pagine
Feature Extraction: Dr. Mallikarjun Hangarge
Dr.Mallikarjun Hangarge
Nessuna valutazione finora
Object Oriented Programming
Documento16 pagine
Object Oriented Programming
Dr.Mallikarjun Hangarge
Nessuna valutazione finora
Inheritance
Documento7 pagine
Inheritance
Dr.Mallikarjun Hangarge
Nessuna valutazione finora
Multilingual Document Image Analysis
Documento23 pagine
Multilingual Document Image Analysis
Dr.Mallikarjun Hangarge
Nessuna valutazione finora
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Da Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Valutazione: 4 su 5 stelle
4/5 (5794)
The Little Book of Hygge: Danish Secrets to Happy Living
Da Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Valutazione: 3.5 su 5 stelle
3.5/5 (400)
Shoe Dog: A Memoir by the Creator of Nike
Da Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Valutazione: 4.5 su 5 stelle
4.5/5 (537)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Da Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Valutazione: 4 su 5 stelle
4/5 (895)
The Yellow House: A Memoir (2019 National Book Award Winner)
Da Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Valutazione: 4 su 5 stelle
4/5 (98)
The Emperor of All Maladies: A Biography of Cancer
Da Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Valutazione: 4.5 su 5 stelle
4.5/5 (271)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Da Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Valutazione: 3.5 su 5 stelle
3.5/5 (231)
Never Split the Difference: Negotiating As If Your Life Depended On It
Da Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Valutazione: 4.5 su 5 stelle
4.5/5 (838)
Grit: The Power of Passion and Perseverance
Da Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Valutazione: 4 su 5 stelle
4/5 (588)
On Fire: The (Burning) Case for a Green New Deal
Da Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Valutazione: 4 su 5 stelle
4/5 (74)
Yes Please
Da Everand
Yes Please
Amy Poehler
Valutazione: 4 su 5 stelle
4/5 (1891)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Da Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Valutazione: 4.5 su 5 stelle
4.5/5 (474)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Da Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Valutazione: 4.5 su 5 stelle
4.5/5 (266)
The Unwinding: An Inner History of the New America
Da Everand
The Unwinding: An Inner History of the New America
George Packer
Valutazione: 4 su 5 stelle
4/5 (45)
Fear: Trump in the White House
Da Everand
Fear: Trump in the White House
Bob Woodward
Valutazione: 3.5 su 5 stelle
3.5/5 (738)
Team of Rivals: The Political Genius of Abraham Lincoln
Da Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Valutazione: 4.5 su 5 stelle
4.5/5 (234)
Principles: Life and Work
Da Everand
Principles: Life and Work
Ray Dalio
Valutazione: 4 su 5 stelle
4/5 (599)
Angela's Ashes: A Memoir
Da Everand
Angela's Ashes: A Memoir
Frank McCourt
Valutazione: 4.5 su 5 stelle
4.5/5 (440)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Da Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Valutazione: 3.5 su 5 stelle
3.5/5 (2259)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Da Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
Valutazione: 4 su 5 stelle
4/5 (1090)
Rise of ISIS: A Threat We Can't Ignore
Da Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Valutazione: 3.5 su 5 stelle
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Da Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Valutazione: 4.5 su 5 stelle
4.5/5 (344)
Bad Feminist: Essays
Da Everand
Bad Feminist: Essays
Roxane Gay
Valutazione: 4 su 5 stelle
4/5 (1015)
Steve Jobs
Da Everand
Steve Jobs
Walter Isaacson
Valutazione: 4.5 su 5 stelle
4.5/5 (806)
John Adams
Da Everand
John Adams
David McCullough
Valutazione: 4.5 su 5 stelle
4.5/5 (2409)
The Glass Castle: A Memoir
Da Everand
The Glass Castle: A Memoir
Jeannette Walls
Valutazione: 4.5 su 5 stelle
4.5/5 (1712)
The Outsider: A Novel
Da Everand
The Outsider: A Novel
Stephen King
Valutazione: 4 su 5 stelle
4/5 (1839)
The Light Between Oceans: A Novel
Da Everand
The Light Between Oceans: A Novel
M.L. Stedman
Valutazione: 4.5 su 5 stelle
4.5/5 (789)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Da Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Valutazione: 4.5 su 5 stelle
4.5/5 (121)
Brooklyn: A Novel
Da Everand
Brooklyn: A Novel
Colm Toibin
Valutazione: 3.5 su 5 stelle
3.5/5 (1937)
The Woman in Cabin 10
Da Everand
The Woman in Cabin 10
Ruth Ware
Valutazione: 3.5 su 5 stelle
3.5/5 (2322)
A Man Called Ove: A Novel
Da Everand
A Man Called Ove: A Novel
Fredrik Backman
Valutazione: 4.5 su 5 stelle
4.5/5 (4609)
Little Women
Da Everand
Little Women
Louisa May Alcott
Valutazione: 4 su 5 stelle
4/5 (104)
Manhattan Beach: A Novel
Da Everand
Manhattan Beach: A Novel
Jennifer Egan
Valutazione: 3.5 su 5 stelle
3.5/5 (792)
Wolf Hall: A Novel
Da Everand
Wolf Hall: A Novel
Hilary Mantel
Valutazione: 4 su 5 stelle
4/5 (3811)
The Perks of Being a Wallflower
Da Everand
The Perks of Being a Wallflower
Stephen Chbosky
Valutazione: 4.5 su 5 stelle
4.5/5 (2103)
The Art of Racing in the Rain: A Novel
Da Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Valutazione: 4 su 5 stelle
4/5 (4200)
A Tree Grows in Brooklyn
Da Everand
A Tree Grows in Brooklyn
Betty Smith
Valutazione: 4.5 su 5 stelle
4.5/5 (1929)
Sing, Unburied, Sing: A Novel
Da Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Valutazione: 4 su 5 stelle
4/5 (1103)
Her Body and Other Parties: Stories
Da Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Valutazione: 4 su 5 stelle
4/5 (821)
The Constant Gardener: A Novel
Da Everand
The Constant Gardener: A Novel
John le Carré
Valutazione: 3.5 su 5 stelle
3.5/5 (104)
SJ 9 PDF
Documento52 pagine
SJ 9 PDF
Nicolas X Flores
Nessuna valutazione finora
90-Day Performance Review
Documento2 pagine
90-Day Performance Review
Khan Mohammad Mahmud Hasan
Nessuna valutazione finora
Analog Servo Drive: Description Power Range
Documento9 pagine
Analog Servo Drive: Description Power Range
Electromate
Nessuna valutazione finora
J
Documento4 pagine
J
api-299173937
Nessuna valutazione finora
IP10G-CLI User Guide Version 6.7 March2011
Documento124 pagine
IP10G-CLI User Guide Version 6.7 March2011
JorgIVariuS
100% (1)
6.9. How To Send SSL-Encrypted Email
Documento3 pagine
6.9. How To Send SSL-Encrypted Email
Jox
Nessuna valutazione finora
Test AND Measurement: Eagle Photonics
Documento90 pagine
Test AND Measurement: Eagle Photonics
Pankaj Sharma
Nessuna valutazione finora
Pet CRD CRPDN Adc MBBL M
Documento125 pagine
Pet CRD CRPDN Adc MBBL M
pmella
Nessuna valutazione finora
Uber Strategy Teardown
Documento44 pagine
Uber Strategy Teardown
skouti9
100% (3)
PT-1971B Recomendado Spare Partes E900
Documento18 pagine
PT-1971B Recomendado Spare Partes E900
Juan Carlos Muñoz
Nessuna valutazione finora
Sany S800S Review
Documento11 pagine
Sany S800S Review
Juna Pasaribu
Nessuna valutazione finora
Machine Guarding PDF
Documento133 pagine
Machine Guarding PDF
brsharma
Nessuna valutazione finora
Assignment (Module 1) Introduction
Documento3 pagine
Assignment (Module 1) Introduction
kunal kumar
100% (1)
GM 302 - Update - 10-2004
Documento26 pagine
GM 302 - Update - 10-2004
Naser Jahangiri
Nessuna valutazione finora
Fab - Y-Strainers - 2nd Edition
Documento15 pagine
Fab - Y-Strainers - 2nd Edition
Richard V. Quispe Castillon
Nessuna valutazione finora
Option N Modbus TCP IP
Documento32 pagine
Option N Modbus TCP IP
José Miguel Echeverria Heredia
Nessuna valutazione finora
Thermal Plasma Tech
Documento4 pagine
Thermal Plasma Tech
johnribar
Nessuna valutazione finora
Lista
Documento1 pagina
Lista
RonaldhinoFranko
Nessuna valutazione finora
Reefer Container
Documento23 pagine
Reefer Container
thole
Nessuna valutazione finora
COVID-19 Vaccination Appointment Details: Center
Documento1 pagina
COVID-19 Vaccination Appointment Details: Center
Mohit Mittal
Nessuna valutazione finora
Fiber-Optic Sensing A Historical Perspective
Documento15 pagine
Fiber-Optic Sensing A Historical Perspective
Anonymous Shv3RwsWnr
Nessuna valutazione finora
EVS
Documento7 pagine
EVS
subhas9804009247
Nessuna valutazione finora
Svcet: Unit IV Traveling Waves On Transmission Line
Documento21 pagine
Svcet: Unit IV Traveling Waves On Transmission Line
Deepak Cool
Nessuna valutazione finora
Summer Practice Report Format For Ce
Documento8 pagine
Summer Practice Report Format For Ce
sohrab
Nessuna valutazione finora
How McAfee Took First Step To Master Data Management Success
Documento3 pagine
How McAfee Took First Step To Master Data Management Success
First San Francisco Partners
Nessuna valutazione finora
12 Smart Steps To Building A New Hatchery
Documento2 pagine
12 Smart Steps To Building A New Hatchery
tayyabab
Nessuna valutazione finora
VIJAY
Documento61 pagine
VIJAY
YOGESH CHAUHAN
Nessuna valutazione finora
Digital Marketing Course India Syllabus
Documento34 pagine
Digital Marketing Course India Syllabus
Amit Kumar
Nessuna valutazione finora
Eaton Fuller FS-4205A Transmission Parts Manual
Documento18 pagine
Eaton Fuller FS-4205A Transmission Parts Manual
Juan Diego Vergel Rangel
Nessuna valutazione finora
Design For Reliability
Documento32 pagine
Design For Reliability
Arman Custodio
Nessuna valutazione finora