Sei sulla pagina 1di 46

10 Things you didnt know about KTM

David.Wright@kofax.com Solution Enablement Specialist

What is KTM?

What is KTM?

Kofax Answer to Document Drudgery


3

What is KTM?

Kofax Intelligent Document Recognition Solution/Toolkit/Platform


4

The Golden Rule of KTM

Automation?
User productivity?
6

Benefits of User Productivity

Wholesaler opens its 17th store


Pan European Wholesaler Manual processing without Kofax After 3 months of Accuracy effort by PS After 2 weeks of User Productivity effort by PS
7

invoices/person /day 800 1200 2500

Productivity Improvement Improvement

+50% >3x

The Fallacy of OCR Accuracy


What OCR accuracy do you have? What is the straight-through processing rate? How much can we automate? 85% straight-through processing
23 fields 99.29% field accuracy 6 chars/field 99.89% character accuracy What is the cost of the other 15%?

You will lose this deal against an OCR Provider because

this deal is being fought over features and tech, and not business value
8

Productivity vs Automation

Productivity Documents/person/day User focused Business value Optimizing corebusiness processes Usability/comfort 8hrs/day Saving $ Limit =
9

Automation Accuracy Numbers technology focused Impossible to convert to ROI Technology Diminishing returns Limit = 100%

Anyone can do KTM

Classify Separate Extract Folder Validate Learning


10

All you need is paper and highlighters

Classify Separate Extract Folder Validate


11

Doing KTM by hand. Paper to Excel.

12

Classic

vs

Quantum

Schrdinger

Newton

Einstein

Bohr
13

God doesnt play dice. Spooky Action at a Distance

14

Programmable vs Learning Software Deterministic, logic, rules, Laws, order Probabilistic, data-driven, Machine learning

Analytics
15

Transition from Determinism to Data-driven/Fuzzy/Quantum

Physics Mathematics Computer Science Business


16

1890 1920
(Classical to Quantum)

1931
(a system cannot demonstrate its own consistency, Kurt Gdels incompleteness axiom)

1970 1990
(machine learning, neural networks, speech recognition, machine translation)

2000 2020
(Big data, analytics, learning systems)

EUs Human Brain Project & USAs BRAIN Initiative in 2013


10 billion from the EU over 10 years to build a

human brain simulator to push forward brain research and test brain diseases.
100M$ from US government to revolutionize our

understanding of the human mind and uncover new ways to treat, prevent, and cure brain disorders

You dont program it, it learns


17

Dont program KTM, teach it

18

Robodog will bite you

19

Doing KTM by hand. Paper to Excel.

20

Field Analysis
File 1.tif 2.tif 3.tif 4.tif 5.tif 6.tif 7.tif 8.tif 9.tif 10.tif 11.tif Class CapaLote Duplicata Duplicata Duplicata Duplicata Duplicata Capa 123987-2012 123987-2012 123987-2012 123987-2012 123987-2012 123987-2012 852147-A 1489/1 3112230U 4012391 3065357 F 65357 194.580 000.022.875 112230 194.562 60.000,00 07.248.659/0001-03 15.963,57 17.155.342/0003-45 2.195,30 86.438.280/0001-30 81.045,00 10.932.276/0001-61 1009,11 80.089.964/0001-97 7.981,39 80.089.964/0001-97 48.741,92 76.777.556/0001-50 32.650,74 56.990.526/0001-10 2.195,30 86.438.280/0001-30 7.454,92 03.364.370/0001-46 15/02/2013 01/12/2011 22/12/2011 14/12/2011 27/10/2011 15/02/2013 21/01/2012 19/01/2012 23/01/2012 21/09/2012 Y Y Y Y Y Nro_DOC NOTA CPFC EMIS VENC EMIT

Nota Fiscal 123987-2012 Nota Fiscal 123987-2012 Nota Fiscal 123987-2012 Nota Fiscal 123987-2012 Nota Fiscal 123987-2012

21

Overview of fields to extract What a customer typically gives


field format number of characters Document type Loss x New ones have no ref-nr! x Payment Rates multiinvoice Possilby x Multiple more than x Ref-Nr per one per doc Document only for only for validation, validation, x if existing if existing only for only for validation, validation, x if existing if existing Budget Invoice x Validation with

Reference number Debtor Last Name Debtor First Name Debtor Street Debtor House number Debtor Address2 Debtor PostCode Debtor City

numeric

6-7

CIP database CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr CIP database in combination with RefNr. If there is no other number in database, then manual validation.

Text

unlimited

Text

unlimited

Text

unlimited

numeric

unlimited

Text

unlimited

x only for Swiss only validation, if existing only for validation, x if existing only for validation, if existing only for validation, if existing

numeric

Swiss only Swiss only

Text

unlimited

Debtor Telephone

numeric

+42 more rows


22

The most successful KTM projects focus on the user.


Make your users happy and content.

KTM is their workplace all day every day.

It is the place of encounter and collaboration

between human and robot.

24

Human Computer Interaction

25

Validation Experience
Result Type Correct Valid User Experience Perfect! Touchless processing. Automation.

True Positive False Negative True Negative False Positive


26

User must press ENTER.

User Corrects/Types data.

Loss of trust. Drop of productivity. Bad data leaves Kofax.

KTM Customer Query


This [deal] was sold on the strength of KTM being able to classify and extract data from items received. This was then used to calculate the RETURN on INVESTMENT(ROI) which enabled them to purchase the solution. The ROI was calculated with the reasonable estimate of 65% automated processing. I would expect that we should realistically see 80% to 90% automated processing of inbound items. That said, someone communicated to the client that the best they were going to see was 15% to 20% automated processing. This obviously sent the client reeling that they werent going to see anything close to their expected ROI and would potentially damage their business and not see the benefits from the system as expected.

So what is a reasonable expectation of KTM?

28

Reasonable Customer Expectations


KTM should be able to significantly improve user productivity

(perhaps 1.5-10x)
KTM will be able to extract perfectly information from

readable and known documents.


KTM should be able to learn how to understand readable &

unknown documents.
KTMs value is in improving

documents/person/day Transactions/second (TPS)


You will have access to near realtime performance graphs that

can optimize user experience and data throughput.

29

Benchmark Before

30

Benchmark During

31

Benchmark After

32

US invoices known vendors

33

Goals of every KTM Project

1. Human Productivity 2. Eliminate False Positives

bad data leaving Kofax 3. Reduce False Negatives user pressing ENTER 4. Few True Negatives OCR Accuracy, Database problems & learning
34

Fuzziness is your friend


Kofax brings messy data from the real world into the clear digital world

35

Fuzziness
Fuzziness is not
Random Unpredictable Unreliable Complex

Fuzzy Software you love


Google Autocomplete Spell checkers Grammar checkers Spam filter Users who read this

Fuzziness is
Simple Learning Flexible Tolerant

book

36

Top Names US Census 2005

Top US Names 2005 1 1 2 2 3 3 4 4 5 6 6 7 7 8 8 9 9 10 10

Male

Female

Surname

JAMES JOHN ROBERT MICHAEL WILLIAM DAVID RICHARD CHARLES JOSEPH THOMAS
37

MARY PATRICIA LINDA BARBARA ELIZABETH JENNIFER MARIA SUSAN MARGARET DOROTHY

SMITH JOHNSON WILLIAMS JONES BROWN DAVIS MILLER WILSON MOORE TAYLOR

Swiss Forenames Vorname 1 Peter 2 Daniel 3 Hans 4 Christian 5 Thomas 6 Walter 7 Michel 8 Martin 9 Ren 10 Markus 11 Josef 12 Patrick 13 Andr 14 Bruno 15 Philippe 16 Maria 17 Andreas 18 Roland 19 Paul 20 Marcel 21 Werner 22 Antonio 23 Pierre 24 Urs 25 Elisabeth
38

0.80% 0.80% 0.67% 0.60% 0.53% 0.52% 0.49% 0.46% 0.45% 0.45% 0.44% 0.43%

0.42% 0.41% 0.40% 0.40% 0.40% 0.39% 0.39% 0.39% 0.37% 0.36% 0.35% 0.34% 0.34%

39

Uses

37,691,912 citizens

40

KTM is the heart of Kofax.


Touchless Processing

41

Search & Match Server


KTM Search&Match Server

Database Fuzzy Index

CSV File Database Center Firewall

SQL Database

42 42

PDF
PDF vs TIFF
PDF/A is a Standard ISO 19005-1:2005, ISO 19005-2:2011 and future

safe
Thousands of Incompatible File Formats Baseline TIFF Readers dont have to be able to read Group 4. Any computer can read a pdf, and Chrome can natively. Tiff viewers need to be installed. PDF has layers, TIFF does not. Searchable PDF compresses better TIFFs can be manipulate PDFs have certificates, encryption, DRM, etc..

PDF High Compression Should be in every project

602 kb
553 kb

76 kb (87%)

B&W PDF

117 kb
114 kb

47 kb

Atalasoft for PDF generation

Potrebbero piacerti anche