Sei sulla pagina 1di 24

The Deep Web

Visit www.seminarlinks.blogspot.in to Download


Surface Web

 The surface Web is that portion of the World Wide Web that is indexable by conventional search engines.

 It is also known as the Clearnet, the visible Web or indexable Web.

 Eighty-five percent of Web users use search engines to find needed information, but nearly as high a
percentage cite the inability to find desired information as one of their biggest frustrations.

 A traditional search engine sees only a small amount of the information that's available -- a measly 0.03 %
[source: OEDB].
Deep Web - Introduction
 The Deep Web is World Wide Web content that is not part of the Surface Web, which is indexed
by standard search engines.

 It is also called the Deepnet, Invisible Web or Hidden Web.

 Largest growing category of new information on the Internet.

 400-550X more public information than the Surface Web.

 Total quality 1000-2000X greater than the quality of the Surface Web.
History
 Jill Ellsworth used the term invisible Web in 1994 to refer to websites that were not registered
with any search engine.

 Mike Bergman cited a January 1996 article by Frank Garcia:


“It would be a site that's possibly reasonably designed, but they didn't bother to register it with
any of the search engines. So, no one can find them! You're hidden. I call that the invisible Web”.
 Another early use of the term Invisible Web was by Bruce Mount and Matthew B. Koll of Personal
Library Software in 1996.

 The first use of the specific term Deep Web, now generally accepted, occurred in the
aforementioned 2001 Bergman study.
How search engines work
 Search engines construct a database of the Web by using programs called spiders or Web crawlers
that begin with a list of known Web pages.

 The spider gets a copy of each page and indexes it, storing useful information that will let the page
be quickly retrieved again later.

 Any hyperlinks to new pages are added to the list of pages to be crawled.

 Eventually all reachable pages are indexed, unless the spider runs out of time or disk space.

 The collection of reachable pages defines the Surface Web.


How search engines work
Contents
 Dynamic Content

 Unlinked content

 Private Web

 Contextual Web

 Limited access content

 Non-Scripted content

 Non-HTML/text content;
 Dynamic content
• Dynamic pages which are returned in response to a submitted query or accessed only
through a form
• especially if open-domain input elements (such as text fields) are used
• such fields are hard to navigate without domain knowledge

 Unlinked Content
• Pages which are not linked to by other pages
• Which may prevent web crawling programs from accessing the content
• This content is referred to as pages without backlinks (or inlinks).
 Private Web: sites that require registration and login (password-protected resources).

 Contextual Web: pages with content varying for different access contexts (e.g., ranges
of client IP addresses or previous navigation sequence).

 Limited access content: sites that limit access to their pages in a technical way (e.g.,
using the Robots Exclusion Standard, CAPTCHAs, or no-cache Pragma HTTP headers which
prohibit search engines from browsing them and creating cached copies.
 Scripted content
pages that are only accessible through links produced by JavaScript as well as content
dynamically downloaded from Web servers via Flash or Ajax solutions.

 Non-HTML/text content
textual content encoded in multimedia (image or video) files or specific file formats not
handled by search engines.
Deep Potential
 The deep Web is an endless repository for a mind-reeling amount of information.

 It's powerful. It unleashes human nature in all its forms, both good and bad.

 There are engineering databases, financial information of all kinds, medical papers, pictures, illustrations ... the list
goes on, basically, forever.

 For example, construction engineers could potentially search research papers at multiple universities in order to
find the latest and greatest in bridge-building materials.

 Doctors could swiftly locate the latest research on a specific disease.

 The potential is unlimited. The technical challenges are daunting. That's the draw of the deep Web.
Shadow Land
 The deep Web may be a shadow land of untapped potential.

 The bad stuff, as always, gets most of the headlines.

 You can find illegal goods and activities of all kinds through the dark Web.

 That includes illicit drugs, child pornography, stolen credit card numbers, human trafficking, weapons, exotic
animals, copyrighted media and anything else you can think of.

 Theoretically, you could even, say, hire a hit man to kill someone you don't like.

 But you won't find this information with a Google search.

 These kinds of Web sites require you to use special software, such as The Onion Router, more commonly known
as Tor.
The Onion Router (TOR)
 Tor is software that installs into your browser and sets up the specific connections you need to access dark
Web sites.

 Critically it is free software for enabling online anonymity and censorship resistance.

 Onion routing refers to the process of removing encryption layers from Internet communications, similar to
peeling back the layers of an onion.

 Using Tor makes it more difficult to trace Internet activity, including "visits to Web sites, online posts, instant
messages, and other communication forms", back to the user.

 It is intended to protect the personal privacy of users, as well as their freedom and ability to conduct
confidential business by keeping their internet activities from being monitored.
Cont…
 Instead of seeing domains that end in .com or .org, these hidden sites end in .onion.

 The most infamous of these onion sites was the now-defunct Silk Road, an online marketplace where
users could buy drugs, guns and all sorts of other illegal items.

 The FBI eventually captured Ross Ulbricht, who operated Silk Road, but copycat sites like Black Market
Reloaded are still readily available.

 Tor is the result of research done by the U.S. Naval Research Laboratory, which created Tor for political
dissidents and whistleblowers, allowing them to communicate without fear of reprisal.

 Tor was so effective in providing anonymity for these groups that it didn't take long for the criminally-
minded to start using it as well.
U.S. authorities shut down Silk after the
Silk Road Website alleged owner of the site Ross William Ulbricht
was arrested.
Money-related transactions
 You may wonder how any money-related transactions can happen when sellers and buyers can't
identify each other.

 That's where Bitcoin comes in.

 Bitcoin, it's basically an encrypted digital currency.

 Like regular cash, Bitcoin is good for transactions of all kinds, and notably, it also allows for
anonymity; no one can trace a purchase, illegal or otherwise.

 When paired properly with TOR, it's perhaps the closest thing to a foolproof way to buy and sell on
the web.
The Brighter Side of Darkness
 The deep Web is home to alternate search engines, e-mail services, file storage, file sharing, social
media, chat sites, news outlets and whistleblowing sites, as well as sites that provide a safer meeting
ground for political dissidents and anyone else who may find themselves on the fringes of society.

 In an age where NSA-type surveillance is omnipresent and privacy seems like a thing of the past, the
dark Web offers some relief to people who prize their anonymity.

 Bitcoin may not be entirely stable, but it offers privacy, which is something your credit card company
most certainly does not.

 For citizens living in countries with violent or oppressive leaders, the dark Web offers a more secure way
to communicate with like-minded individuals.
Invisible Web Search Tools
• A List of Deep Web Search Engines – Purdue Owl’s Resources to Search the Invisible Web
• Art – Musie du Louvre
• Books Online – The Online Books Page
• Economic and Job Data – FreeLunch.com
• Finance and Investing – Bankrate.com
• General Research – GPO’s Catalog of US Government Publications
• Government Data – Copyright Records (LOCIS)
• International – International Data Base (IDB)
• Law and Politics – THOMAS (Library of Congress)
• Library of Congress – Library of Congress
• Medical and Health – PubMed
• Transportation – FAA Flight Delay Information
Future

 The lines between search engine content and the deep Web have begun to blur, as search services
start to provide access to part or all of once-restricted content.

 An increasing amount of deep Web content is opening up to free search as publishers and libraries
make agreements with large search engines.

 In the future, deep Web content may be defined less by opportunity for search than by access fees or
other types of authentication.
Conclusion
 The deep web will continue to perplex and fascinate everyone who uses the internet.

 It contains an enthralling amount of knowledge that could help us evolve technologically and as a
species when connected to other bits of information.

 And of course, its darker side will always be lurking, too, just as it always does in human nature.

 The deep web speaks to the fathomless, scattered potential of not only the internet, but the human
race, too.
References

 http://computer.howstuffworks.com/internet/basics/how-the-deep-web-
works5.htm
 http://oedb.org/ilibrarian/invisible-web/
 http://en.wikipedia.org/wiki/Deep_Web
 http://money.cnn.com/infographic/technology/what-is-the-deep-web/?iid=EL
 http://en.wikipedia.org/wiki/Surface_Web
Thank You
Visit www.seminarlinks.blogspot.in to Download

Potrebbero piacerti anche